Suricata InLine with igb NICs

boobletins

Maybe don't change this unless you run into other issues, but the remarks at the link below suggest that hyperthreading (which you have enabled) may limit your throughput.

https://calomel.org/freebsd_network_tuning.html

# Disable Hyper Threading (HT), also known as Intel's proprietary simultaneous
# multithreading (SMT) because implementations typically share TLBs and L1
# caches between threads which is a security concern. SMT is likely to slow
# down workloads not specifically optimized for SMT if you have a CPU with more
# than two(2) real CPU cores. Secondly, multi-queue network cards are as much
# as 20% slower when network queues are bound to both real CPU cores and SMT
# virtual cores due to interrupt processing collisions.
#
machdep.hyperthreading_allowed="0"  # (default 1, allow Hyper Threading (HT))

That last sentence seems to apply in your situation. They note they've used the config with an i350. I don't see a lot of netmap-specific configuration in there, so ymmv.

This is unrelated to the "bad pkt" error.

stephenw10

@boobletins said in Suricata InLine with igb NICs:

Ok -- I tried to thumbs-up some of your posts hoping that will help with Akismet.

It should. Users with a reputation of 5 or more should never see Akismet.
I voted a few posts too so that is now that case.

Steve

newUser2pfSense

boobletins/stephenw10...I tried posting again but unfortunately I keep getting:

Post content was flagged as spam by Akismet.com

I apologize, I keep trying to post.

I'll try posting a little at a time again if it will let me.

newUser2pfSense

ifconfig igb3 [I redacted out IP/MAC addresses]:

igb3: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 1500
options=1000b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,NETMAP>
ether
hwaddr
inet6 %igb3 prefixlen 64 scopeid 0x4
inet netmask broadcast
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

newUser2pfSense

The only other System Tunable I changed was from here:
https://www.netgate.com/docs/pfsense/hardware/tuning-and-troubleshooting-network-cards.html?highlight=tuning
net.inet.ip.intr_queue_maxlen , Maximum size of the IP input queue, 3000
I believe it was originally set to 1000. I just never changed it back. I can change it back if need be.

Although I've kept some of the tunables in my loader.conf.local file for testing, I've commented them out, #, so nothing there should be loading:

#hw.igb.rxd="1024"
#hw.igb.txd="1024"
#hw.igb.enable_aim=1
#hw.igb.num_queues=0
#kern.ipc.nmbclusters="1000000"
#hw.pci.enable_msi=0
#hw.igb.max_interrupt_rate="32000"
#hw.igb.fc_setting=0
#hw.igb.txd=4096
#hw.igb.rxd=4096

newUser2pfSense

sysctl -a | grep nmbclusters -
kern.ipc.nmbclusters: 4076726

sysctl -a | grep msi -
hw.ixl.enable_msix: 1
hw.sdhci.enable_msi: 1
hw.puc.msi_disable: 0
hw.pci.honor_msi_blacklist: 1
hw.pci.msix_rewrite_table: 0
hw.pci.enable_msix: 1
hw.pci.enable_msi: 1
hw.mfi.msi: 1
hw.malo.pci.msi_disable: 0
hw.ix.enable_msix: 1
hw.igb.enable_msix: 1
hw.em.enable_msix: 1
hw.cxgb.msi_allowed: 2
hw.bce.msi_enable: 1
hw.aac.enable_msi: 1
machdep.disable_msix_migration: 0

sysctl -a | grep num_queues -
hw.ix.num_queues: 0
hw.igb.num_queues: 0

newUser2pfSense

dmesg | grep igb3 [I redacted out IP/MAC addresses] -

igb3: link state changed to UP
igb3: permanently promiscuous mode enabled
igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2
igb3: Using MSIX interrupts with 9 vectors
igb3: Ethernet address:
igb3: Bound queue 0 to cpu 0
igb3: Bound queue 1 to cpu 1
igb3: Bound queue 2 to cpu 2
igb3: Bound queue 3 to cpu 3
igb3: Bound queue 4 to cpu 4
igb3: Bound queue 5 to cpu 5
igb3: Bound queue 6 to cpu 6
igb3: Bound queue 7 to cpu 7
igb3: netmap queues/slots: TX 8/4096, RX 8/4096
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: permanently promiscuous mode enabled
igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2
igb3: Using MSIX interrupts with 9 vectors
igb3: Ethernet address:
igb3: Bound queue 0 to cpu 0
igb3: Bound queue 1 to cpu 1
igb3: Bound queue 2 to cpu 2
igb3: Bound queue 3 to cpu 3
igb3: Bound queue 4 to cpu 4
igb3: Bound queue 5 to cpu 5
igb3: Bound queue 6 to cpu 6
igb3: Bound queue 7 to cpu 7
igb3: netmap queues/slots: TX 8/1024, RX 8/1024
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: permanently promiscuous mode enabled
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2
igb3: Using MSIX interrupts with 9 vectors
igb3: Ethernet address:
igb3: Bound queue 0 to cpu 0
igb3: Bound queue 1 to cpu 1
igb3: Bound queue 2 to cpu 2
igb3: Bound queue 3 to cpu 3
igb3: Bound queue 4 to cpu 4
igb3: Bound queue 5 to cpu 5
igb3: Bound queue 6 to cpu 6
igb3: Bound queue 7 to cpu 7
igb3: netmap queues/slots: TX 8/1024, RX 8/1024
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: permanently promiscuous mode enabled
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2
igb3: Using MSIX interrupts with 9 vectors
igb3: Ethernet address:
igb3: Bound queue 0 to cpu 0
igb3: Bound queue 1 to cpu 1
igb3: Bound queue 2 to cpu 2
igb3: Bound queue 3 to cpu 3
igb3: Bound queue 4 to cpu 4
igb3: Bound queue 5 to cpu 5
igb3: Bound queue 6 to cpu 6
igb3: Bound queue 7 to cpu 7
igb3: netmap queues/slots: TX 8/1024, RX 8/1024
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: permanently promiscuous mode enabled
igb3: link state changed to DOWN
arpresolve: can't allocate llinfo for on igb3
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2
igb3: Using MSIX interrupts with 9 vectors
igb3: Ethernet address:
igb3: Bound queue 0 to cpu 0
igb3: Bound queue 1 to cpu 1
igb3: Bound queue 2 to cpu 2
igb3: Bound queue 3 to cpu 3
igb3: Bound queue 4 to cpu 4
igb3: Bound queue 5 to cpu 5
igb3: Bound queue 6 to cpu 6
igb3: Bound queue 7 to cpu 7
igb3: netmap queues/slots: TX 8/1024, RX 8/1024
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: link state changed to DOWN
igb3: link state changed to UP
igb3: permanently promiscuous mode enabled
igb3: link state changed to DOWN
igb3: link state changed to UP
arpresolve: can't allocate llinfo for on igb3

I added a dev.netmap.buf_size to System Tunables and set the value to 4096. I restarted pfSense and then really throttled as much traffic going through it as I could. I didn't get any netmap_grab_packets errors. I'm now wondering if there is a maximum netmap buffer size.

I look forward to doing whatever change/testing we can to find a solution. Thanks for the continued help!

boobletins

@newuser2pfsense said in Suricata InLine with igb NICs:

I'm now wondering if there is a maximum netmap buffer size.

With 64 GB of RAM, you should be able to take that tuneable very high, but there isn't a need unless the bad pkt error returns with higher 2nd numbers. I'd wait and see, otherwise you are locking up memory for no reason.

I can't explain how you're getting packets of size ~2100 with an MTU of 1500. Maybe JUMBO_MTU allows for that (I don't know). It could also be that something else on your network has a larger MTU setting. I'm not sure how FreeBSD handles those situations. If you're interested you can check any switches and clients to see and adjust accordingly.

If the system is handling as much throughput as you can throw at it, then I'd leave everything alone for now.

If you run into throughput or interrupt issues, then consider disabling hyperthreading. The dmesg output indicates that you're binding queues to virtual and hardware cores which may become an issue depending on how hard you're saturating the interfaces.

bmeeks

Once the applicable system tuneables are nailed down and some "good" typical values are established, this thread should be made a "sticky post" or else a new single "sticky post" created containing the relevant settings. The netmap bad packets error has plagued a lot of Suricata Inline IPS Mode users.

newUser2pfSense

boobletins...I'll let it run for a while with all of the tweaks we've made and check it periodically for any netmap_grab_packets errors.

bmeeks...I agree.

newUser2pfSense

I let my system run for just over a week and I noticed this evening that I couldn't access the interwebs for some reason. I restarted my pfSense computer and everything seemed to go back to normal. I then noticed a few minutes ago the following on the console:

kernel 492.136807 [1071] netmap_grab_packets bad pkt at 878 len 4939
kernel 490.136919 [1071] netmap_grab_packets bad pkt at 667 len 4939
kernel 489.136703 [1071] netmap_grab_packets bad pkt at 933 len 4939
kernel 488.636876 [1071] netmap_grab_packets bad pkt at 875 len 4939
kernel 488.435620 [1071] netmap_grab_packets bad pkt at 806 len 4939
kernel 488.235492 [1071] netmap_grab_packets bad pkt at 766 len 4939

Interesting. I guess I'm going to have to bump up my dev.netmap.buf_size from 4096 to a larger value. I have 64 Gig or RAM in my pfSense comptuer so maybe I'll bump it up to 8192 and see how that works. Has anyone had a related experience after tuning their system?

Update - Since changing the buffer size to 8192, I've noticed webpages load a tad slower.

boobletins

I still periodically see packets larger than my mtu and netmap.buf_size. I haven't been able to track down the source. After tuning it's down to something like once per week - often without any interface hiccup.

I opened a support question here: https://redmine.openinfosecfoundation.org/issues/2720 -- but so far there's no information. I don't think it's a Suricata issue --
I'm no expert, but I don't see anything in the Suricata netmap code that would be adding length to packets.

It's possible that this type of noise is always there but the netmap configuration is more sensitive to violations of mtu/buf_size.

Really the error message just indicates that a packet was dropped because it exceeded the available buffer length. I believe the interface flap after that is due to the watchdog cycling the interface because it sees high packet loss (or latency). Packets are presumably dropped all the time by the OS and we're only aware of them because we're looking for netmap errors now.

For the record: my logs show the last errors on 12/6 with the same packet size you have above:

kernel: 338.512666 [1071] netmap_grab_packets       bad pkt at 1054 len 4939
kernel: 338.714285 [1071] netmap_grab_packets       bad pkt at 1073 len 4939
kernel: 338.914864 [1071] netmap_grab_packets       bad pkt at 1089 len 4939
kernel: 339.423360 [1071] netmap_grab_packets       bad pkt at 1203 len 4939
kernel: 340.414473 [1071] netmap_grab_packets       bad pkt at 1484 len 4939
kernel: 342.414619 [1071] netmap_grab_packets       bad pkt at 1542 len 4939
kernel: 346.414451 [1071] netmap_grab_packets       bad pkt at 2009 len 4939

The same size strikes me as a little odd -- what's putting packets of that exact size on the wire? They happen so rarely now that I don't want to run a pcap for weeks to catch them. I don't see any particularly odd traffic at the time in my logs (though of course the bad packets are dropped, so if they're all bad nothing would show up).

I'd be curious to know the output of "sysctl -a | grep missed_packets" -- or more precisely -- I'd be curious to know if you note those numbers now and compare them after "bad pkt" errors to see if the NIC counters are being incremented by netmap or if we lose that reporting. If it's still accurately incremented on a packet miss, then we should be able to compare inline to legacy mode to see if there's any significant increase in packet loss with netmap mode. I suspect there isn't, it's just louder about it's misses.

bmeeks

Here are a few Netmap-related links I've found. There are some references in these about various netmap errors and issues, particularly around stripping of VLAN tags and problems with flow control. Lots of the issues are NIC driver specific.

https://github.com/luigirizzo/netmap/blob/master/LINUX/README

http://freebsd.1045724.x6.nabble.com/Netmap-ixgbe-stripping-Vlan-tags-td5838105.html

https://redmine.openinfosecfoundation.org/issues/1925

https://helpmanual.io/man4/netmap-freebsd/

boobletins

@bmeeks said in Suricata InLine with igb NICs:

Here are a few Netmap-related links I've found. There are some references in these about various netmap errors and issues, particularly around stripping of VLAN tags and problems with flow control. Lots of the issues are NIC driver specific.

https://github.com/luigirizzo/netmap/blob/master/LINUX/README

http://freebsd.1045724.x6.nabble.com/Netmap-ixgbe-stripping-Vlan-tags-td5838105.html

https://redmine.openinfosecfoundation.org/issues/1925

https://helpmanual.io/man4/netmap-freebsd/

I've read through the man pages and netmap code several times now (which is why I'm so confident I know what the errors mean: https://github.com/luigirizzo/netmap/blob/master/sys/dev/netmap/netmap.c#L1169 )

VLAN tag stripping isn't an issue for me, but there was an interesting bit in that link:

When you switch an interface to netmap mode it does a soft-reset first. That reverts the vlanhwfilter configuration to default on. It's not netmap that does it but the driver. It seems to happen in or around ixgbe_setup_vlan_hw_support().

I just tested this on igb and em drivers, and both keep the vlanhwfilter setting across a netmap restart (along with other settings -- checksum offloading most importantly).

I think that the "bad pkt" error is "resolved" for both of us and probably more broadly. My guess is that the remaining errors are normal network noise that is just noisier than usual because the code writes to syslog for every dropped (because malformed) packet.

A flood of "bad pkt" errors would result if a user had a misconfiguration or an incompatible card (eg MTU set to 9000 to support jumbo frames with a netmap.buf_size of default 2048 would result in huge numbers of "bad pkt" errors) -- and so we're thinking that "bad pkt" means something isn't working correctly. Really it just means what it says -- a bad packet was received that is in violation of both our MTU and our buf_size. The packet should be dropped. Why someone is sending us a packet of size 4939 when we're advertising an MTU of 1500 is a good question.

To really get to the bottom of it I would need to capture and decode one of the oversized packets. Without having to capture enormous amounts of traffic the best way I can think of would be to try to recompile netmap/suricata with an expanded error message that output the packet in base64 to the log for analysis. I'll see how complicated is.

I started a write-up on how to troubleshoot netmap errors and got discouraged when my initial post was rejected by Akismet. If I write it up and send it your way, can you get it posted Bill?

bmeeks

@boobletins said in Suricata InLine with igb NICs:

I started a write-up on how to troubleshoot netmap errors and got discouraged when my initial post was rejected by Akismet. If I write it up and send it your way, can you get it posted Bill?

Sure! Write it up in a format that would make a good Sticky Post to put at the top of this forum along with the others. Give it a title to make it clear what it's about. Be sure to give yourself the credit in the notes, and you can even ask one of the Forum moderators to post the sticky for you. I have asked them to post mine in the past. You can ask @jimp or @johnpoz if they will make it a Sticky Post in this forum. If you run into difficulties or have a question, just let me know.

stephenw10

Nice work.

boobletins

@boobletins said in Suricata InLine with igb NICs:

I'll see how complicated is.

It looks like this would require a kernel rebuild which I'm not really up to -- I'd then have to run that experimental build on my firewall (and I've never built one before).

bmeeks

@boobletins said in Suricata InLine with igb NICs:

@boobletins said in Suricata InLine with igb NICs:

I'll see how complicated is.

It looks like this would require a kernel rebuild which I'm not really up to -- I'd then have to run that experimental build on my firewall (and I've never built one before).

Enabling debugging or extra error messages from within netmap itself will require rebuilding the kernel module. Though I've never done it, you might be able to build a compatible module with debugging enabled using a vanilla FreeBSD 11.2 machine. Then just copy the kernel module over to pfSense. If you have virtual machines, you could do this with not much risk. Just save a snapshot, install the new netmap kernel module and give it a try. If it breaks badly, just restore the previous snapshot.

Turning on debugging with Suricata is relatively easy, but I don't think any really useful information will be gleaned from Suricata itself. I think this issue is between the NIC drivers, the netmap kernel module and the kernel itself.

boobletins

@bmeeks

That's how I built the hyperscan module -- a fresh VM and then followed the directions.

If we think it's possible to do something similar and just copy over the netmap module, then I can try that.

bmeeks

@boobletins said in Suricata InLine with igb NICs:

@bmeeks

That's how I built the hyperscan module -- a fresh VM and then followed the directions.

If we think it's possible to do something similar and just copy over the netmap module, then I can try that.

I think it should work just fine. There is nothing customized about the netmap module in pfSense. They just enable the module to be built along with the kernel. I believe there are some minor pfSense-specific tweaks to the FreeBSD kernel code, but nothing to the netmap module itself.