Suricata InLine with igb NICs
-
@boobletins said in Suricata InLine with igb NICs:
So here are some initial suggestions. Please keep in mind that I've been working on this for ~1 week (in other words: not long), and I'm not a FreeBSD, pfSense, or Suricata expert.
Start by making a backup of your configuration.
Do these first:
My understanding is that flow control should be off on any netmap interface. You have bi-directional flow control enabled:dev.igb.0.fc: 3
Disable flow control on all active interfaces using system tunables. Set dev.igb.0.fc=0 (and dev.igb.1.fc=0)
Actively set energy efficient ethernet to disabled:
dev.igb.0.eee_disabled=1Actively force IPv6_TXCSUM6 off by adding the following to config.xml in a shellcmd tag:
ifconfig igb0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso
(see above in this thread for a link on where/how to do that).
Edit:
To be clear: anywhere I have a command that says "igb0" or "igb.0" you will want to duplicate that for igb1 and any other interface you're running netmap on.So you will need 2 shellcmd lines in config.xml, and two new system tunables for flow control, etc
Consider changing later:
Set rx processing limit:
dev.igb.0.rx_processing_limit: -1It looks like your txd and rxd are both set to 1024 currently, I suggest you move those to 4096:
hw.igb.txd=4096
hw.igb.rxd=4096By changing your txd and rxd we may need to revisit your netmap buf/ring (memory settings).
We may also revisit your interrupt and queue settings.
Boobletins, I will need to revisit later...currently, I am happy with just making adjustments to the buf_size:4096 and disable IPv6...haven't got any alert since and my Internet will be down for a while because of moving.
-
So you're running netmap/IPS mode on igb0 (LAN), igb1 (OPT?), and igb3 (WAN)?
What type of CPU is in the machine (# of cores?, is hyper-threading enabled)? How much RAM?
Are you saturating all 3 active interfaces? Or just 2?
Start by making a backup of your configuration.
First disable flow control (as discussed above):
You have the following on all igb interfaces which means bi-directional flow control is enabled.:dev.igb.0.fc: 3
Change to fc=0 on all netmap interfaces in system tunables. This will take ethernet flow control out of the picture in favor of higher level flow control (TCP) which is less likely to mess with buffering and clog things up.
Let's look at what generates this particular netmap error:
From http://web.mit.edu/freebsd/head/sys/dev/netmap/netmap.c/* * put a copy of the buffers marked NS_FORWARD into an mbuf chain. * Take packets from hwcur to ring->head marked NS_FORWARD (or forced) * and pass them up. Drop remaining packets in the unlikely event * of an mbuf shortage. */ static void netmap_grab_packets(struct netmap_kring *kring, struct mbq *q, int force) { u_int const lim = kring->nkr_num_slots - 1; u_int const head = kring->ring->head; u_int n; struct netmap_adapter *na = kring->na; for (n = kring->nr_hwcur; n != head; n = nm_next(n, lim)) { struct mbuf *m; struct netmap_slot *slot = &kring->ring->slot[n]; if ((slot->flags & NS_FORWARD) == 0 && !force) continue; if (slot->len < 14 || slot->len > NETMAP_BUF_SIZE(na)) { RD(5, "bad pkt at %d len %d", n, slot->len); continue; } slot->flags &= ~NS_FORWARD; // XXX needed ? /* XXX TODO: adapt to the case of a multisegment packet */ m = m_devget(NMB(na, slot), slot->len, 0, na->ifp, NULL); if (m == NULL) break; mbq_enqueue(q, m); } }
I'm no C expert, but as I read this code there are 2 ways to generate your error in netmap:
- a slot is of size less than 14
- a slot is of size greater than the netmap buffer can handle
I don't know what the magic number 14 represents, but let's assume it's some kind of minimum packet size we can't control. If that's the case, then the bad_pkt error is generated from packets that are actually bad.
That's not what you have. The error is telling us the current hwcur value (the first number - the slot number in the ring) and the length or size of the slot (eg #777 with len 2154).
So this is a memory issue. The error would be better off saying something like "dropped a packet because it was too short or too large!" -- but that would be useful to others and is thus verboten ;)
edited: Removed incorrect speculation. Skip to my latest post.
-
This post is deleted! -
@boobletins said in Suricata InLine with igb NICs:
I guess it depends on what NETMAP_BUF_SIZE(na) is returning. It should be either the available memory for netmap buffers, or the available kernel buffers (for the host adapter).
From: https://github.com/luigirizzo/netmap/blob/master/sys/dev/netmap/netmap_kern.h
#define NETMAP_BUF_SIZE(_na) ((_na)->na_lut.objsize) ... struct netmap_adapter { ... struct netmap_lut { struct lut_entry *lut; struct plut_entry *plut; uint32_t objtotal; /* max buffer index */ uint32_t objsize; /* buffer size */ }; /* memory allocator (opaque) * We also cache a pointer to the lut_entry for translating * buffer addresses, the total number of buffers and the buffer size. */ struct netmap_mem_d *nm_mem; struct netmap_mem_d *nm_mem_prev; struct netmap_lut na_lut;
It's returning netmap adapter buffer size.
Let's see.
Your dev.netmap.buf_size=2048 and the length of the slot it was trying to process were all > 2048 when the error was generated.
That makes a certain kind of sense. Why were the slots larger..
Wait. What's your MTU set to on these interfaces? It has to be > 2048? Check this with 'ifconfig igb0' for each interface.
Some sanity checks when enabling netmap would save people a lot of headaches. If your MTU is 10000 and your dev.netmap.buf_size=2048, then netmap will always choke.
Know that if you set dev.netmap.buf_size to some obscenely high number to cover an equally high MTU, netmap will preallocate all of that memory and sit on it.
-
boobletins...Presently I'm using Inline IPS Mode and I only have Suricata running on my WAN and that's igb3. I'm using igb0 and igb1 as well for my WLAN and LAN.
CPU:
Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Current: 4000 MHz, Max: 4001 MHz
8 CPUs: 1 package(s) x 4 core(s) x 2 hardware threads
AES-NI CPU Crypto: Yes (active)Memory:
64 GigSystem Tunables addition:
Tunable Name Description Value
dev.igb.0.fc disable flow control 0
dev.igb.1.fc disable flow control 0
dev.igb.2.fc disable flow control 0
dev.igb.3.fc disable flow control 0
dev.igb.0.eee_disabled disable energy efficient ethernet 1
dev.igb.1.eee_disabled disable energy efficient ethernet 1
dev.igb.2.eee_disabled disable energy efficient ethernet 1
dev.igb.3.eee_disabled disable energy efficient ethernet 1config.xml addition (I had to take the beginning < and ending > out to get it to display):
shellcmd>ifconfig igb0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso</shellcmd
shellcmd>ifconfig igb1 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso</shellcmd
shellcmd>ifconfig igb2 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso</shellcmd
shellcmd>ifconfig igb3 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso</shellcmd
shellcmd>ifconfig em0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso</shellcmdigb0,1,2,3 all have an MTU of 1500 which I believe is default. I haven't set any values for this myself.
-
@newuser2pfsense said in Suricata InLine with igb NICs:
boobletins...Presently I'm using Inline IPS Mode and I only have Suricata running on my WAN and that's igb3. I'm using igb0 and igb1 as well for my WLAN and LAN.
dev.igb.3.fc disable flow control 0Previously you had dev.igb.3.fc=3 Does the "bad pkt" error persist with dev.igb.3.fc=0?
Just to confirm, could you double check and paste me the full output from
ifconfig igb3
Please paste any additional system turntables you've set via the ui and your full loader.conf.local (minus any sensitive data).
Please then manually double check and paste the output from these commands:
sysctl -a | grep nmbclusters sysctl -a | grep msi sysctl -a | grep num_queues dmesg | grep igb3
Above is not busy work, I'm having you manually confirm because with a few commands I found that when I set them in loader.conf.local they didn't take effect. I needed to put some in the ui system tuneables.
We have more settings to tinker with, I made a bunch of changes before the errors went away, but I'm trying to narrow down the issue before just throwing a bunch of new settings at you. I'm pretty confident we can get this working on your igb since its working on mine with 0 errors for over a week now.
-
boobletins...I keep getting the following error message from the page when posting the information you requested; frustrating to say the least:
Error
Post content was flagged as spam by Akismet.comI'll do what I can to get the information in.
The errors do persist:
408.786592 [1071] netmap_grab_packets bad pkt at 186 len 2154
950.583865 [1071] netmap_grab_packets bad pkt at 433 len 2154
530.551894 [1071] netmap_grab_packets bad pkt at 810 len 2147
530.547133 [1071] netmap_grab_packets bad pkt at 807 len 2147
360.440859 [1071] netmap_grab_packets bad pkt at 728 len 2154
764.263927 [1071] netmap_grab_packets bad pkt at 311 len 2154 -
Ok -- I tried to thumbs-up some of your posts hoping that will help with Akismet.
I am interested in those results -- mostly because I think something is putting packets into your hardware buffers that are greater than 2048. They also seem to be consistently in the 2100 range. I can't explain what is doing that or why if your MTU is actually 1500. Maybe there's some kind of overhead with vlan tagging, qos, etc that I'm not aware of.
The why doesn't really matter if all you want is a fix. If you raise the buffer_size of netmap (and the packet sizes stay below those new maximums) then the errors should disappear.
Currently your dev.netmap.buf_size is set to 2048. If you, for example, double that to 4096, then all of the current errors would be covered by the new larger buffer_size in netmap (do this in the ui under system tuneables).
Since I don't understand how you're getting packets that are > 2048 with an MTU of 1500, I can't promise it won't come back with even larger numbers, but that change would cover all of the errors you've pasted so far.
As I say above, you may get additional errors by changing dev.netmap.buf_size -- let me know if that's the case.
For the record: I have an MTU of 1500 and a dev.netmap.buf_size of 1920 is enough to prevent errors.
-
Maybe don't change this unless you run into other issues, but the remarks at the link below suggest that hyperthreading (which you have enabled) may limit your throughput.
https://calomel.org/freebsd_network_tuning.html
# Disable Hyper Threading (HT), also known as Intel's proprietary simultaneous # multithreading (SMT) because implementations typically share TLBs and L1 # caches between threads which is a security concern. SMT is likely to slow # down workloads not specifically optimized for SMT if you have a CPU with more # than two(2) real CPU cores. Secondly, multi-queue network cards are as much # as 20% slower when network queues are bound to both real CPU cores and SMT # virtual cores due to interrupt processing collisions. # machdep.hyperthreading_allowed="0" # (default 1, allow Hyper Threading (HT))
That last sentence seems to apply in your situation. They note they've used the config with an i350. I don't see a lot of netmap-specific configuration in there, so ymmv.
This is unrelated to the "bad pkt" error.
-
@boobletins said in Suricata InLine with igb NICs:
Ok -- I tried to thumbs-up some of your posts hoping that will help with Akismet.
It should. Users with a reputation of 5 or more should never see Akismet.
I voted a few posts too so that is now that case.Steve
-
boobletins/stephenw10...I tried posting again but unfortunately I keep getting:
Post content was flagged as spam by Akismet.com
I apologize, I keep trying to post.
I'll try posting a little at a time again if it will let me.
-
ifconfig igb3 [I redacted out IP/MAC addresses]:
igb3: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 1500
options=1000b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,NETMAP>
ether
hwaddr
inet6 %igb3 prefixlen 64 scopeid 0x4
inet netmask broadcast
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active -
The only other System Tunable I changed was from here:
https://www.netgate.com/docs/pfsense/hardware/tuning-and-troubleshooting-network-cards.html?highlight=tuning
net.inet.ip.intr_queue_maxlen , Maximum size of the IP input queue, 3000
I believe it was originally set to 1000. I just never changed it back. I can change it back if need be.Although I've kept some of the tunables in my loader.conf.local file for testing, I've commented them out, #, so nothing there should be loading:
#hw.igb.rxd="1024"
#hw.igb.txd="1024"
#hw.igb.enable_aim=1
#hw.igb.num_queues=0
#kern.ipc.nmbclusters="1000000"
#hw.pci.enable_msi=0
#hw.igb.max_interrupt_rate="32000"
#hw.igb.fc_setting=0
#hw.igb.txd=4096
#hw.igb.rxd=4096 -
sysctl -a | grep nmbclusters -
kern.ipc.nmbclusters: 4076726sysctl -a | grep msi -
hw.ixl.enable_msix: 1
hw.sdhci.enable_msi: 1
hw.puc.msi_disable: 0
hw.pci.honor_msi_blacklist: 1
hw.pci.msix_rewrite_table: 0
hw.pci.enable_msix: 1
hw.pci.enable_msi: 1
hw.mfi.msi: 1
hw.malo.pci.msi_disable: 0
hw.ix.enable_msix: 1
hw.igb.enable_msix: 1
hw.em.enable_msix: 1
hw.cxgb.msi_allowed: 2
hw.bce.msi_enable: 1
hw.aac.enable_msi: 1
machdep.disable_msix_migration: 0sysctl -a | grep num_queues -
hw.ix.num_queues: 0
hw.igb.num_queues: 0 -
dmesg | grep igb3 [I redacted out IP/MAC addresses] -
igb3: link state changed to UP igb3: permanently promiscuous mode enabled igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2 igb3: Using MSIX interrupts with 9 vectors igb3: Ethernet address: igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: Bound queue 4 to cpu 4 igb3: Bound queue 5 to cpu 5 igb3: Bound queue 6 to cpu 6 igb3: Bound queue 7 to cpu 7 igb3: netmap queues/slots: TX 8/4096, RX 8/4096 igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: permanently promiscuous mode enabled igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2 igb3: Using MSIX interrupts with 9 vectors igb3: Ethernet address: igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: Bound queue 4 to cpu 4 igb3: Bound queue 5 to cpu 5 igb3: Bound queue 6 to cpu 6 igb3: Bound queue 7 to cpu 7 igb3: netmap queues/slots: TX 8/1024, RX 8/1024 igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: permanently promiscuous mode enabled igb3: link state changed to DOWN igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2 igb3: Using MSIX interrupts with 9 vectors igb3: Ethernet address: igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: Bound queue 4 to cpu 4 igb3: Bound queue 5 to cpu 5 igb3: Bound queue 6 to cpu 6 igb3: Bound queue 7 to cpu 7 igb3: netmap queues/slots: TX 8/1024, RX 8/1024 igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: permanently promiscuous mode enabled igb3: link state changed to DOWN igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2 igb3: Using MSIX interrupts with 9 vectors igb3: Ethernet address: igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: Bound queue 4 to cpu 4 igb3: Bound queue 5 to cpu 5 igb3: Bound queue 6 to cpu 6 igb3: Bound queue 7 to cpu 7 igb3: netmap queues/slots: TX 8/1024, RX 8/1024 igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: permanently promiscuous mode enabled igb3: link state changed to DOWN arpresolve: can't allocate llinfo for on igb3 igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xdf000000-0xdf0fffff,0xdf600000-0xdf603fff irq 19 at device 0.3 on pci2 igb3: Using MSIX interrupts with 9 vectors igb3: Ethernet address: igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: Bound queue 4 to cpu 4 igb3: Bound queue 5 to cpu 5 igb3: Bound queue 6 to cpu 6 igb3: Bound queue 7 to cpu 7 igb3: netmap queues/slots: TX 8/1024, RX 8/1024 igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: link state changed to DOWN igb3: link state changed to UP igb3: permanently promiscuous mode enabled igb3: link state changed to DOWN igb3: link state changed to UP arpresolve: can't allocate llinfo for on igb3
I added a dev.netmap.buf_size to System Tunables and set the value to 4096. I restarted pfSense and then really throttled as much traffic going through it as I could. I didn't get any netmap_grab_packets errors. I'm now wondering if there is a maximum netmap buffer size.
I look forward to doing whatever change/testing we can to find a solution. Thanks for the continued help!
-
@newuser2pfsense said in Suricata InLine with igb NICs:
I'm now wondering if there is a maximum netmap buffer size.
With 64 GB of RAM, you should be able to take that tuneable very high, but there isn't a need unless the bad pkt error returns with higher 2nd numbers. I'd wait and see, otherwise you are locking up memory for no reason.
I can't explain how you're getting packets of size ~2100 with an MTU of 1500. Maybe JUMBO_MTU allows for that (I don't know). It could also be that something else on your network has a larger MTU setting. I'm not sure how FreeBSD handles those situations. If you're interested you can check any switches and clients to see and adjust accordingly.
If the system is handling as much throughput as you can throw at it, then I'd leave everything alone for now.
If you run into throughput or interrupt issues, then consider disabling hyperthreading. The dmesg output indicates that you're binding queues to virtual and hardware cores which may become an issue depending on how hard you're saturating the interfaces.
-
Once the applicable system tuneables are nailed down and some "good" typical values are established, this thread should be made a "sticky post" or else a new single "sticky post" created containing the relevant settings. The netmap bad packets error has plagued a lot of Suricata Inline IPS Mode users.
-
boobletins...I'll let it run for a while with all of the tweaks we've made and check it periodically for any netmap_grab_packets errors.
bmeeks...I agree.
-
I let my system run for just over a week and I noticed this evening that I couldn't access the interwebs for some reason. I restarted my pfSense computer and everything seemed to go back to normal. I then noticed a few minutes ago the following on the console:
kernel 492.136807 [1071] netmap_grab_packets bad pkt at 878 len 4939
kernel 490.136919 [1071] netmap_grab_packets bad pkt at 667 len 4939
kernel 489.136703 [1071] netmap_grab_packets bad pkt at 933 len 4939
kernel 488.636876 [1071] netmap_grab_packets bad pkt at 875 len 4939
kernel 488.435620 [1071] netmap_grab_packets bad pkt at 806 len 4939
kernel 488.235492 [1071] netmap_grab_packets bad pkt at 766 len 4939Interesting. I guess I'm going to have to bump up my dev.netmap.buf_size from 4096 to a larger value. I have 64 Gig or RAM in my pfSense comptuer so maybe I'll bump it up to 8192 and see how that works. Has anyone had a related experience after tuning their system?
Update - Since changing the buffer size to 8192, I've noticed webpages load a tad slower.
-
I still periodically see packets larger than my mtu and netmap.buf_size. I haven't been able to track down the source. After tuning it's down to something like once per week - often without any interface hiccup.
I opened a support question here: https://redmine.openinfosecfoundation.org/issues/2720 -- but so far there's no information. I don't think it's a Suricata issue --
I'm no expert, but I don't see anything in the Suricata netmap code that would be adding length to packets.It's possible that this type of noise is always there but the netmap configuration is more sensitive to violations of mtu/buf_size.
Really the error message just indicates that a packet was dropped because it exceeded the available buffer length. I believe the interface flap after that is due to the watchdog cycling the interface because it sees high packet loss (or latency). Packets are presumably dropped all the time by the OS and we're only aware of them because we're looking for netmap errors now.
For the record: my logs show the last errors on 12/6 with the same packet size you have above:
kernel: 338.512666 [1071] netmap_grab_packets bad pkt at 1054 len 4939 kernel: 338.714285 [1071] netmap_grab_packets bad pkt at 1073 len 4939 kernel: 338.914864 [1071] netmap_grab_packets bad pkt at 1089 len 4939 kernel: 339.423360 [1071] netmap_grab_packets bad pkt at 1203 len 4939 kernel: 340.414473 [1071] netmap_grab_packets bad pkt at 1484 len 4939 kernel: 342.414619 [1071] netmap_grab_packets bad pkt at 1542 len 4939 kernel: 346.414451 [1071] netmap_grab_packets bad pkt at 2009 len 4939
The same size strikes me as a little odd -- what's putting packets of that exact size on the wire? They happen so rarely now that I don't want to run a pcap for weeks to catch them. I don't see any particularly odd traffic at the time in my logs (though of course the bad packets are dropped, so if they're all bad nothing would show up).
I'd be curious to know the output of "sysctl -a | grep missed_packets" -- or more precisely -- I'd be curious to know if you note those numbers now and compare them after "bad pkt" errors to see if the NIC counters are being incremented by netmap or if we lose that reporting. If it's still accurately incremented on a packet miss, then we should be able to compare inline to legacy mode to see if there's any significant increase in packet loss with netmap mode. I suspect there isn't, it's just louder about it's misses.