Inline Suricata NIC selection

dcol

Hi,
I am in the process of putting together some more PFsense boxes and was wondering if anyone has any experience using different NIC's for Suricata Inline use.
I have been using the i340-T4 but that NIC eventually starts showing lots of random netmap errors in the console. I was considering the i350-T4V2, but that uses the same igb driver and has the same specs as the i340 except for being PCIe2.0 vs PCIe2.1 on the i350. I suspect the same results with both cards since they use the same driver. But maybe the i350 is faster and can handle netmap better?

Does increased memory or CPU power help the netmap function or is it, as I assume, all done by the NIC's processing power?

Other than turning off all the offloading functions, are there any other tweaks that would improve the netmap/inline functionality?

Appreciate any input on this. Thanks

dcol

Anyone?

dcol

Did some research and found Intel specs that compares the i340 (82580) to the i350
The i340 uses the Intel 82580 controller, the i350 uses the Intel i350. Both run at a processors speed of 25 Mhz

Here are the only differences I could find that I think would impact netmap. Please feel free to add any further info here.
i350 uses ECC self correcting memory, i340 does not.

Here are other advantages
i350 uses less power
i350 has thermal sensors, not that pfsense monitors this, but could someday.
i350 PCIe is 2.1 whereas i340 is 2.0
i350 has better virtualization features, not sure if netmap uses any of these features.
i350 has better management features

Not a whole lot of difference but the i350 is a newer design and may show improvements over the i340 in a live environment.
I ordered an Intel i350T4V2 and will put it though some rigorous testing and post the results here.

bmeeks

It's not so much the specifics of the hardware as it is the driver created/used for that hardware. Netmap compatibility must exist at the software layer where the NIC driver meets the operating system (specifically the kernel networking stack). Netmap is still an emerging technology. There have been (and probably still are) some issues/bugs in both the FreeBSD implementation of Netmap and in Suricata's use of Netmap. pfSense itself is out of the picture as it is simply using Netmap as provided by FreeBSD upstream. Similarly, the Suricata package on pfSense is beholden to Suricata upstream for all the Netmap code.

I don't have a list of known-to-be compatible drivers. I think lots of folks have had success with the basic old em series of Intel drivers (em0, em1, etc.). However many of the newer Intel NICs now use some different FreeBSD drivers. Some of those drivers are not as Netmap friendly as others.

Bill

dcol

I have used the igb driver with the i340-T4 and I get a lot of netmap errors in the console.
I do have an em0 driver on one of the unused interfaces and will try that this evening with inline.

I also read that using emulation mode may work better in some cases. Like to hear your opinion on that, Bill.

So far what I have experienced using inline is the netmap errors and the CPU usage starts increasing to the point that the system eventually locks up after a period of time which could be hours or days. This is why I went off of inline for now. But I realize the potential of inline and want to resolve any issues.

dcol

Switched WAN to em0 and inline mode still get netmap bad pkt errors in the console, but the real concern is that the CPU usage jumped to 50-60% from 3-5% using Legacy mode and the entire network slowed down. I am only dropping 13 emerging threat categories in the dropsid.conf and maybe 200 rules in the disablesid.conf. Suricata is only used on the WAN interface.

PFsense 2.4.1
Suricata 4.0.0.2
Cron and PFblockerNG are the only other installed packages

Is this normal?

monitor.jpg_thumb

dcol

So from what it appears, using inline mode on the WAN is not a good idea because it can kill the network and CPU because of all the hits the WAN gets.
I am not concerned about anything inside the network, just from the WAN so I don't need to put any other interfaces on Suricata.

So, does the number of rulesets coincide with the amount of CPU work Suricata inline has to do? Or is it just the ones configured in the dropsid.conf

What are the factors involved in Inline creating more CPU usage?

Would an 8 core or faster speed processor be more effective with inline mode on the WAN interface or would that also get bogged down?

Is anyone else seeing this?

Here are the monitors using Legacy mode

monitor2.jpg_thumb

bmeeks

The total count of rules impacts the CPU usage. All active rules are evaluated regardless if they are in dropsid.conf or not. If the rule is not disabled or commented out (same as default disabled), then CPU cycles are required in order to evaluate it. All putting a rule in dropsid.conf does is convert the action to DROP from ALERT. There is no CPU penalty for DROP as opposed to ALERT. In fact, you could say dropping traffic is less CPU intensive because what really goes on with Netmap is the packet is inspected by Suricata, and if OK, then copied to the kernel network stack. If the packet generates a drop, then it is not copied to the kernel network stack. So you can argue less CPU work is needed for a drop because the "copy packet to kernel network stack" step is skipped. However, in real life the difference is just a handful of machine cycles.

Rules eat up memory and CPU in either Suricata or Snort. Think about how each and every network packet has to be analyzed and compared to every enabled rule. When you have hundreds and hundreds of enabled rules and many, many packets per second of network traffic your CPU will be quite busy. This is compounded by putting the same rules on multiple interfaces (say using same rules on WAN and LAN). Some folks have admitted to doing that, but I don't know why. More CPU cores is better, but note that with Suricata more cores equals also more memory usage. You would have to jack up the TCP stream memory cap settings with an 8-core CPU.

Using the emulated driver mode for Inline IPS is going to be more CPU intensive. It can be compared to running VMware Workstation on a Windows machine and then running a Windows VM inside VMware Workstation. You create extra work for the CPU to virtualize the network driver for Suricata and pfSense.

Running Legacy Mode blocking with Suricata is perfectly fine and secure enough for most cases. Netmap will eventually mature and just work with everything, but it is going to take some time.

Bill

dcol

Once again, a nice easy to the point explanation on a very complicated feature. Thanks

I now realize that inline has its issues and should not be used in production. I will go back to it from time to time to see if any improvements are made.
I also know that the people here are not the ones to fix these issues, but I do want to report any finding I may have for others to see.

I have tried most of the supported hardware and they all have netmap bad pkt errors at the rate of 10-20 per hour.
I tried Intel i217, i210, i340 82580, 82579 with same results. Should have an i350 soon, and will try that as well but I expect little change since the i350 is very similar to the i340..

dcol

I received the i350-T4V2 and it experiences the same netmap errors. I let it run for 12 hours and received about 50 bad pkt errors. One interesting thing is that no bad pkt errors came up until an alert/block was triggered, which leaves me to believe that the blocks may be causing the bad pkts. I guess the netmap and NIC driver developers have more work to do.

But if the blocks are truly causing the bad pkts, then the bad pkts are not really an issue. I could care if an IP that triggered a block had a bad packet. I didn't want that traffic anyway. There is no time stamp on the bad packets on the console, so I have no way to tell which IP had the bad packet or when it was triggered. There were many more blocks than there were bad packets. But maybe the termination of the packet caused by the block also generated the bad pkt message.

So if anyone knows how to track down the IP's that trigger a bad pkt, let me know and I will research it further.

bmeeks

@dcol:

I received the i350-T4V2 and it experiences the same netmap errors. I let it run for 12 hours and received about 50 bad pkt errors. One interesting thing is that no bad pkt errors came up until an alert/block was triggered, which leaves me to believe that the blocks may be causing the bad pkts. I guess the netmap and NIC driver developers have more work to do.

But if the blocks are truly causing the bad pkts, then the bad pkts are not really an issue. I could care if an IP that triggered a block had a bad packet. I didn't want that traffic anyway. There is no time stamp on the bad packets on the console, so I have no way to tell which IP had the bad packet or when it was triggered. There were many more blocks than there were bad packets. But maybe the termination of the packet caused by the block also generated the bad pkt message.

So if anyone knows how to track down the IP's that trigger a bad pkt, let me know and I will research it further.

If the bad pkt messages only appear with blocked traffic, and the blocks are otherwise legit, then I would not worry about the message. Probably harmless log spam.

Did you remember to disable all the hardware checksumming on the Advanced Network tab in pfSense? You must turn off hardware-based checksums, TCP segmentation offloading and LRO (Large Receive Offloading) when using Inline IPS Mode. Hardware checksumming is on by default. If you make changes to these parameters, you need to reboot the firewall for them to take effect.

Bill

dcol

offloads are all disable. Did this since starting Inline. Just looked and yes set to disabled (checked).

I am really not sure if the bad packets displayed on the console come from drops or alerts because there is no traceable info in the message.
That is what I was asking. Is there a way to trace them?

If I know for sure these bad packets are a result of a drop or an alert, I will ignore them. I just don't want good packets being dropped.

bmeeks

Dropped and alerted packets (their IP addresses) will show up on the ALERTS tab. Dropped packets will be shown in red while alerts only will be black (or the default color).

Bill

dcol

I am familiar with the alerts and dropped tabs, I just don't know if the bad pkts on the console reflect the entries in the alerts. There is not enough info in the console message to determine if they were generated by the same packet that generated the alert.

Is there a log I can turn on, or use, that will give me more detailed info on the bad pkts that are on the console?

bmeeks

@dcol:

I am familiar with the alerts and dropped tabs, I just don't know if the bad pkts on the console reflect the entries in the alerts. There is not enough info in the console message to determine if they were generated by the same packet that generated the alert.

Is there a log I can turn on, or use, that will give me more detailed info on the bad pkts that are on the console?

No. If the console messages don't contain any IP or MAC address hints, then there is nothing else to enable.

Bill

dcol

Then I just have to wait until the powers that be fix it.

I do have one more question for the community.

Does anyone out there not see the bad pkts in the console?
If so, what NIC is in use and what interface. I am using inline on the WAN interface. Maybe the WAN is just too active to handle the packets with netmap. I want to make sure that it is not just me .