Some news about upcoming Suricata updates
-
@bmeeks said in Some news about upcoming Suricata updates:
A particularly bothersome bug was fixed in Suricata 6.x with the "workers" runmode. In version 5.x, running "workers" mode will cause a ton of log spam with "packet seen on wrong thread" messages. So I would not enable that mode in the current Suricata package which is based on the 5.0.6 binary. Wait for this coming update to 6.0.3 before experimenting with runmode "workers".
Actually in Workers mode I got more throuhput in comparison with Autofp. I ran Suricata like this for a year or so. I will test in both running modes in 6.0.3 and compare the results.
Also it's a good ideea that what you summarized here about Suricata workflow should be made sticky.
Much obliged
-
@nrgia said in Some news about upcoming Suricata updates:
@bmeeks said in Some news about upcoming Suricata updates:
A particularly bothersome bug was fixed in Suricata 6.x with the "workers" runmode. In version 5.x, running "workers" mode will cause a ton of log spam with "packet seen on wrong thread" messages. So I would not enable that mode in the current Suricata package which is based on the 5.0.6 binary. Wait for this coming update to 6.0.3 before experimenting with runmode "workers".
Actually in Workers mode I got more throuhput in comparison with Autofp. I ran Suricata like this for a year or so. I will test in both running modes in 6.0.3 and compare the results.
Also it's a good ideea that what you summarized here about Suricata workflow should be made sticky.
Much obliged
"workers" mode is suggested for the highest performance, but it always gave me tons of "packet seen on wrong thread" errors in my testing on virtual machines. This was also reported by others on the web. The "why" had a lot of complicated explanations, and there is a very long thread on the Suricata Redmine site about that particular logged message. It also showed up in stats when they were enabled. This error message disappeared for me with the 6.x binary branch. During testing of the recent bug fix and netmap change, I could readily reproduce the "packet seen on wrong message" by simply swapping out the binary from 6.0.3 to 5.0.6 without changing another thing. That convinced me it was something in the older 5.x code.
"autofp" mode, which means auto flow-pinned, is the default mode. It offers reasonably good performance on most setups. But if you have a high core-count CPU, then "workers" will likely outperform "autofp". You can choose either mode in the pfSense Suricata package and experiment. Which mode works best is very much hardware and system setup dependent. So it's not a one-size-fits-all scenario. Experimentation is required to see which works best in a given setup.
For very advanced users who want to tinker, there are several CPU Affinity settings and other tuning parameters exposed in the
suricata.yaml
file. If tinkering there, I would do so only in a lab environment initially. And remember that on pfSense, thesuricata.yaml
file for an interface is re-created from scratch each time you save a change in the GUI or start/restart a Suricata instance from the GUI. So to make permanent edits, you actually would need to change the settings in the/usr/local/pkg/suricata/suricata_yaml_template.inc
file. The actualsuricata.yaml
file for each interface is built from the template information in that file. -
For those of us currently running Snort on capable multi-core hardware, would these enhancements be a good enough reason to start thinking about switching over to Suricata? It sounds like Suricata will have the opportunity to potentially significantly outperform Snort in inline IPS mode. Thanks in advance for your thoughts on this.
-
@tman222 said in Some news about upcoming Suricata updates:
For those of us currently running Snort on capable multi-core hardware, would these enhancements be a good enough reason to start thinking about switching over to Suricata? It sounds like Suricata will have the opportunity to potentially significantly outperform Snort in inline IPS mode. Thanks in advance for your thoughts on this.
My opinion is that it's highly dependent on the traffic load. Probably up to about 1 Gigabit/sec or so, there would not be much difference. As you approach 10 Gigabits/sec, then I would say most certainly so. Of course the number and type of rule signatures enabled plays a huge role in performance. So there is also that variable to consider.
Snort today, when using Inline IPS Mode, already has the same netmap patch that I put into Suricata. In fact, I originally wrote that new code for the Snort DAQ. But Snort itself is only a single-threaded application, so the impact of multi-queue support in the Snort DAQ was minimal in terms of performance.
As many have pointed out, Snort3 is multithreaded. So Snort3 could benefit from the multiple host ring support. Unfortunately, Snort3 uses a new data acquisition library called
libdaq
. Although I contributed it to them quite some ago (actually about two years or more), and they said they would, the upstream Snort team has not yet included the multi-queue and host stack endpoint netmap patch I submitted in the newlibdaq
library. As a result,libdaq
does not currently support host stack endpoints for netmap. So you can't use it for IPS mode currently on pfSense unless you configure it to use two physical NIC ports for the IN and OUT pathways and bridge between them. It can't work like Snort2 is working using one NIC port and the kernel host stack as the two netmap endpoints. -
I did some initial speed tests as follows:
Tested on 1Gbs Down and 500 Mbps Up line
pfSense Test Rig
https://www.supermicro.com/en/products/system/Mini-ITX/SYS-E300-9A-4C.cfmService used: speedtest.net
NIC Chipset - Intel X553 1Gbps
Dmesg info:
ix0: netmap queues/slots: TX 4/2048, RX 4/2048 ix0: eTrack 0x80000567 ix0: Ethernet address: ac:1f:*:*:*:* ix0: allocated for 4 rx queues ix0: allocated for 4 queues ix0: Using MSI-X interrupts with 5 vectors ix0: Using 4 RX queues 4 TX queues ix0: Using 2048 TX descriptors and 2048 RX descriptors ix0: <Intel(R) X553 (1GbE)>
Results:
Suricata 5.0.6
3 threads - workers mode
Dwn 374.79 - Up 439.38Suricata 6.0.3
auto threads - workers mode
Dwn 410.19 - Up 380.123 threads - workers mode
Dwn 415.47 - Up 436.632 threads - workers mode
Dwn 419.27 - Up 458.21auto threads - AutoFp mode
Dwn 376.13 - Up 358.583 threads - AutoFp mode
Dwn 416.20 - Up 456.342 threads - AutoFp mode
Dwn 418.61 - Up 446.36Please note that if the IPS mode(netmap) is disabled, this configuration can obtain the full line speed.
-
@bmeeks Also in system logs I see an Error about a "rejection.sid", but we don't have even a sample there. And I did not use one before.
The logs lines are:
63213 [Suricata] Enabling any flowbit-required rules for: LAN... 63213 [Suricata] ERROR: unable to find reject_sid list "none" specified for LAN 63213 [Suricata] Updating rules configuration for: LAN ... 63213 [Suricata] Building new sid-msg.map file for WAN... 63213 [Suricata] Enabling any flowbit-required rules for: WAN... 63213 [Suricata] ERROR: unable to find reject_sid list "none" specified for WAN 63213 [Suricata] Updating rules configuration for: WAN ...
-
@nrgia said in Some news about upcoming Suricata updates:
@bmeeks Also in system logs I see an Error about a "rejection.sid", but we don't have even a sample there. And I did not use one before.
The logs lines are:
63213 [Suricata] Enabling any flowbit-required rules for: LAN... 63213 [Suricata] ERROR: unable to find reject_sid list "none" specified for LAN 63213 [Suricata] Updating rules configuration for: LAN ... 63213 [Suricata] Building new sid-msg.map file for WAN... 63213 [Suricata] Enabling any flowbit-required rules for: WAN... 63213 [Suricata] ERROR: unable to find reject_sid list "none" specified for WAN 63213 [Suricata] Updating rules configuration for: WAN ...
I'll reply to this post first.
Most likely there once was a list value that got saved, and then maybe the list was removed. I didn't see that error during testing for this release, and nothing was changed in that part of the code anyway.
To see what might be up, examine your
config.xml
file in a text editor and look carefully through the <suricata> element tags. The tag names are well labeled and you can follow which tags contain certain parameters. The SID conf files are contained in a list array with the names clearly denoted. Then for each Suricata interface (your WAN, for example), there is an XML tag describing the <reject_sid_conf> file to use for that interface. See if there is a value in that tag for your WAN. It should be empty. -
@nrgia said in Some news about upcoming Suricata updates:
I did some initial speed tests as follows:
Tested on 1Gbs Down and 500 Mbps Up line
pfSense Test Rig
https://www.supermicro.com/en/products/system/Mini-ITX/SYS-E300-9A-4C.cfmService used: speedtest.net
NIC Chipset - Intel X553 1Gbps
Dmesg info:
ix0: netmap queues/slots: TX 4/2048, RX 4/2048 ix0: eTrack 0x80000567 ix0: Ethernet address: ac:1f:*:*:*:* ix0: allocated for 4 rx queues ix0: allocated for 4 queues ix0: Using MSI-X interrupts with 5 vectors ix0: Using 4 RX queues 4 TX queues ix0: Using 2048 TX descriptors and 2048 RX descriptors ix0: <Intel(R) X553 (1GbE)>
Results:
Suricata 5.0.6
3 threads - workers mode
Dwn 374.79 - Up 439.38Suricata 6.0.3
auto threads - workers mode
Dwn 410.19 - Up 380.123 threads - workers mode
Dwn 415.47 - Up 436.632 threads - workers mode
Dwn 419.27 - Up 458.21auto threads - AutoFp mode
Dwn 376.13 - Up 358.583 threads - AutoFp mode
Dwn 416.20 - Up 456.342 threads - AutoFp mode
Dwn 418.61 - Up 446.36Please note that if the IPS mode(netmap) is disabled, this configuration can obtain the full line speed.
Are you testing "through" pfSense or "from" pfSense? That can make a big difference. The most valid test is through pfSense. Meaning from a host on your LAN through the firewall out to a WAN testing site.
While running a speed test through pfSense, run
top
and see how many CPU cores are running Suricata. I would expect threads to be distributed among the cores, especially in "workers" runmode. Also note that each time you change the runmode setting, you need to stop and restart Suricata.And finally, remember that a speed test usually represents a single flow, so that will factor into how the load is distributed. A given flow will likely stay pinned to a single thread and core. On the other hand, multiple flows (representing different hosts doing different things) will balance across CPU cores better. This is due to how Suricata assigns threads and flows using the flow hash (calculated from the source and destination IPs and ports). So a simple speed test from one host to another is not going to be able to fully showcase the netmap changes. On the other hand, multiple speed tests from differents hosts, all running at the same time, would represent multiple flows and should balance better across the CPU cores. That would better illustrate how the multiple host stack rings are contributing.
-
@bmeeks said in Some news about upcoming Suricata updates:
@nrgia said in Some news about upcoming Suricata updates:
@bmeeks Also in system logs I see an Error about a "rejection.sid", but we don't have even a sample there. And I did not use one before.
The logs lines are:
63213 [Suricata] Enabling any flowbit-required rules for: LAN... 63213 [Suricata] ERROR: unable to find reject_sid list "none" specified for LAN 63213 [Suricata] Updating rules configuration for: LAN ... 63213 [Suricata] Building new sid-msg.map file for WAN... 63213 [Suricata] Enabling any flowbit-required rules for: WAN... 63213 [Suricata] ERROR: unable to find reject_sid list "none" specified for WAN 63213 [Suricata] Updating rules configuration for: WAN ...
I'll reply to this post first.
Most likely there once was a list value that got saved, and then maybe the list was removed. I didn't see that error during testing for this release, and nothing was changed in that part of the code anyway.
To see what might be up, examine your
config.xml
file in a text editor and look carefully through the <suricata> element tags. The tag names are well labeled and you can follow which tags contain certain parameters. The SID conf files are contained in a list array with the names clearly denoted. Then for each Suricata interface (your WAN, for example), there is an XML tag describing the <reject_sid_conf> file to use for that interface. See if there is a value in that tag for your WAN. It should be empty.Found this line
<reject_sid_file>none</reject_sid_file>
in config.xml
But it's odd because I never had even a sample in the SID Management tab.
I'll delete it then... -
@nrgia said in Some news about upcoming Suricata updates:
Found this line
<reject_sid_file>none</reject_sid_file>
in config.xml
But it's odd because I never had even a sample in the SID Management tab.
I'll delete it then...That should get rid of the error. That text got saved in there somehow, so it's looking for a conf file named "none".
-
@bmeeks said in Some news about upcoming Suricata updates:
Are you testing "through" pfSense or "from" pfSense? That can make a big difference. The most valid test is through pfSense. Meaning from a host on your LAN through the firewall out to a WAN testing site.
LAN Host -> pfSense -> speedtest.net
If you know a location to test with multiple connection, I can try. Also I tried p2p connections like torrents, it reaches 786 Mbps at best.While running a speed test through pfSense, run
top
and see how many CPU cores are running Suricata. I would expect threads to be distributed among the cores, especially in "workers" runmode. Also note that each time you change the runmode setting, you need to stop and restart Suricata.Suricata was stopped and restarted each time I changed the settings. Also I gave each instance of Suricata 1 minute to settle down.
2 with 2 , 3 with 1, 1 with 1 cores, it fluctuates during the speed tests. Also Suricata is enabled on 2 interfaces, and only 4 coresAnd finally, remember that a speed test usually represents a single flow, so that will factor into how the load is distributed. A given flow will likely stay pinned to a single thread and core. On the other hand, multiple flows (representing different hosts doing different things) will balance across CPU cores better. This is due to how Suricata assigns threads and flows using the flow hash (calculated from the source and destination IPs and ports). So a simple speed test from one host to another is not going to be able to fully showcase the netmap changes. On the other hand, multiple speed tests from differents hosts, all running at the same time, would represent multiple flows and should balance better across the CPU cores. That would better illustrate how the multiple host stack rings are contributing.