6100 SLOW in comparison to Protectli FW6E
-
If you are using Suricata with Inline IPS Mode, you may notice a throughput improvement by switching the Suricata runmode from "autofp" to "workers" on the INTERFACE SETTINGS tab.
The "workers" runmode can work better with netmap. The default runmode is "autofp" as that is a good general purpose starting point, but some configurations will benefit by switching to "workers". Any change you make to the runmode requires restarting Suricata on the interface before it is effective.
I am interested in what impact switching the runmode has for you, so please report back in this thread. "Workers" mode is best for multi-queue NICs.
Also, if you don't mind, please post the hardware specs (CPU type and speed, particularly) of the Protectli unit you had previously.
-
@bmeeks Did that change and it did a HUGE boost!
I had only snort rules with IPS balanced and policy and had 815 download. After changing this setting it maxed out at 935 which is near the maximum I get. I will again select more rules I had before and recheck the speed now.The Protectli was a FW6E – 6 Port Intel i7
Intel i7 8550U Quad Core with Hyperthreading (8 threads) at up to 4GHz (turbo boost)
Dual DDR4 memory up to 64GB
mSATA and/or 2.5″ solid state drive
6 Intel Gigabit Ethernet NIC ports
AES-NI -
@manilx I have reenabled the emerging-3coresec as a test.
Run speedtest and got full download speed of 930. Excellent!BUT in the middle of repeating that I lost access to the 6100, no ping. LED was blinking blue, normal.
Scary!
Powered it off by 5s press to power button. Got an orange LED.
Various tries to power it on by 5s press were uncuccessful. Pulled the power plug.
Turned on again.Repeated tests. 930 download.
-
@manilx After some more speedtests the unit died on me twice again! Completely unstable.
This setting kills it after a short while.
Rolling back to a zfs snapshot. -
@manilx:
What were the NICs in the Protectli?There are some open iflib issues with some of the Intel NICs in FreeBSD.
A little surprised workers mode locks up, but it does make me curious. On the Suricata Redmine site some OPNsense users have a similar issue, and OPNsense defaults to workers runmode. We have been trying to determine why.
Their issue is triggered by heavy traffic such as a speed test. Would be interested if you can just let it run a bit with normal traffic flow. Also make sure all hardware offloadings are disabled. That is very critical with inline IPS mode.
-
@bmeeks Protectli had Intel NIC's as I posted above.
Yes, it's definitely caused by heavy traffic. But I can't do more tests here. I NEED a stable system. I have nightly backups running from my company.
All hardware offloadings were/are disabled. -
@manilx said in 6100 SLOW in comparison to Protectli FW6E:
Protectli had Intel NIC's as I posted above.
Can you give me the exact model or else the specific FreeBSD driver being used? For example, em, igb, ix, etc.? All Intel NICs are not using the same driver in FreeBSD.
-
@manilx said in 6100 SLOW in comparison to Protectli FW6E:
Yes, it's definitely caused by heavy traffic.
Unfortunately, I don't have a test bed that seems able to reproduce the problem. Ditto for the upstream Suricata developer working on this same issue. And not even the OPNsense principal developer has been able to duplicate the stall/hang so far as I know. But there is a European user posting in the Suricata Redmine thread here that can reliably reproduce the problem. He is using a virtual environment, so virtual NIC drivers instead of actual hardware.
There is obviously an issue, but so far we have been unable to identify the cause so we can fix it. The first step is being able to reliably reproduce the stall/hang, and thus far that has not been done on a developer's machine .
-
@bmeeks https://eu.protectli.com/product/fw6e/
Interfaces were em0 and em1
This unit is a 6100. The freeze has got me panicked and I really can't afford it to loose connectivity "at random"
-
@bmeeks Perhaps netgate can provide you a 6100 for this. It's in their interest also as pfsense is only as good as the packages.....
-
@manilx said in 6100 SLOW in comparison to Protectli FW6E:
@bmeeks https://eu.protectli.com/product/fw6e/
Interfaces were em0 and em1
This unit is a 6100. The freeze has got me panicked and I really can't afford it to loose connectivity "at random"
Understand. My request was just "if possible", but I certainly appreciate the need for stability.
Some other questions, though, if you will answer them --
On the Protectli hardware, were you also using Inline IPS Mode and able to achieve essentially line-rate speed tests on the Gigabit link with Suricata running? I think that answer is "yes" from reading your posts in this thread, but I want to be sure.
And you say the Protectli Ethernet drivers were the
em
interfaces. That is potentially a valuable clue as like I said previously, there are some differences in the drivers that support Intel NIC families on FreeBSD.I have shared your experience in the Suricata Redmine issue thread I referenced earlier.
-
@bmeeks ANY questions you need!!!!
I used exactly the same config on the Protectli. I imported the config from there actually.
Inline IPS mode, with all ETopen rules activated and I got full 980 download speed. -
@manilx said in 6100 SLOW in comparison to Protectli FW6E:
@bmeeks ANY questions you need!!!!
I used exactly the same config on the Protectli. I imported the config from there actually.
Inline IPS mode, with all ETopen rules activated and I got full 980 download speed.Thanks. I will continue looking into this. There are three of us still looking at this issue: me, a Suricata developer, and from time to time the OPNsense principal developer. Whatever fix is found will be incorporated into Suricata upstream, so you can follow the Suricata Redmine Issue if you would like. It might also turn out to be something that needs fixing in FreeBSD itself. This actually seems most likely to me as no Suricata users on Linux have reported anything similar.
-
@manilx said in 6100 SLOW in comparison to Protectli FW6E:
@bmeeks Perhaps netgate can provide you a 6100 for this. It's in their interest also as pfsense is only as good as the packages.....
I have an SG-5100 which, if I recall, has the same NICs. But I am using that one for my production firewall at the moment so kind of in the same situation as you with regards to needing it up and running.
I will soon have some other spare PC hardware, and perhaps I can grab an Intel NIC card for it that uses the ix and/or igc drivers.
-
@bmeeks I actually started with OPNsense on Proxmox. Run fine for 9 months until https://forum.opnsense.org/index.php?topic=31338.0
This is when I switched to pfsense.
-
@manilx said in 6100 SLOW in comparison to Protectli FW6E:
@bmeeks I actually started with OPNsense on Proxmox. Run fine for 9 months until https://forum.opnsense.org/index.php?topic=31338.0
This is when I switched to pfsense.
Yes, the issue you linked to there is part of the same Suricata problem. In fact, the issues thread you linked to in OPNsense were the reason the OPNsense developer opened the Suricata Redmine Issue I linked earlier (#5744).
What happened is that starting with Suricata 6.0.9 the upstream group merged in the same netmap device changes for multiple host rings support that we have been using in pfSense since August 2021. I created the original patch to use multiple host rings with Suricata Inline IPS netmap mode and submitted it upstream. It was rather quickly merged into the Suricata 7.x development branch, but not into the 6.0.x Master release branch at that time. But I did merge the patch into the 6.0.x version of Suricata we were using on pfSense. It has been in all Suricata versions used with pfSense since August 2021.
There were no stalling issues that I am aware reported on pfSense when the change was merged. I can only recall a single poster (and he posted on the Suricata forum and not here on the Netgate forum) that has had an issue, and his issue was reported about two months ago. But immediately upon rolling out the patch with the release of Suricata 6.0.9, OPNsense users began experiencing the stall/hang. Eventually OPNsense rolled back to the netmap code that was in Suricata 6.0.8 (essentially reverting the multiple host rings patch). That stabilized Suricata for their users. We are still trying to determine what's up with the new patch.
One key difference, and your experience today reiterates the importance of this difference, is that runmode "workers" is the default on OPNsense while runmode "autofp" is the default on pfSense. Seems "workers" mode is the problem child for some reason. Per your testing, switching to "workers" runmode results in the hang/stall. Running with "autofp" mode seems to be stable, but produces lower throughput.
-
@manilx
I'll ask a simple question.
Is it really necessary to use Inline mode?
I repeatedly tried to run various network cards in this mode, most of them are Intel (ix, igb, igс) and even one of them, no matter how ridiculous, Realtek ... Sooner or later some problems arose, such as a kernel panic or some other. That's why I use legacy mode in production.
The unavailability of the service outweighed my paranoia -
@w0w You might have a point there.
Guess as it was the newer way to do this and it was there I thought it should work.... -
@manilx Tried legacy and it's even slower than inline.
Not an option! -
@manilx
Did you reboot your firewall after switching to legacy, is not it?
I don’t know what’s wrong this time, but usually everything is exactly the opposite, something like this:
Should performance differ so much LEGACY/INLINE IDS?
Of course, I understand that the production firewall is completely unsuitable for various kinds of tests, but nevertheless, most likely the problem is in some kind of configuration ... I think so. One of my two firewalls worked fine with Suricata, and its parameters are about 6100... it's a Celeron(R) CPU N3160 with 16GB of memory, but my bandwidth is only about 600 Mbit, so it is possible I did not reach some limit, but I remember that one of the NICs that I have been tested showed slow speed with Inline mode and normal with Legacy, that's why I was thinking it can help.
...