Suricata Often Down/Disabled
-
I've run pfSense on a Dell Inspiron 531 (AMD Athlon 64 X2 4000+ w/ 4 GB RAM) for months without issue. Suricata has performed well positioned in front of a small network. IPS policy selection is set to "high". One monitored interface: WAN. After adding a second Suricata interface, a persistant VPN tunneling through the WAN interface, BOTH the WAN and the VPN become disabled at random times frequently– several times an hour. This is not functional, but I can't seem to find the answer as to why. I'm a relative Suricata/Snort newbie, so, I need to post to the forum for assistance.
Points to consider:
1. Is this likely a performance issue with my hardware or configuration?
2. True, the VPN is getting a good deal more alerts than the WAN ever did.
3. Should I not run Suricata on both interfaces at the same time? Regardless, doing so (running on one at a time), results in the same behavior: the interface is disabled within minutes.
4. I am running Barnyard 2 on both interfaces.
5. I have increased the default memory allocation for most of the settings in Suricata (in most cases for most configurable fields, 4x default memory) and I am still getting the same behavior despite not hitting the memory limit of the entire system.
6. I have set the IPS Policy Selection to "conservative" (aka, low), and the same behavior results-- both interfaces become disabled frequently.
6. See attached recent Suricata log. I see a lot of rules that result in, "ERRCODE: SC_ERR_INVALID_SIGNATURE(39)". Could this be the issue?
I realize that I am likely missing the "elephant in the front yard" on this one. Thank you all for your time and for direction on troubleshooting this issue!
[suricata VPN Interface.txt](/public/imported_attachments/1/suricata VPN Interface.txt)
-
First up, I can tell you the SC_INVALID_SIGNATURE error is not the problem. That's just Suricata rejecting some Snort VRT rules that have keywords and content options that Suricata 2.0.x does not recognize. It just prints the error and then throws out those rules. Emerging Threats (now owned by ProofPoint) is a bit better suited for Suricata because they produce a rule set explicitly designed for Suricata. The package automatically downloads that version when you enable ET rules. As I've said several times here on the form before, there are nearly 800 Snort VRT rules that Suricata will reject and not use because of unsupported keyword and content-modifier options.
It might be the two interfaces are stepping on each other. Suricata operates by putting the monitored interface in promiscuous mode.
Are you saying that previously, with Suricata on just the WAN, it ran fine? And now, after adding the VPN, it crashes no matter which of the two interfaces (VPN or WAN) have Suricata enabled?
Bill
-
Bill,
Thank you for the response– the information about the rejected Suricata rules and Emerging Threats is very helpful to know. It's a good reason to purchase an ET Pro subscription.
You asked, "Are you saying that previously, with Suricata on just the WAN, it ran fine? And now, after adding the VPN, it crashes no matter which of the two interfaces (VPN or WAN) have Suricata enabled?"
The unfortunate answer is, "yes". You have a perfect understanding on the issue: Suricata ran fine (for months, even) with the WAN interface alone (not as many alerts as the VPN interface) and when I added the VPN interface a few days ago, Suricata will crash and disable itself no matter the combination of interfaces. It will crash when Suricata is enabled on both interfaces, or, when either of the two (individually) have Suricata running. The crashes occur very soon after re-enabling the interface-- usually within minutes to an hour. The VPN connection functions well, with the exception of Suricata crashing. Deleting and re-adding the interfaces has not resolved the issue yet, either.
I wonder... is it possible that if this is not a performance-related issue (by overwhelming the hardware with too many alerts on the VPN interface-- it is, after all, a public-facing VPN server), then maybe the "disabling" behavior I'm seeing is the result of a clever attack of some sort? And if not an attack, a configuration issue? I am wide open to suggestions.
I'll be happy to provide logs of whatever you ask for and thank you for your expertise.
Sincerely
-- Rick
-
I doubt it's performance related. More likely something to do with libpcap interacting with the virtual VPN interface on the actual WAN interface. Just a hypothesis at this point, though. Is there anything printed in the system log at the time of the crash? Is your WAN a standard DHCP setup with your ISP giving out an IP, is it static, or is it something like PPPoE?
Bill
-
Aha! Big clue found in the logs, I believe:
1. I waited until Suricata dropped on one of the two interfaces
2. Under the General System logs, I found these entries:
Mar 2 18:33:12 kernel: swap_pager_getswapspace(16): failed
Mar 2 18:33:12 kernel: swap_pager_getswapspace(16): failed
Mar 2 18:33:11 kernel: pid 76462 (suricata), uid 0, was killed: out of swap space
Mar 2 18:33:11 kernel: swap_pager_getswapspace(16): failed
Mar 2 18:33:11 kernel: swap_pager_getswapspace(16): failed
Mar 2 18:33:11 kernel: swap_pager_getswapspace(2): failed
Mar 2 18:33:11 kernel: swap_pager_getswapspace(16): failed* See attached *.txt file which illustrates the suricata failure in more detail, with more log entries.
3. I noticed during the time of the failure that the memory usage briefly climbed to 95%+ and then backed down to "8% of 3931MB". Also, I noticed that the swap space climbed and remained at 73% of 8191MB after the crash. Suricata is now disabled on one interface (VPN crashed, this time) and the system metrics such as CPU and memory are stable.
* see screenshot that was taken shortly after the crash, and the system information metrics currently remain stable at the time of this writing (30+ min. or so after the crash).
So, what would consume so much swap space? What setting(s) shall I modify to mitigate the problem?
–----------------------------------------------------------------------------------------------------------------------------------
Bill, you asked, "Is your WAN a standard DHCP setup with your ISP giving out an IP, is it static, or is it something like PPPoE?"
Indeed my WAN is a standard DCHP setup from a cable provider-- I'm not using Dynamic DNS currently.
My VPN is a third party VPN provider. That VPN server is public facing with no firewall configured, and the IP is dynamic using the diffie-hellman key exchange with forward secrecy.
If there is anything else I can provide, let me know.
Thank you, Bill.
Sincerely,
-- Rick
![pfSense System Information.png](/public/imported_attachments/1/pfSense System Information.png)
![pfSense System Information.png_thumb](/public/imported_attachments/1/pfSense System Information.png_thumb)
pfSense_Log-Suricata_Crashes-Example#1.txt -
That's a lot of swap space to be consuming. In fact, it really should be zero percent. I think you mentioned that as part of your earlier troubleshooting you went through and quadrupled many of the memory settings for Suricata's configurable parameters. That along with a large enabled set of rules could be why you are consuming all of your 4GB of RAM and then using swap memory.
Go back and set all of the parameters you adjusted back to their defaults, then see how swap usage looks.
Bill
-
Bill,
I did do some more tests last night. I reset the memory configuration to default for both interfaces but to no avail: it was still consuming a large amount of RAM and swap space.
So, this morning I removed the VPN interface altogether (remember that adding this interface was the beginning of the symptoms), however, after a reboot, over 75% of the memory was consumed within minutes while the swap space remained low. I didn't have time to perform much more testing, but watching the memory steadily climb with each refresh of the system stats page with only the original "WAN" interface installed and enabled makes me think there's a memory leak somewhere, somehow.
I will do more tests tonight as I have time, and I'll even perform a full wipe/reload to see if file corruption was the issue if necessary
If you have any additional pointers on troubleshooting memory/swap issues, I'll surely welcome the suggestions.
Thank you, truly, for your time thus far. I'll be sure to report back my results ASAP and in detail so others can benefit from this experience.
All the best,
– Rick
-
Have you taken a look in top from the console to see which process is consuming the most memory? Is it Suricata? 4GB of RAM should be enough unless you are running an extremely large rule set with a ton of connections.
Bill
-
Bill,
I did not check top before further troubleshooting this evening, unfortunately. However, I have narrowed down the issue somewhat:
At the console this evening, I removed the network interfaces one by one (VPN and WAN) and observed the performance of each permutation after rebooting. Each test instance yielded the same symptoms under default Suricata configuration settings, despite both of the interfaces being remove from Suricata altogether at one point. So, I uninstalled the Suricata package entirely, rebooted and found the system to be stable.
I then installed Snort, configured it to default settings and downloaded the rules with a valid oink code. I found that the memory was stable at 18-27% usage when both the VPN and the WAN interface was enabled, and, the swap space was stable around 0%. (See screenshot). I then configured the IPS Policy Selection for "Security" on both interfaces. Thereafter, the system continued to remain stable and responsive.
I have yet to uninstall Snort and reinstall Suricata. But the test seems to indicate a corrupt installation or a damaging confluence of configuration settings within Suricata? Performance does not seem to be an issue when Snort is running using the identical network interface configuration and (almost) identical configuration settings.
I'm running out of time this evening, but I will attempt again over the weekend and report back. I'm more hopeful to find the precise answer, now.
Thank you for your input!
Sincerely,
– Rick
::)
-
There were some reported memory leak issues with the Suricata binary, but I don't recall the specific version. Currently the pfSense platform is using 2.0.9 of Suricata, but that will soon changed to the new 3.0RC3 branch for pfSense 2.3-BETA. pfSense 2.2.x users will have to remain on the 2.0.9 Suricata version.
If Snort is working for you, then go with it. There is practically no difference in the two packages in terms of protection.
Bill
-
Bill,
I did not check top before further troubleshooting this evening, unfortunately. However, I have narrowed down the issue somewhat:
At the console this evening, I removed the network interfaces one by one (VPN and WAN) and observed the performance of each permutation after rebooting. Each test instance yielded the same symptoms under default Suricata configuration settings, despite both of the interfaces being remove from Suricata altogether at one point. So, I uninstalled the Suricata package entirely, rebooted and found the system to be stable.
I then installed Snort, configured it to default settings and downloaded the rules with a valid oink code. I found that the memory was stable at 18-27% usage when both the VPN and the WAN interface was enabled, and, the swap space was stable around 0%. (See screenshot). I then configured the IPS Policy Selection for "Security" on both interfaces. Thereafter, the system continued to remain stable and responsive.
I have yet to uninstall Snort and reinstall Suricata. But the test seems to indicate a corrupt installation or a damaging confluence of configuration settings within Suricata? Performance does not seem to be an issue when Snort is running using the identical network interface configuration and (almost) identical configuration settings.
I'm running out of time this evening, but I will attempt again over the weekend and report back. I'm more hopeful to find the precise answer, now.
Thank you for your input!
Sincerely,
– Rick
::)Same here on my new AMD octa core: thought that suricata might benefit from it (compared to my old core2duo), that´s why I switched from snort to suricata (thought of better multicore support). Snort had been running here without problems for months.
Unfortunately this also came up to constantly crashes in suricata module (with tons of "error parsing signature" entries in the logs), so I switched back to snort again too.
In the meantime - while suricata behaves like this - I hope that you get these problems under control.
Thanks for your work.
-
Bill:
Thank you for your input and suggestions. I find myself a little weary having spent hours on troubleshooting this issue this week, and while not being a pfSense/IDS/IPS expert by far I know I spent more time than was likely necessary.
I have not yet reverted Snort back to Suricata. Given your thought that they are almost identical (other than some well documented differences such as multi-thread support, ruleset behavior, etc), I believe I will continue to work with Snort. Currently, my system is stable using Snort with both interfaces enabled (VPN and WAN), and I'm now starting to fine tune the rules (more investment in Snort). As an aside, I found that the "http inspect" suite of rules is often indicating a false positive.
Thank you for your time and for actively listening and responding to each of my posts. Sincerely so– that's dedication.
Peter:
I am glad to know someone can corroborate my findings. This helps allay any doubt that my machine was working with a hardware (RAM?)/corrupt software issue. The odd thing about all this is that Suricata ran perfectly smooth for months using just one interface (WAN) from a typical residential cable connection. Once the VPN was established, the memory climbed immediately after reboot like a thermometer in Death Valley… in August. ???
I hope others might be able to gain from this experience. If anyone wants to see any other logs or continue to pursue this matter, feel free to post to this thread and I will respond.
All the best,
-- Rick
-
ok, solved my problem.
Mine was related to network memory buffers (mbuf) as descriped here: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#mbuf_.2F_nmbclusters
This seems to be a "known bug" respectively demand for manual tuning of standard pfsense settings if you have many CPU-cores AND many NICs on your machine (mine has 8 cores and 5 NICs).
After setting it to a Million (1.000.000), everything is fine again, no more suricata crashes (despite the fact that suricata does not handle all VRT-rules, but "that´s another roadworks", as we say in Germany ;) ).