Suricata process dying due to hyperscan problem
-
@masons said in Suricata process dying due to hyperscan problem:
I removed the offending rule (SID 26470), removed the ASLR change and restarted Suricata. The PC interface Suricata instance immediately dumps core with Signal 11 again.
Stopping the Suricata service, making the ASLR change and restarting Suricata, results in the PC interface Suricata instance coming up and staying up.
At least for me, across several VMs, this is very consistent behavior.
I was about to test specifically with that offending rule enabled, but your test results suggest that is a moot point (meaning not the actual cause). I have no proof, but ALSR is definitely a suspect in my mind (at least for the Signal 11 segfault issue). Apparently it does little to help with the "Hyperscan returned fatal error -1" issue, though.
-
@Maltz said in Suricata process dying due to hyperscan problem:
kernel kills Suricata with a "failed to reclaim memory" error
I didn't reread the now-long thread, but did you post your memory usage with Suricata running?
ZFS is supposed to give up cache RAM but can be tuned to reduce usage:
https://docs.netgate.com/pfsense/en/latest/hardware/tune-zfs.html
"The default maximum ARC size (vfs.zfs.arc.max) is automatic (0) and uses 1/2 RAM or the total RAM minus 1GB, whichever is greater." -
@SteveITS said in Suricata process dying due to hyperscan problem:
I didn't reread the now-long thread, but did you post your memory usage with Suricata running?
It's "28% of 3388 MiB" (4GB Netgate 2100) right now. With any algorithm other than AC-BS, RAM usage ramps up a few minutes after Suricata starts then the kernel kills it.
-
@tylerevers said in Suricata process dying due to hyperscan problem:
@bmeeks said in Suricata process dying due to hyperscan problem:
My pull request containing the anticipated fix for this Hyperscan error has been merged. An updated Suricata package has built and should appear as an available update for 2.7.2 CE and 23.09.1 Plus users.
Look for an update to version 7.0.2_2 for the Suricata package. When installed, the new package should pull in version 7.0.2_5 of the Suricata binary.
Fingers crossed this fixes the Hyperscan issue. But as I mentioned previously, since I could never reproduce the error in my small test environment, I can't say with 100% certainty the bug I found and fixed is the actual Hyperscan culprit.
Nearly 20 hours since updating to 7.0.2_2 on 23.09.1 Plus with custom bare metal setup and no Hyperscan crash yet. Pattern Match set to AUTO and Blocking Mode ENABLED. Using all VLANs that traverse a LAGG in my case just as a reminder.
Thanks, Bill!
At roughly the 28-hour mark, the Suricata Interface failed with the Hyperscan issue again.
-
@bmeeks
i have removed all the rules from an interface but the hyperscan error is still there after a few moments for me.
+noaslr is still doing nothing
any chance you can provide the dbg pkg of suricata? -
@bmeeks
For now and maybe going forward as a perm solution can we just have the package updated to use AC-CS as the default with a note stating to avoid HyperScan for its inconsistent performance or something along those lines. -
@kiokoman said in Suricata process dying due to hyperscan problem:
@bmeeks
i have removed all the rules from an interface but the hyperscan error is still there after a few moments for me.
+noaslr is still doing nothing
any chance you can provide the dbg pkg of suricata?Not at the moment. I'm trying to reconstruct my package builder for the RELENG_2_7_2 branch of CE (which is the current 2.7.2 release), and that build is failing. Working with the Netgate team on that. Once I get my package builder working again, then I can build a debug package and perhaps share it.
Nothing else can happen until at least after this coming weekend as I am about to be out of town for a few days.
-
@michmoor said in Suricata process dying due to hyperscan problem:
@bmeeks
For now and maybe going forward as a perm solution can we just have the package updated to use AC-CS as the default with a note stating to avoid HyperScan for its inconsistent performance or something along those lines.I don't see the point in changing the default if users can just simply make the change manually and save it.
And I can't work on this issue anymore until late this Sunday at the earliest as I will be away from all my computing infrastructure until then.
-
@bmeeks said in Suricata process dying due to hyperscan problem:
I don't see the point in changing the default if users can just simply make the change manually and save it.
I think changing the default would be tremendously useful for people who have no way of knowing why Suricata is crashing over a month after the pfSense update that seemingly broke it. People who haven't, or don't have to expertise to, spend hours poring over system logs, find the right log entry to google, and make their way to this thread.
-
@Maltz said in Suricata process dying due to hyperscan problem:
@bmeeks said in Suricata process dying due to hyperscan problem:
I don't see the point in changing the default if users can just simply make the change manually and save it.
I think changing the default would be tremendously useful for people who have no way of knowing why Suricata is crashing over a month after the pfSense update that seemingly broke it. People who haven't, or don't have to expertise to, spend hours poring over system logs, find the right log entry to google, and make their way to this thread.
I beg to differ, why force everybody to use some settings as workaround, in order to track down an issue? This is not a test branch. As far as I understood from the posts here, this happens only if Suricata is in Legacy Mode. For example I use Suricata in inline mode on WAN and also on LAN with multiple VLANS and I don't encounter this issue. I'm not saying that we should not attempt to fix this, but forcing all of us to use the proposed defaults is bad practice.
-
@bmeeks said in Suricata process dying due to hyperscan problem:
@kiokoman said in Suricata process dying due to hyperscan problem:
@bmeeks
i have removed all the rules from an interface but the hyperscan error is still there after a few moments for me.
+noaslr is still doing nothing
any chance you can provide the dbg pkg of suricata?Not at the moment. I'm trying to reconstruct my package builder for the RELENG_2_7_2 branch of CE (which is the current 2.7.2 release), and that build is failing. Working with the Netgate team on that. Once I get my package builder working again, then I can build a debug package and perhaps share it.
Nothing else can happen until at least after this coming weekend as I am about to be out of town for a few days.
Well. i'm not in a hurry , i just like to solve mistery
-
Suricata still hangs on interfaces with higher traffic, even though I set it to use AC-KS.
It is strange that the same message appears with the hyperscan error, although all interfaces are set to AC-KS:
[104160 - W#03] 2023-12-13 16:18:07 Error: spm-hs: Hyperscan returned fatal error -1.
-
@paulp same here:
[122730 - W#04] 2023-12-12 17:20:11 Error: spm-hs: Hyperscan returned fatal error -1.
-
@NRgia said in Suricata process dying due to hyperscan problem:
@Maltz said in Suricata process dying due to hyperscan problem:
@bmeeks said in Suricata process dying due to hyperscan problem:
I don't see the point in changing the default if users can just simply make the change manually and save it.
I think changing the default would be tremendously useful for people who have no way of knowing why Suricata is crashing over a month after the pfSense update that seemingly broke it. People who haven't, or don't have to expertise to, spend hours poring over system logs, find the right log entry to google, and make their way to this thread.
I beg to differ, why force everybody to use some settings as workaround, in order to track down an issue? This is not a test branch. As far as I understood from the posts here, this happens only if Suricata is in Legacy Mode. For example I use Suricata in inline mode on WAN and also on LAN with multiple VLANS and I don't encounter this issue. I'm not saying that we should not attempt to fix this, but forcing all of us to use the proposed defaults is bad practice.
Let me rephrase: Changing the logic around the default "Auto" setting would be useful. If AC-BS is the only one that works reliably in Legacy Mode (it's the only one that works for me at any rate) then that's the one "Auto" should choose.
-
@Maltz
Exactly which is why i proposed it. As someone pointed out this isnt the dev branch this is the 'prod' branch so change the default to what is working for most instead of having people guess what the issue is, google it, find your way to the forum post and now a forum post which is already a 205 posts topic. Now the person needs to read all of this to conclude Hyperscan doesnt work. -
This issue is incredibly frustrating to pin down. It took a full day, but I just saw the Hyperscan error on one of my VMs. The ASLR change does seem to significantly improve my ability to start Suricata instances and keep them running but, it's not a viable workaround. I've switched over to AC-BS. I'll watch this for a few days and see what happens.
-
Good news! At least I hope so :)
I changed Pattern Match to AC-BS and for now I see it has been working for about 20 hours.
With the Pattern Matcher Algorithm set to Auto, AC-KS or Hyperscan, Suricata stopped after a few hours on the interfaces that had higher traffic. -
@bmeeks said in Suricata process dying due to hyperscan problem:
best workaround for you guys is to select AC-CS (the normal default when Hyperscan is not present
I don't seem to have AC-CS as an option on the routers I've checked?
-
@SteveITS
Ive read AC-KS is performing well enough. -
@SteveITS I assumed that was a typo. AC-BS is the only one that works for me.