Suricata process dying due to hyperscan problem
-
After the last error i decided to uninstall everything and reconfigure from scratch
maybe some configuration didn't migrate correctly
now i'm unable to reproduce the error at start -
@kiokoman said in Suricata process dying due to hyperscan problem:
After the last error i decided to uninstall everything and reconfigure from scratch
maybe some configuration didn't migrate correctly
now i'm unable to reproduce the error at startThis has been the experience of a few other users as well all the way back to the original release of 7.x Suricata in pfSense. That's what makes this such a maddeningly difficult thing to debug .
-
-
I am continuing to look into this issue. Just sent a new batch of emails to the Suricata development team with questions about some recent changes in this area of the Suricata binary's code.
Still would be nice if I could reliably reproduce this in my test machines with a debug image running.
-
Attention Users hitting the Suricata Hyperscan problem (or other mysterious Suricata stoppages):
To help in pinning down what this problem is, please collect the following information for me when you experience the crash and include it in your post or feedback.
-
Are you seeing a Signal 11 or Signal 10 error fault logged in the pfSense system log (under STATUS > SYSTEM LOGS) around the time Suricata crashed? If so, include those log entries in your report.
-
Before attempting to restart Suricata after finding it stopped or crashed, examine the
suricata.log
for the interface under the LOGS VIEW tab in the Suricata GUI. Examine that log for any errors mentioning "hyperscan". Include those in your report.
I am trying to determine if a Signal 11 or Signal 10 happens each time Suricata crashes, or if Suricata is sometimes just stopping on its own when it encounters an internal hyperscan error.
Please provide the information requested above when posting about this issue. It is not helpful at all to simply create a reply saying "I'm having this problem, too" with no additional helpful information.
And at this time there is no indication at all the hyperscan crash issue is related to the Legacy Blocking Mode bug shared with Snort. That bug has, I'm fairly confident, been fixed. I think the issue in this thread is something different.
-
-
@kiokoman Do have a backup config 1) from before upgrading, 2) that wasn’t working and 3) after rebuilding? Might be interesting to compare the Suricata section to see if anything is different across those.
(I usually save one just before upgrading and immediately after)
-
@SteveITS
i have the backup history,
the only difference after reconfiguration wasold not working config:
<stream_bypass>off</stream_bypass>
<stream_drop_invalid>off</stream_drop_invalid>vs
new config
<stream_bypass>no</stream_bypass>
<stream_drop_invalid>no</stream_drop_invalid>everything else is the same but i don't have the old generated suricata.yaml
anyway i have a new problem now, before it was not even starting, now i have this after some hours, and only on one interface (i have suricata running on wan and lan, wan(vmx1) is still running ok)
[843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: running in 'auto' checksum mode. Detection of interface state will require 1000 packets [843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: snaplen set to 1518 [100515 - Suricata-Main] 2023-11-25 12:03:58 Notice: threads: Threads created -> W: 1 FM: 1 FR: 1 Engine started. [843086 - W#01-vmx2] 2023-11-25 12:03:59 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used [843086 - W#01-vmx2] 2023-11-25 12:05:02 Error: spm-hs: Hyperscan returned fatal error -1.
-
@kiokoman said in Suricata process dying due to hyperscan problem:
@SteveITS
i have the backup history,
the only difference after reconfiguration wasold not working config:
<stream_bypass>off</stream_bypass>
<stream_drop_invalid>off</stream_drop_invalid>vs
new config
<stream_bypass>no</stream_bypass>
<stream_drop_invalid>no</stream_drop_invalid>everything else is the same but i don't have the old generated suricata.yaml
anyway i have a new problem now, before it was not even starting, now i have this after some hours, and only on one interface (i have suricata running on wan and lan, wan(vmx1) is still running ok)
[843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: running in 'auto' checksum mode. Detection of interface state will require 1000 packets [843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: snaplen set to 1518 [100515 - Suricata-Main] 2023-11-25 12:03:58 Notice: threads: Threads created -> W: 1 FM: 1 FR: 1 Engine started. [843086 - W#01-vmx2] 2023-11-25 12:03:59 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used [843086 - W#01-vmx2] 2023-11-25 12:05:02 Error: spm-hs: Hyperscan returned fatal error -1.
Those small differences in Boolean values from the
config.xml
file would not be a factor here. Something is most likely wrong within the Suricata binary itself, but I don't know where nor do I know that is absolutely true.I've had a virtual machine running for 36 hours- with every single ET Open rule enabled and the Snort IPS Connectivity Policy enabled- and have not seen a crash yet. So, this is a strange problem. To positively identify it is going to require being able to reproduce it easily. Then a debugging version of Suricata can be executed and the precise failure point identified. But so far I cannot reproduce the problem. And even in @kiokoman's case, the problem disappeared for a time and then recurred later under different circumstances (running versus starting up).
There were some upstream changes in the HyperScan portions of Suricata code starting with version 7.0.1. Those were to work around some problems introduced by a behavior change upstream made by Intel in the HyperScan library itself. I've been communicating with the Suricata developer team, and they are pretty confident the fixes they made are sufficient. Nobody on Linux seems to be having a problem. The vast majority of Suricata users are on Linux derivatives. Very few users are on FreeBSD- mostly just the pfSense and OPNsense users. I'm not seeing this problem reported on the OPNsense forum, but they are still running the 6.0.x branch of Suricata and not the new 7.x branch.
-
@kiokoman
If you are willing, please try the following workarounds for me.Perhaps try just the first one initially, and if you still have the crash, then add on the second one. This command will disable ASLR (address space layout randomization) for the Suricata binary.
Execute this from a shell prompt after first stopping all Suricata instances.
- This will disable ASLR for the Suricata library:
# elfctl -e +noaslr /usr/local/bin/suricata
Each time you make a change above, stop the Suricata processes, make the change, then restart the processes. The change above is not dynamic. It only sets the "turned on/turned off" flag when loading the target binary.
This is a shot-in-the-dark based on my theory that perhaps ASLR is tripping up either the HyperScan library or Suricata. I remember
unbound
had an issue with ASLR a few versions back, and the temp workaround until upstream fixed the underlying problem in the code was to disable ASLR for theunbound
binary.To reset this back to the default, execute the same command but with a minus ("-") instead of plus ("+"). An example is below:
# elfctl -e -noaslr /usr/local/bin/suricata
Please report back if you try this and let me know if it helps.
-
-
-
Symptom: WAN Suricata instance works just fine, but the PC (one of several LAN side interfaces) interface instance dumps core immediately after starting with Signal 11.
Nov 25 14:43:06 kernel pid 36387 (suricata), jid 0, uid 0: exited on signal 11 (core dumped) Nov 25 14:43:06 php 94371 [Suricata] Suricata START for PC(vtnet0.700)... Nov 25 14:43:05 php 94371 [Suricata] Building new sid-msg.map file for PC... Nov 25 14:43:05 php 94371 [Suricata] Enabling any flowbit-required rules for: PC... Nov 25 14:43:05 php 94371 [Suricata] Updating rules configuration for: PC ... Nov 25 14:43:05 php 94371 [Suricata] Building new sid-msg.map file for WAN... Nov 25 14:43:05 php 94371 [Suricata] Enabling any flowbit-required rules for: WAN... Nov 25 14:43:04 php 94371 [Suricata] Updating rules configuration for: WAN ... Nov 25 14:43:04 php-fpm 64493 Starting Suricata on PC(vtnet0.700) per user request...
The Suricata log for the PC interface does not contain any reference to hyperscan.
I tried the ASLR changes that you suggested. The first one didn't appear to work.
elfctl -e +noaslr /usr/local/lib/libhs.so.5.4.0 elfctl: NT_FREEBSD_FEATURE_CTL note not found elfctl: NT_FREEBSD_FEATURE_CTL note not found
The second one, for the Suricata binary did work. Now when I start Suricata both instances start and so far they appear to stay running. However, if I shutdown Suricata I see the Signal 10 and core dump.
Nov 25 15:26:11 kernel pid 22945 (suricata), jid 0, uid 0: exited on signal 10 (core dumped) Nov 25 15:26:10 kernel vtnet0.700: promiscuous mode disabled Nov 25 15:26:10 kernel vtnet0: promiscuous mode disabled Nov 25 15:26:09 SuricataStartup 93534 Suricata STOP for PC(23822_vtnet0.700)... Nov 25 15:26:08 kernel vtnet1: promiscuous mode disabled Nov 25 15:26:06 SuricataStartup 79721 Suricata STOP for WAN(65037_vtnet1)...
-
It appears I spoke too soon. Now my WAN interface instance of Suricata is dumping core and the PC one is staying up. The WAN interface suricata.log file does include the Hyperscan log entry that you're chasing. As you can see from the logs, the instance ran for about 18 minutes before dumping core and reporting the Hyperscan error.
[214708 - RX#01-vtnet1] 2023-11-25 15:27:25 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used [214710 - W#02] 2023-11-25 15:45:18 Error: spm-hs: Hyperscan returned fatal error -1.```
-
@masons said in Suricata process dying due to hyperscan problem:
elfctl -e +noaslr /usr/local/lib/libhs.so.5.4.0
elfctl: NT_FREEBSD_FEATURE_CTL note not found
elfctl: NT_FREEBSD_FEATURE_CTL note not foundOops, looks like the library does not offer that option. I'll update my previous post to remove attempting to disable ASLR in the library. But it does work for Suricata (at least disabling ASLR, that is).
-
Hello,
I'm having the same problem.
I have 6 interfaces set up with Suricata. and only 2 of them are stopped randomly.
One using IX0 on WAN
And the other one using VLAN on IX1
device = '82599ES 10-Gigabit SFI/SFP+ Network Connection'
Using IPS Mode - Legacy ModeI deleted one of the monitored interfaces in Suricata that was having the issue, duplicated a working one. And got the same error on the new (same as before) interface. Also tried to disable some of the working ones but nothing changed.
Suricata log
[607907 - W#02] 2023-11-27 08:35:42 Error: spm-hs: Hyperscan returned fatal error -1.System log.
Nov 27 08:35:42 kernel ix0: promiscuous mode disabled -
@jowe78 Have Suricata in IPS mode on the WAN interface. Had a crash once a day with hyperscan mode. Yesterday I switched to AK-CS mode and it crashed in half a day running. There is no error in logs.
Now I'm switching to AC-BS mode and keep you updated. -
@bmeeks
i'll try asapHyperscan License Change after 5.4
According to Accelerate Snort Performance with Hyperscan and Intel Xeon Processors on Public Clouds versions of Hyperscan later than 5.4 are going to be closed-source:https://github.com/VectorCamp/vectorscan
how about vectorscan? there is plan for it?
.....
106211 - Suricata-Main] 2023-11-27 13:12:52 Notice: threads: Threads created -> W: 1 FM: 1 FR: 1 Engine started.
[863533 - W#01-vmx2] 2023-11-27 13:12:53 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
[863533 - W#01-vmx2] 2023-11-27 13:13:00 Error: spm-hs: Hyperscan returned fatal error -1.elfctl did not help for me
i don't have /root/suricata.core after the crash from hyperscan
gdb (b exit) :
27/11/2023 -- 16:09:46 - <Notice> -- Threads created -> W: 1 FM: 1 FR: 1 Engine started. 27/11/2023 -- 16:09:47 - <Info> -- No packets with invalid checksum, assuming checksum offloading is NOT used 27/11/2023 -- 16:10:12 - <Error> -- Hyperscan returned fatal error -1. [Switching to LWP 925298 of process 75227] Thread 3 "W#01-vmx2" hit Breakpoint 5, 0x00000008016be454 in exit () from /lib/libc.so.7 (gdb) bt #0 0x00000008016be454 in exit () from /lib/libc.so.7 #1 0x00000000006de629 in ?? () #2 0x000000000061d9ac in ?? () #3 0x000000000061ac4e in AppLayerProtoDetectGetProto () #4 0x00000000006197c9 in ?? () #5 0x0000000000619439 in AppLayerHandleTCPData () #6 0x00000000005aee4a in StreamTcpReassembleAppLayer () #7 0x00000000005af9e2 in StreamTcpReassembleHandleSegment () #8 0x00000000005b2b9f in ?? () #9 0x00000000005b15e2 in StreamTcpPacket () #10 0x00000000005b7817 in StreamTcp () #11 0x00000000006731c1 in ?? () #12 0x0000000000672a1a in ?? () #13 0x00000000006a5256 in TmThreadsSlotVarRun () #14 0x000000000067409e in ?? () #15 0x0000000801468ff4 in ?? () from /usr/local/lib/libpcap.so.1 #16 0x00000000006737b7 in ?? () #17 0x00000000006a83aa in ?? () #18 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 #19 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdfdfd000 (gdb) bt full #0 0x00000008016be454 in exit () from /lib/libc.so.7 No symbol table info available. #1 0x00000000006de629 in ?? () No symbol table info available. #2 0x000000000061d9ac in ?? () No symbol table info available. #3 0x000000000061ac4e in AppLayerProtoDetectGetProto () No symbol table info available. #4 0x00000000006197c9 in ?? () No symbol table info available. #5 0x0000000000619439 in AppLayerHandleTCPData () No symbol table info available. #6 0x00000000005aee4a in StreamTcpReassembleAppLayer () No symbol table info available. #7 0x00000000005af9e2 in StreamTcpReassembleHandleSegment () No symbol table info available. #8 0x00000000005b2b9f in ?? () No symbol table info available. #9 0x00000000005b15e2 in StreamTcpPacket () No symbol table info available. #10 0x00000000005b7817 in StreamTcp () No symbol table info available. #11 0x00000000006731c1 in ?? () No symbol table info available. #12 0x0000000000672a1a in ?? () No symbol table info available. #13 0x00000000006a5256 in TmThreadsSlotVarRun () No symbol table info available. #14 0x000000000067409e in ?? () No symbol table info available. #15 0x0000000801468ff4 in ?? () from /usr/local/lib/libpcap.so.1 No symbol table info available. #16 0x00000000006737b7 in ?? () No symbol table info available. #17 0x00000000006a83aa in ?? () No symbol table info available. #18 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 No symbol table info available. #19 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: Cannot access memory at address 0x7fffdfdfd000 (gdb) info threads Id Target Id Frame 1 LWP 100968 of process 75227 "Suricata-Main" 0x000000080169a6ea in _nanosleep () from /lib/libc.so.7 2 LWP 924434 of process 75227 "IM#01" 0x000000080169a7ea in _read () from /lib/libc.so.7 * 3 LWP 925298 of process 75227 "W#01-vmx2" 0x00000008016be454 in exit () from /lib/libc.so.7 4 LWP 925299 of process 75227 "FM#01" 0x0000000800efffdc in ?? () from /lib/libthr.so.3 5 LWP 925300 of process 75227 "FR#01" 0x0000000800efffdc in ?? () from /lib/libthr.so.3 (gdb) thread apply all bt Thread 5 (LWP 925300 of process 75227 "FR#01"): #0 0x0000000800efffdc in ?? () from /lib/libthr.so.3 #1 0x0000000800f10022 in ?? () from /lib/libthr.so.3 #2 0x0000000800f01b9d in ?? () from /lib/libthr.so.3 #3 0x00000000005ecb12 in ?? () #4 0x00000000006a87a8 in ?? () #5 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 #6 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdf9fb000 Thread 4 (LWP 925299 of process 75227 "FM#01"): #0 0x0000000800efffdc in ?? () from /lib/libthr.so.3 #1 0x0000000800f10022 in ?? () from /lib/libthr.so.3 #2 0x0000000800f01b9d in ?? () from /lib/libthr.so.3 #3 0x00000000005ec633 in ?? () #4 0x00000000006a87a8 in ?? () #5 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 #6 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdfbfc000 Thread 3 (LWP 925298 of process 75227 "W#01-vmx2"): #0 0x00000008016be454 in exit () from /lib/libc.so.7 #1 0x00000000006de629 in ?? () #2 0x000000000061d9ac in ?? () #3 0x000000000061ac4e in AppLayerProtoDetectGetProto () #4 0x00000000006197c9 in ?? () #5 0x0000000000619439 in AppLayerHandleTCPData () #6 0x00000000005aee4a in StreamTcpReassembleAppLayer () #7 0x00000000005af9e2 in StreamTcpReassembleHandleSegment () #8 0x00000000005b2b9f in ?? () #9 0x00000000005b15e2 in StreamTcpPacket () #10 0x00000000005b7817 in StreamTcp () #11 0x00000000006731c1 in ?? () #12 0x0000000000672a1a in ?? () #13 0x00000000006a5256 in TmThreadsSlotVarRun () #14 0x000000000067409e in ?? () #15 0x0000000801468ff4 in ?? () from /usr/local/lib/libpcap.so.1 #16 0x00000000006737b7 in ?? () #17 0x00000000006a83aa in ?? () #18 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 #19 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdfdfd000 Thread 2 (LWP 924434 of process 75227 "IM#01"): #0 0x000000080169a7ea in _read () from /lib/libc.so.7 #1 0x0000000800f0ea13 in ?? () from /lib/libthr.so.3 #2 0x00000000006355ed in AlertPfMonitorIfaceChanges () --Type <RET> for more, q to quit, c to continue without paging-- #3 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 #4 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdfffe000 Thread 1 (LWP 100968 of process 75227 "Suricata-Main"): #0 0x000000080169a6ea in _nanosleep () from /lib/libc.so.7 #1 0x0000000800f0e82c in ?? () from /lib/libthr.so.3 #2 0x000000080161ec46 in usleep () from /lib/libc.so.7 #3 0x000000000059fa6a in ?? () #4 0x000000000059f3b4 in SuricataMain () #5 0x00000008015f06fa in __libc_start1 () from /lib/libc.so.7 #6 0x000000000059bea0 in _start () (gdb) thread apply all bt full Thread 5 (LWP 925300 of process 75227 "FR#01"): #0 0x0000000800efffdc in ?? () from /lib/libthr.so.3 No symbol table info available. #1 0x0000000800f10022 in ?? () from /lib/libthr.so.3 No symbol table info available. #2 0x0000000800f01b9d in ?? () from /lib/libthr.so.3 No symbol table info available. #3 0x00000000005ecb12 in ?? () No symbol table info available. #4 0x00000000006a87a8 in ?? () No symbol table info available. #5 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 No symbol table info available. #6 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: Cannot access memory at address 0x7fffdf9fb000 Thread 4 (LWP 925299 of process 75227 "FM#01"): #0 0x0000000800efffdc in ?? () from /lib/libthr.so.3 No symbol table info available. #1 0x0000000800f10022 in ?? () from /lib/libthr.so.3 No symbol table info available. #2 0x0000000800f01b9d in ?? () from /lib/libthr.so.3 No symbol table info available. #3 0x00000000005ec633 in ?? () No symbol table info available. #4 0x00000000006a87a8 in ?? () No symbol table info available. #5 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 No symbol table info available. #6 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: Cannot access memory at address 0x7fffdfbfc000 Thread 3 (LWP 925298 of process 75227 "W#01-vmx2"): #0 0x00000008016be454 in exit () from /lib/libc.so.7 No symbol table info available. #1 0x00000000006de629 in ?? () No symbol table info available. #2 0x000000000061d9ac in ?? () No symbol table info available. #3 0x000000000061ac4e in AppLayerProtoDetectGetProto () No symbol table info available. #4 0x00000000006197c9 in ?? () No symbol table info available. #5 0x0000000000619439 in AppLayerHandleTCPData () No symbol table info available. --Type <RET> for more, q to quit, c to continue without paging-- #6 0x00000000005aee4a in StreamTcpReassembleAppLayer () No symbol table info available. #7 0x00000000005af9e2 in StreamTcpReassembleHandleSegment () No symbol table info available. #8 0x00000000005b2b9f in ?? () No symbol table info available. #9 0x00000000005b15e2 in StreamTcpPacket () No symbol table info available. #10 0x00000000005b7817 in StreamTcp () No symbol table info available. #11 0x00000000006731c1 in ?? () No symbol table info available. #12 0x0000000000672a1a in ?? () No symbol table info available. #13 0x00000000006a5256 in TmThreadsSlotVarRun () No symbol table info available. #14 0x000000000067409e in ?? () No symbol table info available. #15 0x0000000801468ff4 in ?? () from /usr/local/lib/libpcap.so.1 No symbol table info available. #16 0x00000000006737b7 in ?? () No symbol table info available. #17 0x00000000006a83aa in ?? () No symbol table info available. #18 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 No symbol table info available. #19 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: Cannot access memory at address 0x7fffdfdfd000 Thread 2 (LWP 924434 of process 75227 "IM#01"): #0 0x000000080169a7ea in _read () from /lib/libc.so.7 No symbol table info available. #1 0x0000000800f0ea13 in ?? () from /lib/libthr.so.3 No symbol table info available. #2 0x00000000006355ed in AlertPfMonitorIfaceChanges () No symbol table info available. #3 0x0000000800f02d25 in ?? () from /lib/libthr.so.3 No symbol table info available. #4 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: Cannot access memory at address 0x7fffdfffe000 Thread 1 (LWP 100968 of process 75227 "Suricata-Main"): #0 0x000000080169a6ea in _nanosleep () from /lib/libc.so.7 No symbol table info available. #1 0x0000000800f0e82c in ?? () from /lib/libthr.so.3 No symbol table info available. --Type <RET> for more, q to quit, c to continue without paging-- #2 0x000000080161ec46 in usleep () from /lib/libc.so.7 No symbol table info available. #3 0x000000000059fa6a in ?? () No symbol table info available. #4 0x000000000059f3b4 in SuricataMain () No symbol table info available. #5 0x00000008015f06fa in __libc_start1 () from /lib/libc.so.7 No symbol table info available. #6 0x000000000059bea0 in _start () No symbol table info available. (gdb)
-
@jowe78 said in Suricata process dying due to hyperscan problem:
Hello,
I'm having the same problem.
I have 6 interfaces set up with Suricata. and only 2 of them are stopped randomly.
One using IX0 on WAN
And the other one using VLAN on IX1
device = '82599ES 10-Gigabit SFI/SFP+ Network Connection'
Using IPS Mode - Legacy ModeI deleted one of the monitored interfaces in Suricata that was having the issue, duplicated a working one. And got the same error on the new (same as before) interface. Also tried to disable some of the working ones but nothing changed.
Suricata log
[607907 - W#02] 2023-11-27 08:35:42 Error: spm-hs: Hyperscan returned fatal error -1.System log.
Nov 27 08:35:42 kernel ix0: promiscuous mode disabledThat is really puzzling. It is very hard to pin down what the root cause of this might be . Are the rules different on the interfaces with no issue compared to the interfaces that are crashing?
-
@kiokoman said in Suricata process dying due to hyperscan problem:
106211 - Suricata-Main] 2023-11-27 13:12:52 Notice: threads: Threads created -> W: 1 FM: 1 FR: 1 Engine started.
[863533 - W#01-vmx2] 2023-11-27 13:12:53 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
[863533 - W#01-vmx2] 2023-11-27 13:13:00 Error: spm-hs: Hyperscan returned fatal error -1.elfctl did not help for me
Hmm...I was sort of afraid that might be the result. Another user tried it and it seemed to work very briefly, but then a crash. The random nature of this bug is frustrating. It's happening with different physical interfaces, it happens immediately for some users (they can't even start an interface), but for other users it happens at random points during a long runtime.
@kiokoman said in Suricata process dying due to hyperscan problem:
how about vectorscan? there is plan for it?
Vectorscan is somethig Suricata upstream would have to incorporate into the binary. All we do on the pfSense side is take the upstream source code for the binary and add the custom blocking plugin for Legacy Blocking Mode.
I'm also unsure at this point what the support level is in Vectorscan for Intel devices. It was first developed to bring hyperscan-like technology to ARM and other non-Intel CPUs.
-
@chrysmon said in Suricata process dying due to hyperscan problem:
@jowe78 Have Suricata in IPS mode on the WAN interface. Had a crash once a day with hyperscan mode. Yesterday I switched to AK-CS mode and it crashed in half a day running. There is no error in logs.
Now I'm switching to AC-BS mode and keep you updated.There is no error in any log? Always check BOTH the pfSense system log under STATUS > SYSTEM LOGS and the
suricata.log
under the LOGS VIEW tab in the Suricata GUI.Different things are going to be logged in each. For example, if Suricata hard crashes, it can't log anything into
suricata.log
about the crash because the binary died suddenly. But the pfSense operating system will see the binary crash and log information about it in the pfSense system log. -
@bmeeks I wrote it explicitly because it was unusual: in system.log the last entry about suricata was a detection log. Nothing about crash.
Mine still running with AC-BS Matcher Algorithm. I even did a manual update, successful. -
@bmeeks said in Suricata process dying due to hyperscan problem:
That is really puzzling. It is very hard to pin down what the root cause of this might be . Are the rules different on the interfaces with no issue compared to the interfaces that are crashing?
I have all rulesets applied to all interfaces, but not all rules enabled. So there are exceptions, then some rules are disabled to ensure functionality. So there are differences
between the interfaces.I will start fresh on one of the interfaces to see how it works.
-
@bmeeks in my case, no (configured) variance between interfaces.
I recently removed Suricata, including all configuration, caches, logs, etc., and installed fresh. Created the first interface, then copied it to create the second. First interface seems to be stable, but the second will die fairly shortly after start, due to the aforementioned hyper scan problem.