Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Suricata process dying due to hyperscan problem

    Scheduled Pinned Locked Moved IDS/IPS
    295 Posts 25 Posters 88.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • kiokomanK
      kiokoman LAYER 8 @bmeeks
      last edited by

      @bmeeks
      8vcpu
      16gb ram
      increasing stream memory cap up to 2.147.483.648 didn't help

      ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
      Please do not use chat/PM to ask for help
      we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
      Don't forget to Upvote with the 👍 button for any post you find to be helpful.

      1 Reply Last reply Reply Quote 0
      • bmeeksB
        bmeeks @sgnoc
        last edited by

        @sgnoc said in Suricata process dying due to hyperscan problem:

        2023-11-23 19:14:37.481228-05:00 kernel - pid 4814 (suricata), jid 0, uid 0: exited on signal 10 (core dumped)

        Signal 10 is a bus error normally associated with ARM-based hardware. What kind of machine are you running Suricata on? The Signal 10 error is more commonly associated with a non-aligned memory access, and that really can't happen on anything but ARM hardware these days.

        S 1 Reply Last reply Reply Quote 0
        • S
          sgnoc @bmeeks
          last edited by

          @bmeeks I'm using a netgate xg-7100-u, which has an Intel x64 processor, and I have 24 GB of ram installed. It's only the one interface on suricata that has had that error, so I wouldn't think failing memory or other services should be having issues, I would think?

          1 Reply Last reply Reply Quote 0
          • bmeeksB
            bmeeks @sgnoc
            last edited by bmeeks

            @sgnoc said in Suricata process dying due to hyperscan problem:

            This triggered right after the Emerging Threats rules updated and the interface rules reloaded.

            By my calculations using the log timestamps, Suricata finished the rules update and ran for 41 minutes before crashing, so "right after the rules update" is not entirely correct.

            2023-11-23 19:14:37.481228-05:00 	kernel 	- 	pid 4814 (suricata), jid 0, uid 0: exited on signal 10 (core dumped)
            2023-11-23 18:33:08.993276-05:00 	php-cgi 	81475 	[Suricata] The Rules update has finished.
            

            Rules update completed at 18:33:08. That crash happened at 19:14:37, or 41 minutes later.

            Other helpful information the next time this happens would be the content of the suricata.log file around the same time interval. You would need to capture that log BEFORE you restarted Suricata because that log is wiped clean each time Suricata is started or restarted in the GUI.

            S 1 Reply Last reply Reply Quote 0
            • S
              sgnoc @bmeeks
              last edited by

              @bmeeks These are the only logs available in the suricata.log file, and the immediately was reference to it being the next log in line. There was nothing else before the core dump other than the rukes reloading. It has not yet occurred again, so hopefully it is an isolated incident and won't occur again.

              bmeeksB 1 Reply Last reply Reply Quote 0
              • bmeeksB
                bmeeks
                last edited by

                For those of you having the Signal 11 or Signal 10 crashes, it would perhaps be useful if you can submit the core dump backtrace.

                The command to execute at a shell prompt is:

                gdb /usr/local/bin/suricata /root/suricata.core
                

                Then execute these commands within the gdb prompt:

                (gdb) bt
                (gdb) bt full
                (gdb) info threads
                (gdb) thread apply all bt
                (gdb) thread apply all bt full
                

                Capture the output of those commands and post it back here.

                1 Reply Last reply Reply Quote 0
                • bmeeksB
                  bmeeks @sgnoc
                  last edited by

                  @sgnoc said in Suricata process dying due to hyperscan problem:

                  It has not yet occurred again, so hopefully it is an isolated incident and won't occur again.

                  No, I don't think that is a true statement. It should never have occurred in the first place. The fact it did indicates there is a problem, and so it will happen again. It's only the "when" that is unknown.

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    sgnoc @bmeeks
                    last edited by

                    @bmeeks I know it isnt likely, but can still be hopeful. I'll run the core dump commands on the next crash so I can provide them the next time it happens. Thanks for your help!

                    kiokomanK 1 Reply Last reply Reply Quote 0
                    • kiokomanK
                      kiokoman LAYER 8 @sgnoc
                      last edited by

                      After the last error i decided to uninstall everything and reconfigure from scratch
                      maybe some configuration didn't migrate correctly
                      now i'm unable to reproduce the error at start

                      ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
                      Please do not use chat/PM to ask for help
                      we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
                      Don't forget to Upvote with the 👍 button for any post you find to be helpful.

                      bmeeksB S 2 Replies Last reply Reply Quote 0
                      • bmeeksB
                        bmeeks @kiokoman
                        last edited by

                        @kiokoman said in Suricata process dying due to hyperscan problem:

                        After the last error i decided to uninstall everything and reconfigure from scratch
                        maybe some configuration didn't migrate correctly
                        now i'm unable to reproduce the error at start

                        This has been the experience of a few other users as well all the way back to the original release of 7.x Suricata in pfSense. That's what makes this such a maddeningly difficult thing to debug 🤔.

                        1 Reply Last reply Reply Quote 0
                        • bmeeksB bmeeks referenced this topic on
                        • bmeeksB
                          bmeeks
                          last edited by

                          I am continuing to look into this issue. Just sent a new batch of emails to the Suricata development team with questions about some recent changes in this area of the Suricata binary's code.

                          Still would be nice if I could reliably reproduce this in my test machines with a debug image running.

                          1 Reply Last reply Reply Quote 0
                          • bmeeksB
                            bmeeks
                            last edited by bmeeks

                            Attention Users hitting the Suricata Hyperscan problem (or other mysterious Suricata stoppages):

                            To help in pinning down what this problem is, please collect the following information for me when you experience the crash and include it in your post or feedback.

                            1. Are you seeing a Signal 11 or Signal 10 error fault logged in the pfSense system log (under STATUS > SYSTEM LOGS) around the time Suricata crashed? If so, include those log entries in your report.

                            2. Before attempting to restart Suricata after finding it stopped or crashed, examine the suricata.log for the interface under the LOGS VIEW tab in the Suricata GUI. Examine that log for any errors mentioning "hyperscan". Include those in your report.

                            I am trying to determine if a Signal 11 or Signal 10 happens each time Suricata crashes, or if Suricata is sometimes just stopping on its own when it encounters an internal hyperscan error.

                            Please provide the information requested above when posting about this issue. It is not helpful at all to simply create a reply saying "I'm having this problem, too" with no additional helpful information.

                            And at this time there is no indication at all the hyperscan crash issue is related to the Legacy Blocking Mode bug shared with Snort. That bug has, I'm fairly confident, been fixed. I think the issue in this thread is something different.

                            1 Reply Last reply Reply Quote 0
                            • S
                              SteveITS Galactic Empire @kiokoman
                              last edited by

                              @kiokoman Do have a backup config 1) from before upgrading, 2) that wasn’t working and 3) after rebuilding? Might be interesting to compare the Suricata section to see if anything is different across those.

                              (I usually save one just before upgrading and immediately after)

                              Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                              When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                              Upvote 👍 helpful posts!

                              kiokomanK 1 Reply Last reply Reply Quote 0
                              • kiokomanK
                                kiokoman LAYER 8 @SteveITS
                                last edited by kiokoman

                                @SteveITS
                                i have the backup history,
                                the only difference after reconfiguration was

                                old not working config:

                                <stream_bypass>off</stream_bypass>
                                <stream_drop_invalid>off</stream_drop_invalid>

                                vs
                                new config
                                <stream_bypass>no</stream_bypass>
                                <stream_drop_invalid>no</stream_drop_invalid>

                                everything else is the same but i don't have the old generated suricata.yaml

                                anyway i have a new problem now, before it was not even starting, now i have this after some hours, and only on one interface (i have suricata running on wan and lan, wan(vmx1) is still running ok)

                                [843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: running in 'auto' checksum mode. Detection of interface state will require 1000 packets
                                [843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: snaplen set to 1518
                                [100515 - Suricata-Main] 2023-11-25 12:03:58 Notice: threads: Threads created -> W: 1 FM: 1 FR: 1   Engine started.
                                [843086 - W#01-vmx2] 2023-11-25 12:03:59 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
                                [843086 - W#01-vmx2] 2023-11-25 12:05:02 Error: spm-hs: Hyperscan returned fatal error -1.
                                

                                ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
                                Please do not use chat/PM to ask for help
                                we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
                                Don't forget to Upvote with the 👍 button for any post you find to be helpful.

                                bmeeksB 2 Replies Last reply Reply Quote 0
                                • bmeeksB
                                  bmeeks @kiokoman
                                  last edited by bmeeks

                                  @kiokoman said in Suricata process dying due to hyperscan problem:

                                  @SteveITS
                                  i have the backup history,
                                  the only difference after reconfiguration was

                                  old not working config:

                                  <stream_bypass>off</stream_bypass>
                                  <stream_drop_invalid>off</stream_drop_invalid>

                                  vs
                                  new config
                                  <stream_bypass>no</stream_bypass>
                                  <stream_drop_invalid>no</stream_drop_invalid>

                                  everything else is the same but i don't have the old generated suricata.yaml

                                  anyway i have a new problem now, before it was not even starting, now i have this after some hours, and only on one interface (i have suricata running on wan and lan, wan(vmx1) is still running ok)

                                  [843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: running in 'auto' checksum mode. Detection of interface state will require 1000 packets
                                  [843086 - W#01-vmx2] 2023-11-25 12:03:58 Info: pcap: vmx2: snaplen set to 1518
                                  [100515 - Suricata-Main] 2023-11-25 12:03:58 Notice: threads: Threads created -> W: 1 FM: 1 FR: 1   Engine started.
                                  [843086 - W#01-vmx2] 2023-11-25 12:03:59 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
                                  [843086 - W#01-vmx2] 2023-11-25 12:05:02 Error: spm-hs: Hyperscan returned fatal error -1.
                                  

                                  Those small differences in Boolean values from the config.xml file would not be a factor here. Something is most likely wrong within the Suricata binary itself, but I don't know where nor do I know that is absolutely true.

                                  I've had a virtual machine running for 36 hours- with every single ET Open rule enabled and the Snort IPS Connectivity Policy enabled- and have not seen a crash yet. So, this is a strange problem. To positively identify it is going to require being able to reproduce it easily. Then a debugging version of Suricata can be executed and the precise failure point identified. But so far I cannot reproduce the problem. And even in @kiokoman's case, the problem disappeared for a time and then recurred later under different circumstances (running versus starting up).

                                  There were some upstream changes in the HyperScan portions of Suricata code starting with version 7.0.1. Those were to work around some problems introduced by a behavior change upstream made by Intel in the HyperScan library itself. I've been communicating with the Suricata developer team, and they are pretty confident the fixes they made are sufficient. Nobody on Linux seems to be having a problem. The vast majority of Suricata users are on Linux derivatives. Very few users are on FreeBSD- mostly just the pfSense and OPNsense users. I'm not seeing this problem reported on the OPNsense forum, but they are still running the 6.0.x branch of Suricata and not the new 7.x branch.

                                  1 Reply Last reply Reply Quote 0
                                  • bmeeksB
                                    bmeeks @kiokoman
                                    last edited by bmeeks

                                    @kiokoman
                                    If you are willing, please try the following workarounds for me.

                                    Perhaps try just the first one initially, and if you still have the crash, then add on the second one. This command will disable ASLR (address space layout randomization) for the Suricata binary.

                                    Execute this from a shell prompt after first stopping all Suricata instances.

                                    1. This will disable ASLR for the Suricata library:
                                    # elfctl -e +noaslr /usr/local/bin/suricata
                                    

                                    Each time you make a change above, stop the Suricata processes, make the change, then restart the processes. The change above is not dynamic. It only sets the "turned on/turned off" flag when loading the target binary.

                                    This is a shot-in-the-dark based on my theory that perhaps ASLR is tripping up either the HyperScan library or Suricata. I remember unbound had an issue with ASLR a few versions back, and the temp workaround until upstream fixed the underlying problem in the code was to disable ASLR for the unbound binary.

                                    To reset this back to the default, execute the same command but with a minus ("-") instead of plus ("+"). An example is below:

                                    # elfctl -e -noaslr /usr/local/bin/suricata
                                    

                                    Please report back if you try this and let me know if it helps.

                                    M kiokomanK 2 Replies Last reply Reply Quote 0
                                    • bmeeksB bmeeks referenced this topic on
                                    • bmeeksB bmeeks referenced this topic on
                                    • M
                                      masons @bmeeks
                                      last edited by

                                      @bmeeks,

                                      Symptom: WAN Suricata instance works just fine, but the PC (one of several LAN side interfaces) interface instance dumps core immediately after starting with Signal 11.

                                      Nov 25 14:43:06 	kernel 		pid 36387 (suricata), jid 0, uid 0: exited on signal 11 (core dumped)
                                      Nov 25 14:43:06 	php 	94371 	[Suricata] Suricata START for PC(vtnet0.700)...
                                      Nov 25 14:43:05 	php 	94371 	[Suricata] Building new sid-msg.map file for PC...
                                      Nov 25 14:43:05 	php 	94371 	[Suricata] Enabling any flowbit-required rules for: PC...
                                      Nov 25 14:43:05 	php 	94371 	[Suricata] Updating rules configuration for: PC ...
                                      Nov 25 14:43:05 	php 	94371 	[Suricata] Building new sid-msg.map file for WAN...
                                      Nov 25 14:43:05 	php 	94371 	[Suricata] Enabling any flowbit-required rules for: WAN...
                                      Nov 25 14:43:04 	php 	94371 	[Suricata] Updating rules configuration for: WAN ...
                                      Nov 25 14:43:04 	php-fpm 	64493 	Starting Suricata on PC(vtnet0.700) per user request... 
                                      

                                      The Suricata log for the PC interface does not contain any reference to hyperscan.

                                      I tried the ASLR changes that you suggested. The first one didn't appear to work.

                                      elfctl  -e +noaslr /usr/local/lib/libhs.so.5.4.0
                                      elfctl: NT_FREEBSD_FEATURE_CTL note not found
                                      elfctl: NT_FREEBSD_FEATURE_CTL note not found
                                      

                                      The second one, for the Suricata binary did work. Now when I start Suricata both instances start and so far they appear to stay running. However, if I shutdown Suricata I see the Signal 10 and core dump.

                                      Nov 25 15:26:11 	kernel 		pid 22945 (suricata), jid 0, uid 0: exited on signal 10 (core dumped)
                                      Nov 25 15:26:10 	kernel 		vtnet0.700: promiscuous mode disabled
                                      Nov 25 15:26:10 	kernel 		vtnet0: promiscuous mode disabled
                                      Nov 25 15:26:09 	SuricataStartup 	93534 	Suricata STOP for PC(23822_vtnet0.700)...
                                      Nov 25 15:26:08 	kernel 		vtnet1: promiscuous mode disabled
                                      Nov 25 15:26:06 	SuricataStartup 	79721 	Suricata STOP for WAN(65037_vtnet1)... 
                                      
                                      M bmeeksB 2 Replies Last reply Reply Quote 0
                                      • M
                                        masons @masons
                                        last edited by masons

                                        @bmeeks,

                                        It appears I spoke too soon. Now my WAN interface instance of Suricata is dumping core and the PC one is staying up. The WAN interface suricata.log file does include the Hyperscan log entry that you're chasing. As you can see from the logs, the instance ran for about 18 minutes before dumping core and reporting the Hyperscan error.

                                        [214708 - RX#01-vtnet1] 2023-11-25 15:27:25 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
                                        [214710 - W#02] 2023-11-25 15:45:18 Error: spm-hs: Hyperscan returned fatal error -1.```
                                        1 Reply Last reply Reply Quote 0
                                        • bmeeksB
                                          bmeeks @masons
                                          last edited by

                                          @masons said in Suricata process dying due to hyperscan problem:

                                          elfctl -e +noaslr /usr/local/lib/libhs.so.5.4.0
                                          elfctl: NT_FREEBSD_FEATURE_CTL note not found
                                          elfctl: NT_FREEBSD_FEATURE_CTL note not found

                                          Oops, looks like the library does not offer that option. I'll update my previous post to remove attempting to disable ASLR in the library. But it does work for Suricata (at least disabling ASLR, that is).

                                          1 Reply Last reply Reply Quote 0
                                          • jowe78J
                                            jowe78
                                            last edited by jowe78

                                            Hello,

                                            I'm having the same problem.

                                            I have 6 interfaces set up with Suricata. and only 2 of them are stopped randomly.
                                            One using IX0 on WAN
                                            And the other one using VLAN on IX1
                                            device = '82599ES 10-Gigabit SFI/SFP+ Network Connection'
                                            Using IPS Mode - Legacy Mode

                                            I deleted one of the monitored interfaces in Suricata that was having the issue, duplicated a working one. And got the same error on the new (same as before) interface. Also tried to disable some of the working ones but nothing changed.

                                            Suricata log
                                            [607907 - W#02] 2023-11-27 08:35:42 Error: spm-hs: Hyperscan returned fatal error -1.

                                            System log.
                                            Nov 27 08:35:42 kernel ix0: promiscuous mode disabled

                                            C bmeeksB 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.