Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Suricata process dying due to hyperscan problem

    Scheduled Pinned Locked Moved IDS/IPS
    295 Posts 25 Posters 86.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • tylereversT
      tylerevers @bmeeks
      last edited by tylerevers

      @bmeeks said in Suricata process dying due to hyperscan problem:

      My pull request containing the anticipated fix for this Hyperscan error has been merged. An updated Suricata package has built and should appear as an available update for 2.7.2 CE and 23.09.1 Plus users.

      Look for an update to version 7.0.2_2 for the Suricata package. When installed, the new package should pull in version 7.0.2_5 of the Suricata binary.

      Fingers crossed this fixes the Hyperscan issue. But as I mentioned previously, since I could never reproduce the error in my small test environment, I can't say with 100% certainty the bug I found and fixed is the actual Hyperscan culprit.

      Nearly 20 hours since updating to 7.0.2_2 on 23.09.1 Plus with custom bare metal setup and no Hyperscan crash yet. Pattern Match set to AUTO and Blocking Mode ENABLED. Using all VLANs that traverse a LAGG in my case just as a reminder.

      Thanks, Bill!

      bmeeksB tylereversT 2 Replies Last reply Reply Quote 0
      • bmeeksB
        bmeeks @tylerevers
        last edited by

        @tylerevers said in Suricata process dying due to hyperscan problem:

        Nearly 20 hours since updating to 7.0.2_2 on 23.09.1 Plus with custom bare metal setup and no Hyperscan crash yet. Pattern Match set to AUTO and Blocking Mode ENABLED. Using all VLANs that traverse a LAGG in my case just as a reminder.

        Thanks, Bill!

        That sounds encouraging. There definitely was a problem in the custom blocking plugin's code, but perhaps there are two different issues happening in this thread.

        Some users are seeing the Hyperscan error, but others do not see that error and instead are getting Signal 11 segfaults from Suricata.

        S M 2 Replies Last reply Reply Quote 2
        • bmeeksB bmeeks referenced this topic on
        • S
          sgnoc @bmeeks
          last edited by

          @bmeeks I'm over 18 hours updated (7.0.2_2 on 23.09.1) and still running stable with blocking enabled and pattern matcher set to Auto. No issues and everything seems stable to this point.

          Previously I was only able to make it maybe 20 minutes before I got the hyperscan error. I definitely think the update had fixed one of the potential causes for these errors.

          Thanks for your support!

          1 Reply Last reply Reply Quote 1
          • M
            Maltz @bmeeks
            last edited by

            @bmeeks I agree there may be two different things going on. My problem is (and continues to be under 7.0.2_2) that the kernel kills Suricata with a "failed to reclaim memory" error. But that isn't limited to Hyperscan - it also happens using AC and AC-KS. Only AC-BS stays running for more than a few minutes. There's nothing in the Suricata log that catches my eye, since the process is killed by the kernel.

            S 1 Reply Last reply Reply Quote 0
            • S
              sgnoc
              last edited by

              I spoke a little too soon. It appears the fix just patched one error killing the interfaces a little sooner. Another problem is still killing them, they just last a noticeably longer time frame before the error.

              I had been getting my wan interface and one lan interface within about 20 minutes. Now my wan lasted about 19 to 20 hours, and the lan interface that was crashing is still running for now.

              I'll work on getting you the full details from the core dump as soon as I can get back to my computer to see if any of it has changed.

              [102796 - W#03] 2023-12-12 11:52:35 Error: spm-hs: Hyperscan returned fatal error -1.
              
              1 Reply Last reply Reply Quote 0
              • M
                masons @bmeeks
                last edited by

                @bmeeks

                Environment

                • Running pfSense CE 2.7.2 with Suricata plugin 7.0.2_2 and Suricata package 7.0.2_5
                • No changes to Suricata ASLR
                • Pattern matcher = Auto
                • Legacy blocking mode = Enabled
                • Multiple VLANs on the LAN interface, but Suricata is only running on a single VLAN (interface is called PC)

                Reproducing the issue (with logs)

                When I start the Suricata service the WAN interface starts and continues to run without issue, but the PC interface dies immediately. I do not see the Hyperscan error in the Suricata logs. This is 100% reproducible with this VM.
                WAN (works) Suricata log - https://pastebin.com/qRRa2P48
                PC (crashes) Suricata log - https://pastebin.com/FNcRQnhU

                System log excerpt showing that the PC Suricata instance dumps core.

                Dec 12 10:05:51 	kernel 		pid 10455 (suricata), jid 0, uid 0: exited on signal 11 (core dumped)
                Dec 12 10:05:50 	php 	3903 	[Suricata] Suricata START for PC(vtnet0.700)...
                Dec 12 10:05:50 	php 	3903 	[Suricata] Building new sid-msg.map file for PC...
                Dec 12 10:05:50 	php 	3903 	[Suricata] Enabling any flowbit-required rules for: PC...
                Dec 12 10:05:50 	php 	3903 	[Suricata] Updating rules configuration for: PC ...
                Dec 12 10:05:49 	php 	3903 	[Suricata] Building new sid-msg.map file for WAN...
                Dec 12 10:05:49 	php 	3903 	[Suricata] Enabling any flowbit-required rules for: WAN...
                Dec 12 10:05:49 	php 	3903 	[Suricata] Updating rules configuration for: WAN ...
                Dec 12 10:05:49 	php-fpm 	13080 	Starting Suricata on PC(vtnet0.700) per user request... 
                

                Workaround

                This workaround does not require changing the pattern-matcher or disabling the legacy blocking mode. It works consistently across multiple hosts.

                • Stop the Suricata service
                • Go to Diagnostics --> Command Prompt
                • Execute elfctl -e +noaslr /usr/local/bin/suricata
                • Start the Suricata service

                In my case both interfaces start and continue to run without further crashes.

                If you compare the failing PC interface suricata.log file with the working suricata.log file you can see where the process dumps core
                PC (crashes) Suricata log - https://pastebin.com/FNcRQnhU
                PC (working) Suricata log -https://pastebin.com/AE469T7m

                The crashing instance fails immediately after attempting to parse a rule that it doesn't like. The working instance still sees that error, but continues to run.

                This system log excerpt shows that both interfaces start correctly

                Dec 12 10:58:05 	kernel 		vtnet0.700: promiscuous mode enabled
                Dec 12 10:58:05 	kernel 		vtnet0: promiscuous mode enabled
                Dec 12 10:58:00 	kernel 		vtnet1: promiscuous mode enabled
                Dec 12 10:57:36 	SuricataStartup 	66406 	Suricata START for PC(23822_vtnet0.700)...
                Dec 12 10:57:35 	SuricataStartup 	65014 	Suricata START for WAN(65037_vtnet1)...
                Dec 12 10:57:08 	SuricataStartup 	98203 	Suricata STOP for PC(23822_vtnet0.700)... 
                

                Next steps

                I'm going to try removing the failing rule and then try starting up Suricata without the ASLR mitigation. I'll report back what I find.

                bmeeksB 1 Reply Last reply Reply Quote 0
                • bmeeksB
                  bmeeks @masons
                  last edited by

                  @masons said in Suricata process dying due to hyperscan problem:

                  @bmeeks

                  Environment

                  • Running pfSense CE 2.7.2 with Suricata plugin 7.0.2_2 and Suricata package 7.0.2_5
                  • No changes to Suricata ASLR
                  • Pattern matcher = Auto
                  • Legacy blocking mode = Enabled
                  • Multiple VLANs on the LAN interface, but Suricata is only running on a single VLAN (interface is called PC)

                  Reproducing the issue (with logs)

                  When I start the Suricata service the WAN interface starts and continues to run without issue, but the PC interface dies immediately. I do not see the Hyperscan error in the Suricata logs. This is 100% reproducible with this VM.
                  WAN (works) Suricata log - https://pastebin.com/qRRa2P48
                  PC (crashes) Suricata log - https://pastebin.com/FNcRQnhU

                  System log excerpt showing that the PC Suricata instance dumps core.

                  Dec 12 10:05:51 	kernel 		pid 10455 (suricata), jid 0, uid 0: exited on signal 11 (core dumped)
                  Dec 12 10:05:50 	php 	3903 	[Suricata] Suricata START for PC(vtnet0.700)...
                  Dec 12 10:05:50 	php 	3903 	[Suricata] Building new sid-msg.map file for PC...
                  Dec 12 10:05:50 	php 	3903 	[Suricata] Enabling any flowbit-required rules for: PC...
                  Dec 12 10:05:50 	php 	3903 	[Suricata] Updating rules configuration for: PC ...
                  Dec 12 10:05:49 	php 	3903 	[Suricata] Building new sid-msg.map file for WAN...
                  Dec 12 10:05:49 	php 	3903 	[Suricata] Enabling any flowbit-required rules for: WAN...
                  Dec 12 10:05:49 	php 	3903 	[Suricata] Updating rules configuration for: WAN ...
                  Dec 12 10:05:49 	php-fpm 	13080 	Starting Suricata on PC(vtnet0.700) per user request... 
                  

                  Workaround

                  This workaround does not require changing the pattern-matcher or disabling the legacy blocking mode. It works consistently across multiple hosts.

                  • Stop the Suricata service
                  • Go to Diagnostics --> Command Prompt
                  • Execute elfctl -e +noaslr /usr/local/bin/suricata
                  • Start the Suricata service

                  In my case both interfaces start and continue to run without further crashes.

                  If you compare the failing PC interface suricata.log file with the working suricata.log file you can see where the process dumps core
                  PC (crashes) Suricata log - https://pastebin.com/FNcRQnhU
                  PC (working) Suricata log -https://pastebin.com/AE469T7m

                  The crashing instance fails immediately after attempting to parse a rule that it doesn't like. The working instance still sees that error, but continues to run.

                  This system log excerpt shows that both interfaces start correctly

                  Dec 12 10:58:05 	kernel 		vtnet0.700: promiscuous mode enabled
                  Dec 12 10:58:05 	kernel 		vtnet0: promiscuous mode enabled
                  Dec 12 10:58:00 	kernel 		vtnet1: promiscuous mode enabled
                  Dec 12 10:57:36 	SuricataStartup 	66406 	Suricata START for PC(23822_vtnet0.700)...
                  Dec 12 10:57:35 	SuricataStartup 	65014 	Suricata START for WAN(65037_vtnet1)...
                  Dec 12 10:57:08 	SuricataStartup 	98203 	Suricata STOP for PC(23822_vtnet0.700)... 
                  

                  Next steps

                  I'm going to try removing the failing rule and then try starting up Suricata without the ASLR mitigation. I'll report back what I find.

                  This is very intriguing data. Thank you for the research and posting the results. This sort of jives with my original hypothesis that ASLR may be involved here. One of the Netgate kernel developers did not think it was because the currently documented ASLR bug is in the address sanitizer piece of the llmv compiler and he said that was unlikely to be used outside of debug builds. The documentation for the sanitizer says it results in about a 2x slowdown in execution.

                  I also now doubt the documented address sanitizer bug in llvm is the likely cause, but your testing seems to imply that ASLR is at fault in some manner with this bug. However, other users experiencing the bug have tried disabling ASLR (as you did) and did not see any change in behavior.

                  M 1 Reply Last reply Reply Quote 0
                  • M
                    masons @bmeeks
                    last edited by

                    @bmeeks

                    I removed the offending rule (SID 26470), removed the ASLR change and restarted Suricata. The PC interface Suricata instance immediately dumps core with Signal 11 again.

                    Stopping the Suricata service, making the ASLR change and restarting Suricata, results in the PC interface Suricata instance coming up and staying up.

                    At least for me, across several VMs, this is very consistent behavior.

                    bmeeksB 1 Reply Last reply Reply Quote 0
                    • bmeeksB
                      bmeeks @masons
                      last edited by bmeeks

                      @masons said in Suricata process dying due to hyperscan problem:

                      @bmeeks

                      I removed the offending rule (SID 26470), removed the ASLR change and restarted Suricata. The PC interface Suricata instance immediately dumps core with Signal 11 again.

                      Stopping the Suricata service, making the ASLR change and restarting Suricata, results in the PC interface Suricata instance coming up and staying up.

                      At least for me, across several VMs, this is very consistent behavior.

                      I was about to test specifically with that offending rule enabled, but your test results suggest that is a moot point (meaning not the actual cause). I have no proof, but ALSR is definitely a suspect in my mind (at least for the Signal 11 segfault issue). Apparently it does little to help with the "Hyperscan returned fatal error -1" issue, though.

                      kiokomanK M 2 Replies Last reply Reply Quote 0
                      • S
                        SteveITS Galactic Empire @Maltz
                        last edited by

                        @Maltz said in Suricata process dying due to hyperscan problem:

                        kernel kills Suricata with a "failed to reclaim memory" error

                        I didn't reread the now-long thread, but did you post your memory usage with Suricata running?

                        ZFS is supposed to give up cache RAM but can be tuned to reduce usage:
                        https://docs.netgate.com/pfsense/en/latest/hardware/tune-zfs.html
                        "The default maximum ARC size (vfs.zfs.arc.max) is automatic (0) and uses 1/2 RAM or the total RAM minus 1GB, whichever is greater."

                        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                        Upvote 👍 helpful posts!

                        M 1 Reply Last reply Reply Quote 0
                        • M
                          Maltz @SteveITS
                          last edited by

                          @SteveITS said in Suricata process dying due to hyperscan problem:

                          I didn't reread the now-long thread, but did you post your memory usage with Suricata running?

                          It's "28% of 3388 MiB" (4GB Netgate 2100) right now. With any algorithm other than AC-BS, RAM usage ramps up a few minutes after Suricata starts then the kernel kills it.

                          1 Reply Last reply Reply Quote 0
                          • tylereversT
                            tylerevers @tylerevers
                            last edited by

                            @tylerevers said in Suricata process dying due to hyperscan problem:

                            @bmeeks said in Suricata process dying due to hyperscan problem:

                            My pull request containing the anticipated fix for this Hyperscan error has been merged. An updated Suricata package has built and should appear as an available update for 2.7.2 CE and 23.09.1 Plus users.

                            Look for an update to version 7.0.2_2 for the Suricata package. When installed, the new package should pull in version 7.0.2_5 of the Suricata binary.

                            Fingers crossed this fixes the Hyperscan issue. But as I mentioned previously, since I could never reproduce the error in my small test environment, I can't say with 100% certainty the bug I found and fixed is the actual Hyperscan culprit.

                            Nearly 20 hours since updating to 7.0.2_2 on 23.09.1 Plus with custom bare metal setup and no Hyperscan crash yet. Pattern Match set to AUTO and Blocking Mode ENABLED. Using all VLANs that traverse a LAGG in my case just as a reminder.

                            Thanks, Bill!

                            At roughly the 28-hour mark, the Suricata Interface failed with the Hyperscan issue again.

                            1 Reply Last reply Reply Quote 0
                            • kiokomanK
                              kiokoman LAYER 8 @bmeeks
                              last edited by

                              @bmeeks
                              i have removed all the rules from an interface but the hyperscan error is still there after a few moments for me.
                              +noaslr is still doing nothing
                              any chance you can provide the dbg pkg of suricata?

                              ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
                              Please do not use chat/PM to ask for help
                              we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
                              Don't forget to Upvote with the 👍 button for any post you find to be helpful.

                              bmeeksB 1 Reply Last reply Reply Quote 0
                              • M
                                michmoor LAYER 8 Rebel Alliance @bmeeks
                                last edited by

                                @bmeeks
                                For now and maybe going forward as a perm solution can we just have the package updated to use AC-CS as the default with a note stating to avoid HyperScan for its inconsistent performance or something along those lines.

                                Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                Routing: Juniper, Arista, Cisco
                                Switching: Juniper, Arista, Cisco
                                Wireless: Unifi, Aruba IAP
                                JNCIP,CCNP Enterprise

                                bmeeksB 1 Reply Last reply Reply Quote 0
                                • bmeeksB
                                  bmeeks @kiokoman
                                  last edited by

                                  @kiokoman said in Suricata process dying due to hyperscan problem:

                                  @bmeeks
                                  i have removed all the rules from an interface but the hyperscan error is still there after a few moments for me.
                                  +noaslr is still doing nothing
                                  any chance you can provide the dbg pkg of suricata?

                                  Not at the moment. I'm trying to reconstruct my package builder for the RELENG_2_7_2 branch of CE (which is the current 2.7.2 release), and that build is failing. Working with the Netgate team on that. Once I get my package builder working again, then I can build a debug package and perhaps share it.

                                  Nothing else can happen until at least after this coming weekend as I am about to be out of town for a few days.

                                  kiokomanK 2 Replies Last reply Reply Quote 0
                                  • bmeeksB
                                    bmeeks @michmoor
                                    last edited by

                                    @michmoor said in Suricata process dying due to hyperscan problem:

                                    @bmeeks
                                    For now and maybe going forward as a perm solution can we just have the package updated to use AC-CS as the default with a note stating to avoid HyperScan for its inconsistent performance or something along those lines.

                                    I don't see the point in changing the default if users can just simply make the change manually and save it.

                                    And I can't work on this issue anymore until late this Sunday at the earliest as I will be away from all my computing infrastructure until then.

                                    M 1 Reply Last reply Reply Quote 2
                                    • M
                                      Maltz @bmeeks
                                      last edited by

                                      @bmeeks said in Suricata process dying due to hyperscan problem:

                                      I don't see the point in changing the default if users can just simply make the change manually and save it.

                                      I think changing the default would be tremendously useful for people who have no way of knowing why Suricata is crashing over a month after the pfSense update that seemingly broke it. People who haven't, or don't have to expertise to, spend hours poring over system logs, find the right log entry to google, and make their way to this thread.

                                      N 1 Reply Last reply Reply Quote 1
                                      • N
                                        NRgia @Maltz
                                        last edited by NRgia

                                        @Maltz said in Suricata process dying due to hyperscan problem:

                                        @bmeeks said in Suricata process dying due to hyperscan problem:

                                        I don't see the point in changing the default if users can just simply make the change manually and save it.

                                        I think changing the default would be tremendously useful for people who have no way of knowing why Suricata is crashing over a month after the pfSense update that seemingly broke it. People who haven't, or don't have to expertise to, spend hours poring over system logs, find the right log entry to google, and make their way to this thread.

                                        I beg to differ, why force everybody to use some settings as workaround, in order to track down an issue? This is not a test branch. As far as I understood from the posts here, this happens only if Suricata is in Legacy Mode. For example I use Suricata in inline mode on WAN and also on LAN with multiple VLANS and I don't encounter this issue. I'm not saying that we should not attempt to fix this, but forcing all of us to use the proposed defaults is bad practice.

                                        M 1 Reply Last reply Reply Quote 0
                                        • kiokomanK
                                          kiokoman LAYER 8 @bmeeks
                                          last edited by

                                          @bmeeks said in Suricata process dying due to hyperscan problem:

                                          @kiokoman said in Suricata process dying due to hyperscan problem:

                                          @bmeeks
                                          i have removed all the rules from an interface but the hyperscan error is still there after a few moments for me.
                                          +noaslr is still doing nothing
                                          any chance you can provide the dbg pkg of suricata?

                                          Not at the moment. I'm trying to reconstruct my package builder for the RELENG_2_7_2 branch of CE (which is the current 2.7.2 release), and that build is failing. Working with the Netgate team on that. Once I get my package builder working again, then I can build a debug package and perhaps share it.

                                          Nothing else can happen until at least after this coming weekend as I am about to be out of town for a few days.

                                          Well. i'm not in a hurry , i just like to solve mistery 🕵

                                          ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
                                          Please do not use chat/PM to ask for help
                                          we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
                                          Don't forget to Upvote with the 👍 button for any post you find to be helpful.

                                          1 Reply Last reply Reply Quote 0
                                          • P
                                            paulp
                                            last edited by

                                            Suricata still hangs on interfaces with higher traffic, even though I set it to use AC-KS.
                                            It is strange that the same message appears with the hyperscan error, although all interfaces are set to AC-KS:
                                            [104160 - W#03] 2023-12-13 16:18:07 Error: spm-hs: Hyperscan returned fatal error -1.
                                            Suricata-Error.png

                                            BismarckB 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.