surricata keeps shutting down
-
I have the same issue on my pfSense 2.5.2 CE. I upgraded from 2.4.5 p1 and everyday I can see that the interface suricata is running on is stopped. When I start the interface all is ok again.
This is what I can find in the logfile:
19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_WARN_JA3_DISABLED(309)] - ja3 support is not enabled 19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - error parsing signature "alert tls $HOME_NET any -> $EXTERNAL_NET any (msg:"ET JA3 Hash - Suspected Cobalt Strike Malleable C2 M1 (set)"; flow:established,to_server; ja3.hash; content:"eb88d0b3e1961a0562f006e5ce2a0b87"; ja3.string; content:"771,49192-49191-49172-49171"; flowbits:set,ET.cobaltstrike.ja3; flowbits:noalert; classtype:command-and-control; sid:2028831; rev:1; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2019_10_15, deployment Perimeter, former_category JA3, malware_family Cobalt_Strike, signature_severity Major, updated_at 2019_10_15, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1001, mitre_technique_name Data_Obfuscation;)" from file /usr/local/etc/suricata/suricata_44278_igb1.140/rules/suricata.rules at line 4388 19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_WARN_JA3_DISABLED(309)] - ja3(s) support is not enabled 19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - error parsing signature "alert tls $EXTERNAL_NET any -> $HOME_NET any (msg:"ET JA3 Hash - Suspected Cobalt Strike Malleable C2 (ja3s) M1"; flow:established,from_server; ja3s.hash; content:"649d6810e8392f63dc311eecb6b7098b"; tls.cert_subject; content:!"servicebus.windows.net"; flowbits:isset,ET.cobaltstrike.ja3; classtype:command-and-control; sid:2028832; rev:1; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2019_10_15, deployment Perimeter, former_category JA3, malware_family Cobalt_Strike, malware_family Cobalt_Strike, signature_severity Major, updated_at 2019_10_15, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1001, mitre_technique_name Data_Obfuscation;)" from file /usr/local/etc/suricata/suricata_44278_igb1.140/rules/suricata.rules at line 4389 19/7/2021 -- 00:39:16 - <Info> -- 1 rule files processed. 23114 rules successfully loaded, 130 rules failed
Updating rules etc does not help. Rule updates are going fine btw.
EDIT: I have suricata running in Legacy mode.
-
@vjizzle AIUI those error messages pertain to invalid/unsupported rules in a downloaded ruleset, and should not a problem.
When the web UI reports suricata inactive, do you still see alerts? Are the suricata processes still runing:ps x | grep suricata
It seems, for me at any rate, the UI is not correctly reporting suricata state and, if you do start an instance via the UI, you end up with multiple, unwanted suricata processes. -
@darcey Hi! I will have to check next time suricata stops on the interface. I can see suricata service still running when it is stopped on the interface.
Let me monitor this and post my findings here.
-
I saw this on one of our routers yesterday, assumed it had crashed, although I couldn't find a log entry for that, and started it. Today I read this, and it's again in that state...Services/Suricata shows it stopped and Status/Services shows it running. /var/log/suricata/suricata_em01532/suricata.log has no entries dated today. Alerts/blocks are as of a few minutes ago.
Shell Output - ps x | grep suricata 2041 - SNs 20:51.96 /usr/local/bin/suricata -i em0 -D -c /usr/local/etc/suricata/suricata_1532_em0/suricata.yaml --pidfile /var/run/suricata_em01532.pid 30675 - SNs 50:21.92 /usr/local/bin/suricata -i em0 -D -c /usr/local/etc/suricata/suricata_1532_em0/suricata.yaml --pidfile /var/run/suricata_em01532.pid 70269 - S 0:00.00 sh -c ps x | grep suricata 2>&1 70888 - S 0:00.00 grep suricata
...which looks like two running instances. I stopped the service via Status/Services, and one of the two exited. Status/Services still showed it as running, so I clicked stop again and the other exited. If I start it from the interface tab now there's just one.
On the Updates tab I clicked Update and it didn't cause the problem but it failed because of a 404 (ERROR: Snort GPLv2 Community Rules md5 download failed). That's probably unrelated as it succeeded about five hours ago.
This is our only router on 2.5.2, and it has Suricata package 6.0.0_11. I checked another running pfSense 21.05 with Suricata 6.0.0_10 and it does not show it stopped.
Tagging @bmeeks.
-
@steveits said in surricata keeps shutting down:
I saw this on one of our routers yesterday, assumed it had crashed, although I couldn't find a log entry for that, and started it. Today I read this, and it's again in that state...Services/Suricata shows it stopped and Status/Services shows it running. /var/log/suricata/suricata_em01532/suricata.log has no entries dated today. Alerts/blocks are as of a few minutes ago.
Shell Output - ps x | grep suricata 2041 - SNs 20:51.96 /usr/local/bin/suricata -i em0 -D -c /usr/local/etc/suricata/suricata_1532_em0/suricata.yaml --pidfile /var/run/suricata_em01532.pid 30675 - SNs 50:21.92 /usr/local/bin/suricata -i em0 -D -c /usr/local/etc/suricata/suricata_1532_em0/suricata.yaml --pidfile /var/run/suricata_em01532.pid 70269 - S 0:00.00 sh -c ps x | grep suricata 2>&1 70888 - S 0:00.00 grep suricata
...which looks like two running instances. I stopped the service via Status/Services, and one of the two exited. Status/Services still showed it as running, so I clicked stop again and the other exited. If I start it from the interface tab now there's just one.
On the Updates tab I clicked Update and it didn't cause the problem but it failed because of a 404 (ERROR: Snort GPLv2 Community Rules md5 download failed). That's probably unrelated as it succeeded about five hours ago.
This is our only router on 2.5.2, and it has Suricata package 6.0.0_11. I checked another running pfSense 21.05 with Suricata 6.0.0_10 and it does not show it stopped.
Tagging @bmeeks.
I will look into this. Might be related to a recent change in the rules update logic (to address an issue raised by a user on Redmine), although I did test in a VM before posting the upgrade. This version of the package has been available in the 2.6.0 DEVEL branch for quite some time, and I saw no similar reports there.
-
@vjizzle said in surricata keeps shutting down:
I have the same issue on my pfSense 2.5.2 CE. I upgraded from 2.4.5 p1 and everyday I can see that the interface suricata is running on is stopped. When I start the interface all is ok again.
This is what I can find in the logfile:
19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_WARN_JA3_DISABLED(309)] - ja3 support is not enabled 19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - error parsing signature "alert tls $HOME_NET any -> $EXTERNAL_NET any (msg:"ET JA3 Hash - Suspected Cobalt Strike Malleable C2 M1 (set)"; flow:established,to_server; ja3.hash; content:"eb88d0b3e1961a0562f006e5ce2a0b87"; ja3.string; content:"771,49192-49191-49172-49171"; flowbits:set,ET.cobaltstrike.ja3; flowbits:noalert; classtype:command-and-control; sid:2028831; rev:1; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2019_10_15, deployment Perimeter, former_category JA3, malware_family Cobalt_Strike, signature_severity Major, updated_at 2019_10_15, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1001, mitre_technique_name Data_Obfuscation;)" from file /usr/local/etc/suricata/suricata_44278_igb1.140/rules/suricata.rules at line 4388 19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_WARN_JA3_DISABLED(309)] - ja3(s) support is not enabled 19/7/2021 -- 00:39:12 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - error parsing signature "alert tls $EXTERNAL_NET any -> $HOME_NET any (msg:"ET JA3 Hash - Suspected Cobalt Strike Malleable C2 (ja3s) M1"; flow:established,from_server; ja3s.hash; content:"649d6810e8392f63dc311eecb6b7098b"; tls.cert_subject; content:!"servicebus.windows.net"; flowbits:isset,ET.cobaltstrike.ja3; classtype:command-and-control; sid:2028832; rev:1; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2019_10_15, deployment Perimeter, former_category JA3, malware_family Cobalt_Strike, malware_family Cobalt_Strike, signature_severity Major, updated_at 2019_10_15, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1001, mitre_technique_name Data_Obfuscation;)" from file /usr/local/etc/suricata/suricata_44278_igb1.140/rules/suricata.rules at line 4389 19/7/2021 -- 00:39:16 - <Info> -- 1 rule files processed. 23114 rules successfully loaded, 130 rules failed
Updating rules etc does not help. Rule updates are going fine btw.
EDIT: I have suricata running in Legacy mode.
To eliminate this error, go to the APP PARSERS tab for the interface, and in the TLS Parsers section, click the checkbox to enable JA3 fingerprints. Right now it defaults to "off", but it needs to be on if you have rules that use JA3 fingerprinting. The next Suricata update will change the mode to "auto" to fix this.
-
When suricata is restarted after a rules update, the pid files are deleted but not recreated, despite suricata successfuly starting and invoked with the --pidfile option. Stopping/starting the same interfaces works fine via the suricata->interfaces UI.
-
Hi!. So this morning I checked and the interface where suricata is enables was stopped again. The logfile does show only the following entries for errors:
21/7/2021 -- 00:39:16 - <Error> -- [ERRCODE: SC_WARN_JA3_DISABLED(309)] - ja3 support is not enabled 21/7/2021 -- 00:39:16 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - error parsing signature "alert tls $HOME_NET any -> $EXTERNAL_NET any (msg:"ET JA3 Hash - Suspected Cobalt Strike Malleable C2 M1 (set)"; flow:established,to_server; ja3.hash; content:"eb88d0b3e1961a0562f006e5ce2a0b87"; ja3.string; content:"771,49192-49191-49172-49171"; flowbits:set,ET.cobaltstrike.ja3; flowbits:noalert; classtype:command-and-control; sid:2028831; rev:1; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2019_10_15, deployment Perimeter, former_category JA3, malware_family Cobalt_Strike, signature_severity Major, updated_at 2019_10_15, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1001, mitre_technique_name Data_Obfuscation;)" from file /usr/local/etc/suricata/suricata_44278_igb1.140/rules/suricata.rules at line 4382 21/7/2021 -- 00:39:16 - <Error> -- [ERRCODE: SC_WARN_JA3_DISABLED(309)] - ja3(s) support is not enabled 21/7/2021 -- 00:39:16 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - error parsing signature "alert tls $EXTERNAL_NET any -> $HOME_NET any (msg:"ET JA3 Hash - Suspected Cobalt Strike Malleable C2 (ja3s) M1"; flow:established,from_server; ja3s.hash; content:"649d6810e8392f63dc311eecb6b7098b"; tls.cert_subject; content:!"servicebus.windows.net"; flowbits:isset,ET.cobaltstrike.ja3; classtype:command-and-control; sid:2028832; rev:1; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2019_10_15, deployment Perimeter, former_category JA3, malware_family Cobalt_Strike, malware_family Cobalt_Strike, signature_severity Major, updated_at 2019_10_15, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1001, mitre_technique_name Data_Obfuscation;)" from file /usr/local/etc/suricata/suricata_44278_igb1.140/rules/suricata.rules at line 4383 21/7/2021 -- 00:39:20 - <Info> -- 1 rule files processed. 23122 rules successfully loaded, 130 rules failed 21/7/2021 -- 00:39:20 - <Info> -- Threshold config parsed: 49 rule(s) found 21/7/2021 -- 00:39:20 - <Info> -- 23125 signatures processed. 1235 are IP-only rules, 4076 are inspecting packet payload, 17516 inspect application layer, 103 are decoder event only 21/7/2021 -- 00:39:31 - <Info> -- Using 1 live device(s). 21/7/2021 -- 00:39:31 - <Info> -- using interface igb1.140 21/7/2021 -- 00:39:31 - <Info> -- running in 'auto' checksum mode. Detection of interface state will require 1000ULL packets 21/7/2021 -- 00:39:31 - <Info> -- Set snaplen to 1518 for 'igb1.140' 21/7/2021 -- 00:39:31 - <Info> -- RunModeIdsPcapAutoFp initialised 21/7/2021 -- 00:39:31 - <Notice> -- all 5 packet processing threads, 2 management threads initialized, engine started. 21/7/2021 -- 00:42:00 - <Info> -- No packets with invalid checksum, assuming checksum offloading is NOT used
The last entry is also where the logfiles stopped. According to pfSense Services menu suricata should be running. When I check it from the command line I can see this:
10347 - S 0:00.00 sh -c ps x | grep suricata 2>&1 10809 - S 0:00.00 grep suricata 52544 - SNs 6:02.24 /usr/local/bin/suricata -i igb1.140 -D -c /usr/local/etc/suricata/suricata_44278_igb1.140/suricata.yaml --pidfile /var/run/suricata_igb1.14044278.pid
So it seems like suricata is still running on that interface (igb1.140) and when I start the interface it just starts another process. I can see multiple suricata processen then from the command line. What I did (and this was mentioned somewhere on the forum) is stop the suricata service by hitting the stop button several times from the gui. Eventually it kills all the processes and I can start suricata again. But this seems like bug.
Please let me know if I can help with troubleshooting.
-
@vjizzle said in surricata keeps shutting down:
this morning I checked and the interface where suricata is enables was stopped again
Same thing on ours. I had checked a couple of times yesterday and at the end of the day it still looked OK.
I almost upgraded our backup (HA) router in our data center to _11 to test it on 21.05...is it OK to run that way if the primary is _10? I would think so but figured I should ask.
-
@steveits said in surricata keeps shutting down:
@vjizzle said in surricata keeps shutting down:
this morning I checked and the interface where suricata is enables was stopped again
Same thing on ours. I had checked a couple of times yesterday and at the end of the day it still looked OK.
I almost upgraded our backup (HA) router in our data center to _11 to test it on 21.05...is it OK to run that way if the primary is _10? I would think so but figured I should ask.
Yes, the version mismatch will be okay for now. I will look into the problem. I have a batch of GUI fixes ready to submit for Suricata, but I have been holding off for about a week working on getting the 6.0.3 binary to work with netmap. So far, that is not going well. So I may go ahead and release the GUI fixes and leave the binary on 5.0.6 for now.
-
@darcey said in surricata keeps shutting down:
When suricata is restarted after a rules update, the pid files are deleted but not recreated, despite suricata successfuly starting and invoked with the --pidfile option. Stopping/starting the same interfaces works fine via the suricata->interfaces UI.
Thank you for these details. One of my suspicions was it had something to do with the PID file, because that's how the GUI controls/indicators on the INTERFACES tab query the daemons.
I will look into this today and see what is going on.
-
For users experiencing this issue, I believe the "stopped" condition showing on the INTERFACES tab is incorrect. If the PID file is missing for a running instance, the code on the INTERFACES tab thinks Suricata is not running, even when it actually is.
The pfSense code that populates the SERVICES menu display uses
pgrep
looking for 'suricata' to determine if the service is running. So it will show the daemon present (if it is actually running).So until I get a fix posted, don't 100% trust the display on the INTERFACES tab after an automated rules update. Either check under the SERVICES menu in pfSense, or get a CLI prompt on the firewall and run this command:
ps -ax | grep suricata
It will show any running processes.
WARNING: if Suricata is actually running, and then you go to the INTERFACES tab and click "Start", you will launch an identical clone Suricata process and that will cause you trouble!
-
@steveits said in surricata keeps shutting down:
ERROR: Snort GPLv2 Community Rules md5 download failed
This looks to be this issue.
-
@darcey said in surricata keeps shutting down:
pid files are deleted
That's what I see.
2519 - SNs 17:56.44 /usr/local/bin/suricata -i em0 -D -c /usr/local/etc/suricata/suricata_1532_em0/suricata.yaml --pidfile /var/run/suricata_em01532.pid
ls: /var/run/suricata_em01532.pid: No such file or directory
-
I've reproduced the issue in my test system. I did not see it originally because I had the setting for "Live Rule Updates" enabled on the GLOBAL SETTINGS tab. With that setting, Suricata is not physically stopped and restarted. Instead, it is sent a SIGUS2, which tells it to reload the rules and configuration.
You can temporarily work around this issue by enabling that setting. I'm working to identify exactly where the failure is, and then will get a fix posted for the pfSense team to review and make available.
-
@bmeeks said in surricata keeps shutting down:
You can temporarily work around this issue by enabling that setting. I'm working to identify exactly where the failure is, and then will get a fix posted for the pfSense team to review and make available.
I think I just went with the default at the time I installed your package: live updates disabled.
What issues might typically be seen with this enabled?BTW after reading your earlier posts in this thread, I tried enabling JA3 fingerprinting. I'd overlooked that option.
Once enabled, I obviously no longer get the warnings. However I also found the suricata instance, for the interface on which I enabled JA3, restarted and was recognised as such in the UI. i.e. a pid file was created. Whereas it wasn't for the interface with JA3 still disabled.
Hope that makes sense. Many thanks. -
@darcey said in surricata keeps shutting down:
@bmeeks said in surricata keeps shutting down:
You can temporarily work around this issue by enabling that setting. I'm working to identify exactly where the failure is, and then will get a fix posted for the pfSense team to review and make available.
I think I just went with the default at the time I installed your package: live updates disabled.
Is there any reason I might not want to enable live updates?BTW after reading your earlier posts in this thread, I tried enabling JA3 fingerprinting. I'd overlooked that option.
Once enabled, I obviously no longer get the warnings. However I also found the suricata instance, for the interface on which I enabled JA3, restarted and was recognised as such in the UI. i.e. a pid file was created. Whereas it wasn't for the interface with JA3 still disabled.
Hope that makes sense. Many thanks.The "live rule update" option works pretty well for most folks. There were some edge cases in the past where it was causing a problem, so the default was left in "off". I don't even remember now what those edge cases were. I think it had to do with consuming too much RAM, because with that option enabled, Suricata has to- for a short period- keep two complete copies of the enabled rules in RAM.
The JA3 issue has nothing to do with the PID file problem. I've found the cause of that. It's a race sort of condition between consecutive calls to
suricata_stop()
followed immediately by a call tosuricata_start
. Adding a simple 10-second time delay between those two calls prevents the problem, but I'm looking for a slightly more elegant solution for the permanent fix. When you saved the change, that bumped Suricata and the PID file was created. -
@bmeeks
When I tried enabling the JA3 fingerprint option, I believe I forced a rule update via the web UI. But you now have me doubting myself.
Great work and many thanks for taking the time to explain. I'm going to have a go with the live rules swap. Cheers. -
A package update with a fix for this problem has been posted for review and merge by the pfSense team. I've asked them to update the DEVEL, RELEASE and pfSense+ branches.
The Pull Request is posted here: https://github.com/pfsense/FreeBSD-ports/pull/1085. It may take a day or two for the updated package to be posted.
-
That's a pretty faster turnaround! :) It also shows available on 21.05 and 2.5.2 this morning, already.
The "Live Rule Swap on Update" option did work around it as well. Re: defaulting that to off, I also recall earlier posts mentioning RAM usage on lower RAM hardware.