surricata keeps shutting down
-
I've reproduced the issue in my test system. I did not see it originally because I had the setting for "Live Rule Updates" enabled on the GLOBAL SETTINGS tab. With that setting, Suricata is not physically stopped and restarted. Instead, it is sent a SIGUS2, which tells it to reload the rules and configuration.
You can temporarily work around this issue by enabling that setting. I'm working to identify exactly where the failure is, and then will get a fix posted for the pfSense team to review and make available.
-
@bmeeks said in surricata keeps shutting down:
You can temporarily work around this issue by enabling that setting. I'm working to identify exactly where the failure is, and then will get a fix posted for the pfSense team to review and make available.
I think I just went with the default at the time I installed your package: live updates disabled.
What issues might typically be seen with this enabled?BTW after reading your earlier posts in this thread, I tried enabling JA3 fingerprinting. I'd overlooked that option.
Once enabled, I obviously no longer get the warnings. However I also found the suricata instance, for the interface on which I enabled JA3, restarted and was recognised as such in the UI. i.e. a pid file was created. Whereas it wasn't for the interface with JA3 still disabled.
Hope that makes sense. Many thanks. -
@darcey said in surricata keeps shutting down:
@bmeeks said in surricata keeps shutting down:
You can temporarily work around this issue by enabling that setting. I'm working to identify exactly where the failure is, and then will get a fix posted for the pfSense team to review and make available.
I think I just went with the default at the time I installed your package: live updates disabled.
Is there any reason I might not want to enable live updates?BTW after reading your earlier posts in this thread, I tried enabling JA3 fingerprinting. I'd overlooked that option.
Once enabled, I obviously no longer get the warnings. However I also found the suricata instance, for the interface on which I enabled JA3, restarted and was recognised as such in the UI. i.e. a pid file was created. Whereas it wasn't for the interface with JA3 still disabled.
Hope that makes sense. Many thanks.The "live rule update" option works pretty well for most folks. There were some edge cases in the past where it was causing a problem, so the default was left in "off". I don't even remember now what those edge cases were. I think it had to do with consuming too much RAM, because with that option enabled, Suricata has to- for a short period- keep two complete copies of the enabled rules in RAM.
The JA3 issue has nothing to do with the PID file problem. I've found the cause of that. It's a race sort of condition between consecutive calls to
suricata_stop()
followed immediately by a call tosuricata_start
. Adding a simple 10-second time delay between those two calls prevents the problem, but I'm looking for a slightly more elegant solution for the permanent fix. When you saved the change, that bumped Suricata and the PID file was created. -
@bmeeks
When I tried enabling the JA3 fingerprint option, I believe I forced a rule update via the web UI. But you now have me doubting myself.
Great work and many thanks for taking the time to explain. I'm going to have a go with the live rules swap. Cheers. -
A package update with a fix for this problem has been posted for review and merge by the pfSense team. I've asked them to update the DEVEL, RELEASE and pfSense+ branches.
The Pull Request is posted here: https://github.com/pfsense/FreeBSD-ports/pull/1085. It may take a day or two for the updated package to be posted.
-
That's a pretty faster turnaround! :) It also shows available on 21.05 and 2.5.2 this morning, already.
The "Live Rule Swap on Update" option did work around it as well. Re: defaulting that to off, I also recall earlier posts mentioning RAM usage on lower RAM hardware.
-
@bmeeks Hope I'm not resurrecting a thread that should be dead.
I'm seeing the same issue with Suricata randomly stopping on some interfaces and not being able to restart them.
This is the error I see in the log: 6/8/2021 -- 07:59:27 - <Error> -- [ERRCODE: SC_ERR_INITIALIZATION(45)] - pid file '/var/run/suricata_mvneta211853.pid' exists but appears stale. Make sure Suricata is not running and then remove /var/run/suricata_mvneta211853.pid. Aborting!
This what I see running the grep command mentioned above:
9533 - S 0:00.01 sh -c ps -ax | grep suricata 2>&1
59933 - R 0:00.00 grep suricata
87965 - Rs 0:28.53 /usr/local/bin/suricata -i mvneta1.3 -D -c /usr/local/etc/suricata/suricata_33184_mvneta1.3/suricata.yaml --pidfile /var/run/suricata_mvneta1.333184.pidWhen I run ls /var/run/ though there are the "stale" pid's showing below.
suricata_mvneta1.333184.pid
suricata_mvneta114834.pid
suricata_mvneta211853.pidIf I manually delete these I can restart the service on those interfaces, but they ultimately just stop again. I've tried the live update option and that hasn't helped.
For reference I'm running on a Netgate SG-3100 so I would think I have enough horsepower for this as it's only a home system.
-
@bitslammer said in surricata keeps shutting down:
@bmeeks Hope I'm not resurrecting a thread that should be dead.
I'm seeing the same issue with Suricata randomly stopping on some interfaces and not being able to restart them.
This is the error I see in the log: 6/8/2021 -- 07:59:27 - <Error> -- [ERRCODE: SC_ERR_INITIALIZATION(45)] - pid file '/var/run/suricata_mvneta211853.pid' exists but appears stale. Make sure Suricata is not running and then remove /var/run/suricata_mvneta211853.pid. Aborting!
This what I see running the grep command mentioned above:
9533 - S 0:00.01 sh -c ps -ax | grep suricata 2>&1
59933 - R 0:00.00 grep suricata
87965 - Rs 0:28.53 /usr/local/bin/suricata -i mvneta1.3 -D -c /usr/local/etc/suricata/suricata_33184_mvneta1.3/suricata.yaml --pidfile /var/run/suricata_mvneta1.333184.pidWhen I run ls /var/run/ though there are the "stale" pid's showing below.
suricata_mvneta1.333184.pid
suricata_mvneta114834.pid
suricata_mvneta211853.pidIf I manually delete these I can restart the service on those interfaces, but they ultimately just stop again. I've tried the live update option and that hasn't helped.
For reference I'm running on a Netgate SG-3100 so I would think I have enough horsepower for this as it's only a home system.
Your issue is not the same as the ones posted in this thread. The problem in this thread was a disappearing PID file, not a stale one. You have the "stale" files because Suricata is crashing and not cleaning up after itself. The stale PID files are a symptom, not a cause, of your problem.
Look in the pfSense system log for any Suricata or php-fpm related messages. I'm betting you find some from at least one of those sources.
What version of pfSense+ are you running on your firewall, and what version of the Suricata package?
-
@bmeeks Thanks.
Found the issue:
Aug 26 07:18:27 kernel pid 46136 (suricata), jid 0, uid 0, was killed: out of swap space
Aug 26 07:18:27 kernel pid 55608 (suricata), jid 0, uid 0, was killed: out of swap spaceNot sure how to correct it. I was having no issues prior to the 21.05.1 upgrade. I'm running Suricata 6.0.0_14.
-
@bitslammer said in surricata keeps shutting down:
@bmeeks Thanks.
Found the issue:
Aug 26 07:18:27 kernel pid 46136 (suricata), jid 0, uid 0, was killed: out of swap space
Aug 26 07:18:27 kernel pid 55608 (suricata), jid 0, uid 0, was killed: out of swap spaceNot sure how to correct it. I was having no issues prior to the 21.05.1 upgrade. I'm running Suricata 6.0.0_14.
Check
top
to see what is using up your memory. RAM is limited in the SG-3100, so if you are running a large number of rules, that can be a problem. Other potential problem spots are DNSBL with very large domain blacklists.Do you have "Live Reload" enabled on the GLOBAL SETTINGS tab for the Suricata rules update? If so, that can result in increased memory usage during rule updates as Suricata keeps two copies of the rules in RAM for a bit as it loads the updated rules alongside the existing older rules. After everything is ready, it dumps the older rules. But for a while, two copies exist, and thus RAM usage goes up.
-
@bmeeks Thanks. Kind of figured this was the issue. I'll see what I can clean up and will turn off live update.
-
@bitslammer said in surricata keeps shutting down:
out of swap space
There is an issue in 21.x and 2.5.x with pcscd gradually taking up RAM. There is a patch or you can just stop it after booting.
-
@steveits said in surricata keeps shutting down:
@bitslammer said in surricata keeps shutting down:
out of swap space
There is an issue in 21.x and 2.5.x with pcscd gradually taking up RAM. There is a patch or you can just stop it after booting.
Good catch! I forgot about that bug.
-
@bmeeks said in surricata keeps shutting down:
I forgot about that bug.
I upgraded a bunch of clients this month and went back and stopped it on previous upgrades. Hopefully I have burned it in and won't forget it on one. :) Didn't actually run into a problem, but I think the highest I saw was over 2 GB on our one non-appliance install and it definitely grows over time.
(for lurkers, the intent is to have pcscd not running by default on future versions...would be a non-issue if it didn't have a memory leak)
-
about a day or so after I raised this, the issue was addressed and successfully fixed.
I haven't had an issue since. it all works great