Suricata won't stop
-
First, as @SteveITS mentioned, if you have the Service Watchdog package installed, DO NOT configure it to monitor Suricata. It does not understand how to properly monitor Suricata processes and can easily wind up starting duplicate processes that cannot be controlled nor monitored via the GUI.
Another possibility is that something disturbed your firewall and caused a cascade of the "restart all packages" command to be executed. That might also result in duplicate Suricata processes on one or more interfaces.
Do this to recover things.
-
Obtain a shell prompt session on the firewall either at the console directly or via SSH.
-
Once in the shell menu, choose menu option "8" to exit to the prompt.
-
Then issue this command sequence to stop active Suricata processes and list all the remaining Suricata processes (the ones that are duplicates):
/usr/local/etc/rc.d/suricata.sh stop && ps -ax | grep suricata
See if any duplicate processes remain listed. If you see any, note the process ID (PID) of the duplicate processes. Then execute this command to kill them:
kill -9 <pid>
You can also use
pkill suricata
instead, if desired.Run
ps -ax | grep suricata
once more to be sure all the duplicate processes are in fact stopped. Once all duplicate or zombie Suricata processes are eliminated, run this command to start up your configured interfaces:
/usr/local/etc/rc.d/suricata.sh start
-
-
Thank you. It where indeed zombie processes. I did a simple "kill" on them and they where gone.
Afterwards checked config, updated rules, etc and simply started Suricata from GUI again. All fine so far! :) -
The GUI reported one of the 3 Suricata interfaces had stopped this morning but it was actually still running;
[2.5.2-RELEASE][admin@thuis]/root: ps aux |grep suri
root 80979 18.9 17.3 1517280 1437448 - SNs 00:03 185:33.68 /usr/local/bin/suricata -i vtnet0.100 -D -c /usr/local/etc/suricata/suricata_248
root 76116 12.9 16.1 1428612 1342984 - SNs 00:03 103:17.29 /usr/local/bin/suricata -i vtnet0.200 -D -c /usr/local/etc/suricata/suricata_538
root 71664 12.0 17.1 1488968 1425076 - SNs 00:03 106:23.64 /usr/local/bin/suricata -i vtnet0.101 -D -c /usr/local/etc/suricata/suricata_331When doing nothing else but starting the interface again from the gui, the same problems are starting to return; zombie processes. 2x for vtnet0.100
[2.5.2-RELEASE][admin@thuis]/root: ps aux | grep suri
root 34693 92.0 10.0 870448 835276 - Rs 10:48 0:24.17 /usr/local/bin/suricata -i vtnet0.100 -D -c /usr/local/etc/suricata/suricata_248
root 80979 17.9 17.3 1517280 1437448 - SNs 00:03 185:45.36 /usr/local/bin/suricata -i vtnet0.100 -D -c /usr/local/etc/suricata/suricata_248
root 71664 14.9 17.1 1488968 1425076 - SNs 00:03 106:31.94 /usr/local/bin/suricata -i vtnet0.101 -D -c /usr/local/etc/suricata/suricata_331
root 76116 12.9 16.1 1428612 1342984 - SNs 00:03 103:25.45 /usr/local/bin/suricata -i vtnet0.200 -D -c /usr/local/etc/suricata/suricata_538This is now happening on 2 different PFsense installs (one Proxmox virtualized, one on bare metal) so it deff looks like i'm hitting a bug here. The GUI can no longer be trusted regarding the service status of Suricata.
-
Look through your pfSense system log and see if anything Suricata related is logged there. Another user was having trouble with the vtnet driver. In his case, Suricata would not start at all.
What I suspect is happening is one or more Suricata processes are aborting abnormally. Hopefully something will be logged in the system log giving a clue.
When Suricata stops abnormally, it does not clean up its PID file in
/var/run
. The GUI looks for the presence of that file to determine if a process is active or not. Each PID file is named using the Suricata interface name and a UUID number. So if a running process segfaults or otherwise dies a sudden death, the corresponding PID file is not deleted in/var/run
. Thus when the GUI code checks for the file to see if the process is "running", it will be fooled. -
It "quit" again in the gui. Log does not go back to far but i see the following;
Feb 4 11:06:49 php-fpm 99056 [Suricata] Suricata signalled with SIGUSR2 for WIRED (vtnet0.100)...
Feb 4 11:06:48 php-fpm 99056 [Suricata] Building new sid-msg.map file for SECURE...
Feb 4 11:06:48 php-fpm 99056 [Suricata] Enabling any flowbit-required rules for: SECURE...
Feb 4 11:06:47 php-fpm 99056 [Suricata] Updating rules configuration for: SECURE ...
Feb 4 11:06:47 check_reload_status 375 Syncing firewall
Feb 4 11:02:42 php-fpm 337 [Suricata] Suricata signalled with SIGUSR2 for WIRED (vtnet0.100)...
Feb 4 11:02:41 php-fpm 337 [Suricata] Building new sid-msg.map file for SECURE...
Feb 4 11:02:41 php-fpm 337 [Suricata] Enabling any flowbit-required rules for: SECURE...
Feb 4 11:02:40 php-fpm 337 [Suricata] Updating rules configuration for: SECURE ...
Feb 4 11:02:40 check_reload_status 375 Syncing firewall
Feb 4 10:59:28 php-fpm 5511 [Suricata] Suricata signalled with SIGUSR2 for WIRED (vtnet0.100)...
Feb 4 10:59:28 php-fpm 5511 [Suricata] Building new sid-msg.map file for SECURE...
Feb 4 10:59:27 php-fpm 5511 [Suricata] Enabling any flowbit-required rules for: SECURE...
Feb 4 10:59:26 php-fpm 5511 [Suricata] Updating rules configuration for: SECURE ...and a rule update has happened which doesnt look wrong;
Feb 4 12:03:12 php 51060 [Suricata] The Rules update has finished.
Feb 4 12:03:12 php 51060 [Suricata] Suricata has restarted with your new set of rules for SECURE...
Feb 4 12:03:12 php 51060 [Suricata] Suricata START for WIRED(vtnet0.100)...
Feb 4 12:02:56 php 51060 [Suricata] Suricata STOP for WIRED(vtnet0.100)...
Feb 4 12:02:56 php 51060 [Suricata] Building new sid-msg.map file for SECURE...
Feb 4 12:02:56 php 51060 [Suricata] Enabling any flowbit-required rules for: SECURE...
Feb 4 12:02:55 php 51060 [Suricata] Updating rules configuration for: SECURE ...I have disabled the rules updates for now to rule that out.
-
Are you using Legacy Blocking Mode or Inline IPS Mode? Inline IPS Mode with VLANs is problematic due to how the netmap kernel device works with VLANs. In your first post you mentioned the "Blocking pages" so perhaps that means you are using Legacy Mode blocking ??
-
Legacy normally but currently have blocking disabled everywhere to rule it out.
-
@ballistic said in Suricata won't stop:
Legacy normally but currently have blocking disabled everywhere to rule it out.
Okay. That's good.
Are you using any RAM disks?
Have a look at the files in
/var/run
on the firewall by using DIAGNOSTICS > EDIT FILE to browse to that folder and see the files there. Look for all the Suricata files. You should see one file per configured interface. The filename will have the interface name along with a UUID (random unique identifier) number. This UUID is like the serial number for an interface instance of Suricata. The presence or absence of the file (with the.pid
suffix) is how the GUI determines if that Suricata interface instance is running or not running. No file equals "not running" in the logic of the GUI. A present file equals "running" for the GUI.The PID files are simple text. They contain the process ID (PID) of the Suricata instance that started and created the file.
Take an inventory of Suricata PID files in
/var/run
and compare that to the actual running instances you see with this command at a shell prompt:ps -ax | grep suricata
They should all match up. But if something causes a running Suricata instance to abort abnormally, then the PID file is not cleaned up. That will confuse the GUI logic and it will misreport the status of Suricata instances.
-
Thanks. I will check it next time is goes wrong.
If I understand your last sentence correctly, you are referring to Suricata process bieng stopped but shows running in the GUI. It's actually the other way around. Process runs (so pid file should be there) but GUI still thinks it's stopped.
Let's see what happens when the problem shows up again.
-
@ballistic said in Suricata won't stop:
Thanks. I will check it next time is goes wrong.
If I understand your last sentence correctly, you are referring to Suricata process bieng stopped but shows running in the GUI. It's actually the other way around. Process runs (so pid file should be there) but GUI still thinks it's stopped.
Let's see what happens when the problem shows up again.
The absence or presence of the PID file is what the GUI code is working with. So yeah, an inventory the next time the problem presents is going to be helpful.
Is there anything going on that might result in a file getting deleted? Typically the file should be "locked' by the running process and normal deletion not allowed.
At any rate, a screenshot of the content from
var/run
while the issue is present will help me. Also including a screenshot (or the output) from theps -ax | grep suricata
command will be helpful to cross-correlate with the PID files.Oh, and one last thing I just thought of. The GUI code that is examining the PID files is actually triggered by Javascript code running on the local client (so in the browser of the device you are connecting to the firewall GUI with). The Javascript is running a recurring series of Ajax form posts to query the status and update the icons. It's possible there could be problems with that script running in your browser.
-
Rock stable so far. I have re-enabled 12 hour updates to see if that makes it break.
-
@ballistic said in Suricata won't stop:
Rock stable so far. I have re-enabled 12 hour updates to see if that makes it break.
If rule updates wind up being the apparent cause, you might consider switching on the "Live Rule Swap on Update" option on the GLOBAL SETTINGS tab. When that option is enabled, Suricata itself is not restarted. Instead, copy of the updated rules is read into a separate memory area, parsed, and then switched to being active. After that, the previous rule set is removed from memory. The only downside of this feature is a temporary increase in RAM usage during the time two copies of the rules are present in memory as the swap is happening.
But really the only time I've seen rule updates result in duplicate processes is when a user also had the Service Watchdog package installed and monitoring Suricata. In that case, because Service Watchdog simply looks for a Suricata process runnning, when it does not see one, it calls the shell script to restart it. If this happens during the rules update cron task when that task is already restarting Suricata, then two copies can get started on the same interface depending on timing. The "Live Rule Swap" feature works around that because the running Suricata instances are themselves not restarted.
-
Thanks for the info! I do not have the watchdog package installed.
Does the Live rule reload have any impact on the size of the config file?
Last time I enabled Suricata on a 4th interface, my already 92MB config file grew to 120MB+ causing a total crash of the GUI (php), SSH, etc. Traffic was still running but the only way I could recover was to SCP in, switch back to the previous config file and reboot.
I have not looked into the cause of this issue yet but at this moment I don't want it to happen again by enabling the live update function. -
@ballistic said in Suricata won't stop:
Does the Live rule reload have any impact on the size of the config file?
Last time I enabled Suricata on a 4th interface, my already 92MB config file grew to 120MB+ causing a total crash of the GUI (php), SSH, etc. Traffic was still running but the only way I could recover was to SCP in, switch back to the previous config file and reboot.No, changing that should have almost zero effect on the config file size. It literally just stores about 41 or 42 ASCII characters in total depending on if is set to "on" or "off".
I can't imagine why Suricata is making your config file that large. It only stores basic configuration info in ASCII XML. The only thing I can possibly imagine that would be larger is if you had thousands of rules forced enabled or disabled, or you had huge SID management conf files. But even then I can't imagine Suricata adding almost 30 MB of stuff to
config.xml
.Have you actually looked in the file to see what is using that amount of data? It may be RRD logging (the input/output stats graph data). That has absolutely nothing to do with Suricata, though.
Open
/conf/config.xml
in an editor and either search for, or scroll down to, the <installed_packages> section. The file is plaintext XML. Then within that section search for <suricata>. Only the info between the two tags <suricata></suricata> are used by the Suricata package. -
Suricata still rock stable.
Regarding the config; It's probably my insane amount of enabled rules that is causing the issue. There are only 5500 lines in the config but some lines are miles long, even Notepad++chokes on it. Specially this one;
<customrules>YWxlcnQgdGNwICRIT01FX05FVCBhbnkgLT4gWzY3L... +another few miles of txt
And that times 3 (3 suricata enabled interfaces) I will look into more efficient rule enabling. Thanks!
-
@ballistic said in Suricata won't stop:
Suricata still rock stable.
Regarding the config; It's probably my insane amount of enabled rules that is causing the issue. There are only 5500 lines in the config but some lines are miles long, even Notepad++chokes on it. Specially this one;
<customrules>YWxlcnQgdGNwICRIT01FX05FVCBhbnkgLT4gWzY3L... +another few miles of txt
And that times 3 (3 suricata enabled interfaces) I will look into more efficient rule enabling. Thanks!
Custom Rules and various lists such as SID management conf files are stored in the
config.xml
as Base64-encoded ASCII text within the applicable element tag. Those can get long if you have tons and tons of those kinds of entries. As originally envisoned and coded, the idea was users would have maybe only a dozen or couple of dozen custom rules max. Nothing inherently wrong with having more, but it will result in a much bigger XML element entry in theconfig.xml
file. -
Thanks for the explanation.
The problems I got after enabling a 4th Suricata interface was something like this:
https://forum.netgate.com/topic/156679/pfsense-fatal-error-allowed-memory-exhausted-cause
Only able to restore after manual config file replacement and reboot.This was on 2.4 with Suricata 2.0.5 or something. I haven't dared to try it on 2.5/2.0.6 yet but perhaps I should open a seperate topic for that as it's out of scope for this topic's subject.
Spoiler alert: I have 56 thousand custom rules per interface. It's urlhaus blocklist which I haven't updated in a while it seems. If i update now, it would up it to 84k rules :) -
@ballistic said in Suricata won't stop:
Thanks for the explanation.
The problems I got after enabling a 4th Suricata interface was something like this:
https://forum.netgate.com/topic/156679/pfsense-fatal-error-allowed-memory-exhausted-cause
Only able to restore after manual config file replacement and reboot.This was on 2.4 with Suricata 2.0.5 or something. I haven't dared to try it on 2.5/2.0.6 yet but perhaps I should open a seperate topic for that as it's out of scope for this topic's subject.
Spoiler alert: I have 56 thousand custom rules per interface. It's urlhaus blocklist which I haven't updated in a while it seems. If i update now, it would up it to 84k rules :)That is an absurd number of rules to be honest. Are they just IP addresses? If from URLHaus, then I suspect they are actually simple lists of IP addresses/networks to block translated into Suricata/Snort rules syntax. Something like that would be more efficient when used as an IP list in pfBlockerNG-devel or else simply loading a URL table alias in pfSense. Because almost 100% of email traffic today is TLS (and thus encrypted), I can't imagine those rules examining any actual content (meaning data in the packet payloads). That is unless you are doing full MITM proxying of email traffic.
-
It's this list;
https://urlhaus.abuse.ch/downloads/suricata-ids/ -
@ballistic said in Suricata won't stop:
It's this list;
https://urlhaus.abuse.ch/downloads/suricata-ids/Did you know that there is now an option under the GLOBAL SETTINGS tab to add your own additional rules URLs? User @viktor_g here on the forums added that new feature a few months ago.
So you could simply copy this URL on that tab as an "additional rules" entry and then Suricata will download the list each time it updates the rules. You would not need to copy all of that text into the Custom Rules dialog, and your
config.xml
would be considerably smaller as well (since it would no longer need all that Base64-encoded text).