Suricata spawning 2 processes ?
-
You have a duplicate zombie process. There are a few ways this can happen. None of them are "normal".
First, some folks try to run Service Watchdog and have it monitor Suricata (or Snort). That is a big no-no. Service Watchdog does not understand how to properly monitor the IDS/IPS packages and will needlessly try to issue restart commands when the package are already in the middle of restarting themselves (after a periodic check/update for new rules). This results in two copies of the same instance running on an interface.
The second possibility is something caused the built-in pfSense command "restart all packages" to run more than once in quick succession. That can also, under some circumstances, result in a duplicate process on the same interface.
No matter how the duplicate got there, you need to kill it. Go to the INTERFACES tab in Suricata and stop all running instances. Now exit to a shell prompt either directly on the firewall or via an SSH session. When the pfSense CLI menu appears, choose option "8" to exit to a shell. Run this command sequence:
ps -ax | grep suricata
The first command will return a list of Process IDs (<pid>) of any remaining Suricata processes. In your case, I would expect it to show just one (since one of the duplicates should have been killed by the GUI command). Note the Process ID of the still running Suricata process.
Run this second command to kill the zombie process:
kill -9 <pid>
Exit the shell session, return to the GUI, and then restart your Suricata instances on the INTERFACES tab in Suricata.
-
@bmeeks Yep, I've done that several times.
I've killed Suricata from the gui, then killed the process from the cmd line. Then I've gone back in and started suricata on the interface.
I've also just plain rebooted the firewall. 1 process per interface will start, and then for whatever reason, not sure when, I will get a duplicate process on igb3. Never on igb1, just igb3.
I'm not using Watchdog for Suricata. I do have it setup for unbound though, but I'm not getting any emails saying watchdog has restarted anything.
Even more perplexing.... today, before posting this message, I stopped suricata on igb3 from the gui. Then I went to the cmd line and killed the process that showed from ps -ax|grep igb3 (was only 1 suricata process at this point).
I figured I'd leave it alone, circle back around. It looks like it has restarted, for igb3. I didn't do anything, it restarted on it's own.
Which, means, something is triggering it? It could be the restart all packages, but would cause that ? And why is it only igb3. lol
-
Post the contents of your pfSense system log around these time frames. Let's see what is running on the firewall during that time interval.
Suricata does not restart itself except during rules updates, and then only if the interface was already running. The rules update process saves the current state of running Suricata instances, and only restarts those that were running when the rules update task started. Outside of that, nothing within Suricata restarts itself.
Note though, that within the GUI, stopping an interface is not the same as disabling an interface. A "user manually stopped" interface will still restart if the "restart all packages" command discussed below fires. That's because there is a shell script that includes steps to restart all Suricata interfaces that are not actually disabled via the GUI. That shell script is called when the firewall is rebooted or when the "restart all packages" command executes within pfSense.
The firewall has its own system that detects certain types of events and then issues a "restart all packages" command if one of those events is triggered. Typically that includes things like a physical network interface cycling such as a new DHCP address on the WAN perhaps, a cable hotplug event, or
dpinger
thinking a monitored gateway went offline and then came back. -
@bmeeks Ahhh, I think you might be spot on with the restart all packages.
at 11:24 am I stopped suricata on igb3. At 11:26 I installed acme and haproxy (I'm in the process of wanting to setup and use a self host Bitwarden setup).
Jan 9 11:26:21 fw pkg-static[87396]: pfSense-pkg-acme-0.7.3 installed
Jan 9 11:26:23 fw check_reload_status[421]: Reloading filter
Jan 9 11:26:23 fw check_reload_status[421]: Starting packages
Jan 9 11:26:24 fw php-fpm[382]: /rc.start_packages: Restarting/Starting all packages.So that is what did it. I stopped Suricata at 4:19pm after the last post, and it hasn't restarted yet. So I'm going to start Suricata back on igb3 and monitor.
If another process spawns, I'l get the time of that process from ps and try to trace through the logs and see if I can see anything.
Thank you for your time!
-
@wangel said in Suricata spawning 2 processes ?:
@bmeeks Ahhh, I think you might be spot on with the restart all packages.
at 11:24 am I stopped suricata on igb3. At 11:26 I installed acme and haproxy (I'm in the process of wanting to setup and use a self host Bitwarden setup).
Jan 9 11:26:21 fw pkg-static[87396]: pfSense-pkg-acme-0.7.3 installed
Jan 9 11:26:23 fw check_reload_status[421]: Reloading filter
Jan 9 11:26:23 fw check_reload_status[421]: Starting packages
Jan 9 11:26:24 fw php-fpm[382]: /rc.start_packages: Restarting/Starting all packages.So that is what did it. I stopped Suricata at 4:19pm after the last post, and it hasn't restarted yet. So I'm going to start Suricata back on igb3 and monitor.
If another process spawns, I'l get the time of that process from ps and try to trace through the logs and see if I can see anything.
Thank you for your time!
Yes, installing the additional package caused pfSense to issue the "restart all packages" command. That command, in turn, calls the shell script Suricata registered with the system when it was installed as a Service. That shell script contains steps to start ALL the enabled Suricata interfaces.
The only way for an interface to NOT be listed in the shell script is if the interface is disabled in the Suricata GUI. So a manually stopped interface that is not also disabled will get started again when that shell script executes.
-
@bmeeks Ok, so as suspected, it happened again last night. It's when it loading a new set of rules, but I can't figure out ... why?
[22.05-RELEASE][admin@fw.xxxx.xxxxx]/var/log: ps -ax |grep suricata 78240 - Ss 14:39.46 /usr/local/bin/suricata -i igb3 -D -c /usr/local/etc/suricata/suricata_56281_igb3/suricata.yaml --pidfile /var/run/suricata_igb356281.pid 79017 - SNs 14:37.48 /usr/local/bin/suricata -i igb3 -D -c /usr/local/etc/suricata/suricata_56281_igb3/suricata.yaml --pidfile /var/run/suricata_igb356281.pid 79159 - SNs 15:36.95 /usr/local/bin/suricata -i igb1 -D -c /usr/local/etc/suricata/suricata_5377_igb1/suricata.yaml --pidfile /var/run/suricata_igb15377.pid
[22.05-RELEASE][admin@fw.xxxx.xxxxx]/var/log: ps -p 78240 -o lstart STARTED Tue Jan 10 03:30:46 2023 [22.05-RELEASE][admin@fw.xxxx.xxxxx]/var/log: ps -p 79017 -o lstart STARTED Tue Jan 10 03:30:55 2023 [22.05-RELEASE][admin@fw.xxxx.xxxxx]/var/log: ps -p 79159 -o lstart STARTED Tue Jan 10 03:30:37 2023
Jan 10 03:30:13 fw php[65980]: [Suricata] There is a new set of Emerging Threats Open rules posted. Downloading emerging.rules.tar.gz... Jan 10 03:30:15 fw php[65980]: [Suricata] Emerging Threats Open rules file update downloaded successfully. Jan 10 03:30:15 fw php[65980]: [Suricata] Snort VRT rules are up to date... Jan 10 03:30:15 fw php[65980]: [Suricata] Snort GPLv2 Community Rules are up to date... Jan 10 03:30:16 fw php[65980]: [Suricata] Updating rules configuration for: WAN ... Jan 10 03:30:18 fw php[65980]: [Suricata] Enabling any flowbit-required rules for: WAN... Jan 10 03:30:18 fw php[65980]: [Suricata] Building new sid-msg.map file for WAN... Jan 10 03:30:19 fw php[65980]: [Suricata] Updating rules configuration for: LAN ... Jan 10 03:30:21 fw php[65980]: [Suricata] Enabling any flowbit-required rules for: LAN... Jan 10 03:30:21 fw php[65980]: [Suricata] Building new sid-msg.map file for LAN... Jan 10 03:30:22 fw php[65980]: [Suricata] Suricata STOP for LAN(igb1)... Jan 10 03:30:23 fw check_reload_status[421]: Linkup starting igb1 Jan 10 03:30:23 fw kernel: igb1: link state changed to DOWN Jan 10 03:30:23 fw kernel: igb1: promiscuous mode disabled Jan 10 03:30:24 fw php-fpm[98151]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1 ) Jan 10 03:30:24 fw check_reload_status[421]: Reloading filter Jan 10 03:30:26 fw check_reload_status[421]: Linkup starting igb1 Jan 10 03:30:26 fw kernel: igb1: link state changed to UP Jan 10 03:30:27 fw php-fpm[383]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1 ) Jan 10 03:30:27 fw check_reload_status[421]: rc.newwanip starting igb1 Jan 10 03:30:27 fw check_reload_status[421]: Reloading filter Jan 10 03:30:28 fw php-fpm[383]: /rc.newwanip: rc.newwanip: Info: starting on igb1. Jan 10 03:30:28 fw php-fpm[383]: /rc.newwanip: rc.newwanip: on (IP address: 192.168.1.1) (interface: LAN[lan]) (real interface: igb1). Jan 10 03:30:33 fw php-fpm[383]: /rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through XX.XX.XX.XX Jan 10 03:30:33 fw php-fpm[383]: /rc.newwanip: Gateway, NONE AVAILABLE Jan 10 03:30:36 fw dhcpleases[41702]: Could not deliver signal HUP to process 53949: No such process. Jan 10 03:30:37 fw php[65980]: [Suricata] Suricata START for LAN(igb1)... Jan 10 03:30:37 fw php[65980]: [Suricata] Suricata has restarted with your new set of rules for LAN... Jan 10 03:30:37 fw php[65980]: [Suricata] Updating rules configuration for: IOT ... Jan 10 03:30:37 fw php-fpm[383]: /rc.newwanip: Resyncing OpenVPN instances for interface LAN. Jan 10 03:30:37 fw php-fpm[383]: /rc.newwanip: Creating rrd update script Jan 10 03:30:39 fw php-fpm[383]: /rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - 192.168.1.1 -> 192.168.1.1 - Restarti ng packages. Jan 10 03:30:39 fw check_reload_status[421]: Starting packages Jan 10 03:30:39 fw php[65980]: [Suricata] Enabling any flowbit-required rules for: IOT... Jan 10 03:30:40 fw php[65980]: [Suricata] Building new sid-msg.map file for IOT... Jan 10 03:30:40 fw php[65980]: [Suricata] Suricata STOP for IoT VLAN(igb3)... Jan 10 03:30:40 fw php-fpm[82855]: /rc.start_packages: Restarting/Starting all packages. Jan 10 03:30:40 fw kernel: igb3: promiscuous mode disabled Jan 10 03:30:45 fw tail_pfb[69885]: [pfBlockerNG] Firewall Filter Service stopped Jan 10 03:30:45 fw lighttpd_pfb[69553]: [pfBlockerNG] DNSBL Webserver stopped Jan 10 03:30:45 fw php_pfb[70176]: [pfBlockerNG] filterlog daemon stopped Jan 10 03:30:45 fw lighttpd_pfb[71840]: [pfBlockerNG] DNSBL Webserver started Jan 10 03:30:46 fw tail_pfb[75087]: [pfBlockerNG] Firewall Filter Service started Jan 10 03:30:46 fw php[73292]: [pfBlockerNG] DNSBL parser daemon started Jan 10 03:30:46 fw check_reload_status[421]: Rewriting resolv.conf Jan 10 03:30:46 fw php[75434]: [pfBlockerNG] filterlog daemon started Jan 10 03:30:46 fw SuricataStartup[77795]: Suricata START for IoT VLAN(56281_igb3)... Jan 10 03:30:55 fw php[65980]: [Suricata] Suricata START for IoT VLAN(igb3)... Jan 10 03:30:55 fw php[65980]: [Suricata] Suricata has restarted with your new set of rules for IOT... Jan 10 03:30:55 fw php[65980]: [Suricata] The Rules update has finished. Jan 10 03:31:59 fw kernel: igb3: promiscuous mode enabled Jan 10 03:32:28 fw check_reload_status[421]: Linkup starting igb1 Jan 10 03:32:28 fw kernel: igb1: link state changed to DOWN Jan 10 03:32:28 fw kernel: igb1: promiscuous mode enabled Jan 10 03:32:29 fw php-fpm[382]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1 ) Jan 10 03:32:29 fw check_reload_status[421]: Reloading filter Jan 10 03:32:30 fw check_reload_status[421]: Linkup starting igb1 Jan 10 03:32:30 fw kernel: igb1: link state changed to UP Jan 10 03:32:31 fw php-fpm[98151]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1 ) Jan 10 03:32:31 fw check_reload_status[421]: rc.newwanip starting igb1 Jan 10 03:32:31 fw check_reload_status[421]: Reloading filter Jan 10 03:32:32 fw php-fpm[98151]: /rc.newwanip: rc.newwanip: Info: starting on igb1. Jan 10 03:32:32 fw php-fpm[98151]: /rc.newwanip: rc.newwanip: on (IP address: 192.168.1.1) (interface: LAN[lan]) (real interface: igb1). Jan 10 03:32:37 fw php-fpm[98151]: /rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through XX.XX.XX.XX Jan 10 03:32:37 fw php-fpm[98151]: /rc.newwanip: Gateway, NONE AVAILABLE Jan 10 03:32:40 fw dhcpleases[78518]: Could not deliver signal HUP to process 62520: No such process. Jan 10 03:32:41 fw php-fpm[98151]: /rc.newwanip: Resyncing OpenVPN instances for interface LAN. Jan 10 03:32:41 fw php-fpm[98151]: /rc.newwanip: Creating rrd update script Jan 10 03:32:43 fw php-fpm[98151]: /rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - 192.168.1.1 -> 192.168.1.1 - Restar ting packages. Jan 10 03:32:43 fw check_reload_status[421]: Starting packages Jan 10 03:32:44 fw php-fpm[383]: /rc.start_packages: Restarting/Starting all packages. Jan 10 03:32:49 fw lighttpd_pfb[11204]: [pfBlockerNG] DNSBL Webserver stopped Jan 10 03:32:49 fw tail_pfb[11298]: [pfBlockerNG] Firewall Filter Service stopped Jan 10 03:32:49 fw php_pfb[11314]: [pfBlockerNG] filterlog daemon stopped Jan 10 03:32:49 fw lighttpd_pfb[14435]: [pfBlockerNG] DNSBL Webserver started Jan 10 03:32:49 fw tail_pfb[15304]: [pfBlockerNG] Firewall Filter Service started Jan 10 03:32:49 fw check_reload_status[421]: Rewriting resolv.conf Jan 10 03:32:50 fw php[15872]: [pfBlockerNG] DNSBL parser daemon started Jan 10 03:32:50 fw php[15574]: [pfBlockerNG] filterlog daemon started Jan 10 03:56:00 fw sshguard[18912]: Exiting on signal. Jan 10 03:56:00 fw sshguard[257]: Now monitoring attacks.
I'm digging, I wonder if it has to do with the Dynamic DNS stuff I have setup? I use namecheap and am using Dynamic DNS to keep my dns updated.
Just thinking outloud =/
-
@wangel:
I am guessing you are running Inline IPS Mode. The reason I say that is when Suricata is stopping/starting as part of the rules update process I see your interfaces going down and coming back up. That is a side effect of Suricata using the netmap device for inline IPS.Notice how the interface going down and then coming back up is triggering additional pfSense actions. I suspect those may be at the root of the multiple instances on the same interface.
This line in your log snippet is probably the culprit.
Jan 10 03:30:39 fw check_reload_status[421]: Starting packages
I've put a number of attempted safeguards in the Suricata code over to years to try and stop this, but nothing has been 100% effective.
This can probably help, though. It will cause Suricata to swap out the old rules and swap in the new ones without restarting. If Suricata does not restart, then it won't cycle the interface.
Go to the GLOBAL SETTINGS tab in Suricata, scroll down to the Rules Update section, find the Live Rule Swap on Update checkbox and click it to enable live swap updates. Save that change and restart Suricata on all interfaces. You will need to restart them all for the change to become effective.
That change will cause Suricata to use more RAM during the updates because it will now keep two copies of the rules in RAM for a bit during the swap. One copy will be the old rules, and the other is the new set of rules. Once Suricata completes the swap out, RAM usage will go back to normal as the old rules are discarded.
-
@bmeeks I'll give that a shot and see what happens.
I'm actually running Suricata in legacy mode. Inline doesn't seem to work for me. Well, it will work for a little bit, then just totally kill my connection and I have to reboot the firewall.
-
@wangel said in Suricata spawning 2 processes ?:
@bmeeks I'll give that a shot and see what happens.
I'm actually running Suricata in legacy mode. Inline doesn't seem to work for me. Well, it will work for a little bit, then just totally kill my connection and I have to reboot the firewall.
If you are using Legacy Mode, then what I described won't be of any benefit. Suricata does not cycle the interfaces in Legacy Mode. It won't hurt to enable Live Rule Swap, but it won't do anything (I will be extremely surprised if it does help).
You need to figure out why your NICs are disconnecting. Those Link and Hot Plug events should not be happening in normal operation.
-
@bmeeks Got it figured out. It was 100% the Dynamic DNS updater.
No idea why it was bouncing my local interfaces, when it thought my WAN Ip changed, that appears to be what was causing the issue. I've disabled it and haven't seen a 2nd process spawn yet.
Will keep monitoring it.
On a sidenote, I would love to use InLine mode, and I have an intel card. An Intel I340 to be exact. I have a 500mbs down and 20mbs up. Like I said, InLine will work for "awhile" and then just poo itself. I have to power cycle the firewall to get the interfaces to work again.
I just assumed it was the card/bad driver ? I'd like to switch to inline but I can't go rebooting the firewall every couple of days, hehe.
Thanks!
-
@wangel said in Suricata spawning 2 processes ?:
On a sidenote, I would love to use InLine mode, and I have an intel card. An Intel I340 to be exact. I have a 500mbs down and 20mbs up. Like I said, InLine will work for "awhile" and then just poo itself. I have to power cycle the firewall to get the interfaces to work again.
I just assumed it was the card/bad driver ? I'd like to switch to inline but I can't go rebooting the firewall every couple of days, hehe.That behavior is similar to what has been reported by OPNsense users of Suricata in IPS mode. The problem was a big issue with them in a recent OPNsense update that included Suricata 6.0.9. It was so bad that the OPNsense team stripped out completely the netmap improvements that were part of 6.0.9. There is an open Suricata Redmine bug on the issue here: https://redmine.openinfosecfoundation.org/issues/5744.
There are also a number of open bug reports that reference the igb network driver in FreeBSD. Those are listed here: https://bugs.freebsd.org/bugzilla/buglist.cgi?quicksearch=igb.
I'm not sure at this point if the issue is within the Suricata binary or netmap within FreeBSD itself. You are the second pfSense user I have heard mention the "hang up" issue.
-
@bmeeks Very Very interesting. At least .... I'm not alone!? LOL
I bought an Intel Nic, because everyone says "they are the best to use with pfsense" ... which I don't doubt, just, I do want to use inline mode to take that load off my cpu.
Is there a recommended card ? I need a 4 port nic. I guess the "hang up" is a hit or miss thing.
I have another 4 port Intel NIC that I could try. It's the PRO/1000 ET(2)
The only difference is the ET(2) has IPSec offloading and SRV-IO.
-
@wangel said in Suricata spawning 2 processes ?:
@bmeeks Very Very interesting. At least .... I'm not alone!? LOL
I bought an Intel Nic, because everyone says "they are the best to use with pfsense" ... which I don't doubt, just, I do want to use inline mode to take that load off my cpu.
Is there a recommended card ? I need a 4 port nic. I guess the "hang up" is a hit or miss thing.
I have another 4 port Intel NIC that I could try. It's the PRO/1000 ET(2)
The only difference is the ET(2) has IPSec offloading and SRV-IO.
With netmap, any type of offloading on the NIC is a bad idea. It seldom works right. And the Inline IPS Mode uses the FreeBSD netmap device for the traffic drop mechanism.
I'm not 100% sure the issue is restricted to only certain Intel NICs. Several of the OPNsense users were running virtualized, if I recall correctly, and were still having the issue.
With FreeBSD, the older the NIC the better supported it is likely to be. The upcoming pfSense Plus 23.01 release uses FreeBSD 14.0-CURRENT, so it will have the best support for newer hardware. The current release versions of pfSense are based on FreeBSD 12.3-STABLE.