Suricata 7.0.2 service stop problem
-
Was that STOP command in the logs at 22:16:30 something that you initiated? Do you know where that command came from?
When Suricata stops it can cause the physical interface to bounce, especially if using Inline IPS Mode with the netmap device. In your case that bounce might be triggering CARP actions, and the automatic restart you see might just be a consequence of CARP and the "restart all packges" command being issued by pfSense.
-
Yes, that was a "/usr/local/etc/rc.d/suricata.sh stop" command issued by me.
"When Suricata stops it can cause the physical interface to bounce" -- hm... now that you mention this... I've done a few service stops on several systems and it do seems that only physical interfaces are affected, if Suricata runs on a VLAN interface it can be stopped without any problems. Still, this is a weird, it's not OK if an interface on a firewall just goes down even for a short time because a service has stopped... This is clearly not in the suricata rc script. BTW I'm using "workers" run mode but only in IDS mode. Maybe "live swap" is also a setting that might have something to to do with this behaviour. However I can't seem to find any direct cause and effect relationship between those mentioned settings and the interface down events at service stop. I wish i could go to VLAN interfaces everywhere, but I can't.
-
@RobertK-1 said in Suricata 7.0.2 service stop problem:
Yes, that was a "/usr/local/etc/rc.d/suricata.sh stop" command issued by me.
"When Suricata stops it can cause the physical interface to bounce" -- hm... now that you mention this... I've done a few service stops on several systems and it do seems that only physical interfaces are affected, if Suricata runs on a VLAN interface it can be stopped without any problems. Still, this is a weird, it's not OK if an interface on a firewall just goes down even for a short time because a service has stopped... This is clearly not in the suricata rc script. BTW I'm using "workers" run mode but only in IDS mode. Maybe "live swap" is also a setting that might have something to to do with this behaviour. However I can't seem to find any direct cause and effect relationship between those mentioned settings and the interface down events at service stop. I wish i could go to VLAN interfaces everywhere, but I can't.
The bouncing of the interface is not something that's part of the pfSense package. That is a behavior of the binary from upstream. This is true for any interface running with the netmap device for IPS. The netmap device code in the operating system kernel bounces the interface when netmap is started and stopped on an interface such as when Suricata itself is stopped or started.
It appears this cycling can also happen even when using PCAP mode with IDS-only or Legacy Blocking Mode.
Suricata stops and restarts for two reasons: (1) if instructed by the admin via the GUI options; or (2) at the end of a scheduled rules update job unless the Live Rule Swap option is enabled.
-
@bmeeks Thanks for the insight, it goes much deeper than it first seemed
-
Hi All,
I'm having the opposite problem. I have Suricata running on LAN interface in Legacy mode. Since updating to 7.0.2 I can't get it to stay running. The interface keeps shutting down. Here are some of the more recent Log entries that were generated:
[101114 - Suricata-Main] 2023-11-28 09:29:50 Info: detect: 2 rule files processed. 45807 rules successfully loaded, 137 rules failed
[101114 - Suricata-Main] 2023-11-28 09:29:50 Info: threshold-config: Threshold config parsed: 0 rule(s) found
[101114 - Suricata-Main] 2023-11-28 09:29:51 Info: detect: 45814 signatures processed. 1175 are IP-only rules, 6877 are inspecting packet payload, 31123 inspect application layer, 108 are decoder event only
[101114 - Suricata-Main] 2023-11-28 09:29:51 Warning: detect-flowbits: flowbit 'ET.smb.binary' is checked but not set. Checked in 2027402 and 4 other sigs
[101114 - Suricata-Main] 2023-11-28 09:29:51 Warning: detect-flowbits: flowbit 'file.zip&file.silverlight' is checked but not set. Checked in 28582 and 2 other sigs
[101114 - Suricata-Main] 2023-11-28 09:29:51 Warning: detect-flowbits: flowbit 'file.pdf&file.ttf' is checked but not set. Checked in 28585 and 1 other sigs
[101114 - Suricata-Main] 2023-11-28 09:29:51 Warning: detect-flowbits: flowbit 'file.xls&file.ole' is checked but not set. Checked in 30990 and 1 other sigs
[101114 - Suricata-Main] 2023-11-28 09:29:51 Warning: detect-flowbits: flowbit 'file.onenote' is checked but not set. Checked in 61666 and 1 other sigs
[101114 - Suricata-Main] 2023-11-28 09:30:39 Info: runmodes: Using 1 live device(s).
[294025 - RX#01-igb1] 2023-11-28 09:30:39 Info: pcap: igb1: running in 'auto' checksum mode. Detection of interface state will require 1000 packets
[294025 - RX#01-igb1] 2023-11-28 09:30:39 Info: pcap: igb1: snaplen set to 1518
[101114 - Suricata-Main] 2023-11-28 09:30:40 Notice: threads: Threads created -> RX: 1 W: 24 FM: 1 FR: 1 Engine started.
[294025 - RX#01-igb1] 2023-11-28 09:30:40 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
[294046 - W#21] 2023-11-28 09:52:36 Error: spm-hs: Hyperscan returned fatal error -1.I'm running PfSense 2.7.0 and Suricata 7.0.2 on a Dell R410 with 2x Intel X5675 CPU and 65GB of ram. Any pointers will be appreciated.
-
@bmeeks: I think bouncing could be avoided by just killing (-9) Suricata. So if the rc_stop() function in /usr/local/etc/rc.d/suricata.sh is modified in a way, that it would send KILL instead of TERM, there would be no bouncing when stopping the service. If there is no bouncing, Suricata could be used on WAN interface, there would be no associated WAN interface down actions like CARP failover, no rc.newwanip, no rc.start_packages and no more unexpected Suricata restart because rc.start_packeges silently restarts it right after the user stopped the service. I know, killing is a bit hash approach, but loosing the WAN interface just for a second because Suricata service was stopped manually, is very weird, also seeing Suricata running when you have just stopped it, is also unexpected. What is your insight of this approach?
BR
Robert
-
@RobertK-1 said in Suricata 7.0.2 service stop problem:
@bmeeks: I think bouncing could be avoided by just killing (-9) Suricata. So if the rc_stop() function in /usr/local/etc/rc.d/suricata.sh is modified in a way, that it would send KILL instead of TERM, there would be no bouncing when stopping the service. If there is no bouncing, Suricata could be used on WAN interface, there would be no associated WAN interface down actions like CARP failover, no rc.newwanip, no rc.start_packages and no more unexpected Suricata restart because rc.start_packeges silently restarts it right after the user stopped the service. I know, killing is a bit hash approach, but loosing the WAN interface just for a second because Suricata service was stopped manually, is very weird, also seeing Suricata running when you have just stopped it, is also unexpected. What is your insight of this approach?
BR
Robert
Refresh my memory here -- this thread has picked up another poster and now I'm not sure who was asking what .
Are you running with a CARP configuration? If so, Suricata has never been tested or certified to run in that mode. One issue you would have is that "state" is not copied over for Suricata. So, Suricata's tracking of flows and TCP sessions would not seamlessly transition with a CARP failover the same as the
pf
firewall does. You would very possibly experience dropped traffic because the new Suricata instance would not recognize the existing flows from the previous CARP partner,I have not seen the behavior you describe with Suricata in my test environment (although I do not use it, nor have I ever tested it, with CARP.
Killing the binary is quite harsh and may well result in certain cleanup actions not happening making a subsequent restart attempt from the GUI fail.
-
@bmeeks: Let's forget CARP for now, it's not important. If you have a setup with physical WAN interface (no additional layer, like LACP, VLAN) and you happen to run Suricata on that WAN interface, then you have a pretty good chance that Suricata service cannot be stopped by the usual means (webgui service stop button or suricata.sh stop in CLI). When Suricata is stopped, the service receives a TERM signal and starts a normal shutdown procedure, which seems to cause through netmap device code in the kernel bouncing on the WAN interface, which leads to a bunch of other things like fireing up rc.newwanip and rc.start_packages. Subsequently, rc.start_packages restarts Suricata, and you find yourself back in the initial state. In the process you lost your WAN for several seconds, caused a massive CPU spike and achieved nothing. In the end, the service still runs, not to mention the confusion of the user who realises that Suricata, stopped a minute ago, is running again.
-
@RobertK-1 said in Suricata 7.0.2 service stop problem:
If you have a setup with physical WAN interface (no additional layer, like LACP, VLAN) and you happen to run Suricata on that WAN interface, then you have a pretty good chance that Suricata service cannot be stopped by the usual means (webgui service stop button or suricata.sh stop in CLI).
This is not true at all for most users, or else I would expect the forum to be flooded with posts similar to yours. Not saying you are not seeing the issue, but I am saying it appears specific to your setup rather than being a generic issue with the package code.
I run Suricata on the WAN all the time in my test virtual machines so that I can easily throw
nmap
attack scans at the test firewall. It works just fine and immediately shuts down cleanly using the GUI icons on the INTERFACES tab. I run it both in Legacy Mode and Inline IPS Mode - swapping back and forth between them for different tests.Do you by chance have (or have you had in the past) the Service Watchdog package configured to monitor Suricata? If so, that will most definitely cause your problem.
-
@bmeeks. Thanks for the tip, watchdog is (and were) not installed. Funny thing is: now that I "need" it, I can't reproduce this problem. It's a rare occasion when I have to stop Suricat for some reason, but i remember that i've seen the problem last weekend on one of our firewalls. But not now. I've tried all of them through Ansible (suricata.sh stop) and all Suricatas stopped as they should. However there were slight changes since the weekend, hardware checksum offloading was turned off everywhere, all units were rebooted, Suricata was upgraded from 7.0.2_3 to 7.0.3 and there were some Suppress list changes (not that it counts). Strange but basically good
-
@bmeeks -- I've managed to reproduce the problem, and essentially, it's a user error on my part. It turned out that the enabled hardware checksum offloading option caused the interface to bounce during Suricata service stop. If the interface happens to be a WAN interface, then rc.newwanip and rc.start_packages come into play, resulting in a silent Suricata restart. Following best practices and have hardware checksum offloading disabled can help avoid all of these issues.
-
@RobertK-1 said in Suricata 7.0.2 service stop problem:
@bmeeks -- I've managed to reproduce the problem, and essentially, it's a user error on my part. It turned out that the enabled hardware checksum offloading option caused the interface to bounce during Suricata service stop. If the interface happens to be a WAN interface, then rc.newwanip and rc.start_packages come into play, resulting in a silent Suricata restart. Following best practices and have hardware checksum offloading disabled can help avoid all of these issues.
Thank you for the feedback. The info you provided helped me piece together what is going on.
Some time back Suricata upstream made some changes around the PCAP packet capture mode of operation. PCAP is used in Legacy Blocking Mode with Suricata. When Suricata attempts to bring up a configured interface, it checks for certain options being enabled and attempts to disable them when necessary. Hardware offloading is one of those options. When you disable interface options, FreeBSD will cycle the interface. That leads to the double-start (or apparent restart) issue you were seeing.
It's not something the GUI package code is doing. It's something the Suricata binary used from OISF upstream is doing. Turning off all those NIC hardware offloading options within pfSense is the best fix as then the Suricata binary sees they are not enabled, so it does not attempt to disable them.