• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Suricata stops randomly with "stale" PID file.

Scheduled Pinned Locked Moved IDS/IPS
9 Posts 3 Posters 1.7k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    ma0f97
    last edited by ma0f97 Feb 27, 2022, 8:25 PM Feb 27, 2022, 8:23 PM

    Hey Suricata seems to randomly stop on 2 of my 3 interfaces (LAN and WAN, OPT1 which is Wireguard is fine). The bad thing is: I don't see any fatal error in the Suricata.log file:

    27/2/2022 -- 21:04:00 - <Notice> -- This is Suricata version 6.0.4 RELEASE running in SYSTEM mode
    27/2/2022 -- 21:04:00 - <Info> -- CPUs/cores online: 3
    27/2/2022 -- 21:04:00 - <Info> -- SSSE3 support not detected, disabling Hyperscan for MPM
    27/2/2022 -- 21:04:00 - <Info> -- SSSE3 support not detected, disabling Hyperscan for SPM
    27/2/2022 -- 21:04:00 - <Info> -- HTTP memcap: 67108864
    27/2/2022 -- 21:04:00 - <Info> -- fast output device (regular) initialized: alerts.log
    27/2/2022 -- 21:04:00 - <Info> -- http-log output device (regular) initialized: http.log
    27/2/2022 -- 21:04:00 - <Info> -- Using log dir /var/log/suricata/suricata_vtnet139657
    27/2/2022 -- 21:04:00 - <Info> -- Selected pcap-log compression method: none
    27/2/2022 -- 21:04:00 - <Info> -- using normal logging
    27/2/2022 -- 21:04:00 - <Info> -- eve-log output device (regular) initialized: eve.json
    27/2/2022 -- 21:04:00 - <Info> -- Going to log the md5 sum of email subject
    27/2/2022 -- 21:04:00 - <Warning> -- [ERRCODE: SC_WARN_NO_STATS_LOGGERS(261)] - stats are enabled but no loggers are active
    27/2/2022 -- 21:04:00 - <Info> -- SSSE3 support not detected, disabling Hyperscan for SPM
    27/2/2022 -- 21:04:00 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - previous keyword has a fast_pattern:only; set. Can't have relative keywords around a fast_pattern only content
    [...]
    27/2/2022 -- 21:04:05 - <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'file.pdf&file.ttf' is checked but not set. Checked in 28585 and 1 other sigs
    27/2/2022 -- 21:04:05 - <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'file.ppsx&file.zip' is checked but not set. Checked in 26068 and 1 other sigs
    27/2/2022 -- 21:04:38 - <Info> -- Using 1 live device(s).
    27/2/2022 -- 21:04:38 - <Info> -- using interface vtnet1
    27/2/2022 -- 21:04:38 - <Info> -- running in 'auto' checksum mode. Detection of interface state will require 1000ULL packets
    27/2/2022 -- 21:04:38 - <Info> -- Set snaplen to 1518 for 'vtnet1'
    27/2/2022 -- 21:04:38 - <Info> -- Initializing PCAP ring buffer for /var/log/suricata/suricata_vtnet139657/log.pcap.
    27/2/2022 -- 21:04:38 - <Notice> -- Ring buffer initialized with 4 files.
    27/2/2022 -- 21:04:38 - <Info> -- RunModeIdsPcapAutoFp initialised
    27/2/2022 -- 21:04:38 - <Notice> -- all 4 packet processing threads, 2 management threads initialized, engine started.
    27/2/2022 -- 21:05:08 - <Info> -- No packets with invalid checksum, assuming checksum offloading is NOT used
    

    When I click on the start button in the Interface tab it won't start because it says:

    27/2/2022 -- 21:12:32 - <Notice> -- This is Suricata version 6.0.4 RELEASE running in SYSTEM mode
    27/2/2022 -- 21:12:32 - <Info> -- CPUs/cores online: 3
    27/2/2022 -- 21:12:32 - <Info> -- SSSE3 support not detected, disabling Hyperscan for MPM
    27/2/2022 -- 21:12:32 - <Info> -- SSSE3 support not detected, disabling Hyperscan for SPM
    27/2/2022 -- 21:12:32 - <Info> -- HTTP memcap: 67108864
    27/2/2022 -- 21:12:32 - <Error> -- [ERRCODE: SC_ERR_INITIALIZATION(45)] - pid file '/var/run/suricata_vtnet139657.pid' exists but appears stale. Make sure Suricata is not running and then remove /var/run/suricata_vtnet139657.pid. Aborting!
    

    Removing this stale PID file makes me able to start the interface again but after a few minutes it will be red again.

    I am using Suricata 6.0.4 with DISABLED blocking mode on all interfaces. I use ET (free), Snort Community and Snort paid rules (and the one from Suricata itself).

    Can anybody help me? I would appreciate it.

    S 1 Reply Last reply Feb 27, 2022, 9:23 PM Reply Quote 0
    • S
      SteveITS Galactic Empire @ma0f97
      last edited by Feb 27, 2022, 9:23 PM

      @ma0f97 The .pid is left behind when it crashes. Is there anything in the system log file? Out of memory?

      You may not need to run it on WAN as well as internal networks; scanning on WAN happens before the firewall blocks packets so it will end up scanning a lot of packets that the firewall will immediately discard.

      Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
      When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
      Upvote ๐Ÿ‘ helpful posts!

      M 1 Reply Last reply Feb 27, 2022, 9:38 PM Reply Quote 1
      • M
        ma0f97 @SteveITS
        last edited by Feb 27, 2022, 9:38 PM

        @steveits Ah good idea with the system.log I got the following:

        <2>1 2022-02-27T20:57:05.836728+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
        <2>1 2022-02-27T20:57:05.836770+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
        <2>1 2022-02-27T20:57:05.836835+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
        <2>1 2022-02-27T20:57:05.836893+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(1): failed
        <3>1 2022-02-27T20:57:11.928940+01:00 PfSense.pfsense.pve kernel - - - pid 11955 (suricata), jid 0, uid 0, was killed: out of swap space
        <6>1 2022-02-27T20:57:11.929050+01:00 PfSense.pfsense.pve kernel - - - vtnet1: promiscuous mode disabled
        <6>1 2022-02-27T20:57:13.396581+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode enabled
        <3>1 2022-02-27T20:58:40.546557+01:00 PfSense.pfsense.pve kernel - - - pid 37067 (suricata), jid 0, uid 0, was killed: out of swap space
        <3>1 2022-02-27T20:59:33.466802+01:00 PfSense.pfsense.pve kernel - - - pid 97260 (suricata), jid 0, uid 0, was killed: out of swap space
        <3>1 2022-02-27T20:59:36.710826+01:00 PfSense.pfsense.pve kernel - - - pid 75009 (suricata), jid 0, uid 0, was killed: out of swap space
        <6>1 2022-02-27T20:59:36.710867+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode disabled
        <13>1 2022-02-27T21:00:00.263859+01:00 PfSense.pfsense.pve php 57413 - - [pfBlockerNG] Starting cron process.
        <6>1 2022-02-27T21:00:08.488845+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode enabled
        <3>1 2022-02-27T21:00:26.606596+01:00 PfSense.pfsense.pve kernel - - - pid 64071 (suricata), jid 0, uid 0, was killed: out of swap space
        <3>1 2022-02-27T21:00:27.866572+01:00 PfSense.pfsense.pve kernel - - - pid 64398 (suricata), jid 0, uid 0, was killed: out of swap space
        <6>1 2022-02-27T21:00:27.866656+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode disabled
        <6>1 2022-02-27T21:01:08.826772+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode enabled
        <3>1 2022-02-27T21:02:22.354979+01:00 PfSense.pfsense.pve kernel - - - pid 77313 (suricata), jid 0, uid 0, was killed: out of swap space
        <3>1 2022-02-27T21:02:56.766758+01:00 PfSense.pfsense.pve kernel - - - pid 58351 (suricata), jid 0, uid 0, was killed: out of swap space
        <3>1 2022-02-27T21:03:32.886845+01:00 PfSense.pfsense.pve kernel - - - pid 89124 (suricata), jid 0, uid 0, was killed: out of swap space
        <2>1 2022-02-27T21:04:33.786861+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
        <2>1 2022-02-27T21:04:33.786933+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(24): failed
        <2>1 2022-02-27T21:04:33.996622+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
        <2>1 2022-02-27T21:04:33.996771+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(24): failed
        <2>1 2022-02-27T21:04:34.626791+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
        <2>1 2022-02-27T21:04:34.626860+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(24): failed
        <2>1 2022-02-27T21:04:34.626890+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(12): failed
        <2>1 2022-02-27T21:04:34.626918+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(23): failed
        

        I guess it has to do something with swap space. What is the best way to mitigate this? Disable swap? Make swap bigger?

        S 1 Reply Last reply Feb 27, 2022, 9:47 PM Reply Quote 0
        • S
          SteveITS Galactic Empire @ma0f97
          last edited by Feb 27, 2022, 9:47 PM

          @ma0f97 Are you on 22.01/2.6? There was a memory leak in the pcscd service. If you aren't using IPSec you can stop it, then either upgrade or there is a patch to disable it properly in older versions if you look for threads. (it might be in the new System Patches package if that's available in older versions)

          If you are using IPSec then you have to stop IPSec before pcscd, then start IPSec again, or just reboot the router for a temporary fix. Otherwise IPSec logs a lot of errors.

          Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
          When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
          Upvote ๐Ÿ‘ helpful posts!

          M 1 Reply Last reply Feb 27, 2022, 9:51 PM Reply Quote 1
          • M
            ma0f97 @SteveITS
            last edited by Feb 27, 2022, 9:51 PM

            @steveits No I am on

            2.5.2-RELEASE (amd64)
            built on Fri Jul 02 15:33:00 EDT 2021
            FreeBSD 12.2-STABLE

            so I guess I am not affected.

            I think I might add 2GiB of SWAP to the disk following this manual:
            https://people.freebsd.org/~blackend/en_US.ISO8859-1/books/handbook/adding-swap-space.html

            S 1 Reply Last reply Feb 27, 2022, 9:53 PM Reply Quote 0
            • S
              SteveITS Galactic Empire @ma0f97
              last edited by Feb 27, 2022, 9:53 PM

              @ma0f97 Sorry, to be clear the pcscd service was disabled (memory leak fixed) in 2.6.

              Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
              When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
              Upvote ๐Ÿ‘ helpful posts!

              M 1 Reply Last reply Feb 27, 2022, 9:59 PM Reply Quote 1
              • M
                ma0f97 @SteveITS
                last edited by Feb 27, 2022, 9:59 PM

                @steveits Hm starting the interfaces the SWAP was at 25% or so, but RAM was maxed out. The moment I disabled pcscd, the SWAP was immediately at 99%.
                Either way I am upgrading now and hope nothing breaks.

                1 Reply Last reply Reply Quote 0
                • B
                  bmeeks
                  last edited by bmeeks Mar 1, 2022, 3:02 PM Feb 28, 2022, 4:11 PM

                  Something is chewing up your RAM, most likely the pcscd daemon as @SteveITS mentioned. That issue is solved in the latest pfSense release.

                  The stale PID file is a symptom of another issue, not a problem in and of itself.

                  When your system runs out of available RAM, data within memory (RAM hardware) that is not actively being accessed is written out to a special disk area called swap space. This makes room in hardware RAM for immediate needs. But on the next context switch, some of that data written out to swap has to be read back in. That whole disk I/O business makes your box super sluggish.

                  You essentially NEVER want to see any swap space in use except in very extreme and rare temporary conditions. But if something is chewing up RAM and not releasing it, then the box runs out of physical RAM and starts using the swap area as a fallback. When it then also runs out of swap space, it's "game over" ... ๐Ÿ˜ž.

                  M 1 Reply Last reply Mar 1, 2022, 9:51 AM Reply Quote 1
                  • M
                    ma0f97 @bmeeks
                    last edited by Mar 1, 2022, 9:51 AM

                    @bmeeks Hello, thanks for the detailed explanation, I updated the Pfsense (nothing broke ๐Ÿ˜ฎ) and also gave my machine 1GiB more RAM and the Interfaces are now stable and didnโ€™t crash a single time! Only thing that wonders me now is why my Proxmox PVE (A VM management OS) did show that only half of the available RAM was used when in fact Pfsense showed 99%. When I use top and look at the Mem stats, I see that the memory itself is the same as reported to Proxmox but there is an additional (about the same size) portion of โ€žlaundryโ€œ memory in use, whatever this means.

                    Anyway the problem I described is now solved thanks again guys.

                    1 Reply Last reply Reply Quote 0
                    2 out of 9
                    • First post
                      2/9
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                      This community forum collects and processes your personal information.
                      consent.not_received