Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Suricata stops randomly with "stale" PID file.

    Scheduled Pinned Locked Moved IDS/IPS
    9 Posts 3 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      ma0f97
      last edited by ma0f97

      Hey Suricata seems to randomly stop on 2 of my 3 interfaces (LAN and WAN, OPT1 which is Wireguard is fine). The bad thing is: I don't see any fatal error in the Suricata.log file:

      27/2/2022 -- 21:04:00 - <Notice> -- This is Suricata version 6.0.4 RELEASE running in SYSTEM mode
      27/2/2022 -- 21:04:00 - <Info> -- CPUs/cores online: 3
      27/2/2022 -- 21:04:00 - <Info> -- SSSE3 support not detected, disabling Hyperscan for MPM
      27/2/2022 -- 21:04:00 - <Info> -- SSSE3 support not detected, disabling Hyperscan for SPM
      27/2/2022 -- 21:04:00 - <Info> -- HTTP memcap: 67108864
      27/2/2022 -- 21:04:00 - <Info> -- fast output device (regular) initialized: alerts.log
      27/2/2022 -- 21:04:00 - <Info> -- http-log output device (regular) initialized: http.log
      27/2/2022 -- 21:04:00 - <Info> -- Using log dir /var/log/suricata/suricata_vtnet139657
      27/2/2022 -- 21:04:00 - <Info> -- Selected pcap-log compression method: none
      27/2/2022 -- 21:04:00 - <Info> -- using normal logging
      27/2/2022 -- 21:04:00 - <Info> -- eve-log output device (regular) initialized: eve.json
      27/2/2022 -- 21:04:00 - <Info> -- Going to log the md5 sum of email subject
      27/2/2022 -- 21:04:00 - <Warning> -- [ERRCODE: SC_WARN_NO_STATS_LOGGERS(261)] - stats are enabled but no loggers are active
      27/2/2022 -- 21:04:00 - <Info> -- SSSE3 support not detected, disabling Hyperscan for SPM
      27/2/2022 -- 21:04:00 - <Error> -- [ERRCODE: SC_ERR_INVALID_SIGNATURE(39)] - previous keyword has a fast_pattern:only; set. Can't have relative keywords around a fast_pattern only content
      [...]
      27/2/2022 -- 21:04:05 - <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'file.pdf&file.ttf' is checked but not set. Checked in 28585 and 1 other sigs
      27/2/2022 -- 21:04:05 - <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'file.ppsx&file.zip' is checked but not set. Checked in 26068 and 1 other sigs
      27/2/2022 -- 21:04:38 - <Info> -- Using 1 live device(s).
      27/2/2022 -- 21:04:38 - <Info> -- using interface vtnet1
      27/2/2022 -- 21:04:38 - <Info> -- running in 'auto' checksum mode. Detection of interface state will require 1000ULL packets
      27/2/2022 -- 21:04:38 - <Info> -- Set snaplen to 1518 for 'vtnet1'
      27/2/2022 -- 21:04:38 - <Info> -- Initializing PCAP ring buffer for /var/log/suricata/suricata_vtnet139657/log.pcap.
      27/2/2022 -- 21:04:38 - <Notice> -- Ring buffer initialized with 4 files.
      27/2/2022 -- 21:04:38 - <Info> -- RunModeIdsPcapAutoFp initialised
      27/2/2022 -- 21:04:38 - <Notice> -- all 4 packet processing threads, 2 management threads initialized, engine started.
      27/2/2022 -- 21:05:08 - <Info> -- No packets with invalid checksum, assuming checksum offloading is NOT used
      

      When I click on the start button in the Interface tab it won't start because it says:

      27/2/2022 -- 21:12:32 - <Notice> -- This is Suricata version 6.0.4 RELEASE running in SYSTEM mode
      27/2/2022 -- 21:12:32 - <Info> -- CPUs/cores online: 3
      27/2/2022 -- 21:12:32 - <Info> -- SSSE3 support not detected, disabling Hyperscan for MPM
      27/2/2022 -- 21:12:32 - <Info> -- SSSE3 support not detected, disabling Hyperscan for SPM
      27/2/2022 -- 21:12:32 - <Info> -- HTTP memcap: 67108864
      27/2/2022 -- 21:12:32 - <Error> -- [ERRCODE: SC_ERR_INITIALIZATION(45)] - pid file '/var/run/suricata_vtnet139657.pid' exists but appears stale. Make sure Suricata is not running and then remove /var/run/suricata_vtnet139657.pid. Aborting!
      

      Removing this stale PID file makes me able to start the interface again but after a few minutes it will be red again.

      I am using Suricata 6.0.4 with DISABLED blocking mode on all interfaces. I use ET (free), Snort Community and Snort paid rules (and the one from Suricata itself).

      Can anybody help me? I would appreciate it.

      S 1 Reply Last reply Reply Quote 0
      • S
        SteveITS Galactic Empire @ma0f97
        last edited by

        @ma0f97 The .pid is left behind when it crashes. Is there anything in the system log file? Out of memory?

        You may not need to run it on WAN as well as internal networks; scanning on WAN happens before the firewall blocks packets so it will end up scanning a lot of packets that the firewall will immediately discard.

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote ๐Ÿ‘ helpful posts!

        M 1 Reply Last reply Reply Quote 1
        • M
          ma0f97 @SteveITS
          last edited by

          @steveits Ah good idea with the system.log I got the following:

          <2>1 2022-02-27T20:57:05.836728+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
          <2>1 2022-02-27T20:57:05.836770+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
          <2>1 2022-02-27T20:57:05.836835+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
          <2>1 2022-02-27T20:57:05.836893+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(1): failed
          <3>1 2022-02-27T20:57:11.928940+01:00 PfSense.pfsense.pve kernel - - - pid 11955 (suricata), jid 0, uid 0, was killed: out of swap space
          <6>1 2022-02-27T20:57:11.929050+01:00 PfSense.pfsense.pve kernel - - - vtnet1: promiscuous mode disabled
          <6>1 2022-02-27T20:57:13.396581+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode enabled
          <3>1 2022-02-27T20:58:40.546557+01:00 PfSense.pfsense.pve kernel - - - pid 37067 (suricata), jid 0, uid 0, was killed: out of swap space
          <3>1 2022-02-27T20:59:33.466802+01:00 PfSense.pfsense.pve kernel - - - pid 97260 (suricata), jid 0, uid 0, was killed: out of swap space
          <3>1 2022-02-27T20:59:36.710826+01:00 PfSense.pfsense.pve kernel - - - pid 75009 (suricata), jid 0, uid 0, was killed: out of swap space
          <6>1 2022-02-27T20:59:36.710867+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode disabled
          <13>1 2022-02-27T21:00:00.263859+01:00 PfSense.pfsense.pve php 57413 - - [pfBlockerNG] Starting cron process.
          <6>1 2022-02-27T21:00:08.488845+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode enabled
          <3>1 2022-02-27T21:00:26.606596+01:00 PfSense.pfsense.pve kernel - - - pid 64071 (suricata), jid 0, uid 0, was killed: out of swap space
          <3>1 2022-02-27T21:00:27.866572+01:00 PfSense.pfsense.pve kernel - - - pid 64398 (suricata), jid 0, uid 0, was killed: out of swap space
          <6>1 2022-02-27T21:00:27.866656+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode disabled
          <6>1 2022-02-27T21:01:08.826772+01:00 PfSense.pfsense.pve kernel - - - vtnet0: promiscuous mode enabled
          <3>1 2022-02-27T21:02:22.354979+01:00 PfSense.pfsense.pve kernel - - - pid 77313 (suricata), jid 0, uid 0, was killed: out of swap space
          <3>1 2022-02-27T21:02:56.766758+01:00 PfSense.pfsense.pve kernel - - - pid 58351 (suricata), jid 0, uid 0, was killed: out of swap space
          <3>1 2022-02-27T21:03:32.886845+01:00 PfSense.pfsense.pve kernel - - - pid 89124 (suricata), jid 0, uid 0, was killed: out of swap space
          <2>1 2022-02-27T21:04:33.786861+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
          <2>1 2022-02-27T21:04:33.786933+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(24): failed
          <2>1 2022-02-27T21:04:33.996622+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
          <2>1 2022-02-27T21:04:33.996771+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(24): failed
          <2>1 2022-02-27T21:04:34.626791+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(32): failed
          <2>1 2022-02-27T21:04:34.626860+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(24): failed
          <2>1 2022-02-27T21:04:34.626890+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(12): failed
          <2>1 2022-02-27T21:04:34.626918+01:00 PfSense.pfsense.pve kernel - - - swap_pager_getswapspace(23): failed
          

          I guess it has to do something with swap space. What is the best way to mitigate this? Disable swap? Make swap bigger?

          S 1 Reply Last reply Reply Quote 0
          • S
            SteveITS Galactic Empire @ma0f97
            last edited by

            @ma0f97 Are you on 22.01/2.6? There was a memory leak in the pcscd service. If you aren't using IPSec you can stop it, then either upgrade or there is a patch to disable it properly in older versions if you look for threads. (it might be in the new System Patches package if that's available in older versions)

            If you are using IPSec then you have to stop IPSec before pcscd, then start IPSec again, or just reboot the router for a temporary fix. Otherwise IPSec logs a lot of errors.

            Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
            When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
            Upvote ๐Ÿ‘ helpful posts!

            M 1 Reply Last reply Reply Quote 1
            • M
              ma0f97 @SteveITS
              last edited by

              @steveits No I am on

              2.5.2-RELEASE (amd64)
              built on Fri Jul 02 15:33:00 EDT 2021
              FreeBSD 12.2-STABLE

              so I guess I am not affected.

              I think I might add 2GiB of SWAP to the disk following this manual:
              https://people.freebsd.org/~blackend/en_US.ISO8859-1/books/handbook/adding-swap-space.html

              S 1 Reply Last reply Reply Quote 0
              • S
                SteveITS Galactic Empire @ma0f97
                last edited by

                @ma0f97 Sorry, to be clear the pcscd service was disabled (memory leak fixed) in 2.6.

                Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                Upvote ๐Ÿ‘ helpful posts!

                M 1 Reply Last reply Reply Quote 1
                • M
                  ma0f97 @SteveITS
                  last edited by

                  @steveits Hm starting the interfaces the SWAP was at 25% or so, but RAM was maxed out. The moment I disabled pcscd, the SWAP was immediately at 99%.
                  Either way I am upgrading now and hope nothing breaks.

                  1 Reply Last reply Reply Quote 0
                  • bmeeksB
                    bmeeks
                    last edited by bmeeks

                    Something is chewing up your RAM, most likely the pcscd daemon as @SteveITS mentioned. That issue is solved in the latest pfSense release.

                    The stale PID file is a symptom of another issue, not a problem in and of itself.

                    When your system runs out of available RAM, data within memory (RAM hardware) that is not actively being accessed is written out to a special disk area called swap space. This makes room in hardware RAM for immediate needs. But on the next context switch, some of that data written out to swap has to be read back in. That whole disk I/O business makes your box super sluggish.

                    You essentially NEVER want to see any swap space in use except in very extreme and rare temporary conditions. But if something is chewing up RAM and not releasing it, then the box runs out of physical RAM and starts using the swap area as a fallback. When it then also runs out of swap space, it's "game over" ... ๐Ÿ˜ž.

                    M 1 Reply Last reply Reply Quote 1
                    • M
                      ma0f97 @bmeeks
                      last edited by

                      @bmeeks Hello, thanks for the detailed explanation, I updated the Pfsense (nothing broke ๐Ÿ˜ฎ) and also gave my machine 1GiB more RAM and the Interfaces are now stable and didnโ€™t crash a single time! Only thing that wonders me now is why my Proxmox PVE (A VM management OS) did show that only half of the available RAM was used when in fact Pfsense showed 99%. When I use top and look at the Mem stats, I see that the memory itself is the same as reported to Proxmox but there is an additional (about the same size) portion of โ€žlaundryโ€œ memory in use, whatever this means.

                      Anyway the problem I described is now solved thanks again guys.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.