Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Number of running processes increasing

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    18 Posts 5 Posters 2.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jonybat @jonybat
      last edited by

      Just came to mind that this could be triggered by the zabbix monitoring scripts. We are using https://github.com/rbicelli/pfsense-zabbix-template

      The default monitoring template calls the functions get_system_pkg_version(), get_system_pkg_version()['version'] and get_system_pkg_version()['installed_version'] from pkg-utils.inc once per day, which i think that could explain why is the update_pkg_metadata script being executed.

      What it doesn't explain is why it gets stuck...

      Disabled those monitoring items in the meanwhile.

      C 1 Reply Last reply Reply Quote 0
      • C
        cneep @jonybat
        last edited by

        @jonybat Ran across this thread, which described my own situation very closely and thought I'd chime in with perhaps another datapoint.

        This is a single, non-HA SG-3100 21.02.2-RELEASE with 181 days continuous uptime since last reboot. Three Packages: Cron, zabbix-agent52, zabbix-proxy52

        Like you, I'm seeing a steady increase in processes. Mine started on ~8/16/2021 and has steadily increased for the past ~70 days up to ~400 processes as of today. Prior to 8/16/2021, the number of processes was consistent at ~100 for the entire monitored history (just over 1 year).

        In my case:

        ps axl | wc -l
             406
        
        ps axl | grep "/bin/sh /etc/rc.update_pkg_metadata" | wc -l
             144
        

        ...all in wait state, like yours

        ps axl | grep "/usr/sbin/gpioctl -f /dev/gpioc2 3 duty 150" | wc -l
             70
        

        ...all in iircreq state, like yours

        ps axl | grep "/bin/sh /usr/local/sbin/pfSense-led.sh update 1" | wc -l
             72
        

        ...all in wait state, like yours

        I have not yet rebooted or upgraded but those were going to be my initial actions after a quick bit of research on the issue (which led me here). Your comments don't give me a lot of hope that either will actually resolve the issue, though.

        No particularly useful information from me so far other than to simply indicate that you don't appear to be a unique case.

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Hmm, what changed on that date to cause them to start rising? Something locally to you?

          Steve

          C 1 Reply Last reply Reply Quote 0
          • C
            cneep @stephenw10
            last edited by

            @stephenw10 Nothing at all that I can determine. The last intended change would have been around the last reboot, which is confirmed by the Config History. That occurred ~4 months before Zabbix shows the processes started increasing. The firewall would have been left alone to fend for itself for ~4 months prior and a 1.5-2 months after the number of processes suddenly started increasing.

            Based on my calculations, the OP's problem started ~June 15, 2021. My seemingly identical problem started ~August 16, 2021. I had originally thought that perhaps the process that checks for updates triggered the problem perhaps at the discovery of an update. It would be an external influence on an otherwise "static" firewall (config-wise). But I don't think the release dates match up exactly with when the problem started. Close...a couple of weeks, maybe, but not exact, I don't think. Just a thought that didn't seem to pan out...

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Mmm, that seems likely but there was nothing released on either of those dates that might have caused it. Unless the system was tracking dev snapshots.

              If you reboot do the processes immediately start stacking up again?

              I assume if you disable Zabbix it stops?

              Steve

              1 Reply Last reply Reply Quote 0
              • W
                wblanton
                last edited by wblanton

                Coming here to mention that this is currently occurring on my SG-3100 running 21.05.1. It started on 10/29/2021 on the 00:01 cron event to run "rc.update_pkg_metadata". That event is still showing in my processes showing with a state of "IN". Later that day, I show the same three processes listed above that all started at the same time and all are showing with state of "IN".

                This patterns has repeated every day since, where is this an "update_pkg_metadata" at 00:01, and then the set of three processes at some point randomly that day. According to the status monitoring page, I was steady around ~164 processes running before 10/29. As of right now, I have 250 processes.

                Also, I do not have zabbix installed, but I do have NRPE (which is what alerted me to the issue).

                Edited to show SW version.

                1 Reply Last reply Reply Quote 0
                • W
                  wblanton
                  last edited by

                  I'm not able to physically access this device, and no one else will be until next week. In the meantime, is there any reason I can't run the following commands to just kill these processes?

                  pkill -f "sh /etc/rc.update_pkg_metadata"
                  pkill -f "sh /usr/local/sbin/pfSense-led.sh update 1"
                  pkill gpioctl
                  
                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Yeah, you should be able to do that. Though those processes are not actually doing anything.

                    Power cycling the appliance when you can is what you should be looking to do.

                    Steve

                    W 1 Reply Last reply Reply Quote 0
                    • W
                      wblanton @stephenw10
                      last edited by

                      @stephenw10 Thanks. We will definitely be trying to do that next week when someone is one site. Right now, I'm just trying to at least get Nagios happy. lol

                      I know that @jimp said that this only happens on rare occasions, but I'm really curious about how often this happens without the users ever noticing. I'm sure most users aren't monitoring things like process counts, so this may really be happening more than one would think.

                      In any event, please let me know if there are any logs or anything like that I can get over to y'all for diagnostics.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Thanks. If we think of anything that might be available to check I'll ask.
                        I've managed to hit it a few times myself and never found anything unusual beyond the gpio driver failing to return. I agree, I think I found it only because I was looking for it.

                        Steve

                        1 Reply Last reply Reply Quote 1
                        • J
                          jonybat
                          last edited by

                          Just as an update for whoever might run into this, we haven't experienced it since the "available version" zabbix items have been disabled and rebooting the gateways afterwards. That has been over 2 months ago.

                          You can check more info in my previous comment.

                          There is still the open question on why do the processes get stuck during this check, but since this isn't that important, I'm going to leave it like this.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.