Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Number of running processes increasing

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    18 Posts 5 Posters 2.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Hmm, two of those are triggered by the 1st one because there is an update available and it's setting the LED to flash to tell you. I wouldn't expect them to stall out like that.

      You should upgrade though when you can. If not then you could disable the update check in System > Updates > Settings.

      Steve

      1 Reply Last reply Reply Quote 1
      • J
        jonybat
        last edited by

        Will reboot, update and disable dashboard update check. I'll report back if this happens again.

        Thanks

        J 1 Reply Last reply Reply Quote 0
        • J
          jonybat @jonybat
          last edited by

          This started happening again in one of the gateways. All of them were rebooted and updated 2 weeks ago. Dashboard update check has also been disabled.

          ps axl | grep "/bin/sh /etc/rc.update_pkg_metadata" | wc -l
                20
          
          ps axl | grep "/usr/sbin/gpioctl -f /dev/gpioc2 3 duty 150" | wc -l
                 9
          
          ps axl | grep "/bin/sh /usr/local/sbin/pfSense-led.sh update 1" | wc -l
                10
          
          uptime
          11:50AM  up 14 days, 21:06, 3 users, load averages: 0.76, 0.79, 0.75
          

          Any ideas?

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Are you running the 3rd party led_gateways script? That does seem to be inducing issues in 21.05 for some reasons. The author just updated it though.

            Steve

            J 1 Reply Last reply Reply Quote 0
            • J
              jonybat @stephenw10
              last edited by

              No, like i said before, these are fresh gateways, with only a couple of months, and the only package that we installed was zabbix-agent 5.2.

              J 1 Reply Last reply Reply Quote 0
              • J
                jonybat @jonybat
                last edited by

                Just came to mind that this could be triggered by the zabbix monitoring scripts. We are using https://github.com/rbicelli/pfsense-zabbix-template

                The default monitoring template calls the functions get_system_pkg_version(), get_system_pkg_version()['version'] and get_system_pkg_version()['installed_version'] from pkg-utils.inc once per day, which i think that could explain why is the update_pkg_metadata script being executed.

                What it doesn't explain is why it gets stuck...

                Disabled those monitoring items in the meanwhile.

                C 1 Reply Last reply Reply Quote 0
                • C
                  cneep @jonybat
                  last edited by

                  @jonybat Ran across this thread, which described my own situation very closely and thought I'd chime in with perhaps another datapoint.

                  This is a single, non-HA SG-3100 21.02.2-RELEASE with 181 days continuous uptime since last reboot. Three Packages: Cron, zabbix-agent52, zabbix-proxy52

                  Like you, I'm seeing a steady increase in processes. Mine started on ~8/16/2021 and has steadily increased for the past ~70 days up to ~400 processes as of today. Prior to 8/16/2021, the number of processes was consistent at ~100 for the entire monitored history (just over 1 year).

                  In my case:

                  ps axl | wc -l
                       406
                  
                  ps axl | grep "/bin/sh /etc/rc.update_pkg_metadata" | wc -l
                       144
                  

                  ...all in wait state, like yours

                  ps axl | grep "/usr/sbin/gpioctl -f /dev/gpioc2 3 duty 150" | wc -l
                       70
                  

                  ...all in iircreq state, like yours

                  ps axl | grep "/bin/sh /usr/local/sbin/pfSense-led.sh update 1" | wc -l
                       72
                  

                  ...all in wait state, like yours

                  I have not yet rebooted or upgraded but those were going to be my initial actions after a quick bit of research on the issue (which led me here). Your comments don't give me a lot of hope that either will actually resolve the issue, though.

                  No particularly useful information from me so far other than to simply indicate that you don't appear to be a unique case.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, what changed on that date to cause them to start rising? Something locally to you?

                    Steve

                    C 1 Reply Last reply Reply Quote 0
                    • C
                      cneep @stephenw10
                      last edited by

                      @stephenw10 Nothing at all that I can determine. The last intended change would have been around the last reboot, which is confirmed by the Config History. That occurred ~4 months before Zabbix shows the processes started increasing. The firewall would have been left alone to fend for itself for ~4 months prior and a 1.5-2 months after the number of processes suddenly started increasing.

                      Based on my calculations, the OP's problem started ~June 15, 2021. My seemingly identical problem started ~August 16, 2021. I had originally thought that perhaps the process that checks for updates triggered the problem perhaps at the discovery of an update. It would be an external influence on an otherwise "static" firewall (config-wise). But I don't think the release dates match up exactly with when the problem started. Close...a couple of weeks, maybe, but not exact, I don't think. Just a thought that didn't seem to pan out...

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Mmm, that seems likely but there was nothing released on either of those dates that might have caused it. Unless the system was tracking dev snapshots.

                        If you reboot do the processes immediately start stacking up again?

                        I assume if you disable Zabbix it stops?

                        Steve

                        1 Reply Last reply Reply Quote 0
                        • W
                          wblanton
                          last edited by wblanton

                          Coming here to mention that this is currently occurring on my SG-3100 running 21.05.1. It started on 10/29/2021 on the 00:01 cron event to run "rc.update_pkg_metadata". That event is still showing in my processes showing with a state of "IN". Later that day, I show the same three processes listed above that all started at the same time and all are showing with state of "IN".

                          This patterns has repeated every day since, where is this an "update_pkg_metadata" at 00:01, and then the set of three processes at some point randomly that day. According to the status monitoring page, I was steady around ~164 processes running before 10/29. As of right now, I have 250 processes.

                          Also, I do not have zabbix installed, but I do have NRPE (which is what alerted me to the issue).

                          Edited to show SW version.

                          1 Reply Last reply Reply Quote 0
                          • W
                            wblanton
                            last edited by

                            I'm not able to physically access this device, and no one else will be until next week. In the meantime, is there any reason I can't run the following commands to just kill these processes?

                            pkill -f "sh /etc/rc.update_pkg_metadata"
                            pkill -f "sh /usr/local/sbin/pfSense-led.sh update 1"
                            pkill gpioctl
                            
                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Yeah, you should be able to do that. Though those processes are not actually doing anything.

                              Power cycling the appliance when you can is what you should be looking to do.

                              Steve

                              W 1 Reply Last reply Reply Quote 0
                              • W
                                wblanton @stephenw10
                                last edited by

                                @stephenw10 Thanks. We will definitely be trying to do that next week when someone is one site. Right now, I'm just trying to at least get Nagios happy. lol

                                I know that @jimp said that this only happens on rare occasions, but I'm really curious about how often this happens without the users ever noticing. I'm sure most users aren't monitoring things like process counts, so this may really be happening more than one would think.

                                In any event, please let me know if there are any logs or anything like that I can get over to y'all for diagnostics.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Thanks. If we think of anything that might be available to check I'll ask.
                                  I've managed to hit it a few times myself and never found anything unusual beyond the gpio driver failing to return. I agree, I think I found it only because I was looking for it.

                                  Steve

                                  1 Reply Last reply Reply Quote 1
                                  • J
                                    jonybat
                                    last edited by

                                    Just as an update for whoever might run into this, we haven't experienced it since the "available version" zabbix items have been disabled and rebooting the gateways afterwards. That has been over 2 months ago.

                                    You can check more info in my previous comment.

                                    There is still the open question on why do the processes get stuck during this check, but since this isn't that important, I'm going to leave it like this.

                                    1 Reply Last reply Reply Quote 1
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.