• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

24.03 causes sustained rise in processes count and memory usage.

Scheduled Pinned Locked Moved General pfSense Questions
42 Posts 5 Posters 2.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K
    keyser Rebel Alliance @dennypage
    last edited by May 8, 2024, 9:10 PM

    @dennypage Thanks - that helped.

    I have about 30.000 lines of these on the 6100 that I showed the monitoring graph from:

    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]
    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]
    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]
    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]
    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]
    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]
    root 0 0.0 6.1 0 503440 - DLs 24Apr24 0:00.00 [kernel/netlink_socket (PID]

    Any Ideas what that is about?

    Love the no fuss of using the official appliances :-)

    D 1 Reply Last reply May 8, 2024, 9:28 PM Reply Quote 0
    • K
      keyser Rebel Alliance @dennypage
      last edited by May 8, 2024, 9:25 PM

      @dennypage One thing that might be “rare” in my setup compared to others is the fact I’m using the new netflow dump feature - not globally but on a couple of specific rules.
      Could it be a leftover from the export feature?

      Love the no fuss of using the official appliances :-)

      1 Reply Last reply Reply Quote 0
      • D
        dennypage @keyser
        last edited by May 8, 2024, 9:28 PM

        @keyser Other than it's a kernel thread supporting a netlink connection, no. I don't have any on my system. But is this what is growing?

        Also, I see "PID" in your output. Is there more information in the output from ps?

        K 2 Replies Last reply May 8, 2024, 9:34 PM Reply Quote 0
        • K
          keyser Rebel Alliance @dennypage
          last edited by May 8, 2024, 9:34 PM

          @dennypage I assume this is the culprit as 30.000 of those threads is not normal - and very consistent with the growing list of processes (just passed 30.000).

          I can’t immediately connect those lines with anything in the ps output as there is no actual PID to match - just the text “PID”.

          Any pointers to what I could do to get some more usefull output?

          Love the no fuss of using the official appliances :-)

          D 1 Reply Last reply May 8, 2024, 9:36 PM Reply Quote 0
          • D
            dennypage @keyser
            last edited by May 8, 2024, 9:36 PM

            @keyser said in 24.03 causes sustained rise in processes count and memory usage.:

            I assume this is the culprit as 30.000 of those threads is not normal - and very consistent with the growing list of processes (just passed 30.000).

            Hang on... do you mean that there are "thirty thousand" of those processes?!?

            K 1 Reply Last reply May 8, 2024, 9:37 PM Reply Quote 0
            • S
              stephenw10 Netgate Administrator
              last edited by May 8, 2024, 9:36 PM

              If you disable the rules with the netflow data does it stop increasing?

              K 1 Reply Last reply May 8, 2024, 9:40 PM Reply Quote 0
              • K
                keyser Rebel Alliance @dennypage
                last edited by May 8, 2024, 9:37 PM

                @dennypage I should mention the problem is exactly the same on my 2100 ARM based box. Thousands of identical lines like those posted here - only the RSS number and date is different (because this box was rebooted the other day - therefore has only reached about 5000 processes until now).

                Love the no fuss of using the official appliances :-)

                1 Reply Last reply Reply Quote 0
                • K
                  keyser Rebel Alliance @dennypage
                  last edited by May 8, 2024, 9:37 PM

                  @dennypage said in 24.03 causes sustained rise in processes count and memory usage.:

                  @keyser said in 24.03 causes sustained rise in processes count and memory usage.:

                  I assume this is the culprit as 30.000 of those threads is not normal - and very consistent with the growing list of processes (just passed 30.000).

                  Hang on... do you mean that there are "thirty thousand" of those processes?!?

                  Yes - like the graph shows and so does my ps -Haxuww output.

                  Love the no fuss of using the official appliances :-)

                  D 1 Reply Last reply May 8, 2024, 9:39 PM Reply Quote 0
                  • D
                    dennypage @keyser
                    last edited by May 8, 2024, 9:39 PM

                    @keyser Wow. Yes, I would say that's a problem.

                    I would disable netflow (as @stephenw10 suggested) and see if it stops.

                    1 Reply Last reply Reply Quote 0
                    • K
                      keyser Rebel Alliance @stephenw10
                      last edited by May 8, 2024, 9:40 PM

                      @stephenw10 can’t really do that as those are my internet access rules 😂
                      I would have a family revolte on my hands if I try that….

                      But I could ask it not to dump flows on that rule and see if it stabilizes.
                      I will do that now, but it will take about a day before I can verify if that is the cause.

                      Love the no fuss of using the official appliances :-)

                      D 1 Reply Last reply May 8, 2024, 9:42 PM Reply Quote 0
                      • D
                        dennypage @keyser
                        last edited by May 8, 2024, 9:42 PM

                        @keyser I don't think you need to disable the rules, just turn off the netflow output.

                        1 Reply Last reply Reply Quote 0
                        • S
                          stephenw10 Netgate Administrator
                          last edited by May 8, 2024, 9:49 PM

                          Yup just disable pflow on them as a test.

                          Is that output truncated? Does it look like:

                          0 412221 kernel              netlink_socket (PID mi_switch _sleep taskqueue_thread_loop fork_exit fork_trampoline
                          
                          K 1 Reply Last reply May 8, 2024, 9:53 PM Reply Quote 0
                          • K
                            keyser Rebel Alliance @stephenw10
                            last edited by May 8, 2024, 9:53 PM

                            @stephenw10 No, it doesn’t seem truncated - there are other normal lines wastly longer than the 30 odd thousant lines that I posted a few of.
                            The post is a copy of the full lines shown from the ps output.

                            So no, it does not look like the one you posted.

                            Love the no fuss of using the official appliances :-)

                            1 Reply Last reply Reply Quote 0
                            • S
                              stephenw10 Netgate Administrator
                              last edited by May 8, 2024, 10:07 PM

                              Hmm. What do you from procstat -k <pid> using the ID of one of those?

                              K 1 Reply Last reply May 8, 2024, 10:19 PM Reply Quote 0
                              • K
                                keyser Rebel Alliance @stephenw10
                                last edited by May 8, 2024, 10:19 PM

                                @stephenw10 said in 24.03 causes sustained rise in processes count and memory usage.:

                                Hmm. What do you from procstat -k <pid> using the ID of one of those?

                                Since the PID is “0” in all the 30.000 lines, and only the text “PID” is mentioned at the end of each line, I don’t know which PID to actually use with your command.

                                Love the no fuss of using the official appliances :-)

                                1 Reply Last reply Reply Quote 0
                                • S
                                  stephenw10 Netgate Administrator
                                  last edited by May 8, 2024, 10:21 PM

                                  Ah, I see. Hmm....

                                  K 2 Replies Last reply May 9, 2024, 7:03 AM Reply Quote 0
                                  • K
                                    keyser Rebel Alliance @stephenw10
                                    last edited by May 9, 2024, 7:03 AM

                                    @stephenw10 @dennypage It seems its not related to the new netflow export feature. On one box I disabled the export on the two rules I’m monitoring (internet access), and on the other box I disabled netflow export globally in the menu (diabled the feature).

                                    On both boxes another ~ 500 processes was left stranded during the night and inactive memory went up a little more.
                                    Here’s the dump monitor info from the 6100 I showed in the beginning:

                                    4b1aa46b-5795-4a2c-96fe-3e8c3de2cc1e-image.png
                                    1aa6163a-aede-4b46-bd3c-d57b635a4a5a-image.png

                                    I disabled pfflow about 10 hours ago, and as the monitoring shows it’s still growing. The memory bump at 2:00am is pfblocker releading lists.

                                    Love the no fuss of using the official appliances :-)

                                    1 Reply Last reply Reply Quote 0
                                    • K
                                      keyser Rebel Alliance @stephenw10
                                      last edited by May 9, 2024, 8:34 AM

                                      @stephenw10 @dennypage I afterwards did some faultfinding logic by restarting services one at the time to see any impact on processes/memory, and I have found the culprit.

                                      The problem is related to the BSNMPD service (the built in SNMPD) that I’m using to monitor my pfSenses from Zabbix.
                                      When I restart that service all the thousands of stranded processes and their memory usage is freed, and the boxes are back to their expected levels.
                                      Obviously it starts climbing again, so what can I do to help you guys figure the root cause so it can be fixed?

                                      Any help on “debugging” whats causing BSNMPD to leave the processes(memory) stranded would be good - it would help me create a more specific redmine ticket on the issue.

                                      I’m using a community pfSense Template in Zabbix and using SMNPv2 which is all the buildin smnpd supports.

                                      Love the no fuss of using the official appliances :-)

                                      K 1 Reply Last reply May 9, 2024, 8:59 AM Reply Quote 0
                                      • K
                                        kprovost @keyser
                                        last edited by May 9, 2024, 8:59 AM

                                        @keyser That's very interesting. That might point in the direction of a netlink file descriptor leak in bsnmpd.

                                        We can probably confirm that with procstat fd <bsnmp pid>.

                                        Is there anything non-default about your snmpd configuration? I monitor my 2100 with librenms and don't see the leak.

                                        K 2 Replies Last reply May 9, 2024, 9:16 AM Reply Quote 0
                                        • K
                                          keyser Rebel Alliance @kprovost
                                          last edited by keyser May 9, 2024, 9:29 AM May 9, 2024, 9:16 AM

                                          @kprovost I restarted the BSNMPD service about half an hour ago, so there’s only about 50 stranded processes right now.
                                          Your command suggestion seems to indicate you are on the right track as there seems to be a similar amount of leftover references. Here’s the output:

                                          /root: procstat fd 89184
                                          PID COMM FD T V FLAGS REF OFFSET PRO NAME
                                          89184 bsnmpd text v r r------- - - - /usr/sbin/bsnmpd
                                          89184 bsnmpd cwd v d r------- - - - /
                                          89184 bsnmpd root v d r------- - - - /
                                          89184 bsnmpd 0 v c rw------ 3 0 - /dev/null
                                          89184 bsnmpd 1 v c rw------ 3 0 - /dev/null
                                          89184 bsnmpd 2 v c rw------ 3 0 - /dev/null
                                          89184 bsnmpd 3 s - rw------ 1 0 ?
                                          89184 bsnmpd 4 s - rw------ 1 0 UDP *:0 *:0
                                          89184 bsnmpd 5 s - rw------ 1 0 UDP 192.168.255.1:161 *:0
                                          89184 bsnmpd 6 s - rw------ 1 0 UDS 0 0 /var/run/snmpd.sock
                                          89184 bsnmpd 7 s - rw------ 1 0 UDD /var/run/log
                                          89184 bsnmpd 8 v c r------- 1 0 - /dev/pf
                                          89184 bsnmpd 9 s - rw------ 1 0 ?
                                          89184 bsnmpd 10 s - rw------ 1 0 ?
                                          89184 bsnmpd 11 s - rw------ 1 0 ?
                                          89184 bsnmpd 12 s - rw------ 1 0 ?
                                          89184 bsnmpd 13 s - rw------ 1 0 ?
                                          89184 bsnmpd 14 s - rw------ 1 0 ?
                                          89184 bsnmpd 15 s - rw------ 1 0 ?
                                          89184 bsnmpd 16 s - rw------ 1 0 ?
                                          89184 bsnmpd 17 s - rw------ 1 0 ?
                                          89184 bsnmpd 18 v c r------- 1 0 - /dev/null
                                          89184 bsnmpd 19 v c r------- 1 0 - /dev/null
                                          89184 bsnmpd 20 s - rw------ 1 0 UDS 0 0 /var/run/devd.pipe
                                          89184 bsnmpd 21 v c rw------ 1 0 - /dev/mdctl
                                          89184 bsnmpd 22 v c r------- 1 0 - /dev/null
                                          89184 bsnmpd 23 v c r------- 1 0 - /dev/null
                                          89184 bsnmpd 24 s - rw------ 1 0 ?
                                          89184 bsnmpd 25 s - rw------ 1 0 ?
                                          89184 bsnmpd 26 s - rw------ 1 0 ?
                                          89184 bsnmpd 27 s - rw------ 1 0 ?
                                          89184 bsnmpd 28 s - rw------ 1 0 ?
                                          89184 bsnmpd 29 s - rw------ 1 0 ?
                                          89184 bsnmpd 30 s - rw------ 1 0 ?
                                          89184 bsnmpd 31 s - rw------ 1 0 ?
                                          89184 bsnmpd 32 s - rw------ 1 0 ?
                                          89184 bsnmpd 33 s - rw------ 1 0 ?
                                          89184 bsnmpd 34 s - rw------ 1 0 ?
                                          89184 bsnmpd 35 s - rw------ 1 0 ?
                                          89184 bsnmpd 36 s - rw------ 1 0 ?
                                          89184 bsnmpd 37 s - rw------ 1 0 ?
                                          89184 bsnmpd 38 s - rw------ 1 0 ?
                                          89184 bsnmpd 39 s - rw------ 1 0 ?
                                          89184 bsnmpd 40 s - rw------ 1 0 ?
                                          89184 bsnmpd 41 s - rw------ 1 0 ?
                                          89184 bsnmpd 42 s - rw------ 1 0 ?
                                          89184 bsnmpd 43 s - rw------ 1 0 ?
                                          89184 bsnmpd 44 s - rw------ 1 0 ?
                                          89184 bsnmpd 45 s - rw------ 1 0 ?
                                          89184 bsnmpd 46 s - rw------ 1 0 ?
                                          89184 bsnmpd 47 s - rw------ 1 0 ?
                                          89184 bsnmpd 48 s - rw------ 1 0 ?
                                          89184 bsnmpd 49 s - rw------ 1 0 ?
                                          89184 bsnmpd 50 s - rw------ 1 0 ?
                                          89184 bsnmpd 51 s - rw------ 1 0 ?
                                          89184 bsnmpd 52 s - rw------ 1 0 ?
                                          89184 bsnmpd 53 s - rw------ 1 0 ?
                                          89184 bsnmpd 54 s - rw------ 1 0 ?
                                          89184 bsnmpd 55 s - rw------ 1 0 ?
                                          89184 bsnmpd 56 s - rw------ 1 0 ?
                                          89184 bsnmpd 57 s - rw------ 1 0 ?
                                          89184 bsnmpd 58 s - rw------ 1 0 ?
                                          89184 bsnmpd 59 s - rw------ 1 0 ?
                                          89184 bsnmpd 60 s - rw------ 1 0 ?
                                          89184 bsnmpd 61 s - rw------ 1 0 ?
                                          89184 bsnmpd 62 s - rw------ 1 0 ?
                                          89184 bsnmpd 63 s - rw------ 1 0 ?
                                          89184 bsnmpd 64 s - rw------ 1 0 ?
                                          89184 bsnmpd 65 s - rw------ 1 0 ?
                                          89184 bsnmpd 66 s - rw------ 1 0 ?
                                          89184 bsnmpd 67 s - rw------ 1 0 ?
                                          89184 bsnmpd 68 s - rw------ 1 0 ?
                                          89184 bsnmpd 69 s - rw------ 1 0 ?
                                          89184 bsnmpd 70 s - rw------ 1 0 ?
                                          89184 bsnmpd 71 s - rw------ 1 0 ?
                                          89184 bsnmpd 72 s - rw------ 1 0 ?

                                          Love the no fuss of using the official appliances :-)

                                          1 Reply Last reply Reply Quote 0
                                          12 out of 42
                                          • First post
                                            12/42
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received