Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login
    Introducing Netgate Nexus: Multi-Instance Management at Your Fingertips.

    CE 2.8.1 bsnmpd Memory Leak

    Scheduled Pinned Locked Moved General pfSense Questions
    70 Posts 10 Posters 8.5k Views 15 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Averlon
      last edited by

      https://cgit.freebsd.org/src/commit/?id=f1612e7087d7c3df766ff0bf58c48d02fb0e2f6d

      Thats pretty much the only commit I found. It comes from https://redmine.pfsense.org/issues/15481

      1 Reply Last reply Reply Quote 0
      • J Offline
        JD 0
        last edited by

        I've turned off bsnmp on my 4200. It's just leaking too badly -- it nearly ran out of swap overnight, and I also don't want to continue chewing through the lifespan on the flash. This was with Zabbix monitoring dialed back as far as I could reasonably set it and still retain any useful data.

        I've since stood up a FreeBSD host in my Proxmox environment and plan to test a little further there. Though I noticed, interestingly enough, that the bsnmp package is listed as "no maintainer". Perhaps bsnmp isn't the best package to use for this service -- though I understand the intent was for it to be "lightweight".

        keyserK 1 Reply Last reply Reply Quote 0
        • stephenw10S stephenw10 referenced this topic on
        • keyserK Online
          keyser Rebel Alliance @JD 0
          last edited by

          I have the exact same issue with my 6100’s being monitored by the same pfsense SNMP template from zabbix. I initially created another thread about the issue because I first noticed the behaviour by some massive memory jumps in the monitoring graphs, but that was a cosmetic thing. Upon investigation, the issue seems to be exactly the same as this thread.

          Here’s my thead that I have closed: https://forum.netgate.com/topic/200635/bsnmp-causing-massive-memory-use-spikes-since-26.03-update/5

          Love the no fuss of using the official appliances :-)

          1 Reply Last reply Reply Quote 0
          • keyserK Online
            keyser Rebel Alliance @stephenw10
            last edited by

            @stephenw10 Do you have any insights into how we might diagnose and identify the leak here?

            I initially tried disabling all queries related to pf filter stats because that has previously been a culprit in leaving open files (24.03), but that made no difference.

            There is no doubt it's a plain memory leak and it's related to the amount of queries you make as far as I can tell (I query more stats on one particular box, and the leak grows slightly faster on that)

            Love the no fuss of using the official appliances :-)

            K 1 Reply Last reply Reply Quote 0
            • K Online
              kprovost @keyser
              last edited by

              @keyser As I said in https://forum.netgate.com/post/1229469 : we have been unable to reproduce a leak.
              Try to narrow down the specific OID that triggers the leak. Probably the easiest way to do that is to disable the query to half of them and to see if the leak remains. If it does not, switch the disabled halves and try again.
              Once you identify a leaking half split that one in half and repeat the exercise until you have it narrowed down, ideally down to one, but just a handful would be a useful step already.

              keyserK 2 Replies Last reply Reply Quote 0
              • keyserK Online
                keyser Rebel Alliance @kprovost
                last edited by keyser

                @kprovost Just to let the thread know: 26.03.1 continues (as expected) to exibit the memory leak issue in BSNMP.
                I have been working diligently on identifying the issue and I’m getting pretty close. But it takes a sh**load of time to do conquer and divide on this issue as the leak is so slow it takes around 12 hours to reliably identify whether it is still leaking when some SNMP keys has been disabled in Zabbix.

                I will report back when I have some more hard evidence.

                Love the no fuss of using the official appliances :-)

                1 Reply Last reply Reply Quote 3
                • keyserK Online
                  keyser Rebel Alliance @kprovost
                  last edited by

                  @kprovost @stephenw10 Okay, I have some important initial findings to share on this issue now. I have been conducting a very structured conquer and divide strategy to isolate the issue in BSNMPD on my two "identical" SG6100's, and here are the details:

                  When the Zabbix Template item called "PFsense: Firewall Rules Count" is queried, the memory leak issues occur. I have inserted the specs for this item here:
                  0e30f873-8f91-4b71-a381-333e91b812b1-image.png

                  The OID is: .1.3.6.1.4.1.12325.1.200.1.11.1.0

                  The memory leak size is correlated to how often this OID is queried. By default it is done every minute which leads to a slow memory leak of < 100MB in a couple of hours. I have increased the query interval to 5 seconds in order to see the effects and it goes to about 300 MB in a couple of hours. NOTE: These are not precise numbers and are just graph readings.

                  On the identical 6100 there is no longer an obvious memory leak once I disable that particular template item and restart BSNMPD. I have tried reversing the test on the boxes and as expected the opposite happens.

                  However: 2 things suggests the leak is not directly related to every query made at the OID:
                  1: Leaked memory does not seem to scale linearly with queries/time. Query every 5s is 6 times more often that once a minute, but the memory leak size has not risen 6 times.
                  2: Once the OID has been queried the memory leak will continue even though I disable the item query. Memory leaking seems to slowly fade in size until it almost flatlines, but is does keep going for at least 1 hour+ - This could however be false readings on my part as I only have the graphs as a baseline for now. I Have not had enough time and a precise way of telling the memory consumption of the BSNMPD process in specific.

                  But one thing is clear: I need to restart BSNMPD and not query the OID at all to stop the memory leak all together.

                  I have not had enough time to determine if there are other items that very slowly leak memory. I will continue to investigate that, but it seems stable when that item is disabled.

                  NB: The memory consumption of BSNMPD is quite noticeable - about 300MB out of the gates (with this OID disabled). Is that normal or?

                  I hope this can help you find the bug in BSNMPD and create a patch :-)

                  Love the no fuss of using the official appliances :-)

                  tinfoilmattT 1 Reply Last reply Reply Quote 5
                  • tinfoilmattT Offline
                    tinfoilmatt LAYER 8 @keyser
                    last edited by

                    Nice work doing the needful.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S Online
                      stephenw10 Netgate Administrator
                      last edited by

                      Yup nice work. 👍

                      1 Reply Last reply Reply Quote 0
                      • C Offline
                        cjrnz
                        last edited by

                        testing here, have reenabled polling with pfsense.rules.count disabled. cheers!

                        keyserK 1 Reply Last reply Reply Quote 1
                        • keyserK Online
                          keyser Rebel Alliance @cjrnz
                          last edited by keyser

                          @cjrnz There is no doubt here, that the metioned ITEM is the culprit. Here is a several days graph of the difference bewteen two “identical” SG6100’s where one has the ITEM disabled:
                          66e2648e-8fb8-4c36-a5ee-094c247f49f2-image.png

                          Since yesterday both have been running with the ITEM disabled, and the BSNMPD process memory courve is now completely flat. Inspecting the process with TOP shows that one has grown 1MB in size, the other 2MB since their restart 16 hours ago. To early to say if that indicates a VERY VERY slow memory leak, or just normal behaviour from restart.

                          EDIT: I’m no longer sure that it does not scale liniearly with the amount of queries. As you can see from the graph above, the overall firewall memory leak seams to be in the 40 - 50MB / 2 hours when doing the standard query every minute. That is very much 1/6th of the 300MB / 2 hours when quering every 5 second.
                          I have also confirmed that the firewall cached memory graph shows somewhat different behavior than the actual BSNMPD process (inspected from CLI). So the process might stops leaking right away I stop querying the ITEM.

                          Right now I’m just testing long term stability when the ITEM is disabled, so it will be a week or so before I can confirm the linearity with queries that does indeed seem to be there.

                          Love the no fuss of using the official appliances :-)

                          C 1 Reply Last reply Reply Quote 1
                          • C Offline
                            cjrnz @keyser
                            last edited by

                            @keyser ahh, I do like pretty graphs, I've been watching mine on the cli with.

                            while ( 1 )
                            echo `date -u` bsnmpd VSZ:`ps -Haxuww | grep 'bsnmpd' | grep -v grep | awk '{print $5}'` RSS:`ps -Haxuww | grep 'bsnmpd' | grep -v grep | awk '{print $6}'`
                            sleep 60
                            end
                            

                            I started at
                            Wed Jun 3 20:26:36 UTC 2026 bsnmpd VSZ:292236 RSS:264532
                            and now it's at
                            Thu Jun 4 06:28:07 UTC 2026 bsnmpd VSZ:333196 RSS:272456
                            so, as we know, it grows a little over time anyway but only a few bytes over hours.. but so much more with the rule count included.

                            keyserK 1 Reply Last reply Reply Quote 1
                            • keyserK Online
                              keyser Rebel Alliance @cjrnz
                              last edited by

                              @cjrnz Thanks

                              It’s still i bit early to tell, but there seems to be another minor memory leak occuring - at about 4-5 MB/24H.
                              Not anywhere near the problem with the rules.count ITEM, but none the less something that looks like a small leak. I will monitor this for another week properly confirm.

                              Regarding the identified issue in BSNMPD with the rules.count OID - are you Netgate guys on this / have created a redmine for the problem?
                              It is after all not insignificant and WILL cause issues in production where the Firewall rules number is monitored by SNMP from fx. Zabbix or Prometheus or the likes.

                              Love the no fuss of using the official appliances :-)

                              K 1 Reply Last reply Reply Quote 0
                              • stephenw10S Online
                                stephenw10 Netgate Administrator
                                last edited by

                                Yup we are still looking at it. Trying to replicate it here now.

                                1 Reply Last reply Reply Quote 1
                                • K Online
                                  kprovost @keyser
                                  last edited by

                                  @keyser As Steve said: we're still not having any luck reproducing this.

                                  I've gone over the relevant code several times now and I'm failing to see any path that would cause this to leak memory.

                                  .1.3.6.1.4.1.12325.1.200.1.11.1.0 is the number of labels (or rather, the number of rules with at least one label) we have counters for, not the table itself, which is also kind of odd.

                                  Can you confirm that not querying that node, but querying .1.3.6.1.4.1.12325.1.200.1.11.2 also leaks?

                                  It might also be interesting to see if you can get valgrind to find a leak. pkg install valgrind and then /usr/local/bin/valgrind --leak-check=full --show-leak-kinds=all -s /usr/sbin/bsnmpd -d -c /var/etc/snmpd.conf
                                  Let that run for however long it usually takes for the leak to manifest, then Ctrl-C it. Collect the output. (There will be a lot. There are a number of mostly harmless warnings, and valgrind will also warn about a lot of allocated and still reachable memory.)

                                  keyserK 1 Reply Last reply Reply Quote 0
                                  • stephenw10S Online
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    @keyser said in CE 2.8.1 bsnmpd Memory Leak:

                                    .1.3.6.1.4.1.12325.1.200.1.11.1.0

                                    What value does your ruleset return for that OID?

                                    keyserK 1 Reply Last reply Reply Quote 0
                                    • keyserK Online
                                      keyser Rebel Alliance @stephenw10
                                      last edited by

                                      @stephenw10 said in CE 2.8.1 bsnmpd Memory Leak:

                                      @keyser said in CE 2.8.1 bsnmpd Memory Leak:

                                      .1.3.6.1.4.1.12325.1.200.1.11.1.0

                                      What value does your ruleset return for that OID?

                                      Currently: 368 - which I believe is more or less correct looking at all my rules across 8 interfaces and including hidden rules.
                                      Whenever I make a new rule it is incremented by two which I assume is because of how PF rules are created?

                                      Love the no fuss of using the official appliances :-)

                                      1 Reply Last reply Reply Quote 0
                                      • keyserK Online
                                        keyser Rebel Alliance @kprovost
                                        last edited by keyser

                                        @kprovost said in CE 2.8.1 bsnmpd Memory Leak:

                                        @keyser As Steve said: we're still not having any luck reproducing this.

                                        I've gone over the relevant code several times now and I'm failing to see any path that would cause this to leak memory.

                                        .1.3.6.1.4.1.12325.1.200.1.11.1.0 is the number of labels (or rather, the number of rules with at least one label) we have counters for, not the table itself, which is also kind of odd.

                                        Can you confirm that not querying that node, but querying .1.3.6.1.4.1.12325.1.200.1.11.2 also leaks?

                                        When changeing the template line to “.1.3.6.1.4.1.12325.1.200.1.11.2” i get: “No Such Object available on this agent at this OID” from the host once it attempts to query it.

                                        EDIT: Could the reason you cannot recreate it be because of the interface types and count?
                                        I have:

                                        6 Tagged VLAN based interfaces acroos IX0 and IX3
                                        1 Assigned Wireguard Interface (tun_wg0)
                                        Mobile Warrior IPSec is enabled, so the IPsec tab is used as will (allthough not assigned).

                                        None of my physical interfaces in untagged form is assigned.

                                        Love the no fuss of using the official appliances :-)

                                        tinfoilmattT K 3 Replies Last reply Reply Quote 0
                                        • tinfoilmattT Offline
                                          tinfoilmatt LAYER 8 @keyser
                                          last edited by

                                          This post is deleted!
                                          1 Reply Last reply Reply Quote 0
                                          • K Online
                                            kprovost @keyser
                                            last edited by

                                            @keyser said in CE 2.8.1 bsnmpd Memory Leak:

                                            When changeing the template line to “.1.3.6.1.4.1.12325.1.200.1.11.2” i get: “No Such Object available on this agent at this OID” from the host once it attempts to query it.

                                            Yeah, that's not a leaf node. You can walk from there, which is what I'd expect zabbix to do (but I'm not particularly familiar with it).

                                            keyserK 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2026 Rubicon Communications LLC (Netgate). All rights reserved.