Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    High memory usage/leak on PF+ 22.05

    Scheduled Pinned Locked Moved General pfSense Questions
    17 Posts 3 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sgnoc
      last edited by

      I've been having a drastic increase in memory usage on my XG-7100U under PF+ 22.05. I don't recall noticing this issue before I upgraded from 22.01. My system under the current configuration has always run around 14-18% CPU usage and about 20% memory usage, and I have 24G of ram installed.

      I noticed very high usage last night and tried shutting down services to see if it would clear, but it would not. After a reboot the stats were back at normal, but when I checked again this morning it was back up to 88% memory usage. Under the system activity screen, all of the ram usage appears to be in the wired memory with 20G usage.

      Here is a snapshot of the top section:
      77882c9a-5d20-44f6-8fc1-38fe0f3501eb-image.png

      I tried looking, but only found a general reference to possible a kernel leak, but I couldn't find really any more information and that looked to be under 22.01, which I don't remember noticing this problem. SWAP is set to 1024MB with 0% usage, so it at least isn't causing any paging. Here are the services I'm running:

      3119b9d3-f1ee-4486-a175-dd4063caa895-image.png

      Any ideas on where I can troubleshoot to narrow down the culprit, or what I might do next? I don't want to have to reboot my router twice a day to clear out leaked memory. Also hoping I don't have to downgrade to 22.01.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Does the memory actually get exhausted if you don't reboot it?

        There was one other report I saw of something similar, can't seem to find it now...

        1 Reply Last reply Reply Quote 0
        • S
          sgnoc
          last edited by

          @stephenw10 It doesn't exhaust the system. It seems to mostly stabilize right around 88%, leaving around 500-600M of free memory. The system doesn't seem quite as responsive though. I'm not sure if operating with that high of a usage triggers any kind of garbage collection or not, but it is slightly slower to respond than the system is after rebooting and stabilizing once the services all come back up. Currently sitting at 18% CPU and 88% Memory.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Ah, it was in 22.01/2.6 not 22.05 but the advice there should still apply:
            https://forum.netgate.com/post/1041592

            1 Reply Last reply Reply Quote 0
            • S
              sgnoc
              last edited by

              I believe that may be the post I found previously, and then couldn't find again. There doesn't seem to be much there that I can do on my end, short of running the vmstat command and parsing through the output to try and figure out where it is leaking. I looked at the output, but don't understand it entirely. It doesn't seem that my zone 128 is the largest as in their example. My zone 512 seems the largest. Even if the same math added up, that doesn't seem to account for a missing 20G of ram.

              It also looks like that leak should have been fixed in 22.05, so possibly not the same issue that I'm having. I don't believe this issue was occurring on 22.01 for me.

              It sounds like if there is nothing I can really do, the only solution would be to downgrade to 22.01 and hope the next release gets this issue fixed? Is there anything else that I can try to troubleshoot on this?

              Here is the output of the vmstat -z. I checked the output over the last 20 minutes or so and nothing seems to be increasing in the zone list. The numbers fluctuate up and down, but nothing steadily increasing. vmstat.txt

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Hmm, I assume you have tried running with Suricata disabled since that's by far the largest consumer of RAM?

                1 Reply Last reply Reply Quote 0
                • S
                  sgnoc
                  last edited by

                  That was some of the testing I did last night when I first discovered the issue. I disabled the suricata service, but the memory stayed the same percentage. I monitor several interfaces with suricata, but it has only been using about 9.8% of memory. That is definitely one of the largest consumers, but not the 20G of used wired space I'm seeing in the system activity screen.

                  I'm only showing around 604M active and 2319M inactive, which is what I would expect. It appears all of the missing memory is in the Wired allocation. I'm not familiar with Wired memory, though.

                  1 Reply Last reply Reply Quote 0
                  • S
                    sgnoc
                    last edited by

                    Doing some research into the Wired memory, it looks like the kernel decides how much memory it will allocate to itself, and with ZFS sometimes it chooses to use too much. One solution I've seen is to edit /boot/loader.conf and configure vfs.zfs.arc_max to a specific acceptable value.

                    Is there a way to edit the vfs.zfs.arc_max setting through pfsense without manually editing the loader.conf file, which I presume gets overwritten at some point? I didn't see it in the tunable section, but wasn't sure if it may be hidden somewhere I'm not looking.

                    I was going to try and set it to: vfs.zfs.arc_max="12G"

                    That would limit it to about half the total memory, plus what is used by other services, which should still leave me a solid at least 25-35% free. I also don't want to set it incorrectly and end up with a system that isn't able to boot to the OS.

                    I don't mind if ZFS prefers to use more memory, but I don't like it running that close to max if other services need it and ZFS doesn't release enough in time.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Create the file /boot/loader.conf.local and add custom loader variables there. That is kept across upgrades etc and not overwritten by other setting changes.

                      Steve

                      1 Reply Last reply Reply Quote 1
                      • bmeeksB
                        bmeeks
                        last edited by bmeeks

                        Something is wrong with your Suricata configuration on that box. Look at how many duplicate Suricata processes you have running. Not saying that is the only cause of your memory issue, but it certainly is one of them. For example, I see four duplicate Suricata processes for VLAN 45. And I see two each for VLAN 14 and VLAN 25. You should have only a single Suricata instance running per interface (or VLAN).

                        1 Reply Last reply Reply Quote 0
                        • S
                          sgnoc
                          last edited by

                          @stephenw10 Thanks, that's what I needed. Last night I added vfs.zfs.arc_max="12884901888" to the /boot/loader.conf file just to test. I watched as the ARC slowly increased the memory usage again, maybe 1-3M every second or two. It leveled off at 12G, as configured. I'm not sure if maybe the updated kernel released with 22.05 changed how it handles not having a configured max arc setting, and increased the usage to 20G out of 24G total memory.

                          I know from reading what I could find that ARC will release the memory when needed, but not always in time. I definitely don't want to get to where swap memory is being used. I'm going to move the config line into the /boot/loader.conf.local like you recommended to stay persistent across updates/upgrades. I think I'm good to go now, and rather prefer being able to control the max arc storage. That would be a great setting to add into the tunable advanced screen.

                          1 Reply Last reply Reply Quote 0
                          • S
                            sgnoc
                            last edited by

                            @bmeeks I noticed that too, but when I looked into it further, the duplicate entries for each VLAN are running under the same process (same PID), but listed with different states (nanslp, uwait, etc). When I check the process list (ps aux), there is only a single suricata process for each interface configured for monitoring. So, I can't explain why it lists like that on the activity screen, but it looks to be ok overall. Each instance is using around 360M, which I'm fine with since I have 24G total.

                            bmeeksB 1 Reply Last reply Reply Quote 0
                            • bmeeksB
                              bmeeks @sgnoc
                              last edited by

                              @sgnoc said in High memory usage/leak on PF+ 22.05:

                              @bmeeks I noticed that too, but when I looked into it further, the duplicate entries for each VLAN are running under the same process (same PID), but listed with different states (nanslp, uwait, etc). When I check the process list (ps aux), there is only a single suricata process for each interface configured for monitoring. So, I can't explain why it lists like that on the activity screen, but it looks to be ok overall. Each instance is using around 360M, which I'm fine with since I have 24G total.

                              Doh! My bad, that would be the separate threads assigned to a particular interface. So there appear to be 4 threads running on the VLAN 45 interface. That's fine.

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                Hmm, that's interesting. What size is the boot drive on that 7100?

                                Steve

                                S 1 Reply Last reply Reply Quote 0
                                • S
                                  sgnoc @stephenw10
                                  last edited by

                                  @stephenw10 The freebsd-boot partition is 512K and is on the same drive as the OS, which is a 256G M.2 SATA drive. The /boot folder is around 105M, though.

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Thanks. Not something we've seen here but we are investigating.

                                    S 1 Reply Last reply Reply Quote 0
                                    • S
                                      sgnoc @stephenw10
                                      last edited by sgnoc

                                      Sounds good. Thanks for the help. With the /boot/loader.conf.local variable change, I've been running for more than a day now and it all seems stable with much lower memory utilization. I feel a lot more comfortable with 58% utilized compared to the 88%, especially if I have a sudden increase in traffic or encrypted tunnels, I know there are enough resources available to handle anything thrown at it.

                                      1 Reply Last reply Reply Quote 1
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.