Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.4.5.a.20200110.1421 and earlier: High CPU usage from pfctl

    Scheduled Pinned Locked Moved Development
    112 Posts 33 Posters 33.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ
      jimp Rebel Alliance Developer Netgate @luckman212
      last edited by

      @luckman212 said in 2.4.5.a.20200110.1421 and earlier: High CPU usage from pfctl:

      is there any way to remotely downgrade to 2.4.4 from 2.4.5? I think I have a remote SG3100 hitting this issue and it was on 2.4.2 earlier today, I upgraded it... it's 20 miles away :(

      No

      in case (1) is not possible, is this bug also present in current 2.5.0 builds?

      We haven't tested 2.5.0, but I don't think it does. That could change, though, as we're getting the 2.5.0 builds up onto stable/12 and it may be there.

      Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      luckman212L 1 Reply Last reply Reply Quote 0
      • luckman212L
        luckman212 LAYER 8 @jimp
        last edited by luckman212

        @jimp Thank you again. Reading through redmine #10414 it seems like the temporary workaround is:

        • set System > Advanced > Firewall & NAT > Firewall Maximum Table Entries to <65535 โ€” e.g. 65000
        • disable Block bogon networks on all interfaces

        The thing is, I've done both of those things on my only 2.4.5 system (a remote SG-3100) and I believe I am still hitting this problem.

        Take a look at this gateway monitoring graph โ€” never seen spikes like this! They're almost all exactly 20 minutes apart. I checked /etc/crontab for any possible jobs that might be running on 20 minute intervals (found nothing). I also searched the filesystem for any references to 1200 seconds and found just one, in /usr/local/www/interfaces_bridge_edit.php stating "...the timeout of address cache entries [..] default is 1200 seconds". Don't know if that's anything.

        Multiple conversations with the ISP and they are assuring me the problem is "on my end" โ€” of course. I'd normally set up some Wireshark captures between the ISP equipment and pfSense in this type of situation, but since I'm remote that isn't possible.

        It seems like people are also reporting success on virtual machines by setting CPU cores to 1. Is there any boot flag that we can set here to disable SMP e.g. kern.smp.disabled=1 or hint.lapic.1.disable=1 or is that not necessary?

        update: see below -- disabling SMP seems to have helpred.

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          Not just bogons but anything that loads large tables. It could be a URL table alias, pfBlockerNG, or something else.

          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          luckman212L 1 Reply Last reply Reply Quote 0
          • luckman212L
            luckman212 LAYER 8 @jimp
            last edited by

            @jimp said in 2.4.5.a.20200110.1421 and earlier: High CPU usage from pfctl:

            anything that loads large tables. It could be a URL table alias, pfBlockerNG

            This unit doesn't have any aliases defined, and pfBNG is not installed (no packages installed actually).

            A 1 Reply Last reply Reply Quote 0
            • A
              akm22562 @luckman212
              last edited by

              @luckman212 In my case, I had CARP configured with bogons blocked on the WAN interface.

              I can't afford to disable CARP. Disabling bogons was a big help.

              Anyway, just my $0.02.

              1 Reply Last reply Reply Quote 0
              • C
                carl2187 @jimp
                last edited by

                @jimp amazing detective work to have already isolated it down to a specific upstream change in freebsd!

                Please let the community know if theres anything we can do to help, test, build kernels etc.

                Thanks for all you do!

                1 Reply Last reply Reply Quote 0
                • luckman212L
                  luckman212 LAYER 8
                  last edited by luckman212

                  Seems like the fix for this will land in 2.4.5-p1 which is coming soon. But, this person was desperate. So as a test, I put the following in their /boot/loader.conf.local:

                  kern.smp.disabled=1
                  

                  After rebooting, the problem is gone. It's only been an hour, but not a single hiccup so far (๐Ÿคžfingers crossed).This is on an SG-3100.

                  nzkiwi68N 1 Reply Last reply Reply Quote 1
                  • Cool_CoronaC
                    Cool_Corona
                    last edited by Cool_Corona

                    I can confirm this fixes the issue completely!

                    Its not enough to edit -> system -> tunables.

                    You have to edit /boot/loader.conf.local manually.

                    Its the same as limiting a VM to only 1 core.

                    1 Reply Last reply Reply Quote 0
                    • nzkiwi68N
                      nzkiwi68 @luckman212
                      last edited by

                      kern.smp.disabled=1
                      

                      After rebooting, the problem is gone. It's only been an hour, but not a single hiccup so far (๐Ÿคžfingers crossed).This is on an SG-3100.

                      But, a quick look online suggests to me that this is disabling all multi CPU support. On busy systems this could be a problem switching your multi core (example XG-1537 with 8 cores plus hyper threading) into a single CPU system!

                      I recommend if you can wait on 2.4.4-p3 or limp along on 2.4.5 if you can.

                      For me, I shall wait for the official patch / release.

                      luckman212L 1 Reply Last reply Reply Quote 1
                      • luckman212L
                        luckman212 LAYER 8 @nzkiwi68
                        last edited by

                        @nzkiwi68 said in 2.4.5.a.20200110.1421 and earlier: High CPU usage from pfctl:

                        this is disabling all multi CPU support [..] I recommend if you can wait on 2.4.4-p3 or limp along on 2.4.5 if you can.

                        100% good advice. In this case I had to upgrade due to another problem, so I was "stuck" on 2.4.5 with a remote system and had no other option. When 2.4.5-p1 / 2.5.0 come out this should not be needed. Losing 1 core is a fair trade for regaining the stability.

                        Cool_CoronaC 1 Reply Last reply Reply Quote 0
                        • Cool_CoronaC
                          Cool_Corona @luckman212
                          last edited by

                          @luckman212

                          Agreed. Limiting to 1 core is a viable option on a home network or on a small B2B setup. A busy connection running IDS/IPS would be running full load and not have spare ressources left.

                          1 Reply Last reply Reply Quote 0
                          • D
                            DD @luckman212
                            last edited by

                            @luckman212 In version 2.5 everything is working ok. I switched from 2.4.4-p3 to 2.4.5 and I have problem with this bug (my FW is on Hyper-V), then I switched to 2.5 and everything is ok. This week was pfSense 2.5 changed to FreeBSD 12.1-STABLE.

                            Mr_JinXM 1 Reply Last reply Reply Quote 0
                            • Mr_JinXM
                              Mr_JinX @DD
                              last edited by Mr_JinX

                              @DD I've just done a clean install from the latest public download, and it shows FreeBSD version 11.3 stable, on their site it shows 12.0 stable https://docs.netgate.com/pfsense/en/latest/releases/versions-of-pfsense-and-freebsd.html

                              Update:
                              I've just updated to the development release and it's now running 12.1 stable, do we believe the issues are not present in FreeBSD 12.1 stable?

                              1 Reply Last reply Reply Quote 0
                              • G
                                ghosterius
                                last edited by

                                Is there any development on this situation? Are we having some 2.4.5_p1 coming up soon to solve this?
                                I have the pfSense running on an Hyper-V and there's absolutely nothing I can do on the pfSense without having an huge impact (outage, traffic gone, website and console unresponsive).

                                I've attempted reverting back to 2.4.4_p3 but... unfortunately you guys removed the image available for me to reinstall it so... oops!

                                provelsP 1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  Read the thread. All the info is here. At least read from https://forum.netgate.com/post/908806 down.

                                  Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  G 1 Reply Last reply Reply Quote 0
                                  • G
                                    ghosterius @jimp
                                    last edited by

                                    @jimp said in 2.4.5.a.20200110.1421 and earlier: High CPU usage from pfctl:

                                    Read the thread. All the info is here. At least read from https://forum.netgate.com/post/908806 down.

                                    Instead of replying arrogantly, why not assuming I've just done that 5 times and try to give a more "to the point" answer?
                                    Why do you think I am asking if there's any development over this situation? Maybe because of your post stating that you're "assessing the next steps" (the same you just gave me a link to go and read again...).

                                    Reducing my system to a single core installation is not doable (done that... performance went to the ground) and unfortunately reducing the tables to 65000 entries makes pfBlockerNG go wild and start throwing errors.

                                    So... Expecting any development over this situation? is there any kind of patch we can do? can we "override" the setting that exists that causes this problem?

                                    1 Reply Last reply Reply Quote 0
                                    • jimpJ
                                      jimp Rebel Alliance Developer Netgate
                                      last edited by

                                      While terse, my response was not "arrogant". It's not arrogant to expect people to read a thread with all of the information (and links to more information). Far too often people pop in and expect others to do the leg work for them when all of the information is here. If you'd read from my linked post down and followed the links, you'd have all your answers. I just skimmed them and checked. The info is here, and in the links (Like https://redmine.pfsense.org/issues/10414 linked in https://forum.netgate.com/post/909130).

                                      There are multiple comments with suggested workarounds and how to enact them.

                                      Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                      Need help fast? Netgate Global Support!

                                      Do not Chat/PM for help!

                                      1 Reply Last reply Reply Quote 0
                                      • G
                                        ghosterius
                                        last edited by

                                        Working in IT as I do, I understand your point and where you're coming from... However would have been much nicer to start with a: "Have you read the thread already? Did you notice that we've already pointed to possible workarounds?"
                                        To which I'd reply that I've seen the workarounds and that none of them work for me (unfortunately I must add) at least without having another impact....

                                        Thanks for the link to the redmine part, I did not notice that one previously, apologies!

                                        I've noticed that there's a patch for the kernel and good results are visible... Do we know or have an idea when that's coming out?

                                        Thanks for your time.

                                        1 Reply Last reply Reply Quote 0
                                        • jimpJ
                                          jimp Rebel Alliance Developer Netgate
                                          last edited by

                                          We don't all have time to be nice, especially when there is no indication of what the person posting has done. If you had included the additional info about exactly what you had tried in your first comment, that would have been even more helpful. We're not mind readers.

                                          The workarounds do work, at a possible performance penalty (hurts different deployments worse than others). The main workaround is reducing the CPU cores to 1, which is mentioned several times, and that will work 100% of the time for everyone. If that was too much of a performance hit, then you will need to disable all the large tables, move it to hardware with faster single cores, or go back to 2.4.4-p3.

                                          No ETA on 2.4.5-p1 other than "Soon" (as in Weeks, not months). Still some testing left to do on other issues being rolled into 2.4.5-p1 to address other issues discovered in the release.

                                          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                          Need help fast? Netgate Global Support!

                                          Do not Chat/PM for help!

                                          1 Reply Last reply Reply Quote 0
                                          • provelsP
                                            provels @ghosterius
                                            last edited by

                                            @ghosterius said in 2.4.5.a.20200110.1421 and earlier: High CPU usage from pfctl:

                                            Is there any development on this situation? Are we having some 2.4.5_p1 coming up soon to solve this?
                                            I have the pfSense running on an Hyper-V and there's absolutely nothing I can do on the pfSense without having an huge impact (outage, traffic gone, website and console unresponsive).

                                            I've attempted reverting back to 2.4.4_p3 but... unfortunately you guys removed the image available for me to reinstall it so... oops!

                                            Sure. Just reduce to one virtual CPU. EOF

                                            Peder

                                            MAIN - pfSense+ 24.11-RELEASE - Adlink MXE-5401, i7, 16 GB RAM, 64 GB SSD. 500 GB HDD for SyslogNG
                                            BACKUP - pfSense+ 23.01-RELEASE - Hyper-V Virtual Machine, Gen 1, 2 v-CPUs, 3 GB RAM, 8GB VHDX (Dynamic)

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.