Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PF States limit reached.

    Firewalling
    6
    18
    3.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • X
      xciter327
      last edited by

      I am trying to troubleshoot the following issues. I have a firewall that occasionally stops working wit the above message. When that happens the gateway goes down and all traffic via the firewall stops. Gateway monitoring is disabled and state killing on gateway down is disabled.

      It was my understanding that "Firewall adaptive timeouts" are supposed to deal with this, however that does not seem to be the case. I have them configured at 5 million sates(system has 8GB of RAM), with "lower adaptive timeout value" of 3 million and "higher adaptive timeout value" of 4 million. If I am reading the documentation correctly, it should start adapting the timeouts at 3 million and set the timeouts to "0" at 4 million. This does not seem to happen. When I check the number of active states it shows a value of 5 million+ even tough the gateway is down. The only thing that works on the WAN is ARP.

      Besides ICMP(echo reply, echo request, info reply, unreachable, parameter problem, packet too big) the firewall does not reply to other traffic externally. Access to management ports is blocked via a floating rule and an alias.

      Normally the firewall only goes to 20k states.

      0_1537261345199_a3e15d35-e451-4316-a7bf-361af23578ad-image.png

      Now I guess I have couple of options:

      1. Set "Max. src. states" or "Max. src. conn. Rate".
      2. Enable state killing on gateway down. not very keen on this as, we've had issues before with gateway monitoring, so it's currently disabled).
      3. Other?

      Any suggestions would be welcome

      1 Reply Last reply Reply Quote 0
      • H
        heper
        last edited by

        this is a huge network that needs 5million states ? or is there a malfunction ?

        1 Reply Last reply Reply Quote 0
        • X
          xciter327
          last edited by

          Not huge. Coupl if hundred clients NAT-ed on a /29 + IPv6 deployment. I think they call it dual stack lite.

          1 Reply Last reply Reply Quote 0
          • B
            beatvjiking
            last edited by

            I set max src states to 8192 on my networks. With a few hundred devices, even with dual-stack, you're seeing way too many. I've NATed for thousands of devices and not seen that many states.

            1 Reply Last reply Reply Quote 0
            • B
              beatvjiking
              last edited by

              I've seen state tables that size only in instances when malware is in play, or in one case, when an intern for a well-known antimalware company wrote a naive script querying their entire list of malicious domains with no limits on queries per second.

              You may also want to try setting firewall optimization to "aggressive" but the preferable option is to limit max src states.

              X 1 Reply Last reply Reply Quote 0
              • X
                xciter327 @beatvjiking
                last edited by xciter327

                @beatvjiking said in PF States limit reached.:

                I've seen state tables that size only in instances when malware is in play, or in one case, when an intern for a well-known antimalware company wrote a naive script querying their entire list of malicious domains with no limits on queries per second.

                You may also want to try setting firewall optimization to "aggressive" but the preferable option is to limit max src states.

                My thinking exactly. This is a student network, so god knows what are they trying to do.

                Interestingly enough some time ago I was doing tests with hping and packet generator(pktgen I think) and I have managed to fully load up the device (full CPU, full state table etc, interfaces at capacity), however normally after I stop the test the device always recovered. It crashed only once, from many test, but I could not reproduce it. This full lockup I've never managed to reproduce.

                This is exactly why the states numbers have been raised. I've tested it up to 5M states with no issues.

                1 Reply Last reply Reply Quote 0
                • ?
                  A Former User
                  last edited by

                  I read the doco the same way, why doesn't the firewall start just nuking sessions straight away? It shouldn't be possible to hit 5M with your config.

                  I agree that's not the right solution for your problem, but regardless, shouldn't this be working for you?

                  1 Reply Last reply Reply Quote 0
                  • X
                    xciter327
                    last edited by

                    That was my thinking exactly. I've just added the "max src states" to all the firewall rules(which are pass).

                    1 Reply Last reply Reply Quote 0
                    • X
                      xciter327
                      last edited by xciter327

                      Also the firewall had a kernel panic on reboot.(decide to reboot it because the graphs were not working).

                      0_1537343874714_df984a18-4f5d-424f-a287-2d0fdd66e793-image.png

                      I checked in /var/crashes and there was no dump.

                      1 Reply Last reply Reply Quote 0
                      • H
                        heper
                        last edited by

                        So this problem happens every 200 days or so? Uptime in screenshot.....

                        1 Reply Last reply Reply Quote 0
                        • X
                          xciter327
                          last edited by

                          No it happens once a day for the last 3 days. I don't reboot the firewall, I just flush the states. "pfctl -F states all"

                          1 Reply Last reply Reply Quote 0
                          • X
                            xciter327
                            last edited by

                            Just wanted to report it has not happened since I put the limits on.

                            1 Reply Last reply Reply Quote 1
                            • B
                              beatvjiking
                              last edited by

                              You can probably find in your logs what device(s) are attempting to open so many sessions and address whatever is happening - i.e. malware or what have you.

                              1 Reply Last reply Reply Quote 0
                              • X
                                xciter327
                                last edited by

                                Just to report it happened again. In my eyes, there are two options: Option 1 is adaptive timeouts are not working. Option 2 is the device somehow running out of memory. I can see in the monitoring graph that when the states reach roughly 900k the device becomes un-resposive. I've set much lower adaptive timeouts now and put the max states to 5mil(8G RAM). max src states is at 8096 on each firewall rule.

                                If anybody has a suggestion on how to simulate a lot of connection states from multiple IP, I would love to hear it.

                                1 Reply Last reply Reply Quote 0
                                • S
                                  SteveITS Galactic Empire
                                  last edited by

                                  You're not alone: https://www.google.com/search?client=firefox-b-1-d&channel=cus&q=pfsense+pf+states+limit+reached

                                  this one mentions a Spiceworks scan of a large subnet:
                                  https://forum.netgate.com/topic/81059/zone-pf-states-pf-states-limit-reached-how-to-find-the-offender/10
                                  Given that post (simultaneous scan), how often are the adaptive timeouts processed/changed/updated by pfSense? (instantly, every 5 minutes, etc.)

                                  Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                                  When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                                  Upvote 👍 helpful posts!

                                  1 Reply Last reply Reply Quote 0
                                  • B
                                    beatvjiking
                                    last edited by

                                    When I recommended 8k states, that was a very high ceiling. It works well in my environment but in most environments that can be far far lower with no negative impacts on user experience. 512 is a reasonable limit to impose on your allow rules. You may want to try that as an alternative to more RAM :)

                                    1 Reply Last reply Reply Quote 0
                                    • DerelictD
                                      Derelict LAYER 8 Netgate
                                      last edited by

                                      Is this running on Hyper-V?

                                      Chattanooga, Tennessee, USA
                                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                      X 1 Reply Last reply Reply Quote 0
                                      • X
                                        xciter327 @Derelict
                                        last edited by

                                        @Derelict said in PF States limit reached.:

                                        Is this running on Hyper-V?

                                        Appreciate your reply. It is on a physical box. Supemicro Atom C2750.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.