Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Connections dropping under heavy load

    Scheduled Pinned Locked Moved General pfSense Questions
    18 Posts 3 Posters 1.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DerelictD
      Derelict LAYER 8 Netgate
      last edited by

      That probably has nothing to do with the amount of traffic (30Mb/sec is pretty much nothing) but the number of states.

      What do the state levels look like?

      Chattanooga, Tennessee, USA
      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
      Do Not Chat For Help! NO_WAN_EGRESS(TM)

      P 1 Reply Last reply Reply Quote 0
      • P
        pedreter @Derelict
        last edited by

        Thx @Derelict..

        Values are on average...

        State table size: 2% (30513/1630000)
        and
        MBUF Usage: 1% (28616/2000000)

        i agree 309 Mbps is nothing... :-(

        Thanks!

        1 Reply Last reply Reply Quote 0
        • DerelictD
          Derelict LAYER 8 Netgate
          last edited by

          Do you have State Killing on Gateway Failure checked in System > Advanced, Miscellaneous on either node?

          i agree 309 Mbps is nothing... :-(

          Your OP said 30Mb/sec.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          P 1 Reply Last reply Reply Quote 0
          • P
            pedreter @Derelict
            last edited by

            @Derelict

            Sorry, a typo... 30Mbps... not 309

            Thanks..

            1 Reply Last reply Reply Quote 0
            • DerelictD
              Derelict LAYER 8 Netgate
              last edited by

              Do you have State Killing on Gateway Failure checked in System > Advanced, Miscellaneous on either node?

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              P 1 Reply Last reply Reply Quote 0
              • P
                pedreter @Derelict
                last edited by

                @Derelict

                State Killing on Gateway Failure is unchecked

                Thanks for you kindness and help Derelict!

                1 Reply Last reply Reply Quote 0
                • DerelictD
                  Derelict LAYER 8 Netgate
                  last edited by

                  On both nodes?

                  Well, something is killing the states. The default expiration of an ESTABLISHED:ESTABLISHED TCP connection is 24-hours of zero traffic.

                  People sometimes see this when adaptive pruning kicks in but at those state table levels that certainly should not be the case.

                  Again, this would have nothing to do with traffic load but something killing the state.

                  Chattanooga, Tennessee, USA
                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                  P 2 Replies Last reply Reply Quote 0
                  • P
                    pedreter @Derelict
                    last edited by

                    @Derelict

                    Your words make sense to me... i will dig out in that direction...

                    Thanks again!

                    1 Reply Last reply Reply Quote 0
                    • P
                      pedreter @Derelict
                      last edited by

                      @Derelict said in Connections dropping under heavy load:

                      ptive pruning kicks in but at those state table levels that certai

                      Derelict,

                      Currently i have this values:

                      Firewall Maximum States: 1630000

                      but

                      net.pf.source_nodes_hashsize: 8192
                      net.pf.states_hashsize: 32768

                      are they correct? should not they be bigger?

                      Thanks!

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        That's the default size and they are never normally an issue.

                        One thing you might try here is to disable pfSync on the secondary. It's possible you have an interface mismatch and the secondary is syncing back states onto the wrong interface breaking them.
                        If you no longer lose connections with that disabled check the config of both firewalls match exactly.
                        Though that would not normally be load related.

                        Steve

                        1 Reply Last reply Reply Quote 0
                        • P
                          pedreter
                          last edited by

                          @stephenw10 said in Connections dropping under heavy load:

                          That's the default size and they are never normally an issue.

                          Thanks Stephen..

                          When i do what you suggest the state table grows hugely. and very quickly... is that normal? and gets back to normal if i reactivate pfsync in Secondary.

                          i am trying t dig our it it does make any difference....

                          Thanks!

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            How huge? It might be the secondary was killing most states and now it is not...

                            How many clients are behind it?

                            Steve

                            P 1 Reply Last reply Reply Quote 0
                            • P
                              pedreter @stephenw10
                              last edited by pedreter

                              @stephenw10

                              UAU... Stephenw01 very interesting your remark... huge means (on average)... from 25.0000 sudden grow to 150.0000 entries... yes that huge! and back to 25.000 if secondary pfsync is enabled again.

                              There are 15 clients behind the pfsense-cluster.

                              Why the secondary would want to kill states?

                              Thanks!

                              1 Reply Last reply Reply Quote 0
                              • DerelictD
                                Derelict LAYER 8 Netgate
                                last edited by

                                Besides looking at the numbers of states (what is 150.0000 anyway? Is that one hundred fifty thousand or one million five hundred thousand?) does the issue with your states being killed (ssh sessions dying, etc) go away with pfsync disabled?

                                As Steve mentioned the first thing to do is verify all of your interfaces match up.

                                I use Diagnostics > Interfaces for this. The internal interface name (wan, lan, opt1, opt2, etc), the physical interface name (igb0, ix1, re2, vxnet4) all need to match exactly between primary and secondary. The description should not need to match but for consistency I would make them match.

                                What you are seeing is not normal. There is obviously something wrong with your configuration. What that is is still unknown. Don't think either of us have ever see this exact behavior before.

                                Chattanooga, Tennessee, USA
                                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                1 Reply Last reply Reply Quote 0
                                • P
                                  pedreter
                                  last edited by

                                  Thanks Derelict...

                                  Sorry again for my typo: correct figure is 150.000

                                  I agree this does not look normal.

                                  I migrated from 2.1.5 (worked so good!)to 2.4.4-p3 by installing 2.4.4 from iso and then importing config from XML file.

                                  There was no error importing the old config (there were no packages installed) and the interfaces names, description, phisical device match exactly. In fact CARP is working.

                                  May the XML import have done anything in 2.4.4 to generate this problem? maybe something has been corrupted?

                                  Thanks again!

                                  1 Reply Last reply Reply Quote 0
                                  • DerelictD
                                    Derelict LAYER 8 Netgate
                                    last edited by Derelict

                                    Doubtful.

                                    You still have not answered the question: does the issue with your states being killed (ssh sessions dying, etc) go away with pfsync disabled?

                                    Perhaps you should post your settings instead of just saying they match. Cannot count the times a poster has said things are one way when they, in fact, are not.

                                    You did update both nodes to 2.4.4-p3 correct?

                                    Chattanooga, Tennessee, USA
                                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      I mean 10k states per client does seem..... high! But it depends what those clients are doing. If those are all legitimate states then you could be hitting something else more quickly than we would otherwise expect.

                                      But, yeah, did disabling pfSync on the secondary correct the connection drops you were seeing?

                                      Steve

                                      1 Reply Last reply Reply Quote 0
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.