Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SG-5100 WAN failover at gigabit saturation

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    14 Posts 3 Posters 1.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      ashlm @SteveITS
      last edited by

      @steveits Thanks for your reply. Just seems odd that the gateway stays live and has no latency issues when it is the only gateway but once failover is introduced it starts misbehaving.

      Will look into shaping / limiting, but its definitely a band aid and not a solution.

      S 1 Reply Last reply Reply Quote 0
      • S
        SteveITS Galactic Empire @ashlm
        last edited by

        @ashlm The latency triggers the failover. Changing the latency threshold to say 1500ms would not trigger the failover. Or changing the "Time Period" on the gateway which makes it average over a longer time. That's of course not ideal if it is always that slow, but that's what we found to avoid the 30-second-busy failovers.

        And yes limiter/shaping is in some ways a band aid but avoids the latency. IOW it's not really a pfSense problem, the problem is the device is flooding the connection, so pfSense is doing what it's been told and failing over when latency spikes.

        In our case it was a client and we aren't on site so it took a long time to catch it while it was happening and track it down to a Mac, by MAC address. We think it was doing a backup or maybe a long video upload, never quite figured that out as we didn't get a great answer from the person. (which is why I think it was a backup)

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote 👍 helpful posts!

        A 1 Reply Last reply Reply Quote 0
        • A
          ashlm @SteveITS
          last edited by

          @steveits Thanks again for the reply, it's very helpful.

          @steveits said in SG-5100 WAN failover at gigabit saturation:

          The latency triggers the failover.

          Yes, but latency on the gigabit interface reaches the failover threshold (>1s) only when failover is enabled. RTTsd remains below 400ms, well below the failover threshold, when the gigabit interface is set as the solitary gateway, and the gigabit interface remains up for the entire test.

          RTTsd only exceeds 1000ms when failover is enabled.

          S 1 Reply Last reply Reply Quote 0
          • S
            SteveITS Galactic Empire @ashlm
            last edited by

            @ashlm Oh, I get what you're saying now! I hadn't noticed that but wasn't looking for it. That would explain why we only saw it at that client. We thought it was the Mac because that's the only device we ever saw "cause" the problem, on several occasions.

            Since it sounds like you can reproduce it I suggest opening a case at redmine.pfsense.org and link to this thread.

            Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
            When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
            Upvote 👍 helpful posts!

            A 1 Reply Last reply Reply Quote 0
            • A
              ashlm @SteveITS
              last edited by ashlm

              @steveits Thanks, have done so. Enabling gateway failover introduces latency increase and causes artificial failover scenario

              S 1 Reply Last reply Reply Quote 0
              • S
                SteveITS Galactic Empire @ashlm
                last edited by

                @ashlm Is that the right URL? It talks about traffic shaping, and is from 8 days ago. :)

                Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                Upvote 👍 helpful posts!

                1 Reply Last reply Reply Quote 1
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  You're testing that in 21.02?

                  Can you upgrade to 22.01 and see if it's still happening?

                  Steve

                  A 1 Reply Last reply Reply Quote 0
                  • A
                    ashlm @stephenw10
                    last edited by

                    @stephenw10 Apologies, that's a mistake. "22.01-RELEASE (amd64)
                    built on Mon Feb 07 16:37:59 UTC 2022."

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Ah, Ok. Do you know if this is new behaviour in 22.01?

                      A 1 Reply Last reply Reply Quote 0
                      • A
                        ashlm @stephenw10
                        last edited by ashlm

                        @stephenw10 The same failover scenario manifested in 21.02 on the SG-3100, though that device couldn't achieve gigabit down on the WAN interface and was replaced with the SG-5100 without further testing with a solitary gateway. I updated th SG-5100 to the latest release before deployment, so can't say for certain if it would happen on 21.02 on the SG-5100.

                        A 1 Reply Last reply Reply Quote 1
                        • A
                          ashlm @ashlm
                          last edited by

                          @ashlm The issue issue is resolved, or rather is not an issue / not an accurate description. The same latency increase to >1s was recorded while testing the solitary gateway config this morning, therefore is no longer confined / attributable to enabling failover.

                          1 Reply Last reply Reply Quote 1
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Ah, Ok thanks for the update. I couldn't replicate it here.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.