Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Gateway not switching back after failover

    Scheduled Pinned Locked Moved General pfSense Questions
    13 Posts 4 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      Panja
      last edited by

      I have multiple VPN gateways setup in a failover gateway group.
      Set to "Member down".

      So once Tier 1 goes down it switches to Tier 2.
      This works but when Tier 1 comes back up all my clients are still going over the Tier 2 gateway.

      Is there a way to fix this? I've read several threads with the same problem but no fix found...

      Running pfSense CE v2.6.0

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        When the tier 1 gateways come back up new connections will use them again but existing connections on the tier 2 gateway will remain until they are closed.
        For some types of traffic that can be indefinitely, SIP connections for example.
        Most traffic using TCP closes at the end of the session and new connections will be on the restored WAN.

        Steve

        P 1 Reply Last reply Reply Quote 1
        • P
          Panja @stephenw10
          last edited by

          @stephenw10

          Thanks for the reply.
          Is there a way to force all connections back to Tier 1 immediately?
          And not after a certain period. Btw in my test case I've waited for at atleast 15 minutes and it did not switch back.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Not built into pfSense. You could probably script something.

            What connections are you seeing still using the tire2 gateway?

            P 1 Reply Last reply Reply Quote 0
            • P
              Panja @stephenw10
              last edited by

              @stephenw10

              This is the situation.

              2 VPN gateways in failover, Tier 1 and Tier 2.
              Laptop is in vlan LAN_VPN and policy based routing is applied.
              So my laptop uses the failover group as outgoing.

              Once I bring down Tier 1 the failover does its job and switches to Tier 2.
              Once Tier 1 comes back up my laptop is still going out over the Tier 2 gateway.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                For new connections or existing connections? If existing what type of traffic?

                What should happen there is the defined gateway for the group in /tmp/rules.debug should change. Make sure it is.

                Steve

                P 1 Reply Last reply Reply Quote 0
                • P
                  Panja @stephenw10
                  last edited by Panja

                  @stephenw10

                  New and existing connections.

                  I have checked the behaviour of the /tmp/rules.debug file and checked the changes:

                  Normal situation where Tier 1 and Tier 2 are up:

                  GWVPN_WAN1 = " route-to ( ovpnc1 10.10.10.1 ) "
                  GWVPN_WAN2 = " route-to ( ovpnc2 10.20.20.1 ) "
                  GWvpn_gw_group = "  route-to { ( ovpnc1 10.10.10.1 )  }  "
                  

                  When Tier 1 goes down I have:

                  GWVPN_WAN2 = " route-to ( ovpnc2 10.20.20.1 ) "
                  GWvpn_gw_group = "  route-to { ( ovpnc2 10.20.20.1 )  }  "
                  

                  After Tier 1 comes back up and is online (Tier 2 still online as well) I have:

                  GWVPN_WAN2 = " route-to ( ovpnc2 10.20.20.1 ) "
                  GWvpn_gw_group = "  route-to { ( ovpnc2 10.20.20.1 )  }  "
                  
                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, then the only other possibility is that the ruleset it not actually being reloaded for some reason.
                    Check the gateway used in the running policy rule in the output of: pfctl -vsr (that could be large!)

                    P 1 Reply Last reply Reply Quote 0
                    • P
                      Panja @stephenw10
                      last edited by

                      @stephenw10

                      Thanks!
                      Will have a look later on.

                      Cheers

                      1 Reply Last reply Reply Quote 0
                      • M
                        madfuzker
                        last edited by

                        This is pretty well documented on the problem, but still not a real great solution from the pfsense interface side. I would like to see another setting in the Gateway Group Entry that let's you set a time for forced pushback to the Tier 1. In my situation, I don't see new traffic go back to Tier 1 for hours after it's back up. What happens is: my Tier 1 VPN will see a 5-10 second brief packet loss (maybe they reset servers every 24 hours or something?), but after 5-10 seconds all is good again and no loss for another 24 hours or so. When I do a dnsleaktest.com test on a client PC, it's still on the failover Tier 2 VPN for hours at a time before it ever switches back. One way to force it back to Tier 1 right away is to restart the VPN service on Tier 2, which kicks it back to Tier 1 immediately.

                        So if I had a setting Gateway Priority that said force traffic back to Tier 1 after x seconds of uptime on Tier 1, life would be good! People could customize it how they needed. If downtime was always small like in my case, ~30 seconds. If others wanted 2 hours, cool.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Yes the failback situation could be improved. It exists as it is because when that code was created it was not possible to fail back in any useful way without killing all connections. That should now be possible, though it would still be disruptive.

                          1 Reply Last reply Reply Quote 0
                          • B
                            BrucexLing
                            last edited by

                            As heavy handed as it might be I am still curious to know whether a Reset States command would be sufficient to guarantee failback?

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              As long as the main gateway has come back up any new states would created via that. So, yes, I would expect it to failback.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.