Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Multi-WAN gateway failover not switching back to tier 1 gw after back online

    Scheduled Pinned Locked Moved Routing and Multi WAN
    119 Posts 35 Posters 53.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      serbus
      last edited by

      Hello!

      Ahhhh, gotcha.

      I am having a problems following the thread. It is long and old, and seems to cover different (resolved?) problems. Yours could be yet another issue. Maybe a new thread?

      John

      Lex parsimoniae

      1 Reply Last reply Reply Quote 1
      • N
        nleaudio
        last edited by

        As far as I know, this is still problematic! Some PFSense boxes I have on dual-wan setups will switch from tier 1 to tier 2 connections without issue, but going back when the tier 1 is restored does not always work... At least not in the timeframe I would consider usable.

        Bob

        M 1 Reply Last reply Reply Quote 0
        • M
          mo10 @nleaudio
          last edited by

          @nleaudio said in Multi-WAN gateway failover not switching back to tier 1 gw after back online:

          As far as I know, this is still problematic! Some PFSense boxes I have on dual-wan setups will switch from tier 1 to tier 2 connections without issue, but going back when the tier 1 is restored does not always work... At least not in the timeframe I would consider usable.

          Bob

          i think i had those issues because i imported a configuration to different hardware. Did you do the same?
          After i did a reset to defaults and set up everything again it is now switching back fine.

          gnitingG N 2 Replies Last reply Reply Quote 0
          • gnitingG
            gniting @mo10
            last edited by

            @mo10 to be clear... in your dual-WAN setup, if WAN1 (default gateway) goes down and pfsense ends up making WAN2 the default, then upon recovery of WAN1, pfsense automatically marks WAN1 as default?

            M 1 Reply Last reply Reply Quote 0
            • M
              mo10 @gniting
              last edited by

              @ibbetsion

              This was never a problem for me. It maked the gateway as default fine but still was sending traffic the wrong way. Saving an interface fixed it until i unplugged (physically) a cable again.
              Now after resetting everything everything runs as expected.

              Do you have problems with Multi-Wan? What exactly?

              gnitingG 1 Reply Last reply Reply Quote 0
              • gnitingG
                gniting @mo10
                last edited by

                @mo10 my problem is that post recovery, WAN1 never goes back to being default. I have to use a script to bring down WAN2 so that WAN1 becomes default again. Not an ideal solution but it works.

                M 1 Reply Last reply Reply Quote 0
                • M
                  mo10 @gniting
                  last edited by

                  @ibbetsion

                  Would you be able to make a test an reset you pfsense (save configuration first) and just setup the multi-wan an try again?

                  1 Reply Last reply Reply Quote 0
                  • S
                    serbus
                    last edited by serbus

                    Hello!

                    Assuming a pretty standard multiwan: WAN1 -> tier1, WAN2 -> tier2, PREFWAN1/PREFWAN2/BALANCE gwgroups.

                    Whether you have states left open on WAN2 after WAN1 comes back up (sticky connections?) , or the default in the system routing table doesnt switch back to WAN1 after it recovers (make sure you dont have BALANCE as the default gateway), I believe the best approach is to policy route everything.

                    After WAN1 comes back, does traffic routed to a PREFWAN1 gwgroup still go out WAN2?

                    John

                    Lex parsimoniae

                    1 Reply Last reply Reply Quote 0
                    • N
                      nleaudio @mo10
                      last edited by

                      @mo10

                      Yes, it's quite possible that I moved the config to a new box.

                      And yes, when WAN1 comes back, new connections still go out WAN2.

                      It does appear to recover some time later though - maybe by the following day? I've not looked into it carefully.

                      Bob

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        mo10 @nleaudio
                        last edited by

                        @nleaudio

                        Are you using DHCP on the WANs or what are you using?

                        1 Reply Last reply Reply Quote 0
                        • I
                          idiotzoo
                          last edited by

                          I have what looks like the same problem. Gateway group with three gateways. When the tier1 goes down (packet loss) tier2 is used. When tier1 comes back, it does not get used and requires manual reconfigure or reboot. No changes I'm aware of to trigger this behaviour. No hints in the logs.

                          M S 2 Replies Last reply Reply Quote 0
                          • M
                            mo10 @idiotzoo
                            last edited by

                            @idiotzoo
                            Are you using DHCP on the WANs or what are you using on each WAN?
                            Please don't use DHCP, use static instead an report back. Set your main WAN as upstream Gateway.

                            I 1 Reply Last reply Reply Quote 0
                            • I
                              idiotzoo @mo10
                              last edited by

                              @mo10 The tier2 link us using PPPoE, correct me if I'm wrong but I can't use PPPoE with static IPv4 config.

                              I'm not sure what you mean by "Set your main WAN as upstream Gateway".

                              The main WAN link is static. This is a WISP link with a local NAT gateway connected via a vlan, so the physical link never goes down from PFsense point of view. The gateway (a ubiquti radio) also is no use as indicator of the connection health so I have to ping to something and use packet loss to determine the link's state.

                              This was working. The only change is the tier3 link appears to have failed entirely, so this is sitting in a pending state. I'm wondering if this is causing the gateway group to behave incorrectly. Next time the issue occurs I'll remove it and see what happens.

                              M 1 Reply Last reply Reply Quote 0
                              • M
                                mo10 @idiotzoo
                                last edited by mo10

                                @idiotzoo

                                This sounds like a setup Error (pending). Do as you say, delete tier 3 from group an delete tier 3 interface. Then add everything again.

                                I was asking about DHCP because this was the reason I had problems. I heared dual pppoe can cause problems as well but I am not sure.

                                I 1 Reply Last reply Reply Quote 0
                                • I
                                  idiotzoo @mo10
                                  last edited by

                                  @mo10 Someone on site has verified the tier3 connection is borked at layer1 so removing that isn't going to hurt anything. Certainly having a dead link on a lower priority (higher tier) shouldn't cause any issues with the gateway group behaviour and if it does, this is a bug.... but it would be nice to know why a functioning system has stopped working. As this line failure is the only change I'm hopeful that at least explains the issue.

                                  M 1 Reply Last reply Reply Quote 0
                                  • M
                                    mo10 @idiotzoo
                                    last edited by

                                    @idiotzoo

                                    i have found out that there are really strange problems when unplugging and replugging a cable on any wan-port while using DHCP on it.

                                    So maybe you can reproduce your problem by physically unplugging and replugging on your interfaces.

                                    What helped me without needing to reboot: just hit save on any interface.

                                    1 Reply Last reply Reply Quote 0
                                    • I
                                      idiotzoo
                                      last edited by

                                      I removed the failed wan link from the gateway group, no difference. I've now disabled that interface entirely, still doesn't work.

                                      I'm at a bit of a loss.

                                      Anybody know if there's any debugging I can look at? Right now I only know there's a problem if the users tell me. The gateways all look fine, it just doesn't switch back to the tier1 as it should.

                                      1 Reply Last reply Reply Quote 0
                                      • S
                                        serbus @idiotzoo
                                        last edited by

                                        @idiotzoo said in Multi-WAN gateway failover not switching back to tier 1 gw after back online:

                                        I have what looks like the same problem. Gateway group with three gateways. When the tier1 goes down (packet loss) tier2 is used. When tier1 comes back, it does not get used and requires manual reconfigure or reboot. No changes I'm aware of to trigger this behaviour. No hints in the logs.

                                        Hello!

                                        What gateway group (failover/loadbalance) are you using as the Default Gateway on System -> Routing -> Gateways?

                                        What gateway group(s) are you using for all your rules with outbound WAN traffic?

                                        John

                                        Lex parsimoniae

                                        1 Reply Last reply Reply Quote 0
                                        • I
                                          idiotzoo
                                          last edited by

                                          @serbus sorry for the delay in replying.

                                          The system default is wan1 (the fast wan link)
                                          Outbound traffic with a source on the LAN is using a gateway group called office_internet with the wan1 as tier1 and a slower PPPoE ADSL link as tier2.

                                          1 Reply Last reply Reply Quote 0
                                          • B
                                            basicmonkey
                                            last edited by

                                            I've got this issue at a client's office.

                                            If I set a 2 tier gateway group as default gateway for IPv4, on failure of tier 1, tier 2 takes over but doesn't switch back to tier 1 on tier 1 recovery (confirmed on gateway status page). This doesn't happen eve after waiting for an hour.

                                            Interestingly, if I set default gateway to the tier 1 link (which then works as expected) and back to the gateway group, the group is still stuck at tier 2.

                                            This is the same with 'member down' and 'packet loss' options. Tier 1 is PPPoE with dynamic gateway (if that makes a difference).

                                            One thing that may be relevant is that both tier 1 and tier 2 have the same gateway IP. Tier 1 is PPPoE over VDSL, tier 2 is L2TP to the same ISP.

                                            pfSense seems to fail to create correct default routes after fails and I'm often left with no default route despite having working gateways set and active. I need to disable and re-enable the interface to bring it back.

                                            Is this a PPPoE thing?

                                            I 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.