• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Multi-WAN gateway failover not switching back to tier 1 gw after back online

Routing and Multi WAN
35
119
53.2k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    satadru @matronyx
    last edited by Aug 3, 2018, 2:38 PM

    @matronyx

    Just for clarification:

    • Default gateway switching is unchecked.

    • A single gateway group with Tier 1 gateway being highest priority, and Tier 2 being lower priority, and member down is the trigger.

    • Firewall rules use that gateway group.

    And that works for failover? If you pull the cord for gateway 1 it switches to gateway 2? And if you reconnect gateway 1 it switches back?

    M N 2 Replies Last reply Aug 4, 2018, 1:13 AM Reply Quote 0
    • M
      matronyx @satadru
      last edited by Aug 4, 2018, 1:13 AM

      @satadru
      Yes that is correct. The switch back and forth between tiers is fully automatic.

      1 Reply Last reply Reply Quote 1
      • N
        nleaudio @satadru
        last edited by Sep 6, 2019, 3:12 AM

        Although this is an older thread, I have the same issue happening with the very latest version, as of September, 2019. I have three WAN connections, and one of the gateways I have configured has two of the gateways on it. My PFSense will failover to my Tier 2 connection automatically; but when it comes back up, it will not go back to the Tier 1. I even tried clearing the states - no change. I tried changing the gateway set as Tier 2, and it just routed all the traffic thru that gateway, instead of the Tier 1. All gateways are up, and show as up.

        What more can I do to debug this? I did not find the "Default Gateway Switching" option where indicated. Indeed, my "default" gateway is the Tier 1 gateway that seems not to be being used by the Gateway group.

        My config is a bit complex, but I'm happy to try to debug this. Just need direction. Thanks.

        Bob

        1 Reply Last reply Reply Quote 0
        • G
          gniting
          last edited by Sep 6, 2019, 7:53 AM

          I ended up writing a script and running it via cron to achieve the "switch." Yes, it is not elegant, but it gets the job done.

          Here's what I have and I run this as a 5-minute cron job.

          #!/bin/sh
          
          # get active gateway and current time
          CURRENT_TIME="$(date +"%c")"
          CURRENT_GW="$(netstat -rn | grep default | awk '{print $4}')"
          
          if [ $CURRENT_GW = "em2" ]; then
          	#check if WAN1 is up or not
          	WAN1_STATUS="$(pfSsh.php playback gatewaystatus brief | grep WANGW | awk '{print $2}')"
          	if [ $WAN1_STATUS = "none" ]; then
          		#WAN1 is back online, stop/start WAN2
          		echo "$CURRENT_TIME: Bringing down WAN2"
          		ifconfig em2 down
          		echo "$CURRENT_TIME: Sleeping for 30s"
          		sleep 30
          		echo "$CURRENT_TIME: Bringing up WAN2"
          		ifconfig em2 up
          	else
          		echo "$CURRENT_TIME: WAN1 is still down"
          	fi
          else
          	echo "$CURRENT_TIME: Nothing to do!"
          fi
          
          1 Reply Last reply Reply Quote 2
          • C
            C_C
            last edited by C_C Oct 3, 2019, 12:45 PM Oct 1, 2019, 1:41 PM

            Hey. Thanks @ibbetsion for the script.

            Here is a slightly modified version that kills firewall states when there are connections remaining on WAN2 and WAN1 is back online.

            Works great for my needs ( LTE failover ).

            I set it as a cron, every minute:

            */1 * * * * /root/clear_state_back_from_failover_cron.sh >> /root/clear_state_back_from_failover_cron.log
            
            • I also checked "Flush all states when a gateway goes down" in System / Advanced / Miscellaneous.
            • The LTE gateway has monitoring disabled "Disable Gateway Monitoring" in System / Routing / Gateways. Otherwise states will be created on the interface and the script becomes wrong. Also, monitoring would consume data and I did not want that.

            Code:

            #!/bin/sh
            # *** kills firewall states on failover WAN when WAN1 is up ***
            
            WAN1_NAME="WAN_DHCP"
            WAN2_IF=ue0
            WAN2_GW_IP=192.168.3.1
            
            CURRENT_TIME="$(date +"%c")"
            WAN1_STATUS=`pfSsh.php playback gatewaystatus brief | grep "$WAN1_NAME" | awk '{print $2}'`
            
            if [ "$WAN1_STATUS" = "none" ]; then
            	# the following line may need to be tweaked depending on your needs
            	WAN2_NSTATES=`pfctl -s state | grep "$WAN2_IF" | grep -v " -> $WAN2_GW_IP" | wc -l`
            	if [ "$WAN2_NSTATES" -gt 0 ]; then
            		echo "$CURRENT_TIME: WAN1 is online, but connections remain on $WAN2_IF. Killing states."
            		pfctl -F state
            	fi
            else
            	echo "$CURRENT_TIME: WAN1 is down"
            fi
            
            1 Reply Last reply Reply Quote 0
            • C
              cmb991
              last edited by Nov 29, 2019, 11:08 AM

              I'm really surprised pfSense has nothing built in to handle this yet. This has been ongoing since 2017. In my case, my LTE modem (unlimited data) is still in gateway monitoring mode, so I'll be using @ibbetsion script. Thanks @ibbetsion

              1 Reply Last reply Reply Quote 0
              • M
                mo10
                last edited by mo10 Jun 9, 2020, 10:52 PM Jun 7, 2020, 4:47 PM

                EDIT2
                Issues not fixed. If i pull cable and put it right back in it will mess up Multi-WAN. Will not switch back correctly.

                EDIT:
                Resetting to defaults and setting everything up again seems to have fixed my issues.

                Old:
                I think i found the cause:

                Seems Multi WAN is not working properly (or maybe dpinger) if the Interface goes down and back up (unplugging and re plugging).
                In my tests i was just unplugging the cable on the WAN-Port.

                I think the same happens with PPPOE or anytime the link is down and up again (physically).

                This should not happen In my opinion. If modems reboot and so on: MultiWan would stop working.

                I use "Paket Loss" as trigger level on Gateway-Group.

                Would love to hear from you, thanks.

                1 Reply Last reply Reply Quote 0
                • S
                  serbus
                  last edited by Jun 7, 2020, 6:46 PM

                  Hello!

                  I have several sites using multi wan and gateway groups with a mixture of static, dhcp, and pppoe. They all behave as expected.

                  Are you policy routing all of your WAN bound traffic?

                  "Defining gateway groups is only part of the story. Traffic must be assigned to these gateways using the Gateway setting on firewall rules."

                  https://docs.netgate.com/pfsense/en/latest/routing/multi-wan.html#firewall-rules

                  My experience is that you cant depend on the system routing table having your "preferred" (tier1) default route.

                  John

                  Lex parsimoniae

                  M 1 Reply Last reply Jun 7, 2020, 6:50 PM Reply Quote 0
                  • M
                    mo10 @serbus
                    last edited by mo10 Jun 7, 2020, 6:54 PM Jun 7, 2020, 6:50 PM

                    @serbus said in Multi-WAN gateway failover not switching back to tier 1 gw after back online:

                    Hello!

                    I have several sites using multi wan and gateway groups with a mixture of static, dhcp, and pppoe. They all behave as expected.

                    Are you policy routing all of your WAN bound traffic?

                    "Defining gateway groups is only part of the story. Traffic must be assigned to these gateways using the Gateway setting on firewall rules."

                    https://docs.netgate.com/pfsense/en/latest/routing/multi-wan.html#firewall-rules

                    My experience is that you cant depend on the system routing table having your "preferred" (tier1) default route.

                    John

                    Thank you,

                    strange is: if i don't unplug a cable on testing and switch off internet without pulling the cable, everything works just as expected. Every time. Soon as i unplug and replug i have to save interface settings for example to get switiching back to default Tier back working.

                    I checked almost every configuration before and nothing really helped.

                    1 Reply Last reply Reply Quote 0
                    • S
                      serbus
                      last edited by Jun 7, 2020, 7:23 PM

                      Hello!

                      Ahhhh, gotcha.

                      I am having a problems following the thread. It is long and old, and seems to cover different (resolved?) problems. Yours could be yet another issue. Maybe a new thread?

                      John

                      Lex parsimoniae

                      1 Reply Last reply Reply Quote 1
                      • N
                        nleaudio
                        last edited by Jun 8, 2020, 5:10 AM

                        As far as I know, this is still problematic! Some PFSense boxes I have on dual-wan setups will switch from tier 1 to tier 2 connections without issue, but going back when the tier 1 is restored does not always work... At least not in the timeframe I would consider usable.

                        Bob

                        M 1 Reply Last reply Jun 8, 2020, 9:25 AM Reply Quote 0
                        • M
                          mo10 @nleaudio
                          last edited by Jun 8, 2020, 9:25 AM

                          @nleaudio said in Multi-WAN gateway failover not switching back to tier 1 gw after back online:

                          As far as I know, this is still problematic! Some PFSense boxes I have on dual-wan setups will switch from tier 1 to tier 2 connections without issue, but going back when the tier 1 is restored does not always work... At least not in the timeframe I would consider usable.

                          Bob

                          i think i had those issues because i imported a configuration to different hardware. Did you do the same?
                          After i did a reset to defaults and set up everything again it is now switching back fine.

                          G N 2 Replies Last reply Jun 8, 2020, 10:39 AM Reply Quote 0
                          • G
                            gniting @mo10
                            last edited by Jun 8, 2020, 10:39 AM

                            @mo10 to be clear... in your dual-WAN setup, if WAN1 (default gateway) goes down and pfsense ends up making WAN2 the default, then upon recovery of WAN1, pfsense automatically marks WAN1 as default?

                            M 1 Reply Last reply Jun 8, 2020, 10:51 AM Reply Quote 0
                            • M
                              mo10 @gniting
                              last edited by Jun 8, 2020, 10:51 AM

                              @ibbetsion

                              This was never a problem for me. It maked the gateway as default fine but still was sending traffic the wrong way. Saving an interface fixed it until i unplugged (physically) a cable again.
                              Now after resetting everything everything runs as expected.

                              Do you have problems with Multi-Wan? What exactly?

                              G 1 Reply Last reply Jun 8, 2020, 11:08 AM Reply Quote 0
                              • G
                                gniting @mo10
                                last edited by Jun 8, 2020, 11:08 AM

                                @mo10 my problem is that post recovery, WAN1 never goes back to being default. I have to use a script to bring down WAN2 so that WAN1 becomes default again. Not an ideal solution but it works.

                                M 1 Reply Last reply Jun 8, 2020, 12:23 PM Reply Quote 0
                                • M
                                  mo10 @gniting
                                  last edited by Jun 8, 2020, 12:23 PM

                                  @ibbetsion

                                  Would you be able to make a test an reset you pfsense (save configuration first) and just setup the multi-wan an try again?

                                  1 Reply Last reply Reply Quote 0
                                  • S
                                    serbus
                                    last edited by serbus Jun 8, 2020, 12:29 PM Jun 8, 2020, 12:27 PM

                                    Hello!

                                    Assuming a pretty standard multiwan: WAN1 -> tier1, WAN2 -> tier2, PREFWAN1/PREFWAN2/BALANCE gwgroups.

                                    Whether you have states left open on WAN2 after WAN1 comes back up (sticky connections?) , or the default in the system routing table doesnt switch back to WAN1 after it recovers (make sure you dont have BALANCE as the default gateway), I believe the best approach is to policy route everything.

                                    After WAN1 comes back, does traffic routed to a PREFWAN1 gwgroup still go out WAN2?

                                    John

                                    Lex parsimoniae

                                    1 Reply Last reply Reply Quote 0
                                    • N
                                      nleaudio @mo10
                                      last edited by Jun 8, 2020, 3:07 PM

                                      @mo10

                                      Yes, it's quite possible that I moved the config to a new box.

                                      And yes, when WAN1 comes back, new connections still go out WAN2.

                                      It does appear to recover some time later though - maybe by the following day? I've not looked into it carefully.

                                      Bob

                                      M 1 Reply Last reply Jun 11, 2020, 1:30 PM Reply Quote 0
                                      • M
                                        mo10 @nleaudio
                                        last edited by Jun 11, 2020, 1:30 PM

                                        @nleaudio

                                        Are you using DHCP on the WANs or what are you using?

                                        1 Reply Last reply Reply Quote 0
                                        • I
                                          idiotzoo
                                          last edited by Jun 13, 2020, 11:57 AM

                                          I have what looks like the same problem. Gateway group with three gateways. When the tier1 goes down (packet loss) tier2 is used. When tier1 comes back, it does not get used and requires manual reconfigure or reboot. No changes I'm aware of to trigger this behaviour. No hints in the logs.

                                          M S 2 Replies Last reply Jun 13, 2020, 1:14 PM Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.