Failover recovery isn't always working right for me



  • I have a standard failover setup using guides I found online. Rather than going into the exact details of my setup I would like to ask in what scenario is it possible for WAN1 to be detected as down from packet loss then traffic switching over to WAN2 then WAN1 coming back up and pfsense knows its backup because it sent out the email and everything but traffic is still going through WAN2. You can then physically unplug the ethernet going to WAN1 and immediately plug it back in and then suddenly everything goes through WAN1 again.

    This is confusing to me as I do not understand why unplugging it and replugging it will suddenly fix it when pfsense already knew the connection was back up.


  • LAYER 8 Netgate

    Regardless of the email received, what is the status of the gateways before you intervene?

    Status > Gateways

    Are you talking about old connections that might have been established out WAN2 while WAN1 was down or new connections?

    How are your gateways, gateway groups, and policy routing configured?



  • @derelict

    Gateway shows up as online.

    New connections but even if it was an old connection that was established out wan2 while wan1 is down if it doesnt switch over when wan1 comes up on its own why would it switch over when unplugging and replugging wan1? This is the part thats confusing me I don't understand how unplugging and replugging suddenly gets things working again.

    0_1536165207104_573047d3-0e6f-4b33-b3cc-7dbb13f03b92-image.png

    WAN1 checks ip 4.2.2.1 and WAN2 checks ip 4.2.2.2 i've also setup rules so I can watch it with ping tests myself...

    0_1536165336142_0da9a7a1-d69d-4594-9cf8-b4b69afa3af2-image.png
    0_1536165365713_aef6a3f8-2365-46da-9cd2-1c620bbf7d33-image.png

    Edit: those 192.168.0.3 rules are just so that computer can have "open nat" for destiny 2 while another computer is also playing destiny 2 at the same time.


  • LAYER 8 Netgate

    I would take a good look at the states created and the firewall rules (/tmp/rules.debug) while it is in a state you think isn't working correctly to see if anything does not look right.

    The gateway group in the rules.debug file should contain the gateways in the correct order.



  • Did you go to firewall/rules/lan. Edit the "Default allow lan to any rule" and change gw to your gateway group? It is under advanced.



  • @fastfish no i didnt enable "Default gateway switching" it said its unnecessary in most all scenarios, which instead use gateway groups so i just used gateway groups instead. I wasnt sure exactly what the difference was.



  • @rezo That wasn't what I was saying. Please reread.



  • Plugging/unplugging must trigger something that causes it to reset the connections on WAN2. What I found after extensive testing is that once a failover occurs and connections are established on WAN2, it will not break those connections and put them back on WAN1 unless forced to do so.

    I documented my method for getting fail-back to work, maybe not ideal but the only way I could get it working reliably.

    https://forum.netgate.com/topic/135614/failback-from-primary-wan-after-failover-to-secondary-wan


Log in to reply