Failover / Failback not working for me in 2.6.0
-
Failover / Failback not working in 2.6.0
Is this a known issue?
Is it being addressed in 2.7.0?I am in the process of searching, looking around and figuring it out.
This is a VERY high level question and as usual I will dig further into ALL of
The details as we go.My scenario is a static IP and gateway on one WAN interface interface.
Call this the main Internet connection.On AUX1 ethernet Interface I have a dynamic provided IP from a secondary Internet provider.
Call this a "backup Internet connection".If the main static gateway goes down I would like to auto-failover to the secondary.
I am testing failover/failback by physically unplugging the ethernet connection to one
or both of these.The secondary dynamic gateway never "comes back" when I plug it back in.
It shows "pending" in the gateway status.
The interface DOES get a dynamic IP when I plug it back in and I can reach that gateway which
is directly available on this ethernet. I can ping it.
But cannot ping past it. A working default route is not being created.
On the status page It will just sit there saying "pending" forever.
If I twiddle any gateway settings or just re-save them, it springs back to life and works.
But have not been able to get it to come back automatically.I do not want or wish to load balance.
Am simply trying to setup backup internet and auto faulover and failback/restore.I have tried two approaches:
1- Simply create two gateways and chose automatic for default Gateway.
2- Create a gateway group containing both gateways and setting the main as tier1 and secondary
as tier 2.In either attempt the secondary gateway dies and goes into pending if I unplug its ethernet and plug it back in and wait for it to come back.
If both gateways are online and I unplug the main, it does failover from the main to the secondary as expected and will failback (restore to using the main) if I plug the main back in.
But once the secondary fails it never comes back as a working gateway and shows pending.
Regardless of main connection status.For reference -mostly my own- these reports are very similar.
https://www.reddit.com/r/PFSENSE/comments/103lx0g/multiwan_failover_not_working_correctly/ -
Any pointers?
Is it broken in 2.6? Or is it supposed to be working just fine and I'm missing something?
-
@n8lbv On the secondary have you tried changing the monitoring IP? Or disabling gateway monitoring, temporarily?
-
@steveits I will test this again on the bench.
I think I did :)
And gateway monitoring was on by default pinging the gateways.
I didn't think to try turning it off (leaving it off?) as that would pretty much imply it's not going to work right?
I will test again and get more specific. pretty much any change or just re-saving any portion would make it pop back up and be active again.
But would never come back on it's own when the gateway was made available again.Anyhow I'll get one setup for testing here.
That was out at a client and decided just to give it a try and it didn't work out.
Figured I might try and figure it out here and go back later and do this for them. -
@n8lbv If gateway monitoring or action is disabled then the interface will always be on. But if it works then that helps diagnose the problem.
-
@steveits I might be a bit confused and may be clearer when I an setup a test.
If the interface is always on (I think you mean gateway always active) then it won't fail?
And I won;t be able to test what happens when it fails and comes back.
Just thinking out loud.
I realize you might be meaning something totally different than what I am thinking. -
@n8lbv I was more thinking of reasons why it would be offline. If the monitoring detects the monitoring IP is unreachable then it would remain offline. I was thinking, if you disabled monitoring of the secondary WAN then does it come online (probably), and is it functional?
-
@steveits I will try all of this soon.
-Steve -
Good day,
I would like to second this.
Some things I have done and tested (recreated in a VM as well).
Like OP.
WAN1 - Static IP (Monitor IP 1.1.1.1 - 2.7ms)
WAN2 - Dynamic IP (Monitor IP 8.8.8.8 - 9.7ms)
Created gateway group (Backup) with the following: WAN1 - Tier 1 / WAN2 - Tier 2 / Trigger level: "Packet Loss or High Latency"
Set "Default gateway" on Systems Gateway page to "Backup".Same issues with the Dynamic Link (WAN2) when it fails. It will stay under pending.
If go to the "System->Routing" and hit "Save" and then Apply Changes (even though no changes were made)
The WAN2 link will come back online.Also tested with WAN1 and WAN2 both having Static IPs. This issue didn't show up and the routing moved between links as they went up/down multiple times (testing/disconnect).
Tested on removing the Monitor IP on the WAN2 and once WAN2 fails/comes back online it still stays under "pending" state.
Testing on 2.6.0-RELEASE (amd64).
This issue has shown up on both VM and on BareMetal. -
@dataideas-josh Yeah I need to get back to testing this soon.