Fail over not reverting back to load balance after WAN drops momentarily
-
Hi there,
I have an issue with one of my WAN connections in that pfSense will only route all traffic via WAN1 after a failover from load balance.
So I disconnect/reconnect in interfaces and a speed test shows that the load balance is working correctly. Also I can see whether both gateways are up when I'm not on site via the cloud based wifi controller.
However perhaps WAN2 will go offline for a few seconds and after it comes back online, I think pfSense remains failing over to WAN1 and all traffic is only routed via that gateway. I know when the connection comes back, that it is actually reconnected as I can ping.
I mean first issue is that WAN2 shouldn't really ever go down as it's the same modems and settings and WAN1 is solid. (although WAN2 is via an Intel PCI card) but I had changed the card to see if this would change it but no difference.
But second issue is that I can get over a gateway going down for a few minutes as everything will stay up with the failover after the load balance but it seems pfSense isn't reverting back therefore getting half the speed I should and a pretty useless WAN2 connection. I'm rarely on site so a manual disconnect/reconnect isn't a viable option every time it goes down.
Any ideas that I can try/what I've missed?
This used to run as a solid load-balanced system so I can't see what's changed bar a potential fault on the line but pfSense SHOULD revert back to load balance after WAN2 has come back online shouldn't it?
Thanks in advance.
-
Any ideas that I can try/what I've missed?
What is your config here? What loads balancing method do you use?
- Session based routing
- Service based routing
- Policy based routing
This used to run as a solid load-balanced system so I can't see what's changed bar a potential fault on the line but pfSense SHOULD revert back to load balance after WAN2 has come back online shouldn't it?
What is your config here?
Here is a simple but pretty working example that can be used to create a load balancing and failover
set up together with pfSense. Multi WAN with Load balancing & fail over -
I have three gateway groups -
Load balance:
WAN1 - tier 1
WAN2 - tier 1Failover WAN1|WAN2
WAN1 - tier 1
WAN2 - tier 2Failover WAN2|WAN1
WAN1 - tier 2
WAN2 - tier 1I've set up three rules with the specified gateway groups.
I also have two CCTV units that are routed via a single gateway (routed via failover WAN1|WAN2 as WAN1 has a static ip) and also I have routed any traffic to a specific webmail provider destination via WAN1 failover too as the webmail security got touchy and couldn't initialise a session with traffic from load balance. If that makes sense? The CCTV units are in an alias so just one rule for that.
These two rules to specific gateways are at the top of the list and below them are the load balance and failover rules and it goes -
Any source to loadbalance
Any source to WAN1|WAN2 failover
Any source to WAN2|WAN1 failoverThing is it does work, but only when I change something or restart WAN2, the reason I say WAN2 is it goes offline a few times a day for a couple of minutes so a time so I only ever usually find it in the state that it's failed-over to WAN1. As I said before, restarting the connection on WAN2 I can confirm loadbalance is working as it should as I can see from running a speed test. But every time it fails over its like it won't go back to "balance mode".
FYI before I restart the connection to get the balance back, WAN2 will be back online and I can ping from it.
I hope that makes sense? I'm not great at explaining sometimes, I can show you some screenshots if need be thanks
-
So I made a mistake by the looks of things. I had "member down" as trigger level for failovers instead of high packet loss.
I didn't have direct access to the unit to physically pull one of the WAN connections to test by disconnecting one of the WAN connections but I have email monitoring and show that it went down early hours of the morning for a few minutes and then cloud monitoring for wifi is showing both gateways are up so all good.
Not entirely sure when this was changed and why. But ah well…
Thanks all :)
-
So I'm getting a repeat of this issue again. Nightmare.
I've read up on some topics of people seeing similar issues regarding the states. Could anybody point me to the light regarding this issue? When I disconnect and reconnect via interfaces I get use of both gateways but not before long I get the same issue after a gateway maybe going down for a few minutes?
:-[
Thanks