Failover WAN with working OpenVPN Client
I’m having some problem to make my failover on pfSense work as I want.
This is my setup so far…
Gateways (System > Routing > Gateways)
WAN1 – ISP1 Fiber *Default
WAN2 – ISP2 4G
OpenVPN1 – Local server on pfSense
OpenVPN2 – Client outgoing from pfSense to VPNProvider1
Gateway groups (System > Routing > Gateway Groups)
| Trigger: Member down
OpenVPN client (VPN > OpenVPN > Clients)
Interface – GW Group HA_WAN
Explanation of the internal network, it’s containing several VLANs but I will only bring you the required for this function.
On the default LAN I have some firewall rules with option gateway. Some of the hosts are servers, witch always leaving pfSense through the normal WAN connection further out on the Internet. But all regular clients on the LAN have the rule to leave pfSense via OpenVPN2 Gateway, so to say the VPN Client for anonymizing the traffic.
As I’m using the VPN client for anonymizing, I wouldn’t like to have any kind of DNS leakage. This is managed by setting the DNS Servers to go via VPN client at all the time, regarding who asking the query, client or server. This also means that there will not be any DNS responses if VPN client is down. I’m all fine by this. As far my VPN client can handle to bring itself up again. And it always should, as long we have Internet.
The VPN client hostname then… well I’ve created two domain overrides so pfSense knows where to resolve from.
This is overriding the setup at (System > General Setup > DNS Server Settings) that’s only have the OpenVPN2 gateway. (VPN Client).
Domain Overrides (Services > DNS Resolver > General Settings > Domain Overrides)
So far so good… The tunnel successfully brings up every time.
Now to my problem. If I simulate a member down on WAN1, everything switches over to WAN2 as expected. Servers can reach Internet from this ISP (ping) but not browsing = I have no DNS… The VPN client now screams that it can’t resolve the hostname! That’s odd… This seems to having something to do with pfSense local services not being affected by the Gateway Group failover.
How can I say that? Well, if I go (System > Routing) and editing my WAN2 4G (currently live one as WAN1 is “member down”) and checks that as the Default route. Everything starts working!
Switching on the WAN1 again, I have to do the same procedure to set the WAN1 as default route again before the VPN can take up it’s tunnel.
Now, I’ve searched some different solutions to this. One of them was to enable (System > Advanced > Miscellaneous > Load Balancing) “Default gateway switching”. But the problem here is that I have a total of 4 gateways… And my OpenVPN1, local server will always be online. I haven’t found any way to tell pfSense witch gateways it should prefer, so with this setting enabled the default route can suddenly be my local OpenVPN. This setting seems to not take in notice my weight or tier options on the gateways. It’s just randomly chooses the one listed on top in the list, which happens to be the local VPN server.
No luck there, my last thing now is to somehow call a script when my Gateway group signals a member down, and make the default route switch with that script. Is this possible or am I missing something in my configuration that doesn’t make it work as expected?
To clarify, if the wall of text wasn't so inviting… :o
Is it possible to run a script when a gateway group switches gateway?
If WAN2 gets present, run a script to default route to WAN2
If WAN1 gets present, run a script to default route to WAN1
Appreciate a detailed how-to, where to put the scripts on pfSense and also how the script itself should look to make the new default route. As I’m not familiar with scripting in Linux.
Thanks in advance.
Typing to myself this far…
I’ve manage to do a work around with two static routes. As the issue seems to only be with resolving the hostname in OpenVPN Client, and I have two Domain overrides.
Why not just put them as separate static routes to each WAN?
Static routes (System > Routing > Static Routes)
OpenVPN_ns1 > WAN1
OpenVPN_ns2 > WAN2
This actually works, tunnel brings up on WAN2 and I can confirm traffic flow but after a couple of minutes when simulating member down (WAN1 unplugged)… Then the tunnel brakes with a flood of new message in the log.
write UDPv4: No buffer space available (code=55)
Getting same message in the console of pfSense trying to ping something.
[2.3.1-RELEASE][admin@-]/root: ping x.x.x.x PING x.x.x.x (x.x.x.x): 56 data bytes ping: sendto: No buffer space available ping: sendto: No buffer space available
Can someone explain why that is happening?
As soon I bring up WAN1 again everything is working normally.