Dual WAN - Site 2 site failover

pstokman

I'm facing some issues trying to configure a fail-over site-2-site between two pfSense devices, one with dual wan, the other with single wan. I sort of got it working, or so I thought, but when I try to simulate a WAN failure, it doesn't route properly on the single wan. It tries to route traffic back through the VPN that's down.

What I've done is configure two OpenVPN servers on the single WAN appliance and two OpenVPN clients on the dual WAN. The server runs on different ports, one on 1194 and the other on 1195. This is also set on the clients, where they both use different WAN ports to go out. They connect properly, I can ping back and forth, but when I pull the cable or otherwise disable the WAN1 connection, the server side does not register this and changes it's route to go over the second one, it stays on the first one which is down.

I've read about using the client to run on the LAN side and port forward the bunch using gateway groups, but I can't figure out how to forward local traffic to a WAN. None of the posts specify the rules required for it.

Here's a diagram on how I have it:

WAN1 (static) ---.
                             \
                               =---- Internet ----= WAN
                              /
WAN2 (DHCP) ---'

So, running a server on the dual WAN side is not an option because of the second WAN having DHCP. It's IP can change when the box reboots.

I have already configured a gateway group for both WANs as load-balance, so internet connectivity is not a problem when one WAN goes down. This group is set as the gateway on the LAN side using the Advanced firewall rule. Now I just need the same for the VPN.

There's another thing that's not related to this, but I got it as I was simulating WAN failures. Whenever I pulled the second WAN cable and plugged it back in, it's link would go down and up continuously, about every second/2 seconds it would change state. I never found a solution to it and plenty of other users have this problem. One 'solved' it by setting a static IP, but that's not an option. The only way to solve it for me, is to reboot pfSense.

pstokman

Ok, got it working with a failover time of 30 seconds, the time it takes for the gateway to be marked as down. I used http://forum.pfsense.org/index.php?topic=32603.0 and even though it states in the caveats that pulling cables is a bad thing and will cause things to fail eventually, it didn't fail for me in that way. I've set my gateway group to load balance, instead of failover as described there.

It may not use both WAN connections to increase VPN bandwidth, which would have been nice, but at least we can continue to work as we used to and a hiccup in the line doesn't cause a major failure of the VPN, unless it takes more than 30 seconds, but then it will switch to the other WAN any way. Only strange part on the server side, is that it only sees the connection coming from the WAN port which initiated the connection first. When it switches, it's not reflected on the other pfSense that acts as server. Just a minor point.

heper

using a dynamic routing protocol like quagga ospf would provide failover without using gateway-groups. Timeouts can be set as you prefer and it will also fallback without any issue in any scenario

pstokman

Ok, but where do I find OSPF or quagga? I've only seen those terms for 1.2.x. I only see RIP under Services with 2.0.1. Looking at the section in the pfSense book, I see that it's a separate package. I don't have that installed. So I've solved it this way.

Plus, with the gateway group, it also load balances normal internet traffic from the LAN and using static routes, I can force certain sites to be accessible only through a certain WAN (e.g. there is an IP restriction on the website).