Is this possible with pfSense?

jamesc

Whilst browsing Youtube I came across this video of a guy with two Verizon FiOS 35mb connections connected to a Zeroshell box. He has then created two Layer2 OpenVPN tunnels across these connections to another Zeroshell box in his datacentre which is connected to a large Internet pipe.

He has bonded the VPN connections at both ends and is seeing a true aggregated bandwidth of 70mb at the Verizon FiOS end when he runs speed tests.

Is anything like this possible with pfSense? Im aware of using gateway groups but looking for a solution that will give me increased throughput rather than just connection based round robin load balancing.

Heres the video:
http://www.youtube.com/watch?v=XIeXAqdbpzI&feature=youtube_gdata_player

jimp

We were just discussing that very topic with a customer last night. It might be possible but needs a lot more testing.

The theory is:
1. Install tap bridge patch on both ends
2. Setup two tap VPN tunnels between the boxes, one on each WAN, to separate IPs or WANs on the far end – no IPs on the OpenVPN interfaces
3. Create a lagg interface including both VPN interfaces. (The fuzzy part is which lagg modes may actually work or do anything here -- maybe roundrobin or Load Balance would be a good place to start)
4. Assign the lagg interface and give it an IP on both ends, setup your static routes manually by adding a gateway and static routes for the far side network (and the inverse on the opposite side)
5. Add any firewall rules you want to both the lagg and openvpn tabs.

No idea if or how well it works in practice, what any actual speed gain or fault tolerance may be, etc. The more people try it and report back the better we can know what works and what doesn't. :-)

jamesc

Thanks Jim, I better get testing then :-)

jamesc

Some progress…

I've installed the tap patch at both ends and have set up two VPN tunnels from my client (on different WAN's) to two IP's on my server (again, on different WAN's).

OpenVPN is using tap, peer to peer, shared key, no encryption and no tunnel network (using UDP 1194 and 1195). The status on both ends shows the tunnels as up and show the right IP for each VPN. All good so far.

Both ovpnc and ovpns are assigned to a LAGG in load balancing mode.

The LAGG is assigned to an interface and IP address 10.99.99.1 is assigned to the client. 10.99.99.2 is assigned to the server LAGG.

When I ping 10.99.99.2 from the client I get no reply (host is down) message. If I do a packet capture on the LAGG interface on the client, this is what I see:

14:11:36.254149 ARP, Request who-has 10.99.99.2 tell 10.99.99.1, length 28
14:11:37.268732 ARP, Request who-has 10.99.99.2 tell 10.99.99.1, length 28

I have created a pass any rule on both the OpenVPN and LAGG interfaces at both ends.

I think I maybe missing the step below but don't quite understand what route I setup and to where. Packet capture shows the traffic leaving the correct interface so i'm a little lost

setup your static routes manually by adding a gateway and static routes for the far side network (and the inverse on the opposite side)

Cheers

James

jimp

I didn't have much luck either. It may be the lagg mode, you might try a couple different ones (perhaps roundrobin) and see if you get any different results.

I also noticed that lagg is configured at bootup before OpenVPN so it would never work initially at bootup, you have to re-save/apply the lagg after it's booted up all the way.

jamesc

Ok, i've switched the lagg mode to round robin and the ping starts replying. The traffic graphs show equal utilisation on each WAN at both ends. However, failover doesn't seem to work. When I down any of the 4 WAN connections, the ping fails :(

jimp

Yeah that's what I was afraid of. OpenVPN doesn't take the interface down when it's disconnected, so it sees "up" and doesn't take it out of the lagg.

Curious if you "ifconfig ovpnc2 down" if it would work.

jamesc

Just tested that, the ping still fails…

When I do an 'ifconfig ovpnc2 up' it all comes back up again and the graphs show equal distribution of traffic.

jimp

It probably has to be downed on both ends then (server and corresponding client)

jamesc

You are right, that seems to work.

Is there an easy way I could test throughput gain? I'm running all this in VMWare so no real Internet connection. I've just got the WAN's going through seperate vSwitches.

jimp

If you have a client behind each firewall, you could run iperf between them, but if it's all on the same ESX box it would be hard to tell.

Can't you rate limit switch ports on ESX? You could perhaps limit each one to something like 1Mbit/s and see what happens more clearly.

jamesc

Not sure if switch ports can be rate limited on a standard vSwitch.

What if I change the WAN interfaces on the client firewall to 10Mbps full duplex and then run iperf?

jimp

That might help so long as your ESX box is actually capable of passing 40+Mbit/s through it like that. I was aiming lower so the effect would be easier to see without having to potentially come near the limits of the actual hardware involved.

I can set speed and loss % on VMWare workstation switch ports, so I imagine ESX should be able to do the same, perhaps on the client VM and not on the vswitch.

jamesc

Could I ask how I would make one of the networks behind my server firewall available to the OpenVPN client?

This is the first time i've worked with TAP so a little unsure.

I have added a new NIC on my server firewall, IP 192.168.1.2, assigned this to OPT3 and uplinked this to an unused vSwitch, along with a Windows VM on the same subnet that will run the iperf server. I have also created an allow all firewall rule on OPT3.

I have bridged the OPT3 interface in my OpenVPN server setup but from the client I cannot ping 192.168.1.2.

jimp

On the 10.99.99.2 side, add a gateway of 10.99.99.1. Then static route for the remote subnet pointing to 10.99.99.1

Do the opposite on the other end. Add static routes for any networks you want to reach.

jamesc

Hi Jim,

Struggling to progress this further as it doesnt look like you can change the switch port speed in ESXi 5. I guess VMWare Workstation is intended more for development hence the reason for including that feature.

I did try setting the WAN interfaces on the client to 10mbps but the Interfaces widget still reports them as gigabit. I ran two iperf tests, one with the interfaces set to 'auto' and the other with them set to 10mbps. The results we pretty much the same.

On another note, i noticed that by just clicking 'save' in the OpenVPN config screen (with nothing changed), upsets the lagg and the pings start failing. By opening the lagg configuration and hitting save, the pings start replying again.

Is this a feature thats on the roadmap for a future release? If the bandwidth test is promising would you say there is much work involved in getting the failover features to work?

jimp

Not sure really. It's something we'd like to see working (VPN Bonding) but lagg may not be the most efficient way to get that done in the long run. It's just been a topic lately since zeroshell is doing it that way.

Reconfiguring the lagg as needed may not be difficult to add in the backend but it seems like there are quite a few issues with doing it that way that may end up making it not really feasible to use.

Another way we'd mentioned is doing MLPPP over the tunnels to bond them but that could be even more of a challenge.