Load balancing not working with Wireguard client

madbrain

Hi,
I setup pfSense with two ISPs, for both failover and load balancing. When I use Ookla speedtest on either my wired desktop PC or my phone on Wifi, the throughput is the sum of the capabilities of each ISP. Eg. if Sail alone is giving me 200 Mbps down / 30 Mbps up, and Verizon alone is giving me 80 Mbps down / 12 Mbps up, then speedtest will show approximately 280 Mbps down / 42 Mbps up.

However, if I use the Wireguard client on my Windows desktop, and connect to a VPN server, and run a speedtest, it appears that only one of the ISPs is being used. The traffic graph (as monitored on my phone, since the VPN blocks access to pfSense on the desktop) shows throughput only on Verizon, but not Sail in this case. I know Wireguard uses UDP. Are UDP datagrams not getting load balanced, somehow ? If they are supposed to, what am I missing ?

Gblenn

@madbrain You won't get a single VPN tunnel to be working across two different WAN connections like you are experiencing with speedtest. And you would not be able to see that with iperf if you are only doing one single stream for example. It has to go out one OR the other WAN, otherwise it would not know where to go in the return direction...

In fact so is all your traffic in such a setup. Each "session", or stream will be on one single interface. But when you add them up, will see that you are getting the sum capacity of the two ISP's.

The reason you see that with speedtest is that they run multiple streams, which then get distributed across the two connections.
So if you can run two or more computers with VPN clients doing speedtest, you would get a similar result, when adding them up so to speak...

madbrain

@Gblenn Thanks. I would think this limitation would exist if the VPN software uses only one stream indeed. But if it used 2 or more streams, wouldn't it also benefit from the load balancing on the router side ?

Perhaps Wireguard is limited to one stream. Perhaps other VPN protocols support multiple stream and would work in this case ?

Gblenn

@madbrain Speedtest initiates multiple independent streams, each one making a connection between the endpoints. Therefore they can go separate routes, and utilize your dual WAN setup. Your load balancing setup will take each session and distribute them across the two interfaces, but it can't split up a single session, since that would break it...

A VPN is a connection between two endpoints, and pfsense or any other equipment in the path, has no idea of what is going on inside it. Everything inside that "tunnel" is encrypted which takes a lot of resources and creating multiple smaller "pipes" probably doesn't make sense at all, since it is anyway going to the same endpoint. My guess is that it would likely create a lot of overhead instead.

So no, I don't think you can find a VPN solution using multiple "streams" that can be split up across multiple paths. What you can do however, is to set up multiple VPN's, which some providers have. For the purpose of perhaps tunneling to two different countries or sites. But then again, which traffic goes where... Likely you would then set them up as completely different subnets so you know for sure what is going where...

madbrain

@Gblenn I'm aware of how speedtest works.

A VPN protocol could certainly be designed to work similarly by using multiple independent streams, which could be load-balanced by the router across multiple WANs, without the need for the router to decrypt any of the packets or even perform deep packet inspection to figure out it is VPN traffic. Or the VPN client could run on the router itself, be aware of the number of WANs, and appropriately split traffic for each. It would be a non-trivial implementation for sure, especially when the WANs have different performance characteristics, but it is certainly possible. There is research going on in this area, actually that I found yesterday while Googling, but can't locate right now. Clearly, this does not apply to the Wireguard protocol. I'm not sure if it applies to any other VPN protocol currently in existence.

Until such protocol is available, I would still like to be able to direct VPN traffic to the fastest WAN, and not the slower one, which is what I'm seeing occur. It's probably pure bad luck, but it is fairly consistent.

Right now, I have both load-balancing and failover configured for the 2 WANs. I could delete the load balancing rule, and have the fastest ISP be the top priority. But I would lose the aggregate bandwidth by doing so for applications that can benefit from it - the second ISP would only be used when the first one goes down.

Is there a way to keep the load balancing and failover, while causing Wireguard traffic to prefer one WAN over the other ?

Gblenn

@madbrain Well, I guess it doesn't really matter what may or may not be possible if no one is doing it. And like I said, I don't think you will find any VPN provider that does multistream VPN.

So your are stuck with single stream and no possibility to increase throughput through your load balancing setup.

What you can do however, is to create a LAN Pass rule like this:

Select Protocol TCP or TCP/UDP, which lets you specify the port further down.
YourLANIP in this case is the IP of your PC that is running the Wireguard client.
And VPNPort is the port being used for that clients' VPN traffic (or ports if there are different ports for different endpoints).

And under Advanced Options, near the bottom, you select the Gateway you want that traffic to exit via.

That's it...

What you don't have with this is the failover capability for the VPN client. Since your policy rule is forcing all the traffic via one of the interfaces, and if that interface is down, you are stuck. I don't know if there is a way to get around this, like prioritizing the faster interface in the loadbalancer.

madbrain

@Gblenn Thanks. I'm afraid partially losing automatic failover to improve outbound VPN performance is not a great trade-off, unfortunately. Wireguard also doesn't use a fixed port. Various VPN servers use different ones. There is a list of the most common ones. If I could create rules that cover all outbound Wireguard connections on one interface, it seems that it would be possible for Netgate to implement automatic failover for them too, but as you say, that functionality seems to be missing for now.

Since I'm always home when I use outbound VPN, perhaps it is simpler to manually turn off the slower WAN, and reverse this if the faster WAN goes down, also manually..

Gblenn

@madbrain Well, wireguard is not using random ports, and all your clients or endpoints are known to you by looking at the configs for each. So you can easily create an alias containing all those ports and have any and all your wg connections be routed via the desired interface.

But perhaps you should rather be looking at changing the "weight" of the interfaces in your load balancing group. Since they are very different in capacity, your weight should also be set accordingly. Load balancing is a game of statistics, since the idea is to have many single stream connections to share the combined bandwidth of the interfaces.

If you set the weights so that roughly 71% of the traffic is going through the faster interface, you will be using that interface to a much higher degree, which makes sense from a capacity perspective. Then you maintain the failover capability whilst getting a much higher probability of your VPN going out the desired interface.

madbrain

@Gblenn Thanks. I will look into doing that.

However, right now, I'm experiencing a more basic problem. The load balancing isn't working at all, even for Speedtest. All traffic goes through one interface. I'm not certain what might have changed in my configuration that could cause this.

I do have an extra NIC and a total of 3 ISPs rather than 2 ISPs before. But even disabling one interface for one of the ISPs does not restore the load balancing functionality.

I followed the guide at https://www.cyberciti.biz/faq/howto-configure-dual-wan-load-balance-failover-pfsense-router/ . That initially worked with 2 ISPs. But it does not work anymore.

I even tried running simultaneous speedtests from 3 different hosts. All the traffic is going to only one WAN interface. This happens regardless of weights I give to each gateway.

madbrain

@madbrain I had a config with load balancing working with 2 ISPs, which I restored.

If I add an interface for the 3rd ISP, without doing anything else, the load balancing stops working.

If I restore the same working config again, and rename the 2 WAN interfaces, the load balancing also stops working. I renamed WAN to Comcast and WAN2 to Sail. I wouldn't expect a cosmetic config change to affect functionality, but perhaps there are references by name somewhere else. I'm not sure what.

Gblenn

@madbrain Sounds really strange that a name change would have such an effect...

And when adding the third interface, all you do is put that into the existing group, also set as Tier 1 like the others? And that breaks the load balancing completely?

And what about policy rules? Do you have a rule both for the loadbalancer and the failover gateway as per that guide?
If so, you need to make sure the balancer rule is above the failover, since rules are handled from the top.

BTW, how do you go about changing the names of the gateways? Did you edit the config file or do you copy the gateway, give it a new name and then delete the old one?

Changing name on a gateway can't be done without affecting other things, like the FW rules for example.

madbrain

@Gblenn I agree it is strange. I can't reproduce the issue with the name anymore. But I can definitely reproduce the problem of load balancing not working.

I backed up my settings, and started over with a brand new confign and configured all 3 ISPs.

Still, all traffic is being directed at one ISP, even multiple hosts each initiating multiple connections .

In the following graph I had 4 devices running Ookla Speedtest - a Windows box, a Linux box, a Raspberry Pi4, and my S22 Ultra phone on Wifi. All except the phone were wired.

All traffic got routed to the Comcast WAN. Sail and Verizon WANs were untouched.
What am I missing ?

Gblenn

@madbrain One thing that I notice which you have set different to what I have, is the default gateway IPv4. I have it set to the failover group that I created, in your case "v4LB". Whereas you have it as Automatic...

Gblenn

@madbrain When I looked at instructions you linked to, or e.g. Lawrence Systems, they suggest using the Gateway Group in the LAN rule. But instead I have it set as Default... and it's working fine.

madbrain

@Gblenn I tried setting the default gateway to the load balancer group also. That did not help, unfortunately. All traffic is still going through Comcast.

madbrain

@Gblenn Are you using pfSense CE or Plus ? I'm using Plus. I don't see the same screen as you posted in your screenshot. Where is it at ?

Edit: found under Firewall -> Rules -> LAN -> Edit (IPv4 rule) -> Show advanced -> Gateway . I set the load balancing group for both IPv4 and IPv6.

And miraculously, the traffic started getting distributed across all 3 WANs !

Thanks for the tip. I wonder how you got it to work without setting the gateway.

Gblenn

@madbrain I have always had it set to the gateway group in that setting. It was the firewall rule that is suggested both in the instructions you linked to and by Lawrence Systems. There I keep it at default..

Great that it works now!

Gblenn

I guess now you could take a look at the weighting, to rebalance based on individual capacity of each connection. Not the Tier number, but rather for each individual Gateway (under System / Routing / Gateways) when you expand the Advanced button. First item there is weight...

madbrain

@Gblenn Yes. I setup the weighting. Unfortunately, I ran into some issues with Netflix streaming, where buffering happened even though all 3 WANs were up. Will post a separate thread.

rikazkhan

@madbrain said in Load balancing not working with Wireguard client:

I do have an extra NIC and a total of 3 ISPs rather than 2 ISPs before. But even disabling one interface for one of the ISPs does not restore the load balancing functionality.

I do have an extra NIC and a total of 3 ISPs rather than 2 ISPs before. But even disabling one interface for one of the ISPs does not restore the load balancing functionality.