MultiWAN loadbalancing issues
-
Hi all,
I have 4 internet connections (two TD-LTEs and two PPPoEs), and right now I am only using the two TD-LTE connections. pfSense (ver. 2.3.2_1) is running as a VM on a Proliant server running ESXi 6.5.0. The two LTE connections are from the same ISP, so considering pfSense doesn't recognize the same subnet and gateway on two different interfaces, I have set the modems in NAT mode. Modem # 1 has gateway 192.168.1.1 set on it (pfSense's WAN interface is getting assigned the 192.168.1.10 ip) and Modem # 2 has gateway 192.168.2.1 with pfSense getting the 192.168.2.10 ip. Both gateways are online and when I ping or tracerout specifically through each interface, everything seems to be in order. I have setup the loadbalancing gateway group for both interfaces (as Tier 1). And I have setup the firewall rule to use the loadbalancing gateway. However, pfSense is only using WANGW (the default gateway) no matter what. Even if I force down the default gateway it still uses the first interface (192.168.1.10) – btw no failover scenario has been configured -- Switching the default gateway to the second interface doesn't force the traffic to go through it either. When both connections are online, traffic only travels through the 192.168.1.1 gateway, meanwhile the second gateway is sitting idle with no activity. Using IDM or other services that attempt to establish multiple connections just result in the first gateway only activity and nothing goes through the second gateway. I have been trying to solve this issue and so far nothing. I even deleted the VM and reinstalled pfSense, rechecked the network config within VMWare to see maybe something is incorrectly configured there -- it doesn't seem so -- and nothing I have changed has helped me so far. I seem to be missing either something so simple, or I am way out of my depth here. I'd be grateful if you helped me out. Thanks...
P.S. Sticky connections option is disabled.
P.P.S. I have used 8.8.8.8 and 8.8.4.4 for gateway monitoring for each interface and the DNS server settings is using these public DNS addresses accordingly for each interface.
-
"And I have setup the firewall rule to use the loadbalancing gateway."
How are you testing the trace? You can't test it with firewall generated traffic itself because policy routing rule configured on the firewall rules only apply to "inbound" traffic to the interface. You would have to be testing with a client behind the pfSense LAN, not the pfSense ping or traceroute tools itself.
-
"And I have setup the firewall rule to use the loadbalancing gateway."
How are you testing the trace? You can't test it with firewall generated traffic itself because policy routing rule configured on the firewall rules only apply to "inbound" traffic to the interface. You would have to be testing with a client behind the pfSense LAN, not the pfSense ping or traceroute tools itself.
Within pfSense under diagnostics you can use both ping and tracerout specifying the desired interface. That is one way I did it. The other way was physically disconnecting each modem and tracert-ing inside the command prompt window on my Windows PC.
-
You have a firewall rule above your load balancing rule (Default LAN to Any ipv4) that's taking precedence on all of your LAN net generated traffic and using the default gateway. PfSense will process the rule set from top down. Move the LAN net to any rule below the one you have configured with the specified GW group.
Also, you can't use the ping or traceroute tools inside of pfSense to test your load balance configuration because it's considered firewall generated traffic. The rule you configured for specifying your load balancing GW group won't apply when the traffic is generated using those tools on pfSense. It will only apply to "inbound" traffic to that specified interface (LAN). Also, multi-WAN load balancing entails individual connections being balanced in a round-robin fashion, so traceroute wouldn't be the best test here. Try running a speed-test and then checking the traffic graph in pfSense looking at both WANs and making sure activity is taking place on them.
-
You have a firewall rule above your load balancing rule (Default LAN to Any ipv4) that's taking precedence on all of your LAN net generated traffic and using the default gateway. PfSense will process the rule set from top down. Move the LAN net to any rule below the one you have configured with the specified GW group.
That did it! Thank you! Completely overlooked the firewall priority law. I changed the "default Lan to any rule" to the gateway LB and killed my own created rule. The two WANs now appear to be load-balancing but not as effectively or efficiently as I would like them to. Each WAN on its own could give me 25-27Mb/s bandwidth (speedtest.com), combined I don't get anything above 15-17Mb/s. I have been tweaking around with the weight ratio (though both have the same speed and are from the same ISP). Aside from using speedtest, I thought downloading a large file via IDM could be a better venue for testing the actual bandwidth speed, but IDM appears to be using only the default gateway (for instance I set the IDM to use 8 connections, and set the weight ratio to 4-4 on pfsense, but IDM is only talking through the default gateway, while the second gateway is idle with no traffic activity). On some youtube videos I have seen people easily aggregating the two bandwidth (illustrated as before and after on speedtest), but so far my attempts have been semi-fruitful (if that's even a word). I will try to research more on this matter on my own, but as always any help that could save me time, frustration, and energy, would be greatly appreciated!
Also, you can't use the ping or traceroute tools inside of pfSense to test your load balance configuration because it's considered firewall generated traffic. The rule you configured for specifying your load balancing GW group won't apply when the traffic is generated using those tools on pfSense. It will only apply to "inbound" traffic to that specified interface (LAN). Also, multi-WAN load balancing entails individual connections being balanced in a round-robin fashion, so traceroute wouldn't be the best test here. Try running a speed-test and then checking the traffic graph in pfSense looking at both WANs and making sure activity is taking place on them.
I did not know that. Thank you for clarifying the matter for me.
UPDATE: So, I tested load balancing only with the two DSL lines, and now it appears to be I'm getting the aggregated bandwidth of 15 Mb/s (7Mb/s from DSL A + 8Mb/s through DSL B). Another thing that is a bit puzzling with regards to the TD-LTE lines is that when I start downloading a file one of the two connections' RTT begins to hike up very rapidly (from 130ms to 650ms where offline state is triggered) while the other one remains pretty stable. ??? Also at all times the two connections seem to have about 60 to 70ms RTT difference!