(config) Issue with outbound load balancing
-
Hi,
I'm new to pfSense, so maybe I'm doing something wrong but I have read many documentations, and particularly the sticky thread in this section : Outbound Load Balancing is replaced
pfSense seems to always NAT to the same gateway.I have pfsense installed on a box with 2 NICs : WAN & LAN.
The LAN interface connects to the lan.
The WAN interface connects to all the Internet connections :-
GW1 (static IP)
-
GW2 (static IP)
-
GW3 (static IP)
I added each GW in System > Routing. GW 1 is the default GW.
I defined a group in System > Routing > Groups:
-
Name : Balanced
-
GW1 -> Tier 2
-
GW2 -> Tier 1
-
GW3 -> Tier 1
I didn't define routes in System > Routing > Routes
As I'm a (traditional) pf user, I'm leaded to create a nat rule for outgoing traffic (like traditional definitions : nat on $ext_if from $ip to any -> $ext_ip1), but the tip on the gateways group page says explicitly to create a firewall rule.
So I created (or modified, actually I don't remember if a default rule was set) a firewall rule in Firewall > Rules > Lan :-
Action : Pass
-
Disable : unchecked
-
Interface : LAN
-
Protocol : Any
-
Source : LAN subnet
-
Destination : Any
-
Gateway : Balanced
I have uninstalled Squid, but http traffic always goes through GW1. As GW1 is a tier2 in the group, I think the loadbalance is not done at all and the default system gateway is always chosen.
It seems that outbound lb is not done, but I don't know why, can someone help me on this ?edit : I forgot to say that I'm running 2.0-BETA4, but it should obvious due to the section we're in.
-
-
Is that rule on LAN the only rule?
The rules are processed from the top down, and first match wins. So if another rules matches that traffic, it would not hit the rule with the gateway set.
-
This is the first (and only) rule.
There's currently nothing else configured on the distribution (as I said Squid is uninstalled until I properly configure the outbound lb).
-
Squid won't work with the outbound LB anyhow so it's best to leave that off.
After changing the rules, reset the states, and testing with the same browser refreshing over and over would never hit different WANs, since the state/session is probably still open. Best way is to test with curl or multiple browsers from a system on LAN.
-
I can see on the routers providing access to GW2 and GW3 that there's no traffic on them :
GW2 :
-
TX packets : 6000
-
RX packets : 8748
-
Connected since : 8:13:44
GW3 :
-
TX packets : 0
-
RX packets : 3802
-
Connected since : 79:16:52
I understand that I can't see the load balancing by refreshing a web page, futhermore I've enabled sticky connections.
But after a state table flush I shouldn't go through GW1 (tier 2), and the lb always use GW1, and never GW2 (tier 1) & GW3 (tier 1). -
-
Yeah you should be seeing something there.
Check the system logs, see if there is a message about the gateways or gateway group not being resolvable or any similar errors.
-
I made many tests so there are many :
@System:Oct 19 10:56:49 php: /system_gateway_groups.php: Removing static route for monitor 212.27.40.240 and adding a new route through 10.1.5.32
Oct 19 10:56:49 php: /system_gateway_groups.php: Removing static route for monitor 194.2.0.20 and adding a new route through 10.1.5.31
Oct 19 10:56:49 php: /system_gateway_groups.php: Removing static route for monitor 62.73.7.254 and adding a new route through 10.1.5.30
Oct 19 10:56:49 check_reload_status: reloading filter
Oct 19 10:56:49 php: /system_gateway_groups.php: ROUTING: change default route to 10.1.5.30These are not abnormals as I made changes.
There's also a stranger message :
@System:Oct 19 10:56:50 php: : Gateways status could not be determined, considering all as up/active.
But according to the message, it should work.
Status > Gateways displays "online" or "Warning, latency" correctly. But Status > Gateways > Groups data is "(GWx), Gathering data" and never changes.
What should be displayed in this table is not nicely described in the doc. -
They should all say "online" and not "gathering data", which may be part of the problem. There have been a lot of gateway fixes lately, so it's critical to be on the most current snapshot when working with gateway groups and such.
-
They might say 'Gathering dataโฆ' for ~10seconds after that if it does not changes means there is an error.
Possibly a restart would fix it.But i think you are on an older snapshot so upgrade first.
-
I just updated to the last build.
The Status > Gateways > Groups table now displays correct datas ("Online").
The traffic is still only on GW1. GW2 and/or GW3 are not used.
edit : clarify the post.
-
We'll need a screenshot of your LAN firewall rules tab then, and the contents of /tmp/rules.debug would also help.
-
I just removed GW1 from the pool, for test purposes.
flushing states table -> Still GW1 used.
I noticed that now (probably after the upgrade) all GWs are marked as "Default" on System > Routing.
I tried to uncheck the box, but all GW are still "Default" after applying the changes.
I tried to reboot the server, but nothing changed.Summary :
- My pool is composed of GW2 & GW3
- All the connections are made on GW1
- All GW are marked as default, and I can't remove this mark
- All GW are up and detected as such in all tables.
- The GW Group table displays correctly "online" state.
P.S. : I've seen your post, doing it now.
-
Here are the files
![fw rules.PNG](/public/imported_attachments/1/fw rules.PNG)
![fw rules.PNG_thumb](/public/imported_attachments/1/fw rules.PNG_thumb)
tmp_rules.debug.txt -
Jimp, did you see something wrong ?
I think the configuration is very basic.I you want me to reset all the config, feel free to ask.
-
Try without sticky-address or from 2 different hosts on LAN.
-
It does look fairly basic, aside from the fact that all three gateways are on the same interface. I'm not sure if that is causing an issue or not, but as ermal said, try it without sticky checked under advanced options, see if that makes a difference.
-
Ok, I tried deactivating sticky-connections and the problem remains the same.
Does pfSense shouldn't manage multiple GWs from the same interface ?
-
It should work, in theory, but I'm not sure if anyone has thoroughly tested that scenario using them for WAN-type gateways.
-
Hi,
I received hardware pieces to have 1 if per GW ant it works now.
No more erroneous information (like every GW displayed as default).I think there's a bug with this scenario. :)
I let you tag this thread as resolved or not (I'm not sure if it should be set as resolved).