DualWAN set-up Still continues to not work… [ my x-mas wish]

databeestje

FYI: you can not access embedded youtube properly or n% of the time when using a default round robin method of load balancing.

I have a alias, something along these lines.
ext_youtube 64.15.112.0/20, 208.65.152.0/20, 208.117.224.0/20, 208.65.152.0/20, 64.233.167.99/20, 64.233.187.99/20, 72.14.207.99/20, 74.125.8.167/20

Basically a summary for Youtube netblocks. I do my load balancing by having 2 failover pools and sending a number of things out one link, and other sites out the other.
I do have a balance all rule, but I rarely use that for rules. There are just too many frigging sites that always expect you to come from the same site.

fvter

So granted that there are restrictions on how the target site may handle incoming traffic, i've never actually denied that. but this is essentially true of almost all environments.

My issue is more on how the load-balancer engine handles this, why do I have to specifically create rules to manage the load-balancing/exclude badly responding sites: doesn't that break the whole principal of having a load-balancing engine?

I don't understand why the engine can't just have enough intelligence to say:

internal_ip starting a session to subnet x.y.z.u
route traffic internal_ip to subnet x.y.z.u on outgoing int-1
keep routing that traffic for n time (secs, minutes)

I've set up load-balancing on cisco routers and this type of performance issues didn't play a role!

I am also still perplexed on why would the load-balancing/fail-over completely fail when one link goes down it doesn't make sense!

just my 0.02€ worth

althornin

@fvter:

I am also still perplexed on why would the load-balancing/fail-over completely fail when one link goes down it doesn't make sense!

just my 0.02€ worth

This means your config is wrong.
I use dual wan with failover, and it works fine. Including when a link fails.

althornin

I looked at your config, and here is where you have fubared it:

<lbpool><type>gateway</type>
<behaviour>balance</behaviour>
<monitorip>208.67.220.220</monitorip>
<name>WANLoadB</name>
<desc>Load Balancing on the WAN Links</desc>
<port><servers>wan|RRR.TTT.UUU.254</servers>
<servers>wan|208.67.222.222</servers>
<servers>opt1|XXX.YYY.ZZZ.1</servers>
<servers>opt1|208.67.220.220</servers></port></lbpool>

Why are your interfaces in there twice?

The monitor IPs you have in there are wrong - because any interface can ping either 208.67.222.222 or 208.67.220.220, pfsense can't tell if a link is down.
remove those extra items from the pool list for all of your lbpools - you have them in all of the pools.

FieroGT

ok, here is all u have to do, this should fix 90% of the probs you are having… very simple. i experienced alot of the probs u have also. and everything works great now!
goto SYSTEM -> ADVANCED , then ENABLE Use sticky connections.

that should help alot with the timeouts, and vids not loading/playing correctly…
give it a shot, and see how that works 4 u...

btw, i'm running 3 modems, 2 are load balanced....

-r0b

Cry Havok

@fvter:

So granted that there are restrictions on how the target site may handle incoming traffic, i've never actually denied that. but this is essentially true of almost all environments.

My issue is more on how the load-balancer engine handles this, why do I have to specifically create rules to manage the load-balancing/exclude badly responding sites: doesn't that break the whole principal of having a load-balancing engine?

I don't understand why the engine can't just have enough intelligence to say:

internal_ip starting a session to subnet x.y.z.u

route traffic internal_ip to subnet x.y.z.u on outgoing int-1

keep routing that traffic for n time (secs, minutes)

And how is it going to know what that subnet is - you have nothing to tell you whether that's a /8, /15, /30 etc without performing lookups (probably WhoIs), which take non trivial amounts of time and may not even provide you accurate information. I know a number of very large load balancing setups that do nothing more advanced than push a single internal unit (whether that's an IP, subnet or whatever) through a single external IP, attempting to balance the load that way, not dynamically.

fvter

@althornin:

I looked at your config, and here is where you have fubared it:

<lbpool><type>gateway</type>
<behaviour>balance</behaviour>
<monitorip>208.67.220.220</monitorip>
<name>WANLoadB</name>
<desc>Load Balancing on the WAN Links</desc>
<port><servers>wan|RRR.TTT.UUU.254</servers>
<servers>wan|208.67.222.222</servers>
<servers>opt1|XXX.YYY.ZZZ.1</servers>
<servers>opt1|208.67.220.220</servers></port></lbpool>

Why are your interfaces in there twice?

The monitor IPs you have in there are wrong - because any interface can ping either 208.67.222.222 or 208.67.220.220, pfsense can't tell if a link is down.
remove those extra items from the pool list for all of your lbpools - you have them in all of the pools.

I thought about that as well! the original intention was to have to addresses to ping in case of failure but even when I removed the openDNS servers from the config, it still doesn't work.
The other ip addresses are the nexthop (ie. gateway) of each DSL connection.

althornin

Your interfaces should not be in there twice.

marrandy

@cmb:

That's not what I'm saying. When you browse a site, you have multiple HTTP connections out, some of each of them are going to get routed out different public IPs. It's round robin on a per-connection basis. That breaks some websites.

You don't have round robin by source IP address ?

What I mean by that is that all traffic from e.g. source IP 192.168.1.100 will go via one route, and 192.168.1.101 will then be routed to a different public IP and so on.

How does that work with FTP or VOIP then, as they both use control and data multiple paths. Wouldn't they also get routed to different Public IP's which would break them.

I don't get that. Round Robin by source IP would fix that issue.

Or am I missing something ?

GruensFroeschli

You just described why FTP and voip are such problematic protocols….
It is usually solved by forcing these protocols
to only one WAN and not balance them.

The other possibility is to use sticky connections.

Use sticky connections
Successive connections will be redirected to the servers in a round-robin manner with connections from the same source being sent to the same web server. This "sticky connection" will exist as long as there are states that refer to this connection. Once the states expire, so will the sticky connection. Further connections from that host will be redirected to the next web server in the round robin.

however i dont know what the status of that feature is.
The last i know is, that it doesn't work like it should.