Here's a new technique for Load Balancing and Failover across multiple WANs.



  • I have three ISPs going into my box and I'd like to set up my own kind of load balancing.

    At the start of a session with any particular host, the router would ping it's IP from every available WAN connection and whichever WAN returns the ping fastest, it would start the main portion of the session on that WAN.  The router would then make the sessions sticky to that WAN until some kind of timeout.  Of course you would still setup routing rules to move specific ports and types of connections to specific WANs.

    Benefits:

    1. You're near guaranteed to use the fastest available link to any particular server.

    2. When one link starts being used up by intensive transfers, the ping time it returns for new connections will lengthen and the router will start using another connection automatically.

    3. When one link goes down, it's ping time will be infinite.  This would force all new connections to automatically fail over to the fastest available link still working.

    4. You're pinging the exact server you're trying to connect to instead of pinging the gateway like typical load balancing. (Gateways aren't always ping-able)

    5. It's better than Round Robin because you won't pile up large transfers onto one link just because that's how the connections started.

    6. You can also give each WAN's ping time a "weight" against other WAN's pings.  For instance: You've got a WAN that's generally not great for ping times but you still want to use the bandwidth.  (IE: Cable vs. Metro-E) Simply make it's ping times "weigh" 50% of what they actually are.  The slower connection with 100ms average ping times would be used as much as the faster connection with 50ms ping times.  (This will, however, lengthen the time of determining which connection to use by just under double on average, given these variables (whopdeedoo, another 50ms)) This whole process could even be automated with self-adjusting weights based on the desired traffic distribution across WANs.  (That might be going a bit far for proof of concept, though)

    Cons:

    1. There will be a short delay (tens of milliseconds?) added to the start of each new session.

    2. Sessions would have to be well programmed with memory management in mind for scalability in large traffic environments.

    3. You must chose a link after a certain amount of time (say, the average amount of time of the slowest 5% of successfully returned pings) because some IPs will disregard pings.

    4. A rolling list of IPs and the WAN that was used to connect to them would need to be created and any IP entry not used within a certain amount time would need to be deleted.

    Tell me what you think,
    –AkkerKid


Locked