Load balancer with failover, not quite right.



  • I have four Dell boxes set up, two as load balancers and two as web servers.  Between the four, I'm using five public IP addresses.  One of the IPs is set up as a virtual so that through port 443, it will report the interface of whichever load balancer is currently master and on port 80, will switch between the web server's http interface.  Two are set up to show the interface of the individual load balancers and two to access the web servers directly, mostly for SSH.

    In order to get everything working, I had to set up four CARP interfaces.

    • First, 10.23.0.26 is set as an alias for both LBs.  Accessing it from a browser, brings up the load balancer interface or the web site, depending on which port is used.

    • Second, 192.168.1.3 is set as a gateway IP for each web server, each load balancer using the opposite box as failover.

      • Third and fourth are assigning real IP addresses to point to the private IPs held by the two web servers.

        My concern is that with each set created, it forces each to have a new VHID and I wonder if that's the source of my current problems.
        In order to test the failover, I've used the following scenarios:

      • If the link between the WAN switch and the first load balancer fails (cut cable), the second load balancer takes over the public bridge as well as the IP forwarding of both web servers but the gateway of the web servers does not fail over.  I can browse to the second web server but not the first one.  Likewise on the other end, the second web server can browse out, but the first one cannot.

        • If the link between the WAN side switch and the second load balancers fails, the first load balancer retains control, but I lose access to the second web server.
          This is all fine as the intent is for the public to be able to reach a website regardless of individual failure.
          The situation changes if there is a failure on the LAN side between the load balancers and the switch.

        • If I lose the link between LB2 and the LAN switch, no failover takes place, I lose access to WS2 but I keep access to the first web server.  However, if I lose the link between LB1, I somehow lose access to both web servers and neither web server can access out. In this situation, instead of failing over and making the second load balancer master, I simply get a blue I that says INIT.

          • If I lose heartbeat between the two, nothing happens.  Everyone can still access everything.

            It seems as though the LAN gateway is not acting as it should.  Instead of each of the systems routing through whichever system is currently master, it seems like web server 1 always uses load balancer 1 as gateway and web server 2 always uses load balancer 2 as gateway and for whichever load balancer fails, we lose access to the associated web server.  And regardless of what failed, both web servers are always trying to send return traffic through load balancer 1, which doesn't do much good.


Log in to reply