Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Failover support added for Load balancing in latest snapshot

    Routing and Multi WAN
    14
    43
    18.1k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sullrich
      last edited by

      Add static routes for the DNS servers forcing the traffic out the correct interfaces.

      1 Reply Last reply Reply Quote 0
      • R
        Rockyboa
        last edited by

        Hi all,

        Been trying this new feature.  I have two WAN and one is using a very costly per Mb.  If my top gateway become available again will it switch back after a fail over?

        Also I was wondering, how come in my routing table the gatway always stays to the top one in my pool when I look at my route.

        Martin

        1 Reply Last reply Reply Quote 0
        • S
          sullrich
          last edited by

          Yes, it will switch back.  Not sure what you are asking about the route table but we do not route multi-wan via regular routing.  It is handled via PF itself.

          1 Reply Last reply Reply Quote 0
          • V
            Veni
            last edited by

            Great function  :D. I have at home a 100 Mbit line connected to the city's MAN and a ADSL line as a secondary link.
            Great to have something to automate the switch between the WAN's if the primary line goes down, instead of as today, manually connecting cables  :-.

            @databeestje:

            If you do have 2 wan, go to Services -> Load Balancer, Create a new pool, type gateway, add the interfaces and monitor IPs, Save and apply.
            Then go to Firewall -> Rules -> Lan and edit the Lan->Any rule, change the gateway from default to your just created pool.

            Good Luck.

            My problem appears at

            add the interfaces

            because only one NIC is in the list, the NIC named "WAN".
            I have my secondary ISP on the OPT1 NIC, but i cannot choose it.

            Both ISP's issues IP address with the help of DHCP. The ADSL unit is a modem with 4 switchports.
            The 100 Mbit MAN line is a simple Ethernet twisted pair cable.

            The computer running pfSense has 1 onboard 3Com and 2 3Com 3C905 PCI cards.

            How do I tell the failover function that the OPT1 NIC is a WAN NIC so that it gets in the list named "Interface Name" @ load_balancer_pool_edit.php page?

            1 Reply Last reply Reply Quote 0
            • H
              hoba
              last edited by

              Only NICs that have a gateway assigned will be listed in the selection. I guess your OPT1-WAN is not connected and/or has no dhcp lease yet. Make sure it got an IP and gateway assigned first. Then revisit the poolcreationscreen.

              1 Reply Last reply Reply Quote 0
              • V
                Veni
                last edited by

                Thanks, that did the trick  ;).

                Is there a way of controlling the

                ping intervall time,
                ping reply timeout time,
                how many ping timeouts that are needed before it failsover,
                plus controlling how many successful pings on the primary isp that are needed to do a failback?

                If at this time it is not possible to manually control the above values,
                is there a way to find out what the values are today, even if they are hardcoded?

                1 Reply Last reply Reply Quote 0
                • S
                  sullrich
                  last edited by

                  @Veni:

                  Thanks, that did the trick  ;).

                  Is there a way of controlling the

                  ping intervall time,
                  ping reply timeout time,
                  how many ping timeouts that are needed before it failsover,
                  plus controlling how many successful pings on the primary isp that are needed to do a failback?

                  Not currently.

                  @Veni:

                  If at this time it is not possible to manually control the above values,
                  is there a way to find out what the values are today, even if they are hardcoded?

                  1 second timeout, 1 interval every 5 seconds.  Newer snapshots have been changed to ping interval of 3, timeout 2 seconds.

                  1 Reply Last reply Reply Quote 0
                  • V
                    Veni
                    last edited by

                    Thanks.
                    That was the fastest response over a webbased forum i have seen :).

                    1 Reply Last reply Reply Quote 0
                    • V
                      Veni
                      last edited by

                      It's alive ;D.

                      Failover took about max 5 seconds and i could browse the web and check my ipaddress to be sure what isp i was using.
                      Failback the same, only a couple of seconds. Thanks everybody :D.

                      A question about portforwarding and failover:
                      When creating a rule under Firewall/NAT/Port Forward, the first parameter is Interface.
                      Is there a way of being able to choose my loadbalancer pool named "Failover" as interface parameter,
                      or do i have to clone every PF rule so that it even applies to the OPT1 interface?

                      1 Reply Last reply Reply Quote 0
                      • H
                        hoba
                        last edited by

                        You have to add seperate rules/forwards for each Interface.

                        1 Reply Last reply Reply Quote 0
                        • D
                          dscott98
                          last edited by

                          I added static routes for my DNS servers, and even tried to use DNS servers from opendns, still can't get DNS to work properly, i can ping outside my network via ip address, but i can't using domain names.

                          1 Reply Last reply Reply Quote 0
                          • V
                            Veni
                            last edited by

                            I had almost a similar problem. It took a couple of minutes after reboot before the problem started and it did not affect clients
                            on the network using the pfSense computer as a DNS server, but the pfSense own use of internet(not local static mappings)DNS
                            stopped working. Squid was unable to resolve, ping from pfSense console was unable to resolve and the Packages tab on the web
                            gui was unable to resolve.

                            Hoba posted a response to my issue and the problem has after that not yet shown itself again.
                            The only thing i still can not understand was why my problem showed itself when i was running on the primary WAN link
                            and first after a couple of minutes. There was never any failure recorded(nor did i notice one either) on the primary WAN link.
                            But still, Hoba's response solved my problem.

                            http://forum.pfsense.org/index.php/topic,3467.0.html

                            1 Reply Last reply Reply Quote 0
                            • N
                              nexusone
                              last edited by

                              I got this working. Sort of.

                              Its buggy though.

                              Set it all up, lb status shows both links up. interface status show both links up. disconnect wan1 and it takes close to 5 minutes for it failover. while the interface status instantly shows the connection down, the load balancer status takes forever to update.

                              being mindful of the state table i test against a different destination and eventually traffic begins to cross WAN2.

                              Reconnect WAN1, this took 10 minutes for the lb status to show that this connection was back. again the interface status showed it instantly. Traffic never switches back to WAN1. By never I mean I waited for more than 90 minutes. I cleared the state tables etc. The route table shows the WAN1 gw as the default. But all traffic still passes the WAN2 interface.

                              Even if I change the gateway on my outbound rule to explicitly specify only the gw of WAN1 all the traffic passes WAN2. Yes I waited for the rules to build. Yes I flushed the states. Yes both interfaces are up. :)

                              The way the loadbalancer updates the interface status seems to be screwy. In fact at time it won't update the interface status of all my pools the same ways. See the attached image for an example. Explain that one. :)

                              Running 2-09 snapshot.

                              rebooting restores traffic to wan1. rinse and repeat.

                              suggestions?

                              oh…monitor ips are the farside of both connections on the isp networks.

                              ScreenHunter_2.jpg
                              ScreenHunter_2.jpg_thumb

                              1 Reply Last reply Reply Quote 0
                              • N
                                nexusone
                                last edited by

                                Followup :: I've added static routes for the ips i'm monitoring on each interface. Made zero difference.

                                1 Reply Last reply Reply Quote 0
                                • D
                                  databeestje
                                  last edited by

                                  @Sn3ak:

                                  Firstly, Let me say great job guys. keep up the good work.

                                  Can someone get an updated/easier howto posted? I think this would help adoption.
                                  I have looked at two different articles, one from the wiki, and one from somewhere
                                  else on the site.  They are slightly different, and that makes things even more confusing
                                  for someone who hasn't done this before.

                                  That being said, I seem to have gotten mine to work well with three wans. I do have a problem
                                  that has caused me to turn off the Load Balancing. As soon as I create a firewall rule setting the
                                  default route the the loadbalancer, I can't access my IPSEC client's.

                                  I have tried to create different rules, etc to get traffic to pass over the IPSEC, but have failed.

                                  I am the IPSEC Host, the rest of the clients are all mobile. I was looking for a way to set IPSEC
                                  to use the default gateway, or force it to one lan, but can't seem to find a way to do so.

                                  I tried creating the following LAN rule, figuring ipsec could communicate to my network, but my
                                  network couldn't communicate back. the ip 111 used below would be the original default gateway
                                  ip.
                                   
                                  *  LAN net  *  192.168.2.0/24  *  111.111.111.111 Default LAN -> IPSEC

                                  Help, please :)

                                  This is confirmed. I have a patch for this, and I will commit this soon. This should show up as our valentines release.

                                  You could also create a rule from lan subnet to the VPN subnet above the load balancer rule to negate this effect.
                                  We now handle this in the background. I just recently stumbled upon this.

                                  1 Reply Last reply Reply Quote 0
                                  • D
                                    databeestje
                                    last edited by

                                    @nexusone:

                                    Reconnect WAN1, this took 10 minutes for the lb status to show that this connection was back. again the interface status showed it instantly. Traffic never switches back to WAN1. By never I mean I waited for more than 90 minutes. I cleared the state tables etc. The route table shows the WAN1 gw as the default. But all traffic still passes the WAN2 interface.

                                    Even if I change the gateway on my outbound rule to explicitly specify only the gw of WAN1 all the traffic passes WAN2. Yes I waited for the rules to build. Yes I flushed the states. Yes both interfaces are up. :)

                                    Are both your wan interfaces DHCP perhaps?

                                    I do not know what sort of hardware you have but in my home case (with a secondary wireless link) it takes about ~45 seconds for the rules to be generated. This is via Eden 933 with 256MB ram. I am running from a CF card which is slowing the process down quite a bit though.

                                    Also keep in mind that it's common for upstream routers to have implemented a icmp rate limit which might affect the load balancer gateway detection.

                                    On the command page you can execute the following command to see if it has regenerated the correct routes.

                                    grep round /tmp/rules.debug
                                    

                                    This should output all the filter rules that use the load balancer pools. You should check if these are correct.

                                    We will be implementing a few more other fixes to check for down interfaces in the future as well.

                                    1 Reply Last reply Reply Quote 0
                                    • N
                                      nexusone
                                      last edited by

                                      Neither of my interfaces are DHCP. Static addresses on both. Normally I can ping either of my monitor IPs until I'm blue in the face without any complications of ICMP rate limiting. Hardware is a dell poweredge 860 with 2 intel gig-e ports and 2 broadcom gig-e ports. Hardware has been great. The wan ports are both served by the broadcom ports.

                                      In the interface status page i see accurately updated state changes on the link status immediately. The loadbalancer status page lags anywhere between a few minutes and forever to reflect these changes. I've checked the routes like you suggested and it does not appear that the route is being updated when the interface status changes, which subsequently impacts the ability of the load balancer to ping the monitor address and control the pool.

                                      I dont have anything particularly tricky in my config.

                                      2 WANS, both static address connections.
                                      Both are properly configured as both do work with some coaxing.
                                      1 LAN, 1 DMZ (dmz presently not used)
                                      No complicated nat or port forwards or anything.
                                      only a single rule to allow all traffic from lan to pass to *.

                                      Monitor IPs are good. I've checked and double checked. They are both on the farside of my wan links and have working static routes to control which interface is used to avoid "false positives" and so effectively reflect connection status as up or down.

                                      Any help or suggestions are appreciated. Really need to get this sorted out.

                                      @databeestje:

                                      @nexusone:

                                      Reconnect WAN1, this took 10 minutes for the lb status to show that this connection was back. again the interface status showed it instantly. Traffic never switches back to WAN1. By never I mean I waited for more than 90 minutes. I cleared the state tables etc. The route table shows the WAN1 gw as the default. But all traffic still passes the WAN2 interface.

                                      Even if I change the gateway on my outbound rule to explicitly specify only the gw of WAN1 all the traffic passes WAN2. Yes I waited for the rules to build. Yes I flushed the states. Yes both interfaces are up. :)

                                      Are both your wan interfaces DHCP perhaps?

                                      I do not know what sort of hardware you have but in my home case (with a secondary wireless link) it takes about ~45 seconds for the rules to be generated. This is via Eden 933 with 256MB ram. I am running from a CF card which is slowing the process down quite a bit though.

                                      Also keep in mind that it's common for upstream routers to have implemented a icmp rate limit which might affect the load balancer gateway detection.

                                      On the command page you can execute the following command to see if it has regenerated the correct routes.

                                      grep round /tmp/rules.debug
                                      

                                      This should output all the filter rules that use the load balancer pools. You should check if these are correct.

                                      We will be implementing a few more other fixes to check for down interfaces in the future as well.

                                      1 Reply Last reply Reply Quote 0
                                      • R
                                        regis
                                        last edited by

                                        hi all

                                        i'm using 1.0.1-SNAPSHOT-02-14-2007 and trying to use the load balancing feature

                                        my setup is as follows :

                                        LAN 192.168.1.254/24
                                        WAN PPPoE adsl with dynamic IP
                                        WAN2 DHCP from wireless network (Alvarion 5.4 Ghz) static IP

                                        i created a pool (gateway, load balancing) and a new rule for trafic from LAN with my pool as gateway

                                        both wans are marked online in status/load

                                        but when i try to access websites, i have to clic two times to access the page, so i think that only one wan is working

                                        my system log is filling with these messages :

                                        kernel: arplookup 80.8.244.1 failed: host is not on local network
                                        kernel: arpresolve: can't allocate route for 80.8.244.1

                                        this IP address is the gateway assigned by my first ISP (i have an ip address with a /32 on my WAN)

                                        is something misconfigured or does a workaround exists to this problem ?

                                        thanks

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          sai
                                          last edited by

                                          @regis:

                                          hi all

                                          i'm using 1.0.1-SNAPSHOT-02-14-2007 and trying to use the load balancing feature

                                          my setup is as follows :

                                          LAN 192.168.1.254/24
                                          WAN PPPoE adsl with dynamic IP
                                          WAN2 DHCP from wireless network (Alvarion 5.4 Ghz) static IP

                                          When the ip address changes on your wans, the LB will fail. You have to have static ip addresses.

                                          Work around is to put in a simple router between your pfS and the modems

                                          1 Reply Last reply Reply Quote 0
                                          • D
                                            databeestje
                                            last edited by

                                            DHCP wans for balancing are supported, but this requires the dhcp address to maintain the same.
                                            E.g. a static DHCP assigned address or static PPPOE assigned address.

                                            This is a product limitation. I am currently not considering fixing this yet.

                                            Cheers,

                                            Seth

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.