Dual WAN with Failover Not Working



  • Hello,

    We are testing a pfSense appliance (Netgate 3-LAN varity hardware) running pfSense 2.0.3 Release 4G x86 NanoBSD. My configuration (for testing purposes) is as follows:

    LAN: IP=192.168.0.254/24 [DHCP Range .100-.200]
    WAN: IP=192.168.5.9/24 Default Gateway=192.168.5.254
    OPT WAN: IP= 192.168.250.9/24 Default Gateway=192.168.250.254
    (NOT blocking private networks or bogon networks.)

    I have 2 DNS servers setup in SYSTEM>GENERAL SETUP; one assigned to WAN, and the other assigend to OPT WAN.

    In SYSTEM>ROUTING GATEWAYS, the 192.168.5.254 gateway monitors 8.8.4.4 and the 192.168.250.254 gateway monitors 8.8.8.8

    In SYSTEM>ROUTING GROUPS, I have a group called "FAIL_OVER_TEST" with WANGW (WAN) as Tier 1, and WAN2GW (OPT WAN) as Tier 2; trigger level = Packet Loss

    In FIREWALL>RULES LAN I have the default Anti-Lockout rule on top, and the default Allow LAN to ANY rule below that; HOWEVER, I modified that rule and pointed it to my Gateway Group "FAIL_OVER_TEST".

    The pfSense will failover to the OPT WAN when I physically disconnect the WAN ethernet cable, and it will re-connect to the WAN when that cable is reconnected (no problems there).

    Problem is, when I "disable" the Internet on the WAN port, leaving it physically connected to the port on the pfSense, the STATUS>GATEWAYS GATEWAYS page still shows that WAN is online and pinging, even though the PC can't ping 8.8.4.4, and the PING tool in pfSense can't ping 8.8.4.4 either!

    (How I "disable" the Internet on the WAN interface is by creating a firewall rule on the router that feeds the pfSense for 192.168.5.9, blocking all traffic.)

    Please help! I am fairly new to pfSense… not sure if I'm missing something in the config or not? To me, it seems like the software may have a flaw and that it never recogonizes when it can't actually ping out anymore, unless the cable is physically removed from the port.

    Thanks,

    Steve



  • This is because you are manually setting the monitor IP address. Unless you are unable to ping the default gateway IP address, don't set the monitor IP. If you are setting the monitor IP address to 8.8.8.8 and 8.8.4.4 and also using them for DNS servers, then make sure you also set the appropriate gateway for each DNS server. You'll need to reboot pfSense after that so that it updates it's static routes.



  • I not using 8.8.8.8 or 8.8.4.4 for my DNS (I've actually used Comcast's DNS 1 for WAN 75.75.75.75 and Comcast's DNS 2 for OPT WAN 75.75.76.76).

    Should the monitor IP addresses for each WAN be blank then? Or should they be the respective default gateway IPs??

    Thank you!!



  • Leave them blank.



  • That was a good idea. It didn't work though. By leaving the monitor IP blank, pfSense uses the default gateway as the monitor then and the default gateway is always ping-able even when the Internet is down.

    For instance, our static IP is 173.xx.xx.xx3 and our default gateway is 173.xx.xx.xx4 (the .xx4 address "belongs" to the Comcast gateway). I can ping .xx4 and unplug the coax cable to the Comcast gateway and .xx4 STILL pings. That's why I was trying to use a monitor IP that was external to my network.

    Now, the weird thing is that even monitoring 8.8.8.8, when I take WAN 1 down, the PING diag tool in pfSense will show 100% loss when pinging 8.8.8.8 (or any Internet IP for that matter); however, in STATUS>GATEWAYS it shows the WAN as still pinging/online?????????

    And I did setup an explicit (STATIC ROUTE) for 8.8.8.8/32 thorugh the 192.168.5.254 WAN gateway.

    ???



  • That happens when both WANs have the same gateway address like if they're from the same ISP. Use 8.8.8.8 as monitor and DNS address for WAN1 and 8.8.4.4 for monitor and DNS address for WAN2. pfSense should automatically create the appropriate interface based static routes. You'll need to reboot to clear it up.



  • I have yet to figure out…  Why won't pfsense let you configure a single monitor IP for multiple WANs as gateway monitor?



  • Ok, tried that too and it didn't work. It shouldn't be a DNS issue because we're trying to ping an IP address so there is no name for DNS to resolve.

    Flaw in the software??



  • pfSense itself won't use the failover unless you've enabled the gateway switching checkbox.



  • No, mine works fine with multiple WANs with the same gateway. In my case the gateway doesn't respond to pings so I'm using 8.8.8.8 and 8.8.4.4. Settings them as the DNS servers for each WAN causes pfSense to create static routes forcing them through a particular logical interface.



  • Where is the gateway switching check box?



  • It's somewhere in the general settings. That checkbox is only for pfSense's traffic itself to failover. Conditional routing for LAN clients will still failover regardless. This is why you must specify a gateway for each DNS server. That way DNS forwarding works even if gateway switching is disabled.



  • Here is my routing table and gateway status.

    Notice that in STATUS>GATEWAYS it is still showing that it's pinging on WANGW (WAN 1) even though Internet traffic is disabled for that gateway.

    ![Routing Table.JPG](/public/imported_attachments/1/Routing Table.JPG)
    ![Routing Table.JPG_thumb](/public/imported_attachments/1/Routing Table.JPG_thumb)
    ![Gateway Status.jpg](/public/imported_attachments/1/Gateway Status.jpg)
    ![Gateway Status.jpg_thumb](/public/imported_attachments/1/Gateway Status.jpg_thumb)



  • Here is Firewall Rule for LAN, Gateway Groups, and Gateways.

    ![Firewall Rule LAN.jpg](/public/imported_attachments/1/Firewall Rule LAN.jpg)
    ![Firewall Rule LAN.jpg_thumb](/public/imported_attachments/1/Firewall Rule LAN.jpg_thumb)
    ![Gateway Groups.jpg](/public/imported_attachments/1/Gateway Groups.jpg)
    ![Gateway Groups.jpg_thumb](/public/imported_attachments/1/Gateway Groups.jpg_thumb)



  • Here is my General Setup…

    Also, I cannot find the checkbox for gateway switching.

    Hopefully these screenshots help; please let me know if another shot would help.

    THANKS!

    ![General Setup.jpg](/public/imported_attachments/1/General Setup.jpg)
    ![General Setup.jpg_thumb](/public/imported_attachments/1/General Setup.jpg_thumb)



  • Also, just for informational purposes… here is a Visio diagram of the setup I have for testing purposes of dual WAN with failover. I have to do this in a lab environment to prove the concept before I can do this for a client and have their site taken down.

    On "Router 1" in the diagram, I can physically disconnect that ethernet cable to the pfSense WAN and the pfSense WILL failover to the OPT WAN (WAN2); however, as noted in the diagram, when I create a firewall rule on "Router 1" to block any/all Internet traffic the pfSense does not see this... it thinks it can still ping its monitor IP even though the client PC and the pfSense ping tool cannot ping the monitor IP. In essence, the pfSense is failing over for physical loss but NOT packet loss.

    Hope this may add some clarity too.

    ![Dual WAN Test.jpg](/public/imported_attachments/1/Dual WAN Test.jpg)
    ![Dual WAN Test.jpg_thumb](/public/imported_attachments/1/Dual WAN Test.jpg_thumb)



  • Anyone??

    I'd like to get the pfSense working… just for proof of concept, I tried the exact same network schema setup with a Cisco RV042 Dual WAN router and it worked beautifully with about 10 min of setup.

    Please help.

    Thanks,

    Steve



  • @rober1sf:

    Here is my General Setup…

    Also, I cannot find the checkbox for gateway switching.

    Hopefully these screenshots help; please let me know if another shot would help.

    THANKS!

    That b'cos you're looking at the wrong menu. It's under System -> Advanced -> Misc " Allow default gateway switching"
    Give this thread a read  http://forum.pfsense.org/index.php/topic,64612.msg350227.html#msg350227
    I've explained fail-over clearly in there. Good Luck!



  • I found gateway switching but checking that box hasn't made this work either.

    Also, srk3461, I think I've done what your other article says to do… What am I doing wrong? Do I need to have all 3 groups and LAN rules even if I'm NOT load balancing?



  • First make it work with a single pfSense box with two public IP WAN interfaces so you understand exactly how to configure pfSense. Then build your (unnecessarily) complex network around it.



  • Nothing works better for fail-over than two of the same ISP…



  • @kejianshi:

    Nothing works better for fail-over than two of the same ISP…

    …from the same modem.



  • :( Ok. I understand that one doesn't want dual WAN from the same ISP/modem in a production environment but I have to TEST this in a lab environment before I can put it in a production setting where it doesn't work! This "unnecessarily complicated" network as you say is essentially the exact same setup as having 2 modems from 2 ISPs.

    Why does the pfSense NOT fail over to WAN2 when the Internet traffic on WAN1 goes out? And for proof of concept, I used a Cisco RV042 in the same "unnecessarily complicated" network and the Cisco worked.

    I think there is a software flaw with pfSense and the dual WAN failover. You guys using it say it works, but are you dropping physical link for it to fail, because that does work. Dropping Internet packets does not work though!!!!!

    I'm begging for help from you guys, so please help if you can; I don't need the sarcasm or rude comments.



  • So, you are saying failover from packet loss doesn't work?

    How are you simulating packetloss?



  • Failover pased on a packet loss threshold does work. It works by default.



  • If failover isn't working based on simulation of packet loss then:

    Either its broken or

    The settings are wrong or

    Packet loss is not being done effectively to cause a failover.

    Thats why I'm asking how are the packets being dropped.



  • On router 1 in my diagram above, I have an outbound firewall rule that blocks all outbound Internet traffic, thus creating Internet packet loss on WAN 1. I know the packet loss is happening too because the pfSense diag ping tool will have 100% loss, but gateway status will still show it pinging.

    In a "real" production setting, I would test this by removing the coax cable from the cable modem because simply unplugging the power to the modem would be link state down, which doesn't happen when the Internet goes down.



  • Are you sure those packets you are blocking are being dropped silently and not rejected with a reply?

    REJECT
        Prohibit a packet from passing. Send an ICMP destination-unreachable back to the source host [unless the icmp would not normally be permitted, eg. if it is to/from the broadcast address].
    DROP (aka DENY, BLACKHOLE)
        Prohibit a packet from passing. Send no response.



  • When I ping from the workstation behind the pfSense I get response timed out, 100% loss. I believe that is correct, right?



  • Do you own a server on the net anywhere? 
    What I would do is maybe set up a couple Centos boxes with public IP you can ping.
    Use those IPs as your monitor IPs.

    If you shut down the Centos box (no blocking anything), then that will be for sure packet loss.
    You and a buddy could set up one at your home and one at his if you want to have control over two "gateway" IPs to use.

    I know this sounds like unnecessary work, and it may be, but at least you will know its not your method of inducing packet loss that is flawed.
    I suppose you could do the same thing entirely in lab environment with no outside internet.

    I don't know if pfsense would know the difference in a packet dropped silently, rejected, or an unreachable offline server. It might.
    Since you didn't tell me if you are dropping packets silently, I assume you aren't sure.



  • @kejianshi:

    DROP (aka DENY, BLACKHOLE)
       Prohibit a packet from passing. Send no response.

    I'm doing a DENY rule in the firewall. I don't think that I need to switch to pinging 2 of my own servers on the Internet (unless you are thinking about something that I'm not) because this setup works in the exact same setup with the Cisco RV042, and the RV042 fails over.

    ![Router 1 Firewall Rule.JPG](/public/imported_attachments/1/Router 1 Firewall Rule.JPG)
    ![Router 1 Firewall Rule.JPG_thumb](/public/imported_attachments/1/Router 1 Firewall Rule.JPG_thumb)



  • When it works for others, but not for you, I'd suggest changing your methods or recheck your settings.  That would be a pretty big bug if it really doesn't work.



  • Ok, so I added a laptop on the same LAN as Router 1 (see attached Visio) and changed my monitor IP on WAN1 to that latptop's IP address. Physically unplugging the cable from the laptop keeps the physical link on WAN1 but allowed for packet loss and a failover to WAN2. Reconnecting the cable to the laptop allowed the pings to begin again and the connection to WAN1 was then automatically restore as well.

    WORKED like it should! Still confused by the Cisco RV042 worked with the other setup, but all I can say is that maybe the pfSense "sees" the pings differently from the Router 1 firewall block rule than the Cisco??

    At any rate, ready to try in a REAL production environment (see attached Visio for the "UNCOMPLICATED" network you all were looking for ;D ).

    Thanks to all for helping. If anyone stumbles across this and needs help, I've also created a step-by-step setup from initial boot/configuration to setup a Dual WAN with Failover… just PM me for a copy.

    Thanks again!!

    ![Dual WAN Test 2.jpg](/public/imported_attachments/1/Dual WAN Test 2.jpg)
    ![Dual WAN Test 2.jpg_thumb](/public/imported_attachments/1/Dual WAN Test 2.jpg_thumb)
    ![Production Dual WAN.jpg](/public/imported_attachments/1/Production Dual WAN.jpg)
    ![Production Dual WAN.jpg_thumb](/public/imported_attachments/1/Production Dual WAN.jpg_thumb)



  • Yeah - I wasn't trying to waste your time.  I'm glad its working now. 
    I hope your actual install goes well also.


Log in to reply