2.0RC2 - Can not for the life of me get simple Failover working, please help!

  • Hi all,
    I'm running 2.0-RC2 (i386) built on Thu Jun 9 20:28:39 EDT 2011.
    I'm trying to set up a simple A->B failover where if the "A" link fails or gets lossy, pfSense fails over to the "B" gateway.  I don't want any sort of round robin or load balancing, just using the "B" gateway as a standby if "A" goes down.

    Here's how I've configured things, but it's not working– when I tested this by yanking out the ethernet cable from the primary gateway's interface, pfSense correctly marks the gateway as "DOWN" after a brief time, but it does not start routing traffic over the "T1_GW" gateway:

    Help!  I've already spent many many hours fiddling with this, and admittedly this is only my 2nd pfSense install but I think I understand the basics enough to set this up.

  • "the allow default gateway switching" thingy shouldn't be necessary for regular NAT

    check your nat rules … make sure you have entries for your secondary wan ....

    also try to deduce where/why it goes wrong ... logs/pings/traceroutes/packet captures can help

  • @heper:

    check your nat rules … make sure you have entries for your secondary wan ....

    thanks heper, here are my NAT rules, they are on "automatic" maybe that's the issue? do I need to configure these manually and add rules for each gateway??

  • that all seems to be allright

    are you sure it is not a dns issue?

    by default pfsense will automagically add dns servers of WAN1 if you have it on dhcp.
    it's possible ISP1 dns servers are unreachable by ISP2. Therefor manually add dns servers for your secondary line and/or add googles public dns to do some testing

  • Yes pretty sure it's not a DNS issue-  my test is just to ping an external IP address (no DNS resolution involved there) that I know is pingable from both gateways.  The ping fails once WAN1 is down.  Here's my DNS settings anyway, which I think are correct:

    so – no other ideas??  isn't this a rather simple setup?  kind of scary that we are at RC2 and this isn't working?

  • i've setup a situation similar if not more complex with beta4/beta5/rc1/rc2

    it should be a 5 minute job on a bad day.

    I've personally never had issue's with basic failover/balancing.

    I'm guessing you need to provide the dev's more info if you wish to get this sorted.

  • Today I went back & updated to the Jun 13 build, and magically the failover started working.  ???
    So I guess this is just bugginess that is being sorted out.

    Now, my NEW PROBLEM is that after failing over to the OPT1 (T1) interface, it will not "fail back" to the original primary WAN interface.  Even after I plug the Ethernet back into em3 (WAN), the interface doesn't come back online – it's stuck in a perpetual "gathering data" state on the Dashboard.

    I logged in via SSH and attempted to ping the DNS server assigned to the WAN gateway: and I get "no buffer space available".

    ifconfig em3

    all appeared A-OK, so I proceeded to issue:

    ifconfig em3 down

    followed by

    ifconfig em3 up

    ...which restored everything to working again.  ??? So this seems to be a bug, or I would like to know how to incorporate the ifconfig em3 down/up commands into some type of script that triggers when the physical connection state of the interfaces changes to work around this bug.

    FYI this is on a Netgate Hamakua 1U unit running the 1GB embedded (nanobsd) build.  This unit uses Intel 82574L  & Intel 82562GT for it's PHY interfaces.

  • I had the exact same issue a couple weeks ago on an Alix board.  I configured for failover and tested it and after one of the tests, I had to restart the primary interface to get it working again.

  • So is there any way to programmatically make pfSense automatically bring the interface down/back up when switching gateways from OFFLINE to ONLINE?  Or can this bug be fixed?  Does this only affect nanobsd builds?

Log in to reply