Multi-WAN ATT DSL, lost one and failover didnt work

  • So I have 3 3mb ATT DSL lines to my house, I have a gateway group setup with all three lines a Tier 1.  I have three separate DNS ip's assigned to each line via System>General Setup, I have those same DNS ip's assigned to the same WAN interfaces in Monitor IP in the corresponding interface.  I have a firewall rule setup: "IPv4  LAN Net    *  *    *    LoadBalance    none"  and it was placed above the other default LAN Net rules.

    Yesterday everything was working splendidly.. something today though took down Circuit 2 which was checked as 'Default Gateway' under System>Gateways.

    I was assuming that since I had a Gateway Group (LoadBalance) setup that things would roll over but they did not.

    In Googling tonight I found another option: System>Advanced>Misc> Enable default gateway switching which was unchecked by default, I checked it but from reading the description I should not need that since I have a GW Group setup.

    Do I need to enable that option?  Or can my issue be resolved via FW rules?

  • was apinger functioning as should and not locked to an unrealistic ping time ?

  • LAYER 8 Netgate

    What traffic didn't roll over?  New states or old states? Old states can't roll.

    Did you look at the firewall logs or the states for connections you were attempting?  Was is still trying to go out the failed gateway or were they trying to use the other ones and maybe something else was wrong?

    Was the failed gateway showing as down in Status > Gateways?

    I think you might be chasing a red herring regarding the Circuit 2 which was checked as 'Default Gateway' under System>Gateways.  More info is needed to know if your issue was at all related to that.  I have not heard of that being an issue.  My first thought is that there's something somewhere sending traffic to the default gateway instead of the LoadBalance group.

    What version of pfSense are we talking about?

  • I honestly don't know if it was old states or new states.

    I am on the latest version 2.2 running on a Dell Optiplex 320 with 4 1gb nics.

    Yes the failed GW was showing down under Status-Gateways

    And insofar as the Default GW checked.. I am unable to blank that option for all interfaces.  Every time I uncheck that box and hit Save it comes right back.  I had imagined that the entire purpose of load balancing a set of WAN interfaces was to NOT have a Default GW.

    In the end, because I couldn't wait very long to get everyone back up I set WAN2 as Default GW so teh traffic could go out.  Also judging by the interface graphs both of the currently working WAn's (WAN2 and WAN3) are sending and reviving roughly the same amounts of data.

    grandrivers: IDK where to check on apinger at.. but I have always had a decent time difference between teh external Monitor IP's.  Here's how they're currently setup:

    WAN2    100+ms
    WAN3    60-ish ms

  • Confirming so I didnt look the fool.  Unchecking Default Gateway, saving reverts the DGW back to WAN where it had previously been on WAN2.  If I go back into WAN, uncheck teh Default GW box and hit Save it re-appears.
    I also had checked under System>Advanced>Networking Enable Default GW switching.

  • LAYER 8 Netgate

    The issue is not whether or not there's a default gateway set.  The issue is whether there are rules directing traffic to the default gateway instead of the gateway group.

  • Yes there is a LAN rule..

    ID Proto Source Port Destination Port Gateway Queue Schedule Description

        • LAN Address 80 * * Anti-Lockout Rule

    IPv4 * LAN net * * * LoadBalance none Load Balanced LAN > Out

    IPv4 * LAN net * * * * none Default allow LAN to any rule

    IPv6 * LAN net * * * * none Default allow LAN IPv6 to any rule

    Editing this rule and under the Adv features>Gateways the LoadBalance gateway group is selected

  • LAYER 8 Netgate

    What's the schedule for?

  • There is no schedule on any of the three rules.  My spacing's most likely off.  Maybe this will be better:

  • Banned

    Could you attach something readable please?

  • LAYER 8 Netgate

    Looks fine.  Unplug one and see what happens.

  • Well dammit.. I unplugged WAN2 which is the (default GW), my youtube stream didnt even hiccup.  Now after plugging it back in, it's GW showed up then down.  Bounced that dsl model.. still shows as down.  disabled/re-enabled that WAN2 interface.. still shows as down.  Line is up though to the modem as indicated by my test laptop.  Im afraid I'll have to bounce PF, even though i shouldn't need to.

  • LAYER 8 Netgate

    How long did you wait for it to recover before you started messing with it?

  • LAYER 8 Netgate

    Are the modems in bridge mode? (do your pfSense interfaces get public IPs?)

  • Crisis averted.

    But no I'm cursed with Motorola NVG510's, so I only have IP Passthrough mode.  I've been through all the posts from previous folks with these same DSL modems and for whatever reason I am not afflicted with the same crap they were.  DHCP giving out the incorrect subnet, or the wrong gateway…  Maybe I can draw a decent picture for you to better understand.  Now I do understand that IP Passthrough by design should hand off the public IP to a specific internal MAC address, this doesnt happen in my setup for whatever reason.  I can, however, successfully browse the interwebs in all it's glory.

    I will admit though, prior to tonight I had the same subnet used for the DSL modems DHCP range and I did internally... 192.168.15.x

    I tried rebooting pf in an attempt to resolve the WAN2 being down from last night (which it did come back up) the problem was though, that for some reason even though the console was showing all interfaces up I couldnt even ping the LAN side.  i solved that by changing the subnets from 15 to 10 as I saw massive scrolling texts on the pf console complaining about the wan and lan sides having the same subnets.

    Now WAN is still down and I have a ticket open with ATT schedules for a tech to resolve it tomorrow... but as I sit right now I have WAN2-3 up and operational.

Log in to reply