Failover works for a short time then stops working

  • The setup:
    WAN interface: DSL-Modem in bridged mode (dynamic ip via DHCP)
    WAN2 interface: cable-Modem in bridged mode (dynamic ip via DHCP)

    WAN is the main interface for secure traffic (ssl, ssh etc)
    WAN2 is the main interface for all other traffic
    LAN has 10.0.0.x subnet

    WAN fails to WAN2 and vice versa.
    No loadbalancing.
    No sticky states.

    The problem:
    If WAN2 fails traffic is switched to WAN which works for a short time (1-2 minutes). After that all new connections fail. Existing connections (Skype, OpenVPN, running pings) still work.

    One Problem might be the cable modem which starts to serve internal IPs when the connection to the ISP fails (WAN2 gets a 192.168.100.x IP).
    Usually the WAN-IPs are quasi static and don't change over a long time.

    I read that there are some caveats with loadbalancing and dynamic ips, but I'm not sure what they are.

    The question:
    Is the internal IP on WAN2 the cause for all the trouble or did I miss something? What else could be the cause for this behaviour?

    as a sidenote:
    I also tried loadbalancing the two connections but this didn't work to well. About one out of ten connections failed.

  • pfSense adds static routes behind the scenes to the monitor IPs through the desired gateway. Not sure what happens if the gateway changes as the interface is dhcp (if the interface changes from public to private IPs while the rules are attached to the old public gateway IP). This mght cause your problems.

  • You could try adding static routes.

    wan    wan_isp1  wan_gateway
    wan    wan_isp2  wan_gateway
    wan2    wan2_isp1  wan2_gateway
    wan2    wan2_isp2  wan2_gateway

    make sure you can ping you ISP's DNS server

    (Again beating by the fast hoba  :P)

  • Thanks for your replies.
    The change of the gateway results in the following error messages:

    Apr 24 15:37:51 kernel: arpresolve: can't allocate route for
    Apr 24 15:37:51 kernel: arplookup failed: host is not on local network

    But the only thing I would expect from this is that the interface is not marked as UP until the gateway is reachable again. But maybe I'm wrong.

    I suppose you meant adding static routes to the respective ISPs DNS servers. But as hoba already remarked, static routes for the monitoring IPs are added automatically so I doubt that this will help. Apart from that I'm using (pingable) IPs outside the respective ISPs network to make sure I really have a connection to the outside world and not only to the ISPs network.

  • static routes for the monitoring IPs are added automatically

    Didn't knew that, but I'll keep adding them anyway :)

    In the testes i've tried the DHCP was never renewed from the motorola cable modem. But a quick release renew surely did change the ip to a local one, with the result that internet access seems gone.

    Testing with following rule

    • Lan net * Lan address * *

    Did not help, but if i edit a lan rule, saves it and press apply i can surf again.

  • Is there a shell command / script that does the same as when you apply a rule?

    My idea goes something like this.
    Since i know what IP i will get a cron job could be watching for it and reload the rules if becomes the same.

  • I think I solved the problem in my case.
    I hacked the php-script, which generates config file for dhclient so that an additional option is saved in the config file. Specifically I added the option```

Log in to reply