CARP on new 2.1.5 installation fails after a random amount of time



  • We just built a new CARP cluster of two VMs on Vmware on the latest 2.1.5 version.

    Everything works fine for half an hour or so then randomly the WAN VIP stops responding and passing traffic.
    The firewall can still ping the WAN VIP address but it can't be accessed from anywhere else. NAT rules etc, all stop working and internal user on LAN can't access the internet any more.

    The secondary firewall remains in Backup mode and doesn't attempt to failover.

    I've found that editing the WAN VIP address then saving again usually brings traffic back up.
    Editing any other VIP doesn't seem to have any effect.

    Setup is as follows.

    10.20.30.191/29 WAN GW
    10.20.30.192/29 WAN VIP
    10.20.30.193/29 WAN FW1
    10.20.30.194/29 WAN FW2

    10.10.0.254/24 LAN VIP
    10.10.0.1-24 LAN FW1
    10.10.0.2/24 LAN FW2

    172.17.1.1/24 CARP FW1
    172.17.1.2/24 CARP FW2

    Advanced Outbound NAT enabled and set to WAN VIP
    ESXi set to promiscuous mode accept on all vswitches.
    Each subnet above is on a searate Vlan except LAN on default Vlan.

    Network swith has been tested with multicast flood protection on and off and any otehr relevant protections which might shut doen a port.

    Anyone any idea what could be going on here?
    Thanks

    [Update]
    We tried adding an IP Alias to the WAN interface and NAT traffic to this IP seems to continue working even when the main WAN VIP stops working.



  • Hi - glad (and sad) to see someone else with the same problem!  ;D

    I've also noticed that disabling carp and re-enabling carp on the primary

    I also see a similar / related post which I replied to here: Topic 81050

    https://forum.pfsense.org/index.php?topic=81050.msg451115#msg451115

    Have you had any feedback from the powers that be?

    Cheers!

    Mitch



  • I had the same problem with a physical setup of two pfSense machines (different hardware but identical pfSense version and O/S).  When I moved the setup to vmware all problems went away.  I should point though that I'm running the setup on a single host system and I'm using two physical interfaces LAN & WAN while the pfSync interfaces are on a private vswitch.

    I have have problems though with static IP clients.  My understanding is that I should be setting the DNS and gateway to the LAN CARP IP but it doesn't seem work.  Any thoughts?



  • I am presently using two separate physical machines.
    I believe they sync interface is connected by a cross over cable.
    The issue doesn't affect the alias ip's - only the main carp virtual IP.
    So that's my work around at the moment is to not put any services on the main floating IP.
    I've seen enough posts to suspect this is a real problem though - not one of our misconfiguration - what do you think?


Log in to reply