15 CARPS work one doesnt (Both think they are master) - any ideas?



  • 14 of our 15 CARP IPs work (sync state correctly), one doesnt.  Both master and slave think they are master.

    Ive checked the following:

    1. all CARP VIPs subnet masks match the underylying interface's subnet.
    2. All CARP VIPs have a unique VHID (the one which is not working being VHID=16).  Unique for the pfsync interface at least.
    3. The sync interface is a dedicated XOVER port, with a network cable plugged directly between the two FWs (no switch).
    4. If I edit the description of the non-working CARP, it correctly syncs it to the slave.
    5. The advertising frequency of the master is 0, slave is 100.

    Note:

    1. there are 4 physical interfaces:
        a) WAN
        b) LAN
        c) XOVER (pfsync)
        d) STAGE LAN.

    All working CAPS are on the LAN and WAN.  The non working one is the only CARP on STAGE LAN.
    Stage lan IP on FW 1 is 10.10.10.5/24, stagelan IP on FW 2 is 10.10.10.6/24, stagelan CARP is 10.10.10.1/24

    For a long time the set up worked, i.e. we could access servers on the stage LAN.  However, during this "working" time, the status of the stage lan carp on FW2 was blank.  Now we just updated something unrelated, and suddently the status has changed to master on FW2, and we cant access the staging network.

    Any ideas?


  • Rebel Alliance Developer Netgate

    Can each firewall actually ping the other on that interface?

    Typically these situations arise because the slave cannot see the master on that segment.



  • Good Idea.  the master can ping the slave ok.  (didnt try the othe way round).

    There is a STAGE LAN FW rule to allow * * * * *, so I dont think that the FW is blocking CARP traffic.

    tcpdump tells me that neither is seeing each others multicasts.

    Each is connected to a Dell 5524 switch, which are connected together via the HDMI cable chain in a stack.  multicast is enabled by default, and all ports are in the same untagged VLAN.


  • Rebel Alliance Developer Netgate

    If they're not seeing the traffic on the wire, it must be getting blocked by the switch somehow.

    Had this happen with a client a week or two ago and it turned out to be the firmware on the switch, even if you disable storm control it was still enabled. Wasn't a Dell switch though.



  • I had a similar problem on one of my firewalls.

    My solution was to edit the VIP in question on the primary firewall and just put in a - in the description, then save… Then everything started working.



  • When both systems are master it's because the CARP multicast isn't making it between the primary and secondary, most commonly because of a general connectivity issue between them, but at times because the switch(es) aren't passing it which can happen for a variety of reasons.

    @miloman:

    I had a similar problem on one of my firewalls.

    My solution was to edit the VIP in question on the primary firewall and just put in a - in the description, then save… Then everything started working.

    That couldn't be anything more than a coincidence, the description field does nothing at all other than display a description.


Locked