CARP unstable in multiple setup



  • Dear all,
    I have some problems with my pfSense setup. I have 3 layers of pfSense firewalls, and each couple has CARP enabled between them. All seems fine on the second and third layer, while the first layer is unstable when the backup node is powered on.
    I will try to make a picture of the setup:

    I–-- fw1A --- fw2A --- fw3A [all on switch A]
                                                      I
    Internet –-- ext firewall (linux) ---I
                        (switch B)              I
                                                      I---- fw1B --- fw2B --- fw3B [all on switch B]

    The couples (fw1A/fw1B, fw2A/fw2B, fw3A/fw3B) use a cross cable on a dedicated NIC, while the other NIC is connected to different switches.

    The main difference between the setups is that on the front layer (fw1A/fw1B) is also an end point for an OpenVPN VPN (for management) and an IPSEC one to a remote site.

    As soon as the fw1B gets online, the servers behind the fw1A/1B start to have an unstable connection to the internet.

    I have already checked that:

    • all VIP interfaces have different VHIDs
    • pfsync advertisments get sent on the LAN segments (also on the BRIDGED interfaces, and I don't know if this is ok)
    • there are no restrictions on the pfSync dedicated interface

    I have tried to follow several troubleshooting guides found on the internet but none seem to work. At the moment both fw1A and 1B use LAGG interfaces to abstract from physical interfaces.

    The hardware of both nodes in each couple is the same, but differs from layer to layer

    All pfSense are installed with version 2.3.4_p1 but the problem arose with after upgrading several months ago to version 2.2.4.

    One last point, at the same time of the upgrade to version 2.2.4 we moved the external firewall to the same switch as node Fw1B. The external firewall is a linux box. Being all the systems production systems I have trouble to do extensive tests on them.

    I appreciate any help to solve this issue

    Claudio