CARP VIP Not working correctly as default gateway intermittently



  • I'm having a strange issue on 2.3 with CARP VIP addresses.

    I have 2 pfsense VMs acting as a security gateway to control access to certain VLANs. They are both configured for pfSync and CARP and are on separate hosts with anti-affinity rules. The CARP and pfSync piece works fine, both firewalls are always in sync, master/secondary designation works correctly, state tables synced, etc etc.

    For each protected VLAN firewall A's interface IP address is x.x.x.2 and firewall B is x.x.x.3 with a VIP address of x.x.x.1 which is the default gateway IP I configure for devices that require a static IP address, and is the default gateway given by DHCP (pfsense) on those VLANs. DHCP address pool starts at .5 so no issue with conflicting IP addresses.

    My problem is that starting about a week ago, which coincidentally was shortly after upgrading to 2.3 I am getting strange behavior by only SOME devices when they are configured with the VIP IP as the default gateway, whether through DHCP or by static assignment. These devices start having issues connecting to active directory and failing to find a domain controller, or failing to update policies. Anything with SSL encryption becomes broken and inaccessible, so trying to browse an HTTPS website times out, or trying to connect to a resource with any kind of SSL or TLS does not work or takes a VERY long time to connect. RDP fails to negotiate a secure connection, but only for some connections, others work normally but with more latency than usual.

    Unencrypted resources work albeit significantly slower than normal

    Mapped drives only mount correctly 50% of the time, the other 50% of the time they will either not reconnect with a resource not available error, or they will mount but trying to browse will hang the OS for 30-45 seconds before loading.

    Up until today, I only saw this behavior occurring with a single server, but as of 30 minutes ago it happened to my desktop which is in one of those protected VLANs and gets DHCP IP assignments from pfsense.

    What's weird is as soon as this happened to my desktop, I immediately tried to log in to the pfsense box to check logs, because my desktop was behaving as if the firewall was under extreme load, when it wasn't at all. I type in the pfsense webgui URL and I am presented with a DNS rebind error which I thought was weird because I disabled DNS rebind attack checking. After logging in from a different device I confirmed DNS rebind checking was definitely disabled. Looking at the logs I saw these errors for my login attempt

    All of this weirdness immediately stops if I change my default gateway to the direct non-VIP interface address of the master firewall. But like I said what's so strange about this is that 99% of the devices operating behind these firewalls are working perfectly fine using the VIP address as the default gateway, but randomly a device will start flaking out until I change the gateway to the non-vip address.

    Any ideas?