CARP + VIPs + 2.1.4 randomly failing after an hour or so



  • Hey,

    We've had a really unusual issue since upgrading to 2.1.4:

    We have 1 CARP address and about 18 VIPs on-top; everything looks fine for about an hour (this has happened twice now, once on the initial upgrade and last night when I tried to re-enable CARP for a few select VIPs), when suddenly it just stops working, no traffic is delivered into the firewall on a VIP+CARP, however our VPN still works (this is on the primary CARP address).

    If the VIP is reassigned to be on the WAN IP it starts working again.

    Is this related to regression that jimp mentions (https://github.com/pfsense/pfsense/commit/2bf2a1c4c9a4ed1c378891e2b0e55edf3ed1a658) here: https://forum.pfsense.org/index.php?topic=78611.0 or is it a totally different issue?

    It was working beautifully for about 18 months on 2.0.x !

    Thanks,
    Rob



  • I would start with applying the patch that jimp mentions, and see what happens :)

    @jimp:

    If you  use IP Alias type VIPs layered on top of CARP VIPs, use the System Patches package to apply this fix (committed this morning):

    https://github.com/pfsense/pfsense/commit/2bf2a1c4c9a4ed1c378891e2b0e55edf3ed1a658



  • @vindenesen:

    I would start with applying the patch that jimp mentions, and see what happens :)

    Indeed, the traffic is working currently so I'm going to try and reproduce it separately. It's really awkward to test though as it appears to work "for a bit" before stopping completely. I don't understand how it works temporarily and then breaks, the description of the bug from jimp implies that IPAliases ontop of CARP just don't work.



  • For the record, I've reproduced this on the firewalls in the office here; the patch appears to solve it. However I'm still totally confused as to how it's working at all

    When the bug is present the IP address of the Alias (lets say .21) isn't assigned to any of the interfaces at all. Yet if I do curl https://x.x.x.21 then I can get through the NAT and the firewall to the webserver underneath. Why is PFSense responding on an IP it doesn't own!?