BUG: VLAN over LAGG: Adding a new interface breaks existing tunnels.



  • Hi there,

    I've got a pair of boxes running pfSense 2.1.3-RELEASE in a CARP cluster.

    The boxes are identical, with 4 intel em network cards.  All 4 (em0, em1, em2, em3) are aggregateed into a single LACP trunk (command output is anonymized, but accurate):

    lagg0:
            media: Ethernet autoselect
            status: active
            laggproto lacp
            laggport: em3 flags=1c <active,collecting,distributing>laggport: em2 flags=1c <active,collecting,distributing>laggport: em1 flags=1c <active,collecting,distributing>laggport: em0 flags=1c <active,collecting,distributing>The LACP portchannel is trunked 801q, and interfaces are created using VLANs over lagg0:

    lagg0_vlan101:
            inet 1.2.3.254 netmask 0xffffffe0 broadcast 1.2.3.255
            nd6 options=1 <performnud>media: Ethernet autoselect
            status: active
            vlan: 101 vlanpcp: 0 parent interface: lagg0

    CARP is then used to provide default gateways to all of the networks.

    Currently, in total, there are 6 "interfaces" (VLANs) with 6 CARP interfaces providing the 'virtual' gateways or public IP addresses on each of the 5 private networks and the public network.

    wan_vip11: flags=49 <up,loopback,running>metric 0 mtu 1500
            inet 1.2.3.1 netmask 0xffffffe0
            inet 1.2.3.2 netmask 0xffffffe0
            inet 1.2.3.3 netmask 0xffffffe0
            carp: MASTER vhid 11 advbase 1 advskew 0

    The problem:

    Just now, I added a new VLAN (e.g. 102), created a new interface assigned to lagg0_vlan102, and configured it.  This completely broke IPSec.

    In the web GUI config, everything looked fine.
    In the config file, everything looked fine.

    In the "IPSec Status" tab, however, all of the "down" tunnels had the wrong "Local IP" displayed.  Instead of, for example, 1.2.3.1 being used, another VIP that is assigned as an IP alias to the same CARP interface was chosen (say 1.2.3.6).  This, of course, didn't work at all and all of the tunnels that expected source IP 1.2.3.1 were down.

    Undoing my changes in the GUI didn't fix the issue, and neither did using the Config History feature to revert to a previous locally saved version of the config – I restored an .xml backup of the config taken just prior to the change (which I retrieved with "Config History"!) and performed the mandatory system reboot -- this brought things back to normal.

    Some tunnels, however, WERE up - all of those were using a different external CARP VIP which for some reason was NOT changed internally. (1.2.3.2).  Most, however, were on 1.2.3.1 and got remapped to 1.2.3.6 internally.

    Of course, I still need to add that extra VLAN and all that comes with it.  Help? :)</up,loopback,running></performnud></active,collecting,distributing></active,collecting,distributing></active,collecting,distributing></active,collecting,distributing>