System instability when using LAGG + LACP, CARP, and VLAN tagging.



  • I'm trying to move our office firewalls from Cisco PIXes to a pair of PFsense boxes running on this hardware http://www.abmx.com/1u-high-efficiency-mini-server . We have used this machine before for several single pfsense deployments, as well as CARP deployments.

    With the last few deployments, I've been trying to set up VLAN tags on top of LAGG interfaces, and it works. The only issue is that whenever I make an interface change, I HAVE to reboot the firewall in order for the interface to start passing traffic. So far I haven't had a chance to test it on a different system, but I have tested it on 4 different ABMX 1u boxes and gotten the same result every time.

    That was workable for a while, but today I started having another issue. Upon building the PFsenses for my work (time to eat our own dogfood and all of that), I've run into a problem where whenever I make a change to any LAGG interface, the firewall locks up and is completely unresponsive. I can't think of what would cause this to happen with this config as opposed ot any other config except that there are a lot more rules and a lot more vlan interfaces (our PIX configuration was about 1600 lines long). I've had to scrub the deployment for now (fortunately this happened in testing and not production), but my faith in both my ability to configure these things and PFsense itself is shaken. Any help would be appreciated.

    Here is the syslog that was taken during my attempts to configure interfaces. Unfortunately it didn't show anything that caught my eye.

    Sep 18 13:33:56 php: rc.filter_synchronize: XMLRPC sync successfully completed with https://10.255.255.2:443.
    Sep 18 13:33:56 check_reload_status: Linkup starting em1
    Sep 18 13:33:56 kernel: em1: link state changed to UP
    Sep 18 13:33:56 kernel: lagg1: link state changed to UP
    Sep 18 13:33:56 check_reload_status: Linkup starting lagg1
    Sep 18 13:33:56 check_reload_status: Linkup starting em2
    Sep 18 13:33:56 kernel: em2: link state changed to UP
    Sep 18 13:33:56 check_reload_status: Linkup starting em4
    Sep 18 13:33:56 kernel: em4: link state changed to UP
    Sep 18 13:33:58 php: rc.linkup: Hotplug event detected for OUTSIDE(wan) but ignoring since interface is configured with static IP (66.194.167.252 )
    Sep 18 13:33:58 check_reload_status: rc.newwanip starting lagg1_vlan2361
    Sep 18 13:33:59 php: rc.filter_synchronize: Filter sync successfully completed with https://10.255.255.2:443.
    Sep 18 13:34:00 php: rc.newwanip: rc.newwanip: Informational is starting lagg1_vlan2361.
    Sep 18 13:34:00 php: rc.newwanip: rc.newwanip: on (IP address: 66.194.167.252) (interface: OUTSIDE[wan]) (real interface: lagg1_vlan2361).
    Sep 18 13:34:02 php: rc.newwanip: waiting for pfsync…
    Sep 18 13:34:02 php: rc.newwanip: pfsync done in 0 seconds.
    Sep 18 13:34:02 php: rc.newwanip: Configuring CARP settings finalize...
    Sep 18 13:34:07 php: rc.newwanip: Resyncing OpenVPN instances for interface OUTSIDE.
    Sep 18 13:34:07 kernel: ovpns1: link state changed to DOWN
    Sep 18 13:34:07 check_reload_status: Reloading filter
    Sep 18 13:34:07 check_reload_status: Reloading filter
    Sep 18 13:34:07 php: rc.newwanip: Creating rrd update script
    Sep 18 13:34:07 kernel: ovpns1: link state changed to UP
    Sep 18 13:34:07 check_reload_status: rc.newwanip starting ovpns1
    Sep 18 13:34:09 php: rc.newwanip: pfSense package system has detected an ip change 0.0.0.0 -> 66.194.167.252 ... Restarting packages.
    Sep 18 13:34:09 check_reload_status: Starting packages
    Sep 18 13:34:09 php: rc.newwanip: rc.newwanip: Informational is starting ovpns1.
    Sep 18 13:34:09 php: rc.newwanip: rc.newwanip: on (IP address: 10.0.8.1) (interface: []) (real interface: ovpns1).
    Sep 18 13:34:09 php: rc.newwanip: pfSense package system has detected an ip change -> 10.0.8.1 ... Restarting packages.
    Sep 18 13:34:09 check_reload_status: Starting packages
    Sep 18 13:34:11 php: rc.start_packages: Restarting/Starting all packages.
    Sep 18 13:34:11 php: rc.start_packages: Quagga OSPFd: No config data found.
    Sep 18 13:34:11 php: rc.start_packages: Quagga OSPFd: No config data found.
    Sep 18 13:34:11 php: rc.start_packages: Quagga OSPFd: No config data found.
    Sep 18 13:34:11 php: rc.start_packages: Restarting/Starting all packages.
    Sep 18 13:34:11 php: rc.start_packages: Quagga OSPFd: No config data found.
    Sep 18 13:34:11 php: rc.start_packages: Quagga OSPFd: No config data found.
    Sep 18 13:34:11 php: rc.start_packages: Quagga OSPFd: No config data found.
    Sep 18 13:53:09 check_reload_status: Linkup starting em0
    Sep 18 13:53:09 kernel: em0: link state changed to UP
    Sep 18 13:53:11 php: rc.linkup: Hotplug event detected for TEMPMGMT(opt5) but ignoring since interface is configured with static IP (192.168.254.1 )
    Sep 18 13:53:11 check_reload_status: rc.newwanip starting em0
    Sep 18 13:53:13 php: rc.newwanip: rc.newwanip: Informational is starting em0.
    Sep 18 13:53:13 php: rc.newwanip: rc.newwanip: on (IP address: 192.168.254.1) (interface: TEMPMGMT[opt5]) (real interface: em0).
    Sep 18 13:53:13 check_reload_status: Reloading filter



  • Can you please explain your setup behind the pfsense pair? What you describe sounds like STP on the switches is kicking in.



  • @jflsakfja:

    Can you please explain your setup behind the pfsense pair? What you describe sounds like STP on the switches is kicking in.

    A set of two stacked Dell 5524 switches. I've verified that STP isn't blocking, and have actually gone as far as enabling portfast and completely disabling STP on the link aggregation groups in question.



  • You also mentioned LACP. I'm assuming that it's not configured in a way that it spans both switches, correct?


Log in to reply