[SOLVED] pfSense 2.2 VLAN LAGG fails
-
I tried to take 2.2 for a spin but can't get my LAN vlans to work with LAGG. The upgrade went smoothly, however after the reboot, the firewall was inaccessible from the LAN. With the firewall disable to enable access from the WAN, the interface stats showed a lot of outbound error traffic on the LAN VLANs and no inbound traffic. I was able to restore LAN traffic by removing one of the interfaces (em1) from the LAGG and add it directly to the VLAN. Is LAGG with VLAN broken in 2.2? I was not able to capture any log because this is my only gateway to the net.
Here is my setup:
Running pfSense 2.1.5 (64bit) on a generic Intel P4 box with dual Intel NIC connected to a Netgear GS724tv2 switch. I have trunk enabled for the two ports on the Netgear switch, which as per Netgear's manual, "Port Trunking - Manual as per IEEE802.3ad Link Aggregation". On the pfSense box, I have LAGG (LACP) enabled with both em0 and em1 assigned to the LAGG interface. I have multiple VLAN using the same LAGG interface (it seems VLAN can only be assigned to LAGG using LACP).Also of note, when I first received the Netgear switch, I tried enabling the Netgear trunk for my servers dual port NIC, but this does not work with Linux bonding. With the Netgear trunk enabled, no traffic pass from the server to the switch. With the Netgear trunk disabled and IEEE802.3ad Linux bonding enabled, traffic passes on only one port on the Netgear switch. I have since switch Linux bonding to roundrobin and Netgear trunk disable, which sees traffic on both ports on the Netgear switch.
Any suggestion would be greatly appreciated. Let me know if any specific logs are required and where to retrieve them.
-
lagg with VLANs definitely works fine in 2.2, that's how my home system is setup as well as at least a couple test systems. Knowing what the lagg status showed would be helpful, both from the firewall in the output of "ifconfig" and the switch.
(it seems VLAN can only be assigned to LAGG using LACP).
LACP is preferable because it interoperates with the switch in ways that the other options can't, but VLANs can be used with all types of LAGG. With the other types, depending on the switch, you may have to not configure the switch with a trunk at all.
Also of note, when I first received the Netgear switch, I tried enabling the Netgear trunk for my servers dual port NIC, but this does not work with Linux bonding. With the Netgear trunk enabled, no traffic pass from the server to the switch. With the Netgear trunk disabled and IEEE802.3ad Linux bonding enabled, traffic passes on only one port on the Netgear switch. I have since switch Linux bonding to roundrobin and Netgear trunk disable, which sees traffic on both ports on the Netgear switch.
That suggests LACP is problematic on the switch. Make sure you have the latest firmware on the switch as that might help.
-
Just upgraded again and all my VLANs are dead.
ifconfig:
em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=4209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic,vlan_hwtso>ether <removed>nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>) status: active em1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=4009b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso>ether <removed>nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>) status: active bge0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500 options=8009b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,linkstate>ether <removed>inet6 <removed>%bge0 prefixlen 64 scopeid 0x3 inet <removed>netmask 0xfffffe00 broadcast 255.255.255.255 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex,master>) status: active pflog0: flags=100 <promisc>metric 0 mtu 33144 pfsync0: flags=0<> metric 0 mtu 1500 syncpeer: 224.0.0.240 maxupd: 128 defer: on syncok: 1 lo0: flags=8049 <up,loopback,running,multicast>metric 0 mtu 16384 options=600003 <rxcsum,txcsum,rxcsum_ipv6,txcsum_ipv6>inet 127.0.0.1 netmask 0xff000000 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x6 nd6 options=21 <performnud,auto_linklocal>enc0: flags=0<> metric 0 mtu 1536 nd6 options=21 <performnud,auto_linklocal>lagg0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=4009b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso>ether <removed>inet6 <removed>%lagg0 prefixlen 64 scopeid 0x8 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active laggproto lacp lagghash l2,l3,l4 laggport: em1 flags=0<> laggport: em0 flags=0<> lagg0_vlan5: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=3 <rxcsum,txcsum>ether <removed>inet6 <removed>%lagg0_vlan5 prefixlen 64 scopeid 0x9 inet <removed>netmask 0xffffff00 broadcast <removed>nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active vlan: 5 vlanpcp: 0 parent interface: lagg0 lagg0_vlan1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=3 <rxcsum,txcsum>ether <removed>inet6 <removed>%lagg0_vlan1 prefixlen 64 scopeid 0xa inet <removed>netmask 0xffffff00 broadcast <removed>nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active vlan: 1 vlanpcp: 0 parent interface: lagg0 ovpns1: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 1500 options=80000 <linkstate>inet6 <removed>%ovpns1 prefixlen 64 scopeid 0xb inet <removed>--> <removed>netmask 0xffffffff nd6 options=21 <performnud,auto_linklocal>Opened by PID 20570 ovpns2: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 1500 options=80000 <linkstate>inet6 <removed>%ovpns2 prefixlen 64 scopeid 0xc inet <removed>--> <removed>netmask 0xffffffff nd6 options=21 <performnud,auto_linklocal>Opened by PID 20857</performnud,auto_linklocal></removed></removed></removed></linkstate></up,pointopoint,running,multicast></performnud,auto_linklocal></removed></removed></removed></linkstate></up,pointopoint,running,multicast></performnud,auto_linklocal></removed></removed></removed></removed></rxcsum,txcsum></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></removed></removed></removed></removed></rxcsum,txcsum></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></removed></removed></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></performnud,auto_linklocal></rxcsum,txcsum,rxcsum_ipv6,txcsum_ipv6></up,loopback,running,multicast></promisc></full-duplex,master></performnud,auto_linklocal></removed></removed></removed></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></removed></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast></full-duplex></performnud,auto_linklocal></removed></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic,vlan_hwtso></up,broadcast,running,simplex,multicast>
-
Looks like full LACP compliance is required for 2.2. After trying every possible configuration, LCAP in 2.2 does not work with the Netgear GS724tv2 switch. I changed the LAGG protocol to FEC since I wanted some form of aggregation. Anything I should be aware of using FEC? I do get some error packets on reboot (4 errors) but everything else seem to work fine.
-
Shouldn't be any functional difference in LACP from earlier versions to current versions. Where LACP works switch-side, it works the same on both. There must be some difference there, but from the sounds of it, LACP doesn't work properly in general on your switch.
FEC should be fine though. That's just an alias for roundrobin.