LAGG (LACP) - UniFi Switch (16XG)
-
Don't filter on the IP address. Capture everything on port 67. You are missing the DHCP broadcasts.
-
-
No, anything sourced on port 68 is from the client. Anything sourced from port 67 is from the server.
I would download that into wireshark and look at what DHCP is actually doing. Looks like it's working there to me, but can't see much other than two-way traffic.
- 27 days later
-
I had some free time today and made a new discovery. I went to try and setup one of our other production servers to use a LAGG connection to this switch, as soon as I enabled ports 1+2 on the UniFi 16XG (these are the ports connected to this second server) I noticed the status LED's flashing very rapidly on all connected ports, I then immediately unplugged the single ethernet cable running between the 16XG and PfSense and this rapid flashing stopped, however all devices behind the switch still had access to the internet which meant the LAGG between the 16XG and PfSense was working!
After our previous troubleshooting session here I left the two direct attach cables connected between PfSense and the 16XG and also tried many different combinations of configurations including playing around with VLANs. Ultimately I ended up with the original configuration when making this post (LAGG assigned to an enabled interface with no ipv4/v6 address set and included in the bridge) however with one difference, I switched the LAGG interface from LACP to ROUNDROBIN.
As soon as I enabled a different set of ports on the switch to be aggregate then the original ports I had setup as aggregate (11+12) came to life!
At this point I thought maybe the UniFi configuration never got applied and somehow by aggregating a different set of ports finally enabled the original configuration. I then went back and switched from ROUNDROBIN back to LACP but the again we stopped passing packets between the switch and firewall, I again rebooted the switch, rebooted the firewall, switched between net.link.lagg.lacp.default_strict_mode 0/1 rebooted the switch and firewall each time and still no packets would pass.
I finally decided to go back to ROUNDROBIN but packets still would not pass! I proceeded to reboot everything again and still no packets passed! Finally I went back to the UniFi controller and once again enabled a different set of ports to be aggregate and once again packets started passing!
I thought to myself again, maybe after switching back to LACP and trying this trick to enable a different set of ports would kick things off, unfortunately it did not.
So I'm at a loss here for what's happening. You would think there is an issue with the configuration being applied on the UniFi controller however simply switching from ROUNDROBIN over to LACP then back to ROUNDROBIN forces us to use the trick again to get packets passing.
-
Have you tried setting a MAC address on the interface page?
Seems I have to do this to make my WAN work on my MB8600 modem. I spun my wheels for a about an hour until I did so.
-
I have never personally seen a switch that did not work correctly with pfSense LACP.
That said, I have never used a Ubiquiti switch.
I have not seen any reports that it does not work properly.
Are you still messing around with the bridge here? Possible you created a layer 2 loop.
-
I've tried every troubleshooting step with the LAGG in a bridge and as a standalone interface with appropriate firewall rules to allow traffic and no combination will allow packets to pass using LACP.
Currently ROUNDROBIN is working fine and in bridge mode however I would prefer to get it setup using LACP.
It is a bit troubling that simply changing the LAGG Protocol to LACP then back to ROUNDROBIN breaks the system again requiring me to fuss around with the switch and set two random unused ports as aggregate before packets will start passing once more.
-
LAGG (LACP) - UniFi Switch (16XG):
ifconfig -v lagg0
Will your Unifi Switch work with while your pfsense box has a MAC address on that LAGG of 00:00:00:00:00:00?
-
Is there a way to force the Lag ID? I tried directly setting the MAC Address on lagg0 however lag id stayed all zeros.
-
Yeah, but that might be the switch.
-
On my picture that is the MAC address that I spoofed on my WAN page. My modem is the other end of the LAGG in my case.
I would assume that his case would be similar.. ??
-
@kklouzal said in LAGG (LACP) - UniFi Switch (16XG):
Is there a way to force the Lag ID? I tried directly setting the MAC Address on lagg0 however lag id stayed all zeros.
Make sure the address you are trying does not exist anywhere else in your system..
The other issue I see is that both your ports appear to have the same MAC address.. Are you sure your ports are not in some kind of switch mode?
-
The only difference I can see between my output and yours from the image is that LAG ID is all 0's for mine and yours is set.
Both of your ports are using the same MAC Address too
lag id: -------------- 00-90-7f-88-b4-2e & 02-10-18-3a-41-f1
laggport: em0 - 00-90-7f-88-b4-2e & 02-10-18-3a-41-f1
laggport: em1 - 00-90-7f-88-b4-2e & 02-10-18-3a-41-f1For your setup, I would assume that 00-90-7f-88-b4-2e is the physical address of em0/em1 on PfSense and 02-10-18-3a-41-f1 is the physical address of your modem, each device on both ends have multiple ports on the same adapter so they are sharing a physical address.
Mine is doing the same thing except with the Chelsio card and my UniFi 16XG switchlag id: ------------------ 00-00-00-00-00-00 - 00-00-00-00-00-00
laggport: cxgbe0 - 98-be-94-12-d5-e0 - b4-fb-e4-50-50-16
laggport: cxgbe1 - 98-be-94-12-d5-e0 - b4-fb-e4-50-50-16lag id of all 0's is telling me the link is not setting itself up properly. Switching over to ROUNDROBIN allows packets to pass but only after doing that tricky/hacky thing of going over to the switch and setting two unused ports as aggregate, which will kick off the link and get packets moving, then unaggregating those ports.
I'm leaning more towards the side of something being wrong on the UniFi side of things here. I can't find mention of this problem anywhere else on the netgate forums or unifi forums so in all reality I probably have something misconfigured. There aren't many dials to turn and switches to flip without digging into the CLI on our switch. LACP should just work out of the box after aggregating two ports on the switch side.
-
Im looking at your 1st picture at the top of the thread here.
That looks strange to me. Both ports should have an HW: address I believe. And they should be different.
-
Two ports in LACP have the same MAC address. It's perfectly normal.
[2.4.4-RELEASE][root@fw]/root: ifconfig -v lagg0 lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=6500bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:08:a2:0a:59:3f inet6 fe80::208:a2ff:fe0a:593f%lagg0 prefixlen 64 scopeid 0xb nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet autoselect status: active groups: lagg laggproto lacp lagghash l2,l3,l4 lagg options: flags=10<LACP_STRICT> flowid_shift: 16 lagg statistics: active ports: 2 flapping: 0 lag id: [(8000,00-08-A2-0A-59-3F,016B,0000,0000), (0001,CC-4E-24-53-94-00,4E21,0000,0000)] laggport: igb4 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-08-A2-0A-59-3F,016B,8000,0005), (0001,CC-4E-24-53-94-00,4E21,0001,0023)] laggport: igb5 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-08-A2-0A-59-3F,016B,8000,0006), (0001,CC-4E-24-53-94-00,4E21,0001,0024)] [2.4.4-RELEASE][root@fw]/root: ifconfig -v igb4 igb4: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=6500bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:08:a2:0a:59:3f hwaddr 00:08:a2:0a:59:3f nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active [2.4.4-RELEASE][root@fw]/root: ifconfig -v igb5 igb5: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=6500bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:08:a2:0a:59:3f hwaddr 00:08:a2:0a:59:40 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active
-
-
OK?
Is that em0 or em1?
What does
ifconfig -v
show for em0 and em1? -
I was able to get Dynamic 802.3ad LACP working between the switch and a windows 10 machine with no problems at all. The only log entries I can find related to this issue are these here:
cxgbe0: Interface stopped DISTRIBUTING, possible flapping
cxgbe1: Interface stopped DISTRIBUTING, possible flapping -
And what does the switch say?
I can get LACP running between my Brocade, Cisco, and D-Link switches with no problems at all. If your experience points to pfSense, mine points to your switch.
-
I'm not trying to play a whose at fault game here, just trying to pin down the issue so it can be corrected.
Only option left to try is a different NIC and see if that changes things. There could be something physically wrong with the card or with the FreeBSD driver being used, it's an older T4 Chelsio adapter. I'll try one of the built in Intel adapters and report back.