LAGG (LACP) - UniFi Switch (16XG)
-
OK so look at the VLAN configuration on the switch.
Did you enable a DHCP server on the lagg interface?
Did you add firewall rules to the lagg interface?
-
Yes DHCP server was enabled and LAGG interface has firewall rule to permit all traffic, the switch doesn't have any VLANs setup, it should just switch traffic for the subnet it's directly attached to.
The fact that our switch is unable to optain an IP address from DHCP after removing the LAGG interface as a bridge member, directly assigning it an IP address, then enabling DHCP on that subnet tells me there is still a miss-configuration in the LAGG interface somewhere either on the PfSense or UniFi side?
-
Packet capture for UDP 67 on the lagg and see what you see regarding DHCP traffic. Zero idea what is required for the switch itself to obtain DHCP there. It would have to be the management VLAN at least I would assume.
Personally I would be less concerned with that as I would be with clients connected to the switch on the lagg VLAN getting addresses.
-
I unplugged both SPF+ cables making up the LAGG and unplugged the RJ-45 connected to my laptop then started the packet capture, I then plugged the two SPF+ cables back in and finally the RJ-45 and waited until my laptop stopped trying to identify on the network and finally set itself a 169 IP address. Here are the results
On the UniFi_LAGG interface DHCP server I have 1 static lease setup for the UniFi switch to use 192.168.2.80 as it's address. The DHCP server lease range is from 192.168.2.100 to 192.168.2.200 and from this I can conclude that PfSense DHCP server is attempting to assign the two devices their proper addresses. It would seem that communication from the switch back to PfSense is allowed to pass however traffic from PfSense over towards the switch is being blocked?I tried this with net.link.lagg.lacp.default_strict_mode set to 1 and 0 but it gave the exact same results, I have again set the value back to the default of 1.
-
Yes, it looks like the DHCP server on the firewall is receiving the requests and responding.
Is there some sort of DHCP snooping or protection in the switch that might be in play here?
Odd that capture never sees the requests coming in. What did you filter on there?
-
I'm not aware of any security measures in place or can I find any options in the GUI about security that doesn't pertain to the UniFi Security Gateway, even at that those options are all disabled since we don't have that specific device present on our network.
I am still unable to ping PfSense at 192.168.2.1 from my laptop which is connected through the switch even after we set a static IP of 192.168.2.50, Default Gateway of 192.168.2.1 and Primary DNS of 192.168.2.1, each ping attempt results in destination host unreachable. I suppose this could be due to the switch not having a proper address at this point?
I'm also curious as to why net.link.lagg.lacp.default_strict_mode at 1 or 0 both yield the same results, shouldn't one of these values make the LAGG completely unusable?
When I did the Packet Capture these were my settings:
-
@kklouzal said in LAGG (LACP) - UniFi Switch (16XG):
I am still unable to ping PfSense at 192.168.2.1 from my laptop which is connected through the switch even after we set a static IP of 192.168.1.50,
Typo? 192.168.2.50?
-
Don't filter on the IP address. Capture everything on port 67. You are missing the DHCP broadcasts.
-
Yes it was a typo, whoops :)
I went ahead and removed the Host Address from the filter, still only capturing outgoing requests for some reason.
-
No, anything sourced on port 68 is from the client. Anything sourced from port 67 is from the server.
I would download that into wireshark and look at what DHCP is actually doing. Looks like it's working there to me, but can't see much other than two-way traffic.
-
I had some free time today and made a new discovery. I went to try and setup one of our other production servers to use a LAGG connection to this switch, as soon as I enabled ports 1+2 on the UniFi 16XG (these are the ports connected to this second server) I noticed the status LED's flashing very rapidly on all connected ports, I then immediately unplugged the single ethernet cable running between the 16XG and PfSense and this rapid flashing stopped, however all devices behind the switch still had access to the internet which meant the LAGG between the 16XG and PfSense was working!
After our previous troubleshooting session here I left the two direct attach cables connected between PfSense and the 16XG and also tried many different combinations of configurations including playing around with VLANs. Ultimately I ended up with the original configuration when making this post (LAGG assigned to an enabled interface with no ipv4/v6 address set and included in the bridge) however with one difference, I switched the LAGG interface from LACP to ROUNDROBIN.
As soon as I enabled a different set of ports on the switch to be aggregate then the original ports I had setup as aggregate (11+12) came to life!
At this point I thought maybe the UniFi configuration never got applied and somehow by aggregating a different set of ports finally enabled the original configuration. I then went back and switched from ROUNDROBIN back to LACP but the again we stopped passing packets between the switch and firewall, I again rebooted the switch, rebooted the firewall, switched between net.link.lagg.lacp.default_strict_mode 0/1 rebooted the switch and firewall each time and still no packets would pass.
I finally decided to go back to ROUNDROBIN but packets still would not pass! I proceeded to reboot everything again and still no packets passed! Finally I went back to the UniFi controller and once again enabled a different set of ports to be aggregate and once again packets started passing!
I thought to myself again, maybe after switching back to LACP and trying this trick to enable a different set of ports would kick things off, unfortunately it did not.
So I'm at a loss here for what's happening. You would think there is an issue with the configuration being applied on the UniFi controller however simply switching from ROUNDROBIN over to LACP then back to ROUNDROBIN forces us to use the trick again to get packets passing.
-
Have you tried setting a MAC address on the interface page?
Seems I have to do this to make my WAN work on my MB8600 modem. I spun my wheels for a about an hour until I did so.
-
I have never personally seen a switch that did not work correctly with pfSense LACP.
That said, I have never used a Ubiquiti switch.
I have not seen any reports that it does not work properly.
Are you still messing around with the bridge here? Possible you created a layer 2 loop.
-
I've tried every troubleshooting step with the LAGG in a bridge and as a standalone interface with appropriate firewall rules to allow traffic and no combination will allow packets to pass using LACP.
Currently ROUNDROBIN is working fine and in bridge mode however I would prefer to get it setup using LACP.
It is a bit troubling that simply changing the LAGG Protocol to LACP then back to ROUNDROBIN breaks the system again requiring me to fuss around with the switch and set two random unused ports as aggregate before packets will start passing once more.
-
LAGG (LACP) - UniFi Switch (16XG):
ifconfig -v lagg0
Will your Unifi Switch work with while your pfsense box has a MAC address on that LAGG of 00:00:00:00:00:00?
Yours-
Mine-
-
Is there a way to force the Lag ID? I tried directly setting the MAC Address on lagg0 however lag id stayed all zeros.
-
Yeah, but that might be the switch.
-
On my picture that is the MAC address that I spoofed on my WAN page. My modem is the other end of the LAGG in my case.
I would assume that his case would be similar.. ??
-
@kklouzal said in LAGG (LACP) - UniFi Switch (16XG):
Is there a way to force the Lag ID? I tried directly setting the MAC Address on lagg0 however lag id stayed all zeros.
Make sure the address you are trying does not exist anywhere else in your system..
The other issue I see is that both your ports appear to have the same MAC address.. Are you sure your ports are not in some kind of switch mode?
-
The only difference I can see between my output and yours from the image is that LAG ID is all 0's for mine and yours is set.
Both of your ports are using the same MAC Address too
lag id: -------------- 00-90-7f-88-b4-2e & 02-10-18-3a-41-f1
laggport: em0 - 00-90-7f-88-b4-2e & 02-10-18-3a-41-f1
laggport: em1 - 00-90-7f-88-b4-2e & 02-10-18-3a-41-f1For your setup, I would assume that 00-90-7f-88-b4-2e is the physical address of em0/em1 on PfSense and 02-10-18-3a-41-f1 is the physical address of your modem, each device on both ends have multiple ports on the same adapter so they are sharing a physical address.
Mine is doing the same thing except with the Chelsio card and my UniFi 16XG switchlag id: ------------------ 00-00-00-00-00-00 - 00-00-00-00-00-00
laggport: cxgbe0 - 98-be-94-12-d5-e0 - b4-fb-e4-50-50-16
laggport: cxgbe1 - 98-be-94-12-d5-e0 - b4-fb-e4-50-50-16lag id of all 0's is telling me the link is not setting itself up properly. Switching over to ROUNDROBIN allows packets to pass but only after doing that tricky/hacky thing of going over to the switch and setting two unused ports as aggregate, which will kick off the link and get packets moving, then unaggregating those ports.
I'm leaning more towards the side of something being wrong on the UniFi side of things here. I can't find mention of this problem anywhere else on the netgate forums or unifi forums so in all reality I probably have something misconfigured. There aren't many dials to turn and switches to flip without digging into the CLI on our switch. LACP should just work out of the box after aggregating two ports on the switch side.