LAGG (LACP) - UniFi Switch (16XG)
-
OK when I asked what interfaces were part of the bridge, that's what I was asking, what are the bridge member interfaces...
I have no idea if adding a lagg as a bridge member even works. You're certainly poking your head into dark corners. I would take the lagg out of the bridge, number it, and see if it behaves in a more consistent manner. As soon as you know that works, add it to the bridge and see if that breaks it. Then you know.
What does
ifconfig -v bridge0
show?And what port is that ssh client plugged into?
You're going to have to describe things in much more detail and with more specificity. Saying "the ethernet cable that allows me to ssh into it" tells us nothing we can act on.
-
I apologize, there are only 3 connected cables: two SPF+ direct attach connected to 11 and 12 on the switch as well as cxgbe0 and cxgbe1 on PfSense. In this configuration no traffic will pass, adding a third RJ-45 cable from port 16 on the switch into one of the free ports on PfSense allows us to communicate with the switch and login via SSH.
I'll try removing the LAGG from the bridge and assigning an IP so we can further narrow down the issue.
-
Not sure why STP is enabled on the lagg0 bridge member. Did you specifically enable that?
-
Yes, however I should leave it disabled since this is the only switch currently in the network topology and having it enabled only makes sense for a larger environment with multiple switches also running the protocol.
I went ahead and removed UniFi_LAGG from the bridge members, assigned it an IP address of 192.168.2.1, enabled DHCP for UniFi_LAGG, removed our 'third ethernet cable', and rebooted the switch. After the switch came online there were no DHCP leases handed out on the 192.168.2 subnet (was looking for one given to the switch) so I directly plugged my laptop into port 16 on the switch and was unable to receive an IP address. I then manually assigned my laptop an address of 192.168.2.10 and was unable to ping 192.168.2.1
-
OK so look at the VLAN configuration on the switch.
Did you enable a DHCP server on the lagg interface?
Did you add firewall rules to the lagg interface?
-
Yes DHCP server was enabled and LAGG interface has firewall rule to permit all traffic, the switch doesn't have any VLANs setup, it should just switch traffic for the subnet it's directly attached to.
The fact that our switch is unable to optain an IP address from DHCP after removing the LAGG interface as a bridge member, directly assigning it an IP address, then enabling DHCP on that subnet tells me there is still a miss-configuration in the LAGG interface somewhere either on the PfSense or UniFi side?
-
Packet capture for UDP 67 on the lagg and see what you see regarding DHCP traffic. Zero idea what is required for the switch itself to obtain DHCP there. It would have to be the management VLAN at least I would assume.
Personally I would be less concerned with that as I would be with clients connected to the switch on the lagg VLAN getting addresses.
-
I unplugged both SPF+ cables making up the LAGG and unplugged the RJ-45 connected to my laptop then started the packet capture, I then plugged the two SPF+ cables back in and finally the RJ-45 and waited until my laptop stopped trying to identify on the network and finally set itself a 169 IP address. Here are the results
On the UniFi_LAGG interface DHCP server I have 1 static lease setup for the UniFi switch to use 192.168.2.80 as it's address. The DHCP server lease range is from 192.168.2.100 to 192.168.2.200 and from this I can conclude that PfSense DHCP server is attempting to assign the two devices their proper addresses. It would seem that communication from the switch back to PfSense is allowed to pass however traffic from PfSense over towards the switch is being blocked?I tried this with net.link.lagg.lacp.default_strict_mode set to 1 and 0 but it gave the exact same results, I have again set the value back to the default of 1.
-
Yes, it looks like the DHCP server on the firewall is receiving the requests and responding.
Is there some sort of DHCP snooping or protection in the switch that might be in play here?
Odd that capture never sees the requests coming in. What did you filter on there?
-
I'm not aware of any security measures in place or can I find any options in the GUI about security that doesn't pertain to the UniFi Security Gateway, even at that those options are all disabled since we don't have that specific device present on our network.
I am still unable to ping PfSense at 192.168.2.1 from my laptop which is connected through the switch even after we set a static IP of 192.168.2.50, Default Gateway of 192.168.2.1 and Primary DNS of 192.168.2.1, each ping attempt results in destination host unreachable. I suppose this could be due to the switch not having a proper address at this point?
I'm also curious as to why net.link.lagg.lacp.default_strict_mode at 1 or 0 both yield the same results, shouldn't one of these values make the LAGG completely unusable?
When I did the Packet Capture these were my settings:
-
@kklouzal said in LAGG (LACP) - UniFi Switch (16XG):
I am still unable to ping PfSense at 192.168.2.1 from my laptop which is connected through the switch even after we set a static IP of 192.168.1.50,
Typo? 192.168.2.50?
-
Don't filter on the IP address. Capture everything on port 67. You are missing the DHCP broadcasts.
-
Yes it was a typo, whoops :)
I went ahead and removed the Host Address from the filter, still only capturing outgoing requests for some reason.
-
No, anything sourced on port 68 is from the client. Anything sourced from port 67 is from the server.
I would download that into wireshark and look at what DHCP is actually doing. Looks like it's working there to me, but can't see much other than two-way traffic.
-
I had some free time today and made a new discovery. I went to try and setup one of our other production servers to use a LAGG connection to this switch, as soon as I enabled ports 1+2 on the UniFi 16XG (these are the ports connected to this second server) I noticed the status LED's flashing very rapidly on all connected ports, I then immediately unplugged the single ethernet cable running between the 16XG and PfSense and this rapid flashing stopped, however all devices behind the switch still had access to the internet which meant the LAGG between the 16XG and PfSense was working!
After our previous troubleshooting session here I left the two direct attach cables connected between PfSense and the 16XG and also tried many different combinations of configurations including playing around with VLANs. Ultimately I ended up with the original configuration when making this post (LAGG assigned to an enabled interface with no ipv4/v6 address set and included in the bridge) however with one difference, I switched the LAGG interface from LACP to ROUNDROBIN.
As soon as I enabled a different set of ports on the switch to be aggregate then the original ports I had setup as aggregate (11+12) came to life!
At this point I thought maybe the UniFi configuration never got applied and somehow by aggregating a different set of ports finally enabled the original configuration. I then went back and switched from ROUNDROBIN back to LACP but the again we stopped passing packets between the switch and firewall, I again rebooted the switch, rebooted the firewall, switched between net.link.lagg.lacp.default_strict_mode 0/1 rebooted the switch and firewall each time and still no packets would pass.
I finally decided to go back to ROUNDROBIN but packets still would not pass! I proceeded to reboot everything again and still no packets passed! Finally I went back to the UniFi controller and once again enabled a different set of ports to be aggregate and once again packets started passing!
I thought to myself again, maybe after switching back to LACP and trying this trick to enable a different set of ports would kick things off, unfortunately it did not.
So I'm at a loss here for what's happening. You would think there is an issue with the configuration being applied on the UniFi controller however simply switching from ROUNDROBIN over to LACP then back to ROUNDROBIN forces us to use the trick again to get packets passing.
-
Have you tried setting a MAC address on the interface page?
Seems I have to do this to make my WAN work on my MB8600 modem. I spun my wheels for a about an hour until I did so.
-
I have never personally seen a switch that did not work correctly with pfSense LACP.
That said, I have never used a Ubiquiti switch.
I have not seen any reports that it does not work properly.
Are you still messing around with the bridge here? Possible you created a layer 2 loop.
-
I've tried every troubleshooting step with the LAGG in a bridge and as a standalone interface with appropriate firewall rules to allow traffic and no combination will allow packets to pass using LACP.
Currently ROUNDROBIN is working fine and in bridge mode however I would prefer to get it setup using LACP.
It is a bit troubling that simply changing the LAGG Protocol to LACP then back to ROUNDROBIN breaks the system again requiring me to fuss around with the switch and set two random unused ports as aggregate before packets will start passing once more.
-
LAGG (LACP) - UniFi Switch (16XG):
ifconfig -v lagg0
Will your Unifi Switch work with while your pfsense box has a MAC address on that LAGG of 00:00:00:00:00:00?
Yours-
Mine-
-
Is there a way to force the Lag ID? I tried directly setting the MAC Address on lagg0 however lag id stayed all zeros.