2100 DHCP VLAN configuration
-
Hey folks,
I have followed several guides about how to go about setting up VLANs with the 2100 (updated to version 24.11) and its built-in LAN switch ports, and as far as I can tell I've got everything configured correctly - but I'm having some trouble getting devices to get addresses via DHCP.
The symptom is that I don't get a DHCP address. In the DHCP server logs, I see that the server is regularly receiving DHCPREQUESTs on the configured VLAN interface, and is returning a DHCPOFFER, those packets apparently never make it back through the switch to the client.
My goal here is to have each switch port drop untagged traffic (or dump it into a VLAN that we don't listen on or care about, which I think is what we get with the default VLAN 1 as long as you don't configure the LAN interface?), but support ~4 tagged VLANs with different non-overlapping subnets so that I can control routing between them with firewall rules. In my screenshots later you'll no doubt notice some of the rules I already applied...I can reapply those easily with Ansible so if I need to nuke them for debugging that's fine but I'd rather just shove an allow-all at the top to avoid that.
Reading the docs, I don't think I want to change the port VID in the switch config - my reading of that value is that this would be how you would separate the switch ports out to be separate from each other and I don't really want that, I just want these ports to pretty much all be the same config, so having them all VID 1 seems fine...but maybe that's where I'm going wrong? I don't know.
Here are a few screenshots of what I hope are all of the relevant configuration...
First, the switch VLAN config:
Next the VLAN definitions for the router:
Next the firewall rules for LAN and MANAGEMENT interfaces (LAN being as far as I can tell the switch uplink to the router SOC,
mvneta1
).I think these temporary allow-all rules should make sure we're good to go passing all traffic.
LAN:
MANAGEMENT:
Lastly here's what the DHCP server logs say, this is repeated every few seconds but it's always the same 2 lines:
Anyway I'm sure I'm just missing something, but unfortunately the docs for the switch ports on the 2100 are far from comprehensive and I don't see any obvious explanation as to what I'm missing. Please help!
-
@thalin VLAN200 will leave the 2100 tagged on all ports and now on the other end you gotta have either a switch that can handle VLANs or a client that is configured to handled VLAN200 tagged traffic.
How is the network looking after the 2100? Can you draw a simple diagram, like "internet - 2100 - VLAN-awar-switch - clients"?
And can you post a picture of the switch port configuration?
-
Yup what do you have attached to the 2100 ports? It needs to be something that can handle the tagged traffic.
If you have left the PVID as 1 on all ports though it must be arriving tagged as 200 though since the dhcp discover traffic has made it through.
-
Hey folks, thanks for the response! I appreciate your willingness to respond and maybe help!
I apologize for not being a bit more detailed here. I thought it was clear that since the 2100 is getting the DHCP request on the management VLAN interface (mvneta1.200), the client device was indeed requesting an address using that VLAN tag.
For context this is a unifi switch (US-8-60W) connected port 1 on the switch to LAN port 1 on the 2100, with the "Network Override" option on the US-8-60W set to the correct VLAN. This is the setting which tells the Unifi device which VLAN tag to request its management address on. Thus, the DHCP request to the 2100 is tagged on the correct network (not untagged, which would be the default behavior for these devices).
This is how I do most of my Unifi gear in other places (this works great with a Netgate 6100, but it has native ports, not a switch, which I think is the complication that I'm not understanding and getting correct here) even though it's probably unnecessary to have a separate management network. In this case it's also standing in for a more necessary requirement to have some traffic split off because of actual business requirements. Once I get this working I'll be able to replicate it for the other VLANs which will be needed.
So to explicitly state the problem I'm having:
- the traffic is tagged on the way in from the client
- the 2100 gets the request on the tagged vlan
- but the replies never make it back to the client for some reason I don't understand.
Hope that helps clarify the situation!
-
Also here are the requested diagrams/screenshots:
Switch port config:
Network diagram (it really is this simple right now, I'm configuring it through a Wireguard link so there is literally nothing else plugged into the device, and nothing plugged into the US-8-60W - the switch itself is the only client):
-
Ah so the DHCP client here is the switch itself? Hmm.
As you say it appears the dhcp replies never reach the client. Hard to see how that could happen though.
Try running a pcap on mvneta1, including tagged packets, for dhcp traffic on udp ports 67 and 68. Make sure it is actually sending the replies.
-
Yep this is one of the first things I did, so I have a pcap from pretty early on, before I posted this thread. I can definitely see the offer going out from the 2100. I will see if I can get a pcap from the client perspective today. Anyway, here's a screenshot from the pcap I did in pfSense:
-
Are those redacted addresses public IPs?
Are the MAC addresses correct in the replies?
Is the VLAN tagging correct?
The client is not the switch itself then?
-
@stephenw10 said in 2100 DHCP VLAN configuration:
Are those redacted addresses public IPs?
Nah, just don't want to spew private ip blocks around on the internet (I know, paranoid and probably unnecessary).
@stephenw10 said in 2100 DHCP VLAN configuration:
Are the MAC addresses correct in the replies?
Yes the MAC addresses seem correct. On one of the DHCP Offer packets, the destination MAC is the MAC I see in the Unifi UI for the client device, and the source is the MAC for
mvneta1
on the 2100.@stephenw10 said in 2100 DHCP VLAN configuration:
Is the VLAN tagging correct?
This capture was done on the VLAN interface. I will have to go do another capture on the LAN interface of the 2100 to see if the VLAN tagging is correct - though I assume it is at least for the traffic to be showing up there... I'll do another capture anyway to make sure that the replies are actually tagged.
EDIT: Yep, capture on the parent interface confirms that the offer packet is tagged on the correct VLAN:
@stephenw10 said in 2100 DHCP VLAN configuration:
The client is not the switch itself then?
Incorrect, the client is currently the US-8-60W Unifi switch. I was trying to say that I could swap out the switch as client and get another computer to do captures on the client side since afaik I can't do a client-side pcap from the switch.
-
Hmm, well I guess I would test a client that isn't the switch just in case it has some quirk that prevents it using a VLAN correctly for management. But if you used that same setup on a 6100 it should work here too.
Otherwise a pcap at the client will at least show if it reaches it.
-
@stephenw10 hello again, sorry for the pause. I went off to work on other things for a while, but now I'm back with a bit better test setup which is leaving me ultimately more confused than when I left off (spoiler alert lol).
A bit of a summary refresher on the setup here since it's been so long:
- Netgate 2100 running 24.11, configured to have VLAN 200 tagged on all 4 switch ports as well as port 5, the uplink port.
- The 2100 has a VLAN 200 interface defined (named MANAGEMENT, in case I refer to it as the management network or something later on accidentally) with an IP address of 10.XX.2.1/24
- The 2100 has DHCP configured on the VLAN 200 interface set to give out IPs in the range of 10.XX.2.100/24 - 10.XX.2.254/24
- Unifi US-8-60W PoE switch, with switch port 1 plugged into LAN port 1 of the 2100.
- This switch is configured to use VLAN 200 as its management network interface, and configured to DHCP to get its address.
I could try static IP for the switch management interface, but it would be much more painful to reconfigure it since I am going back and forth between networks to make configuration changes to the switch right now and my local network even though it has the same VLANs, has different IP ranges. The Unifi controller is on my local network (and will be available via VPN tunnel once it's actually on the 2100's network successfully).
New tests
So I have a new mini-pc that has yet to be installed with an actual OS for real usage, so I decided to boot up a Kali live-cd to do some easier pcaps than I was able to do before. I also configured the switch to set up port mirroring for port 1 so I could see what the switch sees when using wireshark on Kali.
A few tests.
Client PC instead of US-8-60W
Kali when plugged into the same port with the same configuration as the switch is able to get a DHCP address just fine on VLAN 200. So there's definitely some incompatibility/configuration problem between the US-8-60W & the 2100.
Mirrored port on US-8-60W
So next I set up port mirroring on the switch so I could get the DHCP conversation from the client switch side. I plugged in the Kali machine to port 3 and set up Wireshark to listen to all traffic on eth0 on that machine so I could just see whatever the switch sees, both tagged and untagged traffic.
Unfortunately for my sanity, the switch does see the DHCP Offer packet from the 2100. I have no idea why the switch isn't doing anything with it. This is especially confusing since it seems to work perfectly well on my home network, using the same VLAN - just with a different IP subnet. Literally just plug the exact same port on the US-8-60W into one of my Unifi switches configured to tag VLAN 200, and my home Netgate 4100 gives it an IP address on VLAN 200 with the correct IP subnet for that network. which it accepts right away.
FWIW I also tried this with another port on the client switch (the management interface is virtual and can attach to any physical port, so it shouldn't matter which port is the uplink to the 2100) and got as far as I can tell the same behavior.
What's Next
I'm a bit at a loss here what to do next. I have a second US-8-60W device here for another site that I am going to try to provision the same way and see if it has the same behavior. I may also try a Netgate 1100 device I have for this other site that I can see if it exhibits the same behavior as the 2100. Maybe it's just a weird compatibility thing between these two devices, or something. I have no idea at this point.
- Netgate 2100 running 24.11, configured to have VLAN 200 tagged on all 4 switch ports as well as port 5, the uplink port.
-
Ok that's some good testing.
So see the switch send the dhcp request and it is tagged 200?
And you see pfSense reply and that is also tagged?
Try using another client connected to the switch that pulls a lease whilst pcapping the mirror port. There must be some difference between the switch and another client.
It could be the switch requires something additional. For example we have seen some ISP that only respond to dhcp requests when given the right priority tag. Or that send a priority tag causing the replies to be dropped.