DHCP failing when moving between AP's
Right now we're having a pfSense running on an old computer, acting as a firewall, DCHP Server, FreeRADIUS server (with external database) and so forth.
It's been running smoothly for all the time I've been in the company, however when we decided to switch from AeroHive AP's to Unifi UAP-Pro AP's, for some reason we're experiencing DHCP issues for some devices. It's not always the same devices, and it's not always on the same AP.
We increased the amount of AP's from 3 to 5, but I'm having a hard time seeing why this should be the cause of issues.
I've checked the DHCP range on both of the Wireless Network VLAN's, and neither are close to being completely used.
They're running on the following ranges;
10.20.0.10 - 10.20.0.254
10.80.0.10 - 10.80.0.254
The heaviest load I've seen on them so far is 76 on VLAN 80 and 45 on VLAN 20 at the same time. But this shouldn't even be close to max capacity, as far as I know?
Anyone have an idea of what I need to check, to make sure everything is set up correct, or maybe how to increase the capacity of the DHCP server?
What is in the DHCP server logs? Is it even seeing the DISCOVERs and REQUESTs of the clients that are failing? When you get one that fails note the time and the MAC address of the device and check the logs.
I ran pfSense DHCP with 650 APs and thousands of simultaneous clients. Looking at pfSense for the problem is probably not where the solution will be found (other than using its debugging tools like packet capture and logging).
Increasing the number of APs should not matter to DHCP. It's only concerned with the number of clients with leases.
You do have to be sure that your DHCP pools are large enough to accommodate the device churn and lease time. If you have devices coming and going a lot you might need to increase the pool size or decrease the lease time. You didn't state that you were using captive portal but that also plays into this formula.
There is no REQUEST when the device fails to get an IP assigned.
However, if I set my device to use a known IP in the same range it connects right away.
The error is mainly happening on two AP's that are in the same area of the building.
I've tried swapping around the AP's and the error is still happening in the same spot, now on other AP's though. But at the same time, some clients can just connect to the network right away on the AP, so I'm having a hard time seeing exactly where the problem is happening.
We don't have a lot of guests in the house, so it's mainly the same clients connecting to the different AP's.
Right now the lease expires in 10 minutes on both VLANs. But given that we don't even have 250 devices in total I don't know how it should be a problem, specially when they're split on two VLANs.
None of the networks are using a captive portal, but VLAN 20 is using FreeRADIUS authentication.
"Roaming" between APs does not require a new DHCP lease. The client just continues to use the same lease.
If pfSense is not even seeing a DISCOVER/REQUEST when one is necessary you need to look at the layer 2 - the controller, switches, and APs to see what's going on there.
dhcpd cannot answer a request it never receives.
I would capture on a SPAN/Mirror port on the switch to see if the AP is sending the DHCP DISCOVER/REQUEST from the client.
10 minutes is really short. If you don't have a lot of client churn I'd increase that to at least 3600 seconds (1 hour). At 10 minutes you could possibly be confusing some DHCP clients but that's pretty much a guess.
The WPA2 Enterprise RADIUS authentication would have to succeed before a lease request would be put on the network. But it sounds like this is happening on both SSIDs.
Regarding the REQUEST that's exactly what I'm thinking. It makes complete sense that it doesn't receive an IP when there's no REQUEST for it.
I just don't understand why the REQUEST comes in when I set an IP on my device, but not if I set my device to receive the IP from the DHCP.
The problem is happening on both SSID's yes, so I don't think the FreeRADIUS is the problem.
If you set a static IP address there is no REQUEST because there is no DHCP.
That makes a lot more sense then.
I'm thinking the problem is occuring with the switch it is connected to, given that the same AP works without issues when set up on another port of the switch, and other APs are failing on the same port as well.
Some managed switches have higher-layer code for things like DHCP snooping and abuse prevention. I'd look there.
The fact that setting a static IP address always works leads me to believe that you have good layer 2 between the APs (and the clients) and the firewall port. So it must be something at a higher layer.
Gertjan last edited by Gertjan
What exactly is the problem ?
Client did ask for an IP, and didn't get one ?
Oops - didn't saw the 3 replies ...
Well, the problem is that a REQUEST isn't sent to the DHCP server from one (mainly) AP, when clients are connecting to it, even though it's working like a charm on the other APs in the building that are connecting to the same DHCP server.
Gertjan last edited by
Between this "AP" and pfSense "smart" switch ?
What happens when you change it for a less-smarter switch ? ;)
The Switch is an HP 2530 PoE switch, and it is the only PoE device I have available at the moment, so I can't test with a "dumb" switch unfortunately.
Quick look shows that that switch can do dhcp snooping since I see in the manual dhcp snooping events for snmp.. So you need to look at the configuration of that switch or the port your AP is connected to.
If pfsense does not see the discover for dhcp then no it would never offer an IP..