Dhcp lease client hostname missing (with 2.3)
-
After the upgrade to 2.3, I noticed that majority (not all) of my DHCP leases are missing the client hostname entry. This is visible both in the UI but also in the actual dhcpd.lease file. Only minor amount of these clients have the client-hostname entry in their lease. With 2.2.6 this was working for 100% of the clients.
This naturally influences the DNS resolver because I had set that DHCP client hostnames will be registered into the DNS.
Any ideas where and how to start the diagnostics on this one? I tried some simple ones like restarting DHCP and DNS services and forcing the lease renewal on the client side but those didn't change this.
-
I noticed the same. Of the 23 IP's leased, only 4 have host names attached. In 2.6 every lease had a host name.
-
Have you tried turning off the setting to "Register DHCP static mappings in the DNS Resolver" and turning it back on?
If so, perhaps turning off "Register DHCP static mappings in the DNS Resolver"; editing the host name for one of the clients whose name is missing to something and editing it back; then re-enabling "Register DHCP static mappings in the DNS Resolver" may help?
-
According to this thread, the problem seems to be in dhcpd. I'm seeing this issue on two separate pfSense 2.3.1 installations – one of them configured from scratch when 2.3.0 was released. It mostly seems to be Apple devices that lose their hostnames in the dhcpd.leases file. Any suggestions on how to debug this issue? Is there any way to make the dhcpd logs more verbose?
-
I have a DHCP packet capture that includes a couple of hostnames disappearing and a different one that was missing reappearing (only 48 packets over 33 hours). I also have the corresponding dhcpd.log and dhcpd.leases from the end of the capture period (I didn't think to grab dhcpd.leases at the beginning of the capture period). I'd be happy to share these with any pfSense devs that are interested.
-
I never got any resolution on this issue. I sent several packet dumps to cmb and never heard back from him. The packet captures proved that all my devices in question send hostnames and that the hostnames can disappear when their is no DHCP network activity. I find this very frustrating because our internal reporting tools can't resolve IP's to hostnames when this happens and is making our reports very tedious to read. I'm half suspicious that their is some logic error in dhcpd. I think that process has some timed interval where it "cleans up" the leases file. I'm wondering if that process is buggy. Once, I watched the leases until I saw one of the hostnames disappear. What I observed was that the current dhcpd.leases file had the hostname missing but the hdcpd.leases backup file still had the correct hostname. The hostname missing was the only difference between those 2 files. While this is a minor problem overall, it irritates the bosses enough that they are pressuring me to change firewalls to one that doesn't do this. I love pfSense and I wish we could fix this problem before I'm forced to move to some other product. :(
Later edit: It does not seem related to the kind of device as I have mostly android devices.
-
Interesting. I'll have to take a closer look at dhcpd.leases~. Unless some other tool is writing to dhcpd.leases, it seems like it must be a problem with dhcpd itself. It's strange that I haven't come across reports of this issue anywhere else online or in the ISC dhcp mailing list archives yet though.
I took this script and trimmed it down to just poll for changes in dhcpd.leases and archive off a snapshot whenever it changed:
#!/bin/sh outdir="/tmp/dhcpd_snapshots" watchcmnd="cat /var/dhcpd/var/db/dhcpd.leases | md5" execcmnd="cp /var/dhcpd/var/db/dhcpd.leases $outdir/dhcpd_\`date +%F_%T\`.leases" mkdir -p $outdir md5=`eval $watchcmnd` md5Now=$md5 echo "`date +%F_%T`" eval $execcmnd while [ true ] ; do while [ $md5Now = $md5 ] ; do sleep 1 md5Now=`eval $watchcmnd` done echo "`date +%F_%T`" eval $execcmnd md5=$md5Now done
Looking at these snapshots, the dhcpd log, and a packet capture log, here's what I'm seeing so far:
-
ClientA sends a DHCP request at 02:37:59 UTC.
-
This client already has a lease, so dhcpd just renews it by updating the starts, ends, and cltt fields for its lease in dhcpd.leases.
-
At the same time, it updates the cltt field on ClientB's lease from 00:15:13 to 01:14:25 and deletes its client-hostname field at the same time.
Since there's no dhcpd.leases file snapshot at anywhere near 01:14:25, dhcpd must be caching some non-critical changes in memory and only writing them out in batches when one of the leases actually change – that doesn't really explain why AsgardianFW only saw a hostname disappear by itself when compared to the backup leases file though.
cltt is the client's last transaction time (man), so looking back in the packet capture log to 01:14:25, there are two DHCP broadcast requests (and associated ACKs) from ClientB. Both requests and ACKs are identical (except for the IP IDs, transaction IDs, checksums, etc).
The dhcpd log matches up:
01:14:25 pfsense dhcpd: reuse_lease: lease age 3552 (secs) under 25% threshold, reply with unaltered, existing lease 01:14:25 pfsense dhcpd: DHCPREQUEST for 192.168.1.149 from Apple:ClientB via re1 01:14:25 pfsense dhcpd: DHCPACK on 192.168.1.149 to Apple:ClientB via re1 01:14:25 pfsense dhcpd: reuse_lease: lease age 3552 (secs) under 25% threshold, reply with unaltered, existing lease 01:14:25 pfsense dhcpd: DHCPREQUEST for 192.168.1.149 from Apple:ClientB via re1 01:14:25 pfsense dhcpd: DHCPACK on 192.168.1.149 to Apple:ClientB via re1
One odd thing is that the two requests are only ~3ms apart (associated ACKs are sent within ~0.6ms). Since this is UDP and I'm doing the packet capture on pfSense itself and in this case ClientB is an iPhone on wifi, perhaps the first ACK just got lost or delayed long enough that ClientB sent another request (?).
According to the packet capture, both requests include the client hostname. I'm guessing there's something about those requests that dhcpd isn't handling well. Maybe it's dropping the hostname from the leases file whenever it responds with an unaltered existing lease?
I'll need to dig through my logs some more to see how closely other instances of disappearing hostnames match up with this pattern.
-
-
pfSense 2.3.1 is running ISC DHCP Server 4.3.3-P1. Looking through the 4.3.4+ release notes, the only item that stood out to me was:
- Added the lease address to the end of the debug level log message emitted when an existing lease is renewed within the dhcp-cache-threshold. Thanks to Nathan Neulinger at Missouri S&T for suggesting the change. [ISC-Bugs #40598]
There are quite a few bug fixes, but it's difficult to tell if any of them are related to this issue.
-
Added a bug ticket for this issue, as there definitely is something to it.
https://redmine.pfsense.org/issues/6589 -
Vynce, perhaps you're on to something. I can add a little more evidence to the pile. For me, this only seems to happen for devices connected wirelessly. Although, I will confess that almost all of my wired devices have a static lease in DHCP. I do have 1 Windows desktop that is permanently connected via wire and has a dynamic lease. It has never exhibited this problem. I also have 1 Windows laptop that is connected wirelessly from time to time and I've never seen it lose its hostname (but the laptop isn't on all the time). 2 data points may not be enough to draw a proper conclusion, but so far the problem seems to target Android and Apple phones and tablets connected wirelessly (I have 6 different models of these devices so it doesn't appear model specific). At one point I thought it might be related to roaming between access points on my wireless network, but I did observe a loss of hostname of my phone while I was holding it in my hand sitting at the computer watching hostnames (so while I'm not 100% sure, I don't think it changed access points).
-
Please don't hate me for this, but I installed OPNSense (16.7 RC1) in a small, non-essential location (with 2 iPhone users) because it has ISC DHCP Server 4.3.4. I just confirmed that it, too, loses hostnames so the problem isn't addressed in the newer dhcp version.
-
For me, this only seems to happen for devices connected wirelessly.
I doubt it's wireless-related. dhcpd has no way of knowing how a device is connected, a wifi lease looks exactly the same as a wired lease.
Thanks for confirming the behavior hasn't changed in 4.3.4.
This probably needs to be escalated to the ISC dhcpd list to hopefully get feedback there.
-
cmb,
Who should do the escalation? One of us? You? -
I'm not going to have time to see it to resolution with only 2 weeks left. If one of you can push it along, that'd help get to a faster resolution (I hope). Otherwise someone should get to it at some point.
-
Thanks Chris. I've posted to the dhcp-users mailing list. Hopefully they have some suggestions :)
-
-
Are you synchronizing two dhcpd servers that are doing dynamic dns registration?
I believe there is a known limitation in the isc-dhcpd sychronization, in that it transfers the lease between dhcpd servers, but fails to transfer the host name.
-
Thanks for the heads up on that sync issue. In my case it's just a super basic configuration – no redundancy, syncing, or anything.
-
Shawn Routhier (ISC DHCP lead engineer) suggested adding "dhcp-cache-threshold 0;" to dhcpd.conf to disable an optimization that allows dhcpd to skip updating dhcpd.leases when a client requests an updated lease after less than 25% of the lease time.
Here's how I made that change in case anyone else wants to give it a try. Not sure if it's the best way, but the setting has been added to dhcpd.conf at least.
-
Diagnostics -> Edit File.
-
Paste "/etc/inc/services.inc" in the box and click Load.
-
Add "dhcp-cache-threshold 0;" on line 541. There's another location around line 1285 for dhcpdv6 if that applies to your configuration.
-
Click save.
-
Services -> DHCP Server.
-
Click save to regenerate dhcpd.conf and kick dhcpd itself to reload its config.
Now to wait and see what happens :)
-
-
My setup is also pretty simple. The only non-default settings I have are some static leases and the default lease time is set to 86400 (24 hrs). The longer lease time makes is easier to see the problem because the hostname comes back after a lease renewal. The default time of 2 hours somewhat hides the hostname issue unless you're really watching it.
I have made the suggested change to disable the optimization and I'll report back within 24 hours.