Unbound keeps forgetting hostnames registered by DHCP on VLANs
-
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
Logs contain messages about unbound restarts every couple of minutes, although the PID of the unbound process never changes. Might be related?
A very bad sign.
unbound gets restarted on every new lease coming in, and every lease renewal.What if :
The easy one (ones you think about it) : One of your devices is using a Wifi connection, and it is on the border of the radio strenght of the wifi signal : it will reconnect every x seconds to =re establish the connection. This will be followed by a DHCP REQUEST, this will restart unbound.The bad one (because you've bought a device that really wasn't expensive enough) : It asks for a 10 minutes (hard coded) lease - and will renew that lease every 5 minutes. Now image, you've several of these devices.
Both situations will constantly restart unbound.
For reminders during restart, your network has no working 'DNS'.Continue chain gunning unbound like this, and eventually thinks will fail ....
That's what your probably see right now : there is a zombie unbound process instance in memory, binding up the port TCP and UDP 53. It sticks in memory, can't be killed anymore as the PID is =lost, and a new unbound process can't start as port TCP and UDP 53 are already taken.
That's a deadlock - a bad situation.Conclusion : short leases, constant unbound restarts, yeah, I'm not surprised tof the situation that you're seeing.
Register DHCP leases in the DNS Resolver" is pretty broken for network with wifi/devices that behave badly/many devices.More then a decade ago, I saw this already happening on my small pfSense network, with some Wifi connected phones etc (no aliexpress dumb-ass doorbells that nuke your network, as described above), so I've decides that the "14235" unbound restarts a day isn't something good.
But then I used pfSense for my company, a hotel, and I used a captive portal.
That's where I had to remove " DHCP Registration [ ] Register DHCP leases in the DNS Resolver" option for good, as Wifi and these "chinese" phones were really killing my DNS by a constant unbound restart.
I didn't want to know the hostnames of these devices anyway, as I don't want to connect myself to these devices.
Worse : the host names of the device can be badly frmated - using illigal chars .... with all the nasty consequences (unbound refuses to start etc).So, entering :
and be done with it. No more unbound restarts, DNS is rock solid since.
It still restart ones in a while, as I'm messing around with my pfSense a lot.
pfBlockerng also restarts unbound, if needed, but I only update my DNSBL feeds ones a week or so (not every hours !!)Also : this issue will be dealt with in the nearby future : KEA will have the possibility to 'integrate' the DHCP host names (even the faulty ones, that might trip unbound ^^) into the unbound local cache without a brutal unbound restart : a long outstanding issue will be, finally, resolved.
-
Thanks for your suggestions, @Gertjan!
@Gertjan said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
unbound gets restarted on every new lease coming in, and every lease renewal.
Is unbound really meant to be restarted on every lease renewal? If this is by design, then it should take care not to throw away its runtime data, IMHO!
What if :
[various WiFi scenarios]I don't think WiFi is causing the problem in my situation, because:
- VLAN1 don't show any problem, so the restarts don't seem to affect VLAN1
- All access points are (still) connected to VLAN1, the one which does NOT show any problems.
- Wifi roaming (using identical SSID+PSK) works fine.. Very good signal strength all over the site.
- unbound restarts happen much more frequently than lease renews.
- On the faulty VLANs, there is yet ONLY ONE host (for testing purposes) and this host is using a wired LAN connection.
- When this host is connected to VLAN1, everything works fine.
So, entering :
I've set this also. But only servers have static assignments.
pfBlockerng also restarts unbound, if needed, but I only update my DNSBL feeds ones a week or so (not every hours !!)
Uh, this might be the catch. I recently installed pfBlockerNG. But I did not think it would restart unbound every couple of minutes.
-
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
Is unbound really meant to be restarted on every lease renewal? If this is by design, then it should take care not to throw away its runtime data, IMHO!
It doesn't.
It writes out the internal cache, and reads that back in when it start.
It will also read the file with contains all the DHCP registered leases, which was 'just' updated.
But yes, it still has to 'restart' to take into account the new situation.@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
I don't think WiFi is causing the problem in my situation
I was just illustrating possible issues. I haven't seen them all.
What you can do is this : forget the GUI for a moment. Lets get 'interactive'.
So go console or SSH, option 8.
Then :tail -f /var/log/resolver.log /var/log/dhcpd.log
and watch it .....
Take your phone, and stop the Wifi.
Then activate the wifi ...
The activity you will see is : the dhcp log shows that a lease comes in, and unbound gets restarted.
Draw your conclusions.
You've already made clear that I didn't want that.
But, you do you ^^@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
I recently installed pfBlockerNG.
Every minute, no.
You told it how often it restarts unbound.
Its combined with this :and this :
and probably it won't restart because the downloaded files didn't change - its smart enough to know nothing changed so not restart needed.
But, if you have a lot of DNSBL, ans some change all the time, and these are big (like millions++ entries) then unbound takes a long time to restart. Some have shown : minutes ( !! ) of DNS outage.
If, and now I'm guessing : what happens if the dhcp_leases process (the one that checks the dhcp leases file, and restart unbound if it detects a lease change) during this moment ?
Answer : I even don't want to test this - its probably bad.Btw : you have also VLANs - and even a VLAN1 ? Are you sure ? I've read somewhere - this forum - that "1" as an VLAN number was 'not good' ?!
-
@Gertjan said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
this forum - that "1" as an VLAN number was 'not good' ?!
There is nothing wrong with using vlan 1, its just the native default vlan.. It is almost never tagged. In a enterprise setup you will not see it used for data/user traffic. It would be the management vlan if anything.
Here is a thread from 2023 where this was discussed. My thoughts are in there..
https://forum.netgate.com/topic/179554/vlan-1-best-practices
-
@Gertjan said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
Is unbound really meant to be restarted on every lease renewal? If this is by design, then it should take care not to throw away its runtime data, IMHO!
It doesn't.
It writes out the internal cache, and reads that back in when it start.
It will also read the file with contains all the DHCP registered leases, which was 'just' updated.
But yes, it still has to 'restart' to take into account the new situation.I can see in the logs that dhcpd sends a HUP every time a lease is updated and unbound reacts with this "Restart" log entry.
But I would not expect a real restart in this situation. I'd expect unbound to simply re-read the dhcp-dns mapping file.
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
I recently installed pfBlockerNG.
Every minute, no.
The "restarts" are not exactly every minute. It seems to be caused by the HUP from dhcpd, about 14 times in the last 50 minutes. But yet again, unbound should not forget the hostnames.
Its combined with this :
This was set to "hourly" (I kept the defaults, IIRC). But even this is far from the time scale of several minutes.
I've set it to daily now.
and this :
This was set to daily. Changed it to weekly now.
But, if you have a lot of DNSBL, ans some change all the time, and these are big
No, I've only the ADs_Basic
(like millions++ entries) then unbound takes a long time to restart. Some have shown : minutes ( !! ) of DNS outage.
Ummm... There's no outage. I get an answer, but the answer is empty.
If, and now I'm guessing : what happens if the dhcp_leases process (the one that checks the dhcp leases file, and restart unbound if it detects a lease change) during this moment ?
The problem happens much more often than just once an hour. And why would I be the first one to see such a problem?
Btw : you have also VLANs - and even a VLAN1 ? Are you sure ? I've read somewhere - this forum - that "1" as an VLAN number was 'not good' ?!
VLAN1 is usually meant to be the management network as it is the default VLAN which also goes along with the trunk.
PS: I got a crash report today (but no reboot). For me, it looks like PIMD crashed. But I have no clue how to get any clue out of this dump. Should I send it somewhere for help to diagnose?
-
@doejohn re: unbound restart on lease renewal: https://redmine.pfsense.org/issues/5413
But no, wouldn’t expect to lose any records. Are you using an internal/fake domain or a real one?
-
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
It seems to be caused by the HUP from dhcpd,
Exact !
There is a process call "dhcpleases" :[23.09.1-RELEASE][root@pfSense.bhf.tld]/root: ps ax | grep 'dhcpleases ' 97311 - Is 0:00.02 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d brit-hotel-fumel.net -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts
that detects if a lease is modified or add, by checking this file : /var/dhcpd/var/db/dhcpd.leases
If that the case, it will update the /var/unbound/dhcpleases_entries.conf file and restart == HUP the unbound process.
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
But yet again, unbound should not forget the hostnames.
if it forgot one, check the /var/unbound/dhcpleases_entries.conf at that moment. Is it still there ?
If it isn't, unbound isn't aware of the host anymore. The fault would be upstream. Check /var/dhcpd/var/db/dhcpd.leases, the DHCP server scratch pad file, if it is still there, with a valid time etc.Anyway : this dhcpleases process will be gone soon, as the new KEA dhcp server will have a better way to include DHCP hosts into the local DNS unbound cache, without unbound being restarted.
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
No, I've only the ADs_Basic
Ok, like me. So when needs to be restarted, it won't take much time.
Bascilly, unbound is restarted for the same reason : a bunch of host names (== thne DNSBLL file, with many host names) was changed : a restarted is needed to make unbound aware of this.@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
Ummm... There's no outage. I get an answer, but the answer is empty.
Not found or 'time out' : same result.
-
@SteveITS said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
@doejohn re: unbound restart on lease renewal: https://redmine.pfsense.org/issues/5413
But no, wouldn’t expect to lose any records. Are you using an internal/fake domain or a real one?
Fake domain XXXX.lan
-
@doejohn unbound restarting on dhcp events has been an issue since I can remember.. Unless you have a small network, long lease times, etc. ie very few dhcp events.. Unbound restarting on events is going to be problematic at best.
if you have lots of dhcp events, and unbound is constantly restarting your going to have a bad day. And if your doing something that takes unbound longer to restart than a couple of seconds. Say large lists in pfblocker, etc.. your going to have a worse day.
It has been current recommendation to not register dhcp if the restarting of unbound is going to cause you problems.. if you have a couple events a day, and takes 2 seconds other than loosing your cache - might not even notice any issues. But if you have a 100 restarts a day, and they take 2 minutes.. Yeah its going to be a problem most likely.
-
@Gertjan said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
There is a process call "dhcpleases" :[23.09.1-RELEASE][root@pfSense.bhf.tld]/root: ps ax | grep 'dhcpleases ' 97311 - Is 0:00.02 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d brit-hotel-fumel.net -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts
that detects if a lease is modified or add, by checking this file : /var/dhcpd/var/db/dhcpd.leases
If that the case, it will update the /var/unbound/dhcpleases_entries.conf file and restart == HUP the unbound process.
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
But yet again, unbound should not forget the hostnames.
if it forgot one, check the /var/unbound/dhcpleases_entries.conf at that moment. Is it still there ?
I see. Unfortunately, I can't reproduce it any more.
What have I done to "fix" it? I have no idea.
- There was the crash report (pimd?)
- I have set pfBlockerNG cron setting to daily and list download to weekly. But I have set it back to hourly/daily, hoping to reproduce it again.
- I have restarted unbound
When it will happen again: How can I check which names are filtered by pfBlockerNG?
Anyway : this dhcpleases process will be gone soon, as the new KEA dhcp server will have a better way to include DHCP hosts into the local DNS unbound cache, without unbound being restarted.
Is there a time frame for this change?
@doejohn said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
Ummm... There's no outage. I get an answer, but the answer is empty.
Not found or 'time out' : same result.
For the end-user, result is the same.
But for debugging, there's a big difference. An answer like "I dont know such a host" is fundamentally different from long delays because of something expensive has to be sorted out at startup and eventually timing out.
So, most of the time I see the former "I dont know such a host". I also (occasionally) see significant delays, which can be explained by DNSBL-startup delays. But most of the times I see immediate "Don't know" replies.
-
@johnpoz said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
@doejohn unbound restarting on dhcp events has been an issue since I can remember.. Unless you have a small network, long lease times, etc. ie very few dhcp events.. Unbound restarting on events is going to be problematic at best.
Lease time is currently set to 2 hours. AFAIR, this was the default setting. I always keep defaults as long as there's no urgent reason to change.
I'll try 24 hours.
-
@doejohn so with a 2 hour lease, every hour (50% mark) client will renew - this will cause an event.. If you have 1 client that is every hour unbound will be restarting. If you have 100, its a lot of restarts ;)
-
@johnpoz said in Unbound keeps forgetting hostnames registered by DHCP on VLANs:
@doejohn so with a 2 hour lease, every hour (50% mark) client will renew - this will cause an event.. If you have 1 client that is every hour unbound will be restarting. If you have 100, its a lot of restarts ;)
That's right.
But yet again: 2 hours is the default setting (I just double-checked). And I have a relatively small network, a total of only about 15 leases here.
If such a small amount of hosts is causing problems with the default setting, then increasing the default should definitely be taken into account.