Frequent unbound restarts
-
Thank you both for the suggestions. I'll look at the logs for concurrent events.
-
I compared the dns and dhcp logs and there is a correlation.
For every instance of
May 10 10:39:24 unbound 85179:0 info: service stopped (unbound 1.6.1).
there is an instance of
May 10 10:39:24 dhcpleases Sending HUP signal to dns daemon(85179)
Occasionally, it takes more than one attempt:
May 10 10:33:06 dhcpleases Sending HUP signal to dns daemon(85179) May 10 10:33:06 dhcpleases kqueue error: unkown May 10 10:33:05 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process. May 10 10:33:05 dhcpleases Sending HUP signal to dns daemon(13085)
Sometimes the HUP messages are preceeded by DHCPACK and sometimes by dhcpd restarting. It appears to be "normal", although messy would be a better word to describe it.
-
On second thought, maybe inefficient would be a better way to describe it. This is a home system. There are under 20 dhcp and dhcpv6 leases and under 30 reservations. In a large network with numerous devices, unbound would be spending more time restarting than resolving dns queries. Not very scalable.
-
There has been lots of discussion on this.. I have not paid much attention to the threads because I don't have it registering dhcp other than my devices I have set for static via dhcp. Other devices that would just get something out of the dhcp pool.. This would normally just be guest devices since all my other devices I setup a reservation for.
I have zero need to being able to resolve by name a friends phone for example that might be on my guest wifi network once in a blue moon.
But I do recall there being multiple threads and much discussion on this - I do believe pfblocker came up in the conversations as well as some sort of contributor to the restarts. Since it reloads its lists into unbound when they update.
For now I would disable dhcp leases registration and see if that lowers the number of restarts you see.
-
Thanks for your reply. I did search and noted one thread in particular that was started a long time ago about this. It sounds like the same issue.
I have reservations for all of my servers, desktops and laptops. Virtually all of the dynamic leases are mobile phones or tablets. I agree that it doesn't serve much purpose to register these leases. I occasionally have 30 tablets on the network in addition to the usual mobile phones. I'll take a look at the log when that happens next time. Irrespective of the utility of registering leases, the implementation (requiring unbound to restart) seems to be inherently inefficient.
-
I haven't looked at the implementation that closely but Unbound has a 'reload' subcommand that should be enough to reload all of the configuration file without requiring a full restart of the daemon. However, Unbound is run under chroot in pfSense that might be the reason why a restart is required to fully reload all of the settings. This is guesswork on my part based on what I know about Unbound and chroot'ed daemons in general.
-
Disable DHCP Registration and configure Host Overrides in DNS Resolver. ;)
As for unbound, BBCan177 is presently trying to use live reload to make change to unbound DB instead of having it reload conf file at every change. But he ran into problems with DB getting out of sync with the conf file.
So at some point in the future, maybe DHCP could use the same technique.
-
Disable DHCP Registration and configure Host Overrides in DNS Resolver. ;)
As for unbound, BBCan177 is presently trying to use live reload to make change to unbound DB instead of having it reload conf file at every change. But he ran into problems with DB getting out of sync with the conf file.
So at some point in the future, maybe DHCP could use the same technique.
Can you be more specific about host overrides? Maybe I'm missing something, but I didn't see any such setting.
-
Under Services / DNS Resolver / General Settings ?
https://doc.pfsense.org/index.php/Unbound_DNS_ResolverHost Overrides allows creation of custom DNS responses/records to create new entries that do not exist in DNS outside the firewall, or to override DNS responses for other hosts.
This is were I input the DHCP reservations so the hostname are resolved by unbound.
Maybe you can just keep Static DHCP enabled to get the same result.
-
I'm glad that someone else posted on this topic, albeit back in 2017.
I enabled TLD yesterday, and pfblocker is set to update at 0030.
Sometime around 0200 this morning my pfsense box started sending out multiple emails from the service watchdog that the dhcpd service and unbound had stopped and it was restarting those services. However I did not receive my daily email from the pfsense box, normally that comes in around 5-530am.
I'm unable to ssh into the pfsense box, and it was unresponsive this morning.
Unfortunately I start my work week today, so I can't even delve into it.
Luckily I do have a current backup, so I'm thinking I will follow the above, disable DHCP Registration and configure Host Overrides in DNS Resolver, and see if that solves the issues.2.4.2
8gb ram![Screenshot (20).png](/public/imported_attachments/1/Screenshot (20).png)
![Screenshot (20).png_thumb](/public/imported_attachments/1/Screenshot (20).png_thumb) -
It seems like a good interim "hack" would be to patch the code to prevent unbound from restarting with dhcp udpates and create a cron job to do it on a more "controlled" basis.
Does anyone know which files in the php code needs changing? Is it under /usr/local/www ?
-
You will find what you're looking for here /etc/inc.
Check services.inc and system.inc.It's far more easier to disable the "DHCP Registration" completly, and give all host that you want be able to resolve at any time a static fixed MAC DHCP lease entry.
When these host renew, they will receive the same IP every time, and will unbound not be told to restart.Remember : the real issue is unbound itself : it reads config file when it start. If some host-IP information is known afterwards - unbound can only be made aware when it restarts. A cron solution would make your device "non resolvable" for a certain time, the time it takes before the cron job runs.
-
@gertjan said in Frequent unbound restarts:
unbound can only be made aware when it restarts
pfblockerNG-devel is changing unbound internal db (Resolver Live Sync) using unbound-control cmd. So maybe at some point DHCP server could be doing something similar.
-
Is there really no other way to refresh other than restarting Unbound and reloading everything?
I mean this is not only a problem with the "Register DHCP leases in the DNS Resolver" but also with Remote OpenVPN clients that are using the DNS Resolver.
Every time a Remote Client connection is initiated or stopped, Unbound has to restart. If you are using pfBlocker-ng, that could mean upwards of 60 seconds of downtime.
I thought I read here on the forum that a fix was being developed but I guess I am mistaken.
Either Netgate or upstream needs to do something about this. Having to disable important features for the sake of uptime seems like quite a big problem for me.
Is there any way to escalate this?
-
DHCP could change the Unbound in-memory db as pfblockerNG does with Live Reload, instead of restarting Unbound.
-
Hi all
Is there any update on this? I'm trying to use PFBlockerNG-devel but this causes unbound restarts to take a noticeable amount of time and them happening every 10-30 minutes makes the whole package unusable.
I did a quick check on github and I see where the C code is sending a HUP to DNS. I could change this to call unbound-control like pfBlockerNG does (though I can't assume unbound is running the way PFBlocker can, so it will be slightly more complicated). Would such a pull request be accepted?
EDIT: Alternatively, we could change the unbound startup script to catch HUP and call unbound-control instead of restarting? That might be more robust.
-
@jasonArloUser said in Frequent unbound restarts:
I'm trying to use PFBlockerNG-devel but this causes unbound restarts
as per pfBlocker settings, it will reload restart unbound every :
@jasonArloUser said in Frequent unbound restarts:
to take a noticeable amount of time
Other options are : use less feeds. Or a bigger system.
@jasonArloUser said in Frequent unbound restarts:
on github and I see where the C code is sending a HUP to DNS
The DHCP daemon ?
Do what @RonpfS mentioned.
Give all important devices static mac mappings, and the DHCP server daemon won't restart unbound any-more. -
That's what I'm doing but the fact is that the current implementation is wrong. When a new host enters the DNS local zone because it gets a DHCP release the correct thing to do is reload that zone, not restart the DNS service.
-
Note that I agree with you.
bind doesn't go wacka when an attached interface goes down. Or a zone needs to be reloaded. Etc.
But bind is huuuuuuge to setup (read : error prune )- and the code foot print is even bigger.
Putting a GUI in front of bind is daunting - a never ending story.But ..... for me the case was solved a long time ago.
I just don't have unbound restarted any-more when a lease comes in.
Problem solved.I had to list up all my IPv4 devices - they all use DHCP except for 2 servers, they are static.
Most of my device have a Static MAC entry, so the DNS info is already stored into the DNS.New, other devices that enter my network could connect my 'known' devices like printers etc.
Other, already present devices don't need to connect to these new devices so an DNS entry isn't needed. -
@Gertjan said in Frequent unbound restarts:
Give all important devices static mac mappings, and the DHCP server daemon won't restart unbound any-more.
That is a workaround, not a solution. Furthermore, one of the goals of DHCP is to reduce manual intervention. Having to manually set up (possibly thousands of) static DHCP entries it not much better than setting static IPs directly on clients.
-
@Gertjan Oh I agree that Bind would be way too much effort here. But the solution is actually very close. All that needs to happen is the DHCP code that is currently sending a HUP signal just needs to call unbound-control to load the local zone, as that is all it can be effecting.
In any case, I've done what you did so I don't expect to feel the pain much anymore but if I get annoyed at adding devices to the static mapping maybe I'll try to submit a patch request. :)
-
I had a look at this issue, and if memory servers me well, it's not the dhcp daemon that kick unbound, but the dhcpleases process (the IPv4 version) that is just stopped when you uncheck :
This little program updates the /etc/hosts file and signals unbound to "reload".
unbound's default reaction will be : restart - what is causing all this.
I'll hope that unbound will be more intelligent one day, and 'watches' files so it reload these file(s) that needed to be reloaded.I didn't look how Live Sync (unbound) is implemented ... but I tend to say it isn't done as 'bind' does it. But it's a beginning.
True is : pfSense should use it for its hosts file / DHCP changes - if possible.Btw :
Same story, from years ago https://forum.netgate.com/topic/79375/unbound-frequently-restarts-on-2-2-is-this-normal
https://forum.netgate.com/topic/80517/unbound-seems-to-be-restarting-frequently
etc.Today, it's nearly 2020 ..... and where is the doc about this "Live sync" : NOT here https://nlnetlabs.nl/documentation/unbound/ (and that's the source).
-
do I understand correctly that the "Register DHCP leases in DNS resolver" functionality is totally unusable, forcing us to do a lot of manual work with registering devices' DHCP addresses as static leases? wtf? first FQDN alias resolution, now this.. pFsense is starting to be unusable.
i mean, how are you others deal with this? :/ -
"Register DHCP leases in DNS resolver" is usable.
It's just a setting that must be understood so you can can recognise DNS situations, and do something about it.
With many devices (dozens or hundreds) and/or a very short DHCP lease, unbound, the resolver, could get restarted rather often.
What about forcing a long DHCP lease like a day or 2 ?
If very heavy DNS packages like pfBlocker are used also, the start-up time of unbound gets impacted.
All of this related to what device you use to run pfSense. An I7-core with SSD will not sweat - a loaded SG-1100 could give noticeable DNS outages. Added to that : the cache is lost.Also : only new leases, introducing new devices to the network, will restart unbound. As successful DHCP renewing doesn't.
This is one of the many reasons why "visitors" belong on their own network with "Register DHCP leases in DNS resolver" set to OFF. I don't care what their IP is, neither their host name.
@work, I have about 40 devices.
I tend to fix every device to a known IP4 and Pv6. I still tend to use the IPv4 as a device number, but I know that that concept will vanish when IPv4 fades out. I use MAC-Leases of course, and have to set up this ones for every new device. At the same moment, I choose a simple, short, representative name for the device.
These leases are also placed into the "hosts" file ( == "Registered").@home : I don't care less. I do not need to know the name of the phone a visitor brought along with him. Neither the IP.
My couple of own devices are - as above - locked to a "MAC based lease".
So I'm not using "Register DHCP leases in DNS resolver".So, IMHO : it's a close to a no-problem.
( but for others, it could be the next planet killer ) -
@Gertjan thx for the reaction, and for the suggestions and ideas as well. very helpfull - i'm not by far a super-skilled pfsense admin, so much appreciate the ideas.
just one clarification, if you could: I haven't found a way to enable DHCP leases DNS registration for particular subnet - i only see a global checkbox in DNS Resolver settings. could you please point me to a place this can be configured per interface/subnet?
-
For every LAN type of network, you have a DHCP server with its dedicated settings :
For every DHCP, you can set and maintain, at the bottom of the page, the "DHCP Static Mappings for this Interface".
On the Status > DHCP Leases page you can also choose what lease you want to add as a Static lease.
-
@Gertjan said in Frequent unbound restarts:
This is one of the many reasons why "visitors" belong on their own network with "Register DHCP leases in DNS resolver" set to OFF. I don't care what their IP is, neither their host name.
this is what I referred to. the above sounds like there should be an option to disable "Register DHCP leases in DNS resolver" feature on each network (subnet) separately.
your last screenshot refers to a different functionality - setting dhcp static leases.
-
@jt said in Frequent unbound restarts:
setting dhcp static leases.
These behave like the non static leases, and will never change.
It has it's own check box. I have that always checked.True is : the "dhcpleases process", that collects DHCP server leases for any DHCP server process, will restart the DNS system (unbound or the forwarder, dnsmasq) if new leases come in. There is no "per interface" choice.
If you select :
you will see 'dhcpleases" process running :
ps ax | grep dhcpl
It's this process that 'HUPS' unbound - or the forwader dnsmasq.
I guess things were designed as such a long time ago. Without really knowing that it could one day concerns a looooot of devices on a network (10 Mbits half duplex 'blindly fast' in past ...).
As said earlier : unbound will get reprogrammed sooner or later so it will reread changes in config file on the fly some day.
-
@Gertjan unbound doesn't need to be reprogrammed. It already has the ability to reload configuration with the unbound-control command. The issue here is that the process that updates the leases doesn't use that, it sends the process a HUP. This is a bug as the lease program knows it can only be affecting the local zone so there is no possible reason to do anything other than reload that zone. Sending HUP and expecting unbound to somehow know that HUP means a reload of the local zone would be incorrect.
There are 2 simple ways to fix this issue: (a) fix the script to call unbounded-control as it should or (b) in the shell command that starts unbounded it is probably possible to set up a HUP handler that calls unbound-control. The second solution is wrong here because HUP should really reload everything but at least it would get things working without having to rewrite anything else.
-
@jasonArloUser said in Frequent unbound restarts:
@Gertjan unbound doesn't need to be reprogrammed. It already has the ability to reload configuration with the unbound-control command. The issue here is that the process that updates the leases doesn't use that, it sends the process a HUP. This is a bug as the lease program knows it can only be affecting the local zone so there is no possible reason to do anything other than reload that zone. Sending HUP and expecting unbound to somehow know that HUP means a reload of the local zone would be incorrect.
I know. Still waiting for this to happen.
Look at the word reload here. Or from the authors.What I read here is : it just restarts, or something close to stop and start.
Cache is lost - all config files are read in again ... The only benefit is that the process isn't destroyed, and recreated.This means that, for me, that when a relative small /etc/hosts file has to get read in a again, it also reads other config files like the "/var/unbound/pfb_dnsbl.*conf" from pfBlocker if it is installed and activated.
-
Hi
Just wanted to add I came here after googling this event in the logs.
I have a feeling the restarts started or became worse when my ISP enabled IPv6. I did have the check box set for Register DHCP leases in the DNS Resolver, but with IPv4 noticed no issues.
When I enabled IPv6 I found web pages didn't zip in quite as fast as they did when it was IPv4 only, but were okay after the first visit, I put this down to the web browser using a fall back mechanism then caching the results of it for a while. A few weeks on and I saw the restarts in the log every second or so and came here.
So I've disabled the DCHP registration option, the resets have stopped, and web pages are loading without the extra latency, which makes sense as the results are staying in the DNS resolver cache and no delays from hitting pfSense when the service is restarting.
It's like I've had a speed upgrade.
Regards
Phil
-
I installed pfSense about a month ago and have been trying to track down my random internet drops for weeks. Finally saw the HUP signal message in the logs this morning and realised it happened on every DHCP request.. which lead me to this page.
I'm a little in shock over how absolutley stone age this is. A really basic ability causes it to restart the whole thing. What the hell?!
Static DHCP as a workaround is great and all, but as others have said, it's no where near a solution. It's manually doing the job for DHCP & DNS.
Anyone tested if this is fixed in any of the later v5 releases etc? Or is that not where the problem lay?
-
I never actuality tested (looked for) this : If a new lease comes in and the option is checked, then unbound is HUPped.
A lease renewal : no, as only the duration is updated, which is not a DNS 'thing'. -
how's the BIND implementation currently doing? would switching from unbound help here?
-
bind check the config file it's using, and parses them when it detect it was 'touched' by some other process.
What I do know is this : it won't ditch the cache when this happens.
It can also unbind and bind to new interfaces, when they come and leave.But I can't tell if it would be better.
bind is huge. Setup has to been done manually, even if you use a GUI like pfSense uses.
bind needs users to look at the manual (huge also) for sure , otherwise DNS becomes a mess.I don't bother, practically all my devices on my trusted LAN's have static MAC leases, for IPv4 and IPv6, so 'my' unbound doesn't restarts very often (less then ones a day, probably even less frequent).
-
I have this problem after adding pfBlockerNG with a significantly large DNSBL list. Will try the following to reduce/eliminate impact:
- Reduce pfBlockerNG lists to a more reasonable size
- Add more static IP leases where reasonable
- Increase lease time
-
[Post deleted. My problem is not related to this.]
-
If you are getting disruption to VoIP calls that's clearly not directly DNS related. It may in fact be nothing to do with Unbound at all and in fact is just a symptom of something else that also causes Unbound to take far longer to reload.
Have you been seeing this before 2.4.5 or just since upgrading? If it's only in 2.4.5 you are probably hitting this:
https://redmine.pfsense.org/issues/10414Try opening
top -aSH
and also pinging the firewall and then go to Status > Filter Reload in the gui and reload the filter.
If you see pings spike and processes shoot up to the top of the top table, pfctl, sshd, dpinger etc, then you almist certainly are hitting that.Steve
-
Appreciate the feedback, thanks. I guess I am still digging on my issue because my son just confirmed to me that my specific problem is not yet resolved.
-
I'm assuming you are running this at home and don't have a massive number of dhcp clients?
There are thousands of users in the same situation, including me, who are not hitting this. It think it's likely Unbound reloading causing disruption is in fact a symptom of something else rather than a cause.
Steve