Frequent unbound restarts
-
Appreciate the feedback, thanks. I guess I am still digging on my issue because my son just confirmed to me that my specific problem is not yet resolved.
-
I'm assuming you are running this at home and don't have a massive number of dhcp clients?
There are thousands of users in the same situation, including me, who are not hitting this. It think it's likely Unbound reloading causing disruption is in fact a symptom of something else rather than a cause.
Steve
-
This post is deleted! -
Check the cron table, install the Cron package to make it easy. What is running at those intervals?
Steve
-
This post is deleted! -
I have turned off DHCP DNS registration for guest users and so on and I have static DHCP leases for all my known devices but I still have the internet go offline once per day. The fact remains: this is a bug in PfSense that the lease script is calling HUP. It needs to be changed to reload the only the local cache as described above.
-
This is the open big covering this issue: https://redmine.pfsense.org/issues/5413
There is an open pull request there. dhcpleases is a binary though so would need to be compiled and swapped out to test.
Steve
-
@jasonArloUser said in Frequent unbound restarts:
I have turned off DHCP DNS registration for guest users and so on and I have static DHCP leases for all my known devices but I still have the internet go offline once per day. The fact remains: this is a bug in PfSense that the lease script is calling HUP. It needs to be changed to reload the only the local cache as described above.
If you remove the check, as shown here :
then process 'dhcpleases' - example :
2930 - Ss 0:00.01 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d your-local-domain.tld -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts
will not run : unbound will not get restarted by a new DHCP lease.
If (your) unbound is restarting to often, some other process is responsible for this.
Btw : my 'unbound' process restated 5 days ago - and I guess it was me doing so changing a setting.
@stephenw10 said in Frequent unbound restarts:
This is the open big covering this issue: https://redmine.pfsense.org/issues/5413
There is an open pull request there. dhcpleases is a binary though so would need to be compiled and swapped out to tesUnbelievable that this issue still exists after 4 years. One might ask : is it really an issue, as circumventing it is rather easy to do ?
Also, unbound is a rather small resolver that handles the job very well.
It 'unbound' that has to be rewritten by the authors so it can reread dynamic its config files if one changes - as the alternative does : bind (named).
But bind has also it's disadvantage. It's huge. And hard to administer it when you hide the option (hunderds) behind a GUI like interface like pfSense. -
I have never hit it myself but clearly some people do. Switching to reload instead of restart does seem like the obvious option here. The fact it hasn't happened yet may imply I'm missing something though.
Steve
-
@stephenw10 It's particularly noticeable if you are using pfBlockerNG - which adds large lists of sites to the unbound config (to provide DNS based blocking) and thus the reload can take some seconds (the restart wasn't noticeable before I implemented pfBlockerNG). This was a major issue for me until I unticked DHCP registration option. But having DHCP registration disabled is a bit lame.
Luke
-
I imagine there is a threshold where the latency for the different processes becomes critical. I run pfBlocker and have dhcp leases enabled and never have an issue. I seemingly have not that limit yet.
-
@stephenw10 said in Frequent unbound restarts:
I have never hit it myself but clearly some people do. Switching to reload instead of restart does seem like the obvious option here. The fact it hasn't happened yet may imply I'm missing something though.
Steve
This is what I don't understand either. This seems like a reaaally simple thing to fix.. and yes, I say FIX because this is absolutely a ridiculous flaw.
-
@lucas_nz said in Frequent unbound restarts:
But having DHCP registration disabled is a bit lame.
Shutting down that option is half the work.
This one stays on :and you add all your devices to the "DHCP Static Mappings for this Interface" list.
@stephenw10 said in Frequent unbound restarts:
I run pfBlocker and have dhcp leases enabled and never have an issue.
I bet you didn't select "all the feeds" neither ;)
edit : https://redmine.pfsense.org/issues/5413
-
@Gertjan said in Frequent unbound restarts:
I bet you didn't select "all the feeds" neither ;)
Indeed I did not.
-
anyone knows if this has been fixed in the latest update 2.4.5-p1?
-
@jt said in Frequent unbound restarts:
anyone knows if this has been fixed in the latest update 2.4.5-p1?
Everybody knows.
See here - just above. - just above.Again : as soon as [nllabs.n](https://nlnetlabs.nl/projects/unbound/about/ (the authors) rewrites unbound to implement something that could be the solution, this wont't happen.
As such, it's not a (pfSense) bug. At most, one could say that unbound is good, but not perfect.
The easy work around is : declare static MAC DHCP leases for all the devices that you need to address 'by name' - these devices often hosts services to be accessed from your LAN.
-
This is completely wrong. Shutting off services in the device to fix outages is not acceptable. If a service is not supported, don't offer it as a feature.
Also, there is no problem with unbound. I repeat as people seem to not be getting this: THERE IS NO PROBLEM WITH UNBOUND. The problem is with the other software. Unbound has a way to reload specific zones via a command. The DHCP lease scripts incorrectly send a HUP to the process. THIS IS A BUG. It's not a design choice, it's not a different way to do it. It's broken, incorrectly written software. A lease can never affect any other zone than local so the local zone should be reloaded.
If one pays for PFSense is the support any better than this forum? Because this forum is just dismissive of seemingly every problem. IPV6 support is also broken and no one cares about that either.
-
@jasonArloUser said in Frequent unbound restarts:
This is completely wrong.
I need - we all, I guess, work arrounds, as the issue can't be solved easily.
In another thread I showed the source code of unbound : how it handles the reception of OS signal like SIGHUP for example.
What ever you send to unbound : there is no "reload the config" functionality - it restarts.
That is the reason why I tend to say : do the next best thing : limit the number of SIGHUP's that will get send to unbound. Which means : stop the DHCP registration. Not perfect, I know, because now the admin has some work to do to compensate this initial loss-of-functionality - let's say 30 seconds of wrok for every device in the network. I mean : add DHCP static leases for every device that has to have a known host name in your network.@jasonArloUser said in Frequent unbound restarts:
THERE IS NO PROBLEM WITH UNBOUND
I prefer to say : it's not perfect ;)
What I know is that the process that processes new DHCP leases so they get signalled to the DNS sub system : known as "dhcpleases" is working as it should. It signals the current (unbound inour case) the DNS system that there is a new host name available.
This dhcpleases doesn't say "here is xxx.localdomain.local with IP a.b.c.d".
This process observes the file /var/dhcpd/var/db/dhcpd.leases - maintand by the dhcpd server daemon.
When it changes, it parses /var/dhcpd/var/db/dhcpd.leases, rewrites /var/unbound/dhcpleases_entries.conf, gets the pid of unbound by reading /var/run/unbound.pid and sends a SIGHUP to it (unbound).and unbound parses all the config files again ... by restating.
It's not smart enough to detect that it 'knows' that a line (or more) was added removed (or a combination) to the /var/unbound/dhcpleases_entries.conf file ....Do you see this any different ?
The obtained the "how it works" by reading the code (the C language is part of my professional eduction). Still, why not, I could be wrong ^^Keep in mind : unbound is a light weight DNS Resolver with DNSSEC and forward capabilities. It does not advertise more.
It's not 'bind'. (bind is huge ...)Btw : I'm not judging the system as a whole, just trying to understand how it works. I like to understand why things happen. It's a needed step if solutions need to be found.
edit : in the past, and in the present, unbound stops and starts fast.
Of course, when it restarts, its DNS cache is gone ..... no good, but many didn't notice, so ok ...
Then some one came by, and invented "DNSBL" for pfSense, and how to repopulate the local DNS cache with pre build replies that made it possible to screen out some IP's and domain names. You know the one I'm talking about.
The size of the config files read by unbound at startup exploded. Before, several kilo bytes was usual. Now, check the forum : people don't even blink their eyes when the explain that they have a million or more DNSBL's in their 'unbound' config files .... Why would one stop at a certain size if you can have it all ? "Just select them all" Consequences are unknown thus none existent ....The poor unbound has to parse them all at startup ..... because a stupid DHCP lease came in.
And during that time, tens of seconds, no more DNS .... and that, that was noticed. And here we are.So, according to you : who's fault is this ? ;)
-
@Gertjan said in Frequent unbound restarts:
Keep in mind : unbound is a light weight DNS Resolver with DNSSEC and forward capabilities. It does not advertise more.
It's not 'bind'. (bind is huge ...)True, but Netgate encourages people to use this as their DNS solution. Another way to read it is that they sell $5300 devices that can't even serve DNS properly.
The poor unbound has to parse them all at startup ..... because a stupid DHCP lease came in.
And during that time, tens of seconds, no more DNS .... and that, that was noticed. And here we are.So, according to you : who's fault is this ? ;)
It is true that a typical "fat" client with a DNS cache has no issues with DNS not being available for a few moments. However, on my network, with numerous embedded devices & sensors (and yes, even a Google Home device) which do NOT do any DNS caching but always do a DNS lookup, we have continuous measurement interruptions every time Unbound does a restart, even if it only takes 500 ms on my device.
That is operational impact for Netgate device customers (not people running some free version of pfSense somewhere), for an issue that has been known already for three years (start of this thread).
BTW: Thanks for your investigative work, was an interesting read!
-
Hello!
On a very small network (sg-3100) with 10 or so regular devices (devices not coming and going) with DHCP leases, I am seeing around 10 dns restarts per hour initiated by dhcpd ("pfSense dhcpleases: Sending HUP signal to dns daemon"). Stock OOTB snort and pfb. No feed or rule craziness. I am not getting any blowback from users about internet flakiness.
Does dhcpd restart the dns on lease renewals or DHCPREQUEST/DHCPACK traffic? I cant tell from the logs. It is hard to match up those restarts to any specific dhcpd activity.
The default lease time is only 2hrs. On a large network with lots of leases that could be many restarts? Maybe a longer lease time?
John