Frequent unbound restarts
-
[Post deleted. My problem is not related to this.]
-
If you are getting disruption to VoIP calls that's clearly not directly DNS related. It may in fact be nothing to do with Unbound at all and in fact is just a symptom of something else that also causes Unbound to take far longer to reload.
Have you been seeing this before 2.4.5 or just since upgrading? If it's only in 2.4.5 you are probably hitting this:
https://redmine.pfsense.org/issues/10414Try opening
top -aSH
and also pinging the firewall and then go to Status > Filter Reload in the gui and reload the filter.
If you see pings spike and processes shoot up to the top of the top table, pfctl, sshd, dpinger etc, then you almist certainly are hitting that.Steve
-
Appreciate the feedback, thanks. I guess I am still digging on my issue because my son just confirmed to me that my specific problem is not yet resolved.
-
I'm assuming you are running this at home and don't have a massive number of dhcp clients?
There are thousands of users in the same situation, including me, who are not hitting this. It think it's likely Unbound reloading causing disruption is in fact a symptom of something else rather than a cause.
Steve
-
This post is deleted! -
Check the cron table, install the Cron package to make it easy. What is running at those intervals?
Steve
-
This post is deleted! -
I have turned off DHCP DNS registration for guest users and so on and I have static DHCP leases for all my known devices but I still have the internet go offline once per day. The fact remains: this is a bug in PfSense that the lease script is calling HUP. It needs to be changed to reload the only the local cache as described above.
-
This is the open big covering this issue: https://redmine.pfsense.org/issues/5413
There is an open pull request there. dhcpleases is a binary though so would need to be compiled and swapped out to test.
Steve
-
@jasonArloUser said in Frequent unbound restarts:
I have turned off DHCP DNS registration for guest users and so on and I have static DHCP leases for all my known devices but I still have the internet go offline once per day. The fact remains: this is a bug in PfSense that the lease script is calling HUP. It needs to be changed to reload the only the local cache as described above.
If you remove the check, as shown here :
then process 'dhcpleases' - example :
2930 - Ss 0:00.01 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d your-local-domain.tld -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts
will not run : unbound will not get restarted by a new DHCP lease.
If (your) unbound is restarting to often, some other process is responsible for this.
Btw : my 'unbound' process restated 5 days ago - and I guess it was me doing so changing a setting.
@stephenw10 said in Frequent unbound restarts:
This is the open big covering this issue: https://redmine.pfsense.org/issues/5413
There is an open pull request there. dhcpleases is a binary though so would need to be compiled and swapped out to tesUnbelievable that this issue still exists after 4 years. One might ask : is it really an issue, as circumventing it is rather easy to do ?
Also, unbound is a rather small resolver that handles the job very well.
It 'unbound' that has to be rewritten by the authors so it can reread dynamic its config files if one changes - as the alternative does : bind (named).
But bind has also it's disadvantage. It's huge. And hard to administer it when you hide the option (hunderds) behind a GUI like interface like pfSense. -
I have never hit it myself but clearly some people do. Switching to reload instead of restart does seem like the obvious option here. The fact it hasn't happened yet may imply I'm missing something though.
Steve
-
@stephenw10 It's particularly noticeable if you are using pfBlockerNG - which adds large lists of sites to the unbound config (to provide DNS based blocking) and thus the reload can take some seconds (the restart wasn't noticeable before I implemented pfBlockerNG). This was a major issue for me until I unticked DHCP registration option. But having DHCP registration disabled is a bit lame.
Luke
-
I imagine there is a threshold where the latency for the different processes becomes critical. I run pfBlocker and have dhcp leases enabled and never have an issue. I seemingly have not that limit yet.
-
@stephenw10 said in Frequent unbound restarts:
I have never hit it myself but clearly some people do. Switching to reload instead of restart does seem like the obvious option here. The fact it hasn't happened yet may imply I'm missing something though.
Steve
This is what I don't understand either. This seems like a reaaally simple thing to fix.. and yes, I say FIX because this is absolutely a ridiculous flaw.
-
@lucas_nz said in Frequent unbound restarts:
But having DHCP registration disabled is a bit lame.
Shutting down that option is half the work.
This one stays on :and you add all your devices to the "DHCP Static Mappings for this Interface" list.
@stephenw10 said in Frequent unbound restarts:
I run pfBlocker and have dhcp leases enabled and never have an issue.
I bet you didn't select "all the feeds" neither ;)
edit : https://redmine.pfsense.org/issues/5413
-
@Gertjan said in Frequent unbound restarts:
I bet you didn't select "all the feeds" neither ;)
Indeed I did not.
-
anyone knows if this has been fixed in the latest update 2.4.5-p1?
-
@jt said in Frequent unbound restarts:
anyone knows if this has been fixed in the latest update 2.4.5-p1?
Everybody knows.
See here - just above. - just above.Again : as soon as [nllabs.n](https://nlnetlabs.nl/projects/unbound/about/ (the authors) rewrites unbound to implement something that could be the solution, this wont't happen.
As such, it's not a (pfSense) bug. At most, one could say that unbound is good, but not perfect.
The easy work around is : declare static MAC DHCP leases for all the devices that you need to address 'by name' - these devices often hosts services to be accessed from your LAN.
-
This is completely wrong. Shutting off services in the device to fix outages is not acceptable. If a service is not supported, don't offer it as a feature.
Also, there is no problem with unbound. I repeat as people seem to not be getting this: THERE IS NO PROBLEM WITH UNBOUND. The problem is with the other software. Unbound has a way to reload specific zones via a command. The DHCP lease scripts incorrectly send a HUP to the process. THIS IS A BUG. It's not a design choice, it's not a different way to do it. It's broken, incorrectly written software. A lease can never affect any other zone than local so the local zone should be reloaded.
If one pays for PFSense is the support any better than this forum? Because this forum is just dismissive of seemingly every problem. IPV6 support is also broken and no one cares about that either.
-
@jasonArloUser said in Frequent unbound restarts:
This is completely wrong.
I need - we all, I guess, work arrounds, as the issue can't be solved easily.
In another thread I showed the source code of unbound : how it handles the reception of OS signal like SIGHUP for example.
What ever you send to unbound : there is no "reload the config" functionality - it restarts.
That is the reason why I tend to say : do the next best thing : limit the number of SIGHUP's that will get send to unbound. Which means : stop the DHCP registration. Not perfect, I know, because now the admin has some work to do to compensate this initial loss-of-functionality - let's say 30 seconds of wrok for every device in the network. I mean : add DHCP static leases for every device that has to have a known host name in your network.@jasonArloUser said in Frequent unbound restarts:
THERE IS NO PROBLEM WITH UNBOUND
I prefer to say : it's not perfect ;)
What I know is that the process that processes new DHCP leases so they get signalled to the DNS sub system : known as "dhcpleases" is working as it should. It signals the current (unbound inour case) the DNS system that there is a new host name available.
This dhcpleases doesn't say "here is xxx.localdomain.local with IP a.b.c.d".
This process observes the file /var/dhcpd/var/db/dhcpd.leases - maintand by the dhcpd server daemon.
When it changes, it parses /var/dhcpd/var/db/dhcpd.leases, rewrites /var/unbound/dhcpleases_entries.conf, gets the pid of unbound by reading /var/run/unbound.pid and sends a SIGHUP to it (unbound).and unbound parses all the config files again ... by restating.
It's not smart enough to detect that it 'knows' that a line (or more) was added removed (or a combination) to the /var/unbound/dhcpleases_entries.conf file ....Do you see this any different ?
The obtained the "how it works" by reading the code (the C language is part of my professional eduction). Still, why not, I could be wrong ^^Keep in mind : unbound is a light weight DNS Resolver with DNSSEC and forward capabilities. It does not advertise more.
It's not 'bind'. (bind is huge ...)Btw : I'm not judging the system as a whole, just trying to understand how it works. I like to understand why things happen. It's a needed step if solutions need to be found.
edit : in the past, and in the present, unbound stops and starts fast.
Of course, when it restarts, its DNS cache is gone ..... no good, but many didn't notice, so ok ...
Then some one came by, and invented "DNSBL" for pfSense, and how to repopulate the local DNS cache with pre build replies that made it possible to screen out some IP's and domain names. You know the one I'm talking about.
The size of the config files read by unbound at startup exploded. Before, several kilo bytes was usual. Now, check the forum : people don't even blink their eyes when the explain that they have a million or more DNSBL's in their 'unbound' config files .... Why would one stop at a certain size if you can have it all ? "Just select them all" Consequences are unknown thus none existent ....The poor unbound has to parse them all at startup ..... because a stupid DHCP lease came in.
And during that time, tens of seconds, no more DNS .... and that, that was noticed. And here we are.So, according to you : who's fault is this ? ;)