Unbound Appears to restart frequently and fails to resolve domains sometimes.
-
Oct 1 21:29:31 unbound 15524:1 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
The DNSSEC option isn't activated be default.
It should work - it does for me - but it show a more important thing : what else did you took from default ?
Btw : Unbound can't request de primary build-in DS key …. very strange that that one stimes out - it's like nor priming on our 12 top level domaine servers. Your network connection is ok ?
Like (example - many more exists) : if you checked "Services => DNS Resolver => General Settings => DHCP Registration" and your pfEnse is subjected to a DHCP hail-storm then Unbound would restart as a machine gun.Hi Gertjan, I think you described exactly my problem. Anytime I enable DHCP registration in the resolver, unbound restarts a lot. I mean a lot…
Mar 28 07:33:06 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 07:33:08 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 07:45:39 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 07:45:41 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 07:45:56 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 07:45:57 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 07:58:38 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 07:58:39 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 08:03:29 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 08:03:31 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 08:03:31 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 08:03:32 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 08:04:41 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 08:04:43 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 08:33:07 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 08:33:09 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 08:45:40 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 08:45:41 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
Mar 28 08:45:56 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
Mar 28 08:45:57 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).That's just a little of it. I'd like to be able to use that feature without killing unbound. What are my options? Disabling DHCP registration is the first option, but what else?
Raffi
-
@Raffi.:
….
What are my options? Disabling DHCP registration is the first option, but what else?On my LAN, all devices are present for months if not years. I gave them all a DHCP static leases, so my unbound doesn't restart often - March 25 (3 days now) was the last time actually.
I haveStatic DHCP - Register DHCP static mappings in the DNS Resolver
checked of course.
-
Thanks Gertjan, most of my clients have static DHCP reservations also. That is working fine and not causing unbound to restart. I have been reading up on multiple previous threads about this issue and at some point you described this as normal behavior for unbound to restart when the DHCP leases are written. I will leave the DHCP reservation unchecked in that case. It has been know to not only cause the resolver to reboot, but also causes name resolution to be very slow and unresponsive at times. That's probably because it's busy restarting and obviously can't resolve during that process.
-
I'm glad to read this topic! I kinda suffered from slow DNS resolution with unbound and it had nothing to do with unbound being slow, of course. It seems to me that this should've been documented better because it made me wonder a couple hours before I figured out what was causing my nightmare. Anyways, I'll share my experience just in case another reader finds value in it.
Having "Register DHCP leases in the DNS Resolver" checked or registering DHCP static mappings in the DNS Resolver settings, while conveniently allows to resolve hostnames in a blink of an eye in the network does causes unbound to restart. In my particular case, I use unbound to blacklist (return 0.0.0.0) from a list of nearly 100,000 hostnames. Therefore, I do expect unbound to take longer to restart by having to load my blacklist.conf.
Since all DNS and DHCP services in my network is handled by pfsense, this caused quite a bit of a problem for me. Guest network devices, dev VMs and testing scripts to specifically rename hostnames added additional restarts as you can imagine.
Just four days ago I added a secondary DNS service (forwarder) in my LAN because of the so many slowdowns I suffered. This has eradicated the issue for me. Basically having dnsmasq on a secondary box caching and forwarding to pfsense and to an external resolver when pfsense resolver is unavailable has kept everyone quiet - specially that one who MUST be obeyed (aka my wife).
In the DHCP options, I set the secondary DNS in all LANs as the first resolver for clients. I also increased DHCP leases to 15 days. After all, the workaround was to use a secondary linux server that was already running on the network anyways.
Anyone has suggestions or a different way to handle unbound restarts in pfsense?
-
Here are one of the more popular threads on this issue, https://forum.pfsense.org/index.php?topic=89589.msg765049#msg765049. Some have reported success with the various solutions posted on there. Others have links to fixes on different threads. Unfortunately, none of the fixes worked for me. I still have DHCP registration unchecked. The static DHCP registration is not an issue for me though. Luckily for me the clients I actually care about resolving have static reservation anyway.
Good luck!
Raffi -
Same problem for me. Activating NOTIFY for this thread post..
-
I ended up just using dnsmasq with dnscrypt-proxy on a secondary box as my primary DNS server for all internal networks. Unbound is also limited doing DNS over TLS (it is slow since it does not reuse connections).
This is my work around:
1- pfSense is still my DHCP Server and Secondary DNS. (still registering DHCP leases in the DNS Resolver).
- DHCP leases 15 days
- Increased DNS TTL in Unbound and forward to upstream over TLS - (initial query is slow but once cache kicks in it is all good).
2- LANs DNS 1 - Linux Box: dnsmasq with dnscrypt-proxy 2.0.9 (forward local domain to pfsense so that LANs hostnames can be resolved) -
@ralphys how did you achieve having pfSense as your DHCP and using it as a secondary DNS? Do you use the forwarder to forward requests to the primary DNS? Or how did you implement it? I'm kind of in the same situation and search a solution for this.
-
@ceofreak said in Unbound Appears to restart frequently and fails to resolve domains sometimes.:
@ralphys how did you achieve having pfSense as your DHCP and using it as a secondary DNS? Do you use the forwarder to forward requests to the primary DNS? Or how did you implement it? I'm kind of in the same situation and search a solution for this.
Let me try to help with that.
1- I'm not including how to setup dnsmasq and dnscrypt-proxy but I will add some general configuration as guidance.
pfSense Unbound Config
Services => DNS Server
Enable the option below:
DNSSEC: Enable DNSSEC Support
DNS Query Forwarding: Enable Forwarding Mode
DHCP Registration: Register DHCP leases in the DNS Resolver
Static DHCP: Register DHCP static mappings in the DNS ResolverCustom Options:
server: forward-zone: name: "." forward-ssl-upstream: yes forward-addr: 1.0.0.1@853 forward-addr: 9.9.9.9@853 server: private-domain: "plex.direct"
At this point your DNS queries will be forwarded to upstream servers from pfSense as requests come in (if not in the cache).
Services => DHCP Server:
DNS servers: 192.168.1.2 <= This is the linux box with dnsmasq/dnscrypt
DNS servers: 192.168.1.1 <= This is pfSense as secondary DNS ServerAt this point when a clients request a lease, DHCP provides the lease and also the primary and secondary DNS server.
If you have multiple VLAN you want to add the primary and secondary DNS server for each of those VLAN. E.g.:
VLAN 90
Primary DNS: 192.168.1.2 <= assuming this is the IP of your linux box with dnsmasq/dnscrypt.
Secondary DNS 192.168.90.1 <= pfSense as secondary DNS for VLAN 90.
... and so on.Default lease time : 1296000
Maximum lease time : 2592000With that in place, it is a matter to configure your primary DNS (dnsmasq in Linux box with dnscrypt-proxy)
This is my current configuration as reference:
/etc/dnsmasq.conf
listen-address=127.0.0.1,192.168.1.2 port=53 bind-interfaces # upstream DNS Server (pfsense) expand-hosts server=/lab.domain.net/192.168.1.1 domain=lab.domain.net,192.168.1.0/24 rebind-domain-ok=/plex.direct/ resolv-file=/etc/resolv.dnsmasq strict-order # advanced options filterwin2k cache-size=100000 dns-forward-max=1000 neg-ttl=60 max-ttl=3600 min-cache-ttl=600 # logging log-facility=/var/log/dnsmasq.log log-queries log-async=10
You will also need to configure dnscrypt-proxy. Basically, all your clients will use dnsmasq (192.168.1.2 in my configuration above as example) as the primary DNS. Dnsmasq will forward all requests to dnscrypt-proxy in your Linux box and your requests leave your network encrypted.
As you can see, local requests are forwarded to pfsense instead of dnscrypt-proxy for local resolution in dnsmasq.conf:
server=/lab.domain.net/192.168.1.1
That should give you an idea.
Cheers!
-
@ralphys thanks for this long answer! I will run some tests as soon as I get around to it! Reporting back here.