Unbound Appears to restart frequently and fails to resolve domains sometimes.



  • Hi,
    I have no formal training in IT apart from web dev but ive been fiddling for with networks for years. The granularity pfSense provides is amazing compared to anything available to consumers. I'm currently running 2.3.4-RELEASE-p1 (amd64) on a old IBM System x3200 M3, that I was given that was removed from service (no issues) I ran this setup for a few weeks with a few test machines attached with no issues pfSense may have been updated since so it may have been a problem with an update.
    I've switched between dnsmasq and unbound and I have not noticed any issues with dnsmasq. With unbound, after opening several sites that i have not been to before in quick succession (Or after a while browsing) I will get the following errors in chrome.
    First:
    ERR_NAME_RESOLUTION_FAILED
    Then:
    DNS_PROBE_FINISHED_NXDOMAIN

    If i look at my DNS Resolver logs there is a huge amount of unbound restarts.

    Oct 1 21:29:24 unbound 15524:0 info: 1.000000 2.000000 3
    Oct 1 21:29:24 unbound 15524:0 info: server stats for thread 2: 7 queries, 2 answers from cache, 5 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 1 21:29:24 unbound 15524:0 info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0
    Oct 1 21:29:24 unbound 15524:0 info: average recursion processing time 0.685782 sec
    Oct 1 21:29:24 unbound 15524:0 info: histogram of recursion processing times
    Oct 1 21:29:24 unbound 15524:0 info: [25%]=0.32768 median[50%]=0.643216 [75%]=0.940536
    Oct 1 21:29:24 unbound 15524:0 info: lower(secs) upper(secs) recursions
    Oct 1 21:29:24 unbound 15524:0 info: 0.016384 0.032768 1
    Oct 1 21:29:24 unbound 15524:0 info: 0.262144 0.524288 1
    Oct 1 21:29:24 unbound 15524:0 info: 0.524288 1.000000 2
    Oct 1 21:29:24 unbound 15524:0 info: 1.000000 2.000000 1
    Oct 1 21:29:24 unbound 15524:0 info: server stats for thread 3: 20 queries, 5 answers from cache, 15 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 1 21:29:24 unbound 15524:0 info: server stats for thread 3: requestlist max 2 avg 0.333333 exceeded 0 jostled 0
    Oct 1 21:29:24 unbound 15524:0 info: average recursion processing time 0.626031 sec
    Oct 1 21:29:24 unbound 15524:0 info: histogram of recursion processing times
    Oct 1 21:29:24 unbound 15524:0 info: [25%]=0.301466 median[50%]=0.498074 [75%]=0.910804
    Oct 1 21:29:24 unbound 15524:0 info: lower(secs) upper(secs) recursions
    Oct 1 21:29:24 unbound 15524:0 info: 0.065536 0.131072 1
    Oct 1 21:29:24 unbound 15524:0 info: 0.131072 0.262144 2
    Oct 1 21:29:24 unbound 15524:0 info: 0.262144 0.524288 5
    Oct 1 21:29:24 unbound 15524:0 info: 0.524288 1.000000 4
    Oct 1 21:29:24 unbound 15524:0 info: 1.000000 2.000000 3
    Oct 1 21:29:24 unbound 15524:0 notice: Restart of unbound 1.6.1.
    Oct 1 21:29:24 unbound 15524:0 notice: init module 0: validator
    Oct 1 21:29:24 unbound 15524:0 notice: init module 1: iterator
    Oct 1 21:29:24 unbound 15524:0 info: start of service (unbound 1.6.1).
    Oct 1 21:29:31 unbound 15524:1 info: failed to prime trust anchor – DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:29:31 unbound 15524:2 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:29:31 unbound 15524:1 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:29:31 unbound 15524:0 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:29:31 unbound 15524:2 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:29:31 unbound 15524:2 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:29:31 unbound 15524:0 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:40:00 unbound 15524:0 info: service stopped (unbound 1.6.1).
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 0: 278 queries, 27 answers from cache, 251 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 0: requestlist max 6 avg 1.11952 exceeded 0 jostled 0
    Oct 1 21:40:00 unbound 15524:0 info: average recursion processing time 0.673429 sec
    Oct 1 21:40:00 unbound 15524:0 info: histogram of recursion processing times
    Oct 1 21:40:00 unbound 15524:0 info: [25%]=0.129252 median[50%]=0.386494 [75%]=0.889721
    Oct 1 21:40:00 unbound 15524:0 info: lower(secs) upper(secs) recursions
    Oct 1 21:40:00 unbound 15524:0 info: 0.000000 0.000001 4
    Oct 1 21:40:00 unbound 15524:0 info: 0.016384 0.032768 19
    Oct 1 21:40:00 unbound 15524:0 info: 0.032768 0.065536 31
    Oct 1 21:40:00 unbound 15524:0 info: 0.065536 0.131072 9
    Oct 1 21:40:00 unbound 15524:0 info: 0.131072 0.262144 44
    Oct 1 21:40:00 unbound 15524:0 info: 0.262144 0.524288 39
    Oct 1 21:40:00 unbound 15524:0 info: 0.524288 1.000000 55
    Oct 1 21:40:00 unbound 15524:0 info: 1.000000 2.000000 32
    Oct 1 21:40:00 unbound 15524:0 info: 2.000000 4.000000 15
    Oct 1 21:40:00 unbound 15524:0 info: 4.000000 8.000000 3
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 1: 144 queries, 15 answers from cache, 129 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 1: requestlist max 3 avg 0.387597 exceeded 0 jostled 0
    Oct 1 21:40:00 unbound 15524:0 info: average recursion processing time 0.762516 sec
    Oct 1 21:40:00 unbound 15524:0 info: histogram of recursion processing times
    Oct 1 21:40:00 unbound 15524:0 info: [25%]=0.16384 median[50%]=0.435159 [75%]=0.958833
    Oct 1 21:40:00 unbound 15524:0 info: lower(secs) upper(secs) recursions
    Oct 1 21:40:00 unbound 15524:0 info: 0.000000 0.000001 2
    Oct 1 21:40:00 unbound 15524:0 info: 0.016384 0.032768 9
    Oct 1 21:40:00 unbound 15524:0 info: 0.032768 0.065536 12
    Oct 1 21:40:00 unbound 15524:0 info: 0.065536 0.131072 4
    Oct 1 21:40:00 unbound 15524:0 info: 0.131072 0.262144 21
    Oct 1 21:40:00 unbound 15524:0 info: 0.262144 0.524288 25
    Oct 1 21:40:00 unbound 15524:0 info: 0.524288 1.000000 26
    Oct 1 21:40:00 unbound 15524:0 info: 1.000000 2.000000 21
    Oct 1 21:40:00 unbound 15524:0 info: 2.000000 4.000000 5
    Oct 1 21:40:00 unbound 15524:0 info: 4.000000 8.000000 4
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 2: 102 queries, 17 answers from cache, 85 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 2: requestlist max 5 avg 0.294118 exceeded 0 jostled 0
    Oct 1 21:40:00 unbound 15524:0 info: average recursion processing time 0.666056 sec
    Oct 1 21:40:00 unbound 15524:0 info: histogram of recursion processing times
    Oct 1 21:40:00 unbound 15524:0 info: [25%]=0.08192 median[50%]=0.386662 [75%]=0.889567
    Oct 1 21:40:00 unbound 15524:0 info: lower(secs) upper(secs) recursions
    Oct 1 21:40:00 unbound 15524:0 info: 0.000000 0.000001 1
    Oct 1 21:40:00 unbound 15524:0 info: 0.016384 0.032768 7
    Oct 1 21:40:00 unbound 15524:0 info: 0.032768 0.065536 12
    Oct 1 21:40:00 unbound 15524:0 info: 0.065536 0.131072 5
    Oct 1 21:40:00 unbound 15524:0 info: 0.131072 0.262144 8
    Oct 1 21:40:00 unbound 15524:0 info: 0.262144 0.524288 20
    Oct 1 21:40:00 unbound 15524:0 info: 0.524288 1.000000 14
    Oct 1 21:40:00 unbound 15524:0 info: 1.000000 2.000000 13
    Oct 1 21:40:00 unbound 15524:0 info: 2.000000 4.000000 3
    Oct 1 21:40:00 unbound 15524:0 info: 4.000000 8.000000 2
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 3: 33 queries, 6 answers from cache, 27 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 1 21:40:00 unbound 15524:0 info: server stats for thread 3: requestlist max 3 avg 0.333333 exceeded 0 jostled 0
    Oct 1 21:40:00 unbound 15524:0 info: average recursion processing time 0.580928 sec
    Oct 1 21:40:00 unbound 15524:0 info: histogram of recursion processing times
    Oct 1 21:40:00 unbound 15524:0 info: [25%]=0.25559 median[50%]=0.475136 [75%]=0.881072
    Oct 1 21:40:00 unbound 15524:0 info: lower(secs) upper(secs) recursions
    Oct 1 21:40:00 unbound 15524:0 info: 0.032768 0.065536 1
    Oct 1 21:40:00 unbound 15524:0 info: 0.065536 0.131072 1
    Oct 1 21:40:00 unbound 15524:0 info: 0.131072 0.262144 5
    Oct 1 21:40:00 unbound 15524:0 info: 0.262144 0.524288 8
    Oct 1 21:40:00 unbound 15524:0 info: 0.524288 1.000000 7
    Oct 1 21:40:00 unbound 15524:0 info: 1.000000 2.000000 5
    Oct 1 21:40:00 unbound 15524:0 notice: Restart of unbound 1.6.1.
    Oct 1 21:40:00 unbound 15524:0 notice: init module 0: validator
    Oct 1 21:40:00 unbound 15524:0 notice: init module 1: iterator
    Oct 1 21:40:00 unbound 15524:0 info: start of service (unbound 1.6.1).
    Oct 1 21:40:01 unbound 15524:0 info: failed to prime trust anchor – DNSKEY rrset is not secure . DNSKEY IN
    Oct 1 21:40:02 unbound 15524:0 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN

    Need advice on what i should investigate next.



  • Try updating to 2.4RC. Unbound restarts have been vastly reduced under 2.4.



  • Thankyou marjohn56,
    Upgrade to 2.4 complete. Seems to crash a bit less now and recovers faster. But I have still managed to get it to crash after opening about 10- 20 or new domains in a browser in succession. The other issue I have now is that under system log > dns resolver it seems im not getting anything output to the log file.

    I was experiencing the same symptoms as before at 23:57 and 23:59 but nothing has been output to the log.

    Oct 2 23:54:27 unbound 80091:0 notice: Restart of unbound 1.6.3.
    Oct 2 23:54:27 unbound 80091:0 notice: init module 0: validator
    Oct 2 23:54:27 unbound 80091:0 notice: init module 1: iterator
    Oct 2 23:54:27 unbound 80091:0 info: start of service (unbound 1.6.3).
    Oct 2 23:54:29 unbound 80091:0 info: service stopped (unbound 1.6.3).
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 0: 2 queries, 1 answers from cache, 1 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 1: 3 queries, 0 answers from cache, 3 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 1: requestlist max 2 avg 1 exceeded 0 jostled 0
    Oct 2 23:54:29 unbound 80091:0 info: average recursion processing time 0.434081 sec
    Oct 2 23:54:29 unbound 80091:0 info: histogram of recursion processing times
    Oct 2 23:54:29 unbound 80091:0 info: [25%]=0 median[50%]=0 [75%]=0
    Oct 2 23:54:29 unbound 80091:0 info: lower(secs) upper(secs) recursions
    Oct 2 23:54:29 unbound 80091:0 info: 0.262144 0.524288 2
    Oct 2 23:54:29 unbound 80091:0 info: 0.524288 1.000000 1
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 2: 1 queries, 0 answers from cache, 1 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0
    Oct 2 23:54:29 unbound 80091:0 info: average recursion processing time 0.331310 sec
    Oct 2 23:54:29 unbound 80091:0 info: histogram of recursion processing times
    Oct 2 23:54:29 unbound 80091:0 info: [25%]=0 median[50%]=0 [75%]=0
    Oct 2 23:54:29 unbound 80091:0 info: lower(secs) upper(secs) recursions
    Oct 2 23:54:29 unbound 80091:0 info: 0.262144 0.524288 1
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 3: 3 queries, 0 answers from cache, 3 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:29 unbound 80091:0 info: server stats for thread 3: requestlist max 4 avg 2 exceeded 0 jostled 0
    Oct 2 23:54:29 unbound 80091:0 info: average recursion processing time 0.337855 sec
    Oct 2 23:54:29 unbound 80091:0 info: histogram of recursion processing times
    Oct 2 23:54:29 unbound 80091:0 info: [25%]=0 median[50%]=0 [75%]=0
    Oct 2 23:54:29 unbound 80091:0 info: lower(secs) upper(secs) recursions
    Oct 2 23:54:29 unbound 80091:0 info: 0.131072 0.262144 1
    Oct 2 23:54:29 unbound 80091:0 info: 0.524288 1.000000 1
    Oct 2 23:54:29 unbound 80091:0 notice: Restart of unbound 1.6.3.
    Oct 2 23:54:29 unbound 80091:0 notice: init module 0: validator
    Oct 2 23:54:29 unbound 80091:0 notice: init module 1: iterator
    Oct 2 23:54:29 unbound 80091:0 info: start of service (unbound 1.6.3).
    Oct 2 23:54:31 unbound 80091:0 info: service stopped (unbound 1.6.3).
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 0: 2 queries, 0 answers from cache, 2 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 0: requestlist max 2 avg 1 exceeded 0 jostled 0
    Oct 2 23:54:31 unbound 80091:0 info: average recursion processing time 0.214835 sec
    Oct 2 23:54:31 unbound 80091:0 info: histogram of recursion processing times
    Oct 2 23:54:31 unbound 80091:0 info: [25%]=0 median[50%]=0 [75%]=0
    Oct 2 23:54:31 unbound 80091:0 info: lower(secs) upper(secs) recursions
    Oct 2 23:54:31 unbound 80091:0 info: 0.032768 0.065536 1
    Oct 2 23:54:31 unbound 80091:0 info: 0.262144 0.524288 1
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 2: 1 queries, 0 answers from cache, 1 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0
    Oct 2 23:54:31 unbound 80091:0 info: average recursion processing time 1.557454 sec
    Oct 2 23:54:31 unbound 80091:0 info: histogram of recursion processing times
    Oct 2 23:54:31 unbound 80091:0 info: [25%]=0 median[50%]=0 [75%]=0
    Oct 2 23:54:31 unbound 80091:0 info: lower(secs) upper(secs) recursions
    Oct 2 23:54:31 unbound 80091:0 info: 1.000000 2.000000 1
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 3: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Oct 2 23:54:31 unbound 80091:0 info: server stats for thread 3: requestlist max 0 avg 0 exceeded 0 jostled 0
    Oct 2 23:54:31 unbound 80091:0 notice: Restart of unbound 1.6.3.
    Oct 2 23:54:31 unbound 80091:0 notice: init module 0: validator
    Oct 2 23:54:31 unbound 80091:0 notice: init module 1: iterator
    Oct 2 23:54:31 unbound 80091:0 info: start of service (unbound 1.6.3).



  • @1337cookie:

    Oct 1 21:29:31   unbound   15524:1   info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    

    The DNSSEC option isn't activated be default.
    It should work - it does for me - but it show a more important thing : what else did you took from default ?
    Btw : Unbound can't request de primary build-in DS key …. very strange that that one stimes out - it's like nor priming on our 12 top level domaine servers. Your network connection is ok ?
    Like (example - many more exists) : if you checked "Services => DNS Resolver => General Settings => DHCP Registration" and your pfEnse is subjected to a DHCP hail-storm then Unbound would restart as a machine gun.



  • @Gertjan:

    @1337cookie:

    Oct 1 21:29:31   unbound   15524:1   info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN
    

    The DNSSEC option isn't activated be default.
    It should work - it does for me - but it show a more important thing : what else did you took from default ?
    Btw : Unbound can't request de primary build-in DS key …. very strange that that one stimes out - it's like nor priming on our 12 top level domaine servers. Your network connection is ok ?
    Like (example - many more exists) : if you checked "Services => DNS Resolver => General Settings => DHCP Registration" and your pfEnse is subjected to a DHCP hail-storm then Unbound would restart as a machine gun.

    Hi Gertjan, I think you described exactly my problem. Anytime I enable DHCP registration in the resolver, unbound restarts a lot. I mean a lot…
    Mar 28 07:33:06 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 07:33:08 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 07:45:39 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 07:45:41 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 07:45:56 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 07:45:57 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 07:58:38 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 07:58:39 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 08:03:29 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 08:03:31 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 08:03:31 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 08:03:32 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 08:04:41 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 08:04:43 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 08:33:07 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 08:33:09 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 08:45:40 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 08:45:41 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).
    Mar 28 08:45:56 pfsense unbound: [5370:0] info: service stopped (unbound 1.6.6).
    Mar 28 08:45:57 pfsense unbound: [5370:0] info: start of service (unbound 1.6.6).

    That's just a little of it. I'd like to be able to use that feature without killing unbound. What are my options? Disabling DHCP registration is the first option, but what else?

    Raffi



  • @Raffi.:

    ….
    What are my options? Disabling DHCP registration is the first option, but what else?

    On my LAN, all devices are present for months if not years. I gave them all a DHCP static leases, so my unbound doesn't restart often - March 25 (3 days now) was the last time actually.
    I have

     Static DHCP - Register DHCP static mappings in the DNS Resolver
    

    checked of course.



  • Thanks Gertjan, most of my clients have static DHCP reservations also. That is working fine and not causing unbound to restart. I have been reading up on multiple previous threads about this issue and at some point you described this as normal behavior for unbound to restart when the DHCP leases are written. I will leave the DHCP reservation unchecked in that case. It has been know to not only cause the resolver to reboot, but also causes name resolution to be very slow and unresponsive at times. That's probably because it's busy restarting and obviously can't resolve during that process.



  • I'm glad to read this topic! I kinda suffered from slow DNS resolution with unbound and it had nothing to do with unbound being slow, of course.  It seems to me that this should've been documented better because it made me wonder a couple hours before I figured out what was causing my nightmare. Anyways, I'll share my experience just in case another reader finds value in it.

    Having "Register DHCP leases in the DNS Resolver" checked or registering DHCP static mappings in the DNS Resolver settings, while conveniently allows to resolve hostnames in a blink of an eye in the network does causes unbound to restart. In my particular case, I use unbound to blacklist (return 0.0.0.0) from a list of nearly 100,000 hostnames. Therefore, I do expect unbound to take longer to restart by having to load my blacklist.conf.

    Since all DNS and DHCP services in my network is handled by pfsense, this caused quite a bit of a problem for me. Guest network devices, dev VMs and testing scripts to specifically rename hostnames added additional restarts as you can imagine.

    Just four days ago I added a secondary DNS service (forwarder) in my LAN because of the so many slowdowns I suffered. This has eradicated the issue for me. Basically having dnsmasq on a secondary box caching and forwarding to pfsense and to an external resolver when pfsense resolver is unavailable has kept everyone quiet - specially that one who MUST be obeyed (aka my wife).

    In the DHCP options, I set the secondary DNS in all LANs as the first resolver for clients. I also increased DHCP leases to 15 days. After all, the workaround was to use a secondary linux server that was already running on the network anyways.

    Anyone has suggestions or a different way to handle unbound restarts in pfsense?



  • Here are one of the more popular threads on this issue, https://forum.pfsense.org/index.php?topic=89589.msg765049#msg765049. Some have reported success with the various solutions posted on there. Others have links to fixes on different threads. Unfortunately, none of the fixes worked for me. I still have DHCP registration unchecked. The static DHCP registration is not an issue for me though. Luckily for me the clients I actually care about resolving have static reservation anyway.

    Good luck!
    Raffi



  • Same problem for me. Activating NOTIFY for this thread post..



  • I ended up just using dnsmasq with dnscrypt-proxy on a secondary box as my primary DNS server for all internal networks. Unbound is also limited doing DNS over TLS (it is slow since it does not reuse connections).

    This is my work around:

    1- pfSense is still my DHCP Server and Secondary DNS. (still registering DHCP leases in the DNS Resolver).
      - DHCP leases 15 days
      - Increased DNS TTL in Unbound and forward to upstream over TLS - (initial query is slow but once cache kicks in it is all good).
    2- LANs DNS 1 - Linux Box: dnsmasq with dnscrypt-proxy 2.0.9 (forward local domain to pfsense so that LANs hostnames can be resolved)



  • @ralphys how did you achieve having pfSense as your DHCP and using it as a secondary DNS? Do you use the forwarder to forward requests to the primary DNS? Or how did you implement it? I'm kind of in the same situation and search a solution for this.



  • @ceofreak said in Unbound Appears to restart frequently and fails to resolve domains sometimes.:

    @ralphys how did you achieve having pfSense as your DHCP and using it as a secondary DNS? Do you use the forwarder to forward requests to the primary DNS? Or how did you implement it? I'm kind of in the same situation and search a solution for this.

    Let me try to help with that.

    1- I'm not including how to setup dnsmasq and dnscrypt-proxy but I will add some general configuration as guidance.

    pfSense Unbound Config

    Services => DNS Server

    Enable the option below:

    DNSSEC: Enable DNSSEC Support
    DNS Query Forwarding: Enable Forwarding Mode
    DHCP Registration: Register DHCP leases in the DNS Resolver
    Static DHCP: Register DHCP static mappings in the DNS Resolver

    Custom Options:

    server:
    forward-zone:
    name: "."
    forward-ssl-upstream: yes
    forward-addr: 1.0.0.1@853
    forward-addr: 9.9.9.9@853
    server:
    private-domain: "plex.direct"
    

    At this point your DNS queries will be forwarded to upstream servers from pfSense as requests come in (if not in the cache).

    Services => DHCP Server:

    DNS servers: 192.168.1.2 <= This is the linux box with dnsmasq/dnscrypt
    DNS servers: 192.168.1.1 <= This is pfSense as secondary DNS Server

    At this point when a clients request a lease, DHCP provides the lease and also the primary and secondary DNS server.

    If you have multiple VLAN you want to add the primary and secondary DNS server for each of those VLAN. E.g.:

    VLAN 90
    Primary DNS: 192.168.1.2 <= assuming this is the IP of your linux box with dnsmasq/dnscrypt.
    Secondary DNS 192.168.90.1 <= pfSense as secondary DNS for VLAN 90.
    ... and so on.

    Default lease time : 1296000
    Maximum lease time : 2592000

    With that in place, it is a matter to configure your primary DNS (dnsmasq in Linux box with dnscrypt-proxy)

    This is my current configuration as reference:

    /etc/dnsmasq.conf

    listen-address=127.0.0.1,192.168.1.2
    port=53
    bind-interfaces
    
    # upstream DNS Server (pfsense)
    expand-hosts
    server=/lab.domain.net/192.168.1.1
    domain=lab.domain.net,192.168.1.0/24
    rebind-domain-ok=/plex.direct/
    
    resolv-file=/etc/resolv.dnsmasq
    strict-order
    
    # advanced options
    filterwin2k
    cache-size=100000
    dns-forward-max=1000
    neg-ttl=60
    max-ttl=3600
    min-cache-ttl=600
    
    # logging
    log-facility=/var/log/dnsmasq.log
    log-queries
    log-async=10
    

    You will also need to configure dnscrypt-proxy. Basically, all your clients will use dnsmasq (192.168.1.2 in my configuration above as example) as the primary DNS. Dnsmasq will forward all requests to dnscrypt-proxy in your Linux box and your requests leave your network encrypted.

    As you can see, local requests are forwarded to pfsense instead of dnscrypt-proxy for local resolution in dnsmasq.conf:

    server=/lab.domain.net/192.168.1.1
    

    That should give you an idea.

    Cheers!



  • @ralphys thanks for this long answer! I will run some tests as soon as I get around to it! Reporting back here.


Log in to reply