Unbound DNS Resolver crashing randomly



  • Still - No one knows anything about your settings in dns resolver or system > general….



  • Default settings. This is a clean install. Just did a plain vanilla install this morning to rule out any user entered settings killing it.

    Definite issue in the resolver. Noticed this happening frequently while using eBay android app. Never saw such issues in 2.1.5



  • Cool - 64bit?  Pure hardware.  No VM?



  • Yup  amd64 on i3



  • This issue has started to become a nuisance. Kids have started to complain about it happening every 30 mins. Sometimes twice every 15 mins. Did a clean install again but it's still the same.
    No errors logged and service is up the whole time. Only way temporary solution is to do a manual service restart.

    Is anyone working on fixing this?


  • Banned

    Since noone can reproduce it, pretty much doubt anyone's working on it. Maybe you have some lolcats in there?!  :o



  • Could be…




  • I wonder if a packet dump of DNS traffic on the WAN port is in order.



  • Others seeing similar issue as well

    https://forum.pfsense.org/index.php?topic=88272.0


  • Banned

    …yeah, but nobody want's to play with me anymore ;-) ...not even doktormotor :-D



  • But mine works.
    And in response to the "default" install, I set explicit/specific in the page Services: DNS Resolver(General settings):

    Enabled True.
    Network Interfaces : LAN's & Localhost
    Outgoing Network Interfaces : All
    All others choices are set False.



  • I'm guessing it's not really crashed from the sounds of it (read: it's still running). This sounds like the issue in the "lolcats" thread doktornotor linked.

    Go to Services>DNS Resolver, Advanced, make sure you have "Harden Glue" and "Harden DNSSEC data" both enabled.



  • @cmb:

    I'm guessing it's not really crashed from the sounds of it (read: it's still running). This sounds like the issue in the "lolcats" thread doktornotor linked.

    Go to Services>DNS Resolver, Advanced, make sure you have "Harden Glue" and "Harden DNSSEC data" both enabled.

    Yes.  This was the issue.  This was the new symptom after enabling DNSSEC (without Harden Glue).
    I posted this before I realized it was a symptom of the same issue when DNSSEC was turned on.



  • I'm using version 2.2.2 of pfsense but the problem also occurred with version 2.2
    Apparently when the unbound is on the machine after a few days of use begins to show great instability.
    I lost the connection to the network interface lan in pfsense several times during the afternoon. After disabling the unbound problems ceased.
    I initially uses the unbound to rewrite the domain of youtube.com, the process worked for about a week correctly, but stopped suddenly in one day. To solve the problem it was necessary to restart the service unbound and everything worked properly for a while. Referring again to happen after a few minutes.
    The problem became serious when the next day was no longer possible to log in pfsense the web interface. I suspected that someone had managed to invade pfsense and damage files somehow.
    I reinstalled the machine and everything was ok for about seven days, but this afternoon it started again.
    I lost communication with the LAN interface of my pfsense, unplug and plug the network cable lan solve the problem, but soon returned to happen.
    Finally I turned off the unbound and everything has stabilized.
    I suspect the unbound is tipping the entire operating system somehow in pfsense.



  • I am having the same issue.  Unbound doesn't "crash" it just ramps up to 100% CPU and becomes unusable.  Sometimes it goes away by itself, other times I have to restart the service to make it usable again.

    Here is a link to the post I made: https://forum.pfsense.org/index.php?topic=93846.msg520894#msg520894

    Interesting problem indeed….



  • @hda thank you for your post. I have Active Directory locally yet I still wanted to use DNS Resolver because all of my OpenVPN clients and DHCP reservations would all resolve. I had the hardest time... it would work for a day or so and then no longer resolve properly via nslookup or the host commands. I had to bounce the resolver service every time.

    Seeing your post made me think that I should be more explicit and specifically pick the interface LAN & Localhost instead of All from the dropdown. I've now been up for 2-3 days with zero issues. Lesson learned that being explicit is not just for programming but for every part of life :)

    Thanks again!



  • @JZng you are an all powerful necromancer, but in a good way at least.



  • @harvy66 I try brother :) didn't mean to wake the dead but what an important concept to pay forward. Leaving All in DNS Resolver > Network Interfaces made the pfSense resolver fail at random.



  • I'm getting similar issues. Looking in my DNS log i'm having the following show up:

    Jan 7 11:03:22	unbound	70295:0	fatal error: Could not read config file: /unbound.conf. Maybe try unbound -dd, it stays on the commandline to see more errors, or unbound-checkconf
    Jan 7 11:03:22	unbound	70295:0	notice: Restart of unbound 1.8.1.
    Jan 7 11:03:22	unbound	70295:0	info: mesh has 0 recursion states (0 with reply, 0 detached), 0 waiting replies, 0 recursion replies sent, 0 replies dropped, 0 states jostled out
    Jan 7 11:03:22	unbound	70295:0	info: server stats for thread 3: requestlist max 0 avg 0 exceeded 0 jostled 0
    Jan 7 11:03:22	unbound	70295:0	info: server stats for thread 3: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Jan 7 11:03:22	unbound	70295:0	info: mesh has 0 recursion states (0 with reply, 0 detached), 0 waiting replies, 0 recursion replies sent, 0 replies dropped, 0 states jostled out
    Jan 7 11:03:22	unbound	70295:0	info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0
    Jan 7 11:03:22	unbound	70295:0	info: server stats for thread 2: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Jan 7 11:03:22	unbound	70295:0	info: mesh has 0 recursion states (0 with reply, 0 detached), 0 waiting replies, 0 recursion replies sent, 0 replies dropped, 0 states jostled out
    Jan 7 11:03:22	unbound	70295:0	info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0
    Jan 7 11:03:22	unbound	70295:0	info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimitin
    

    I've got the following settings in my advanced dns resolver:
    0_1546859353375_001d7d19-d0e0-4817-8afb-67ec6d9c4fb4-image.png

    and when I check the DNS resolver status page, I get the message that the resolver is stopped or disabled.

    Any ideas how i can resolve this please? My wife is going mad! lol



  • Do what the logs files says.

    edit :
    IE : goto console mode, option 8 and enter

    unbound-checkconf