DNS issues after upgrading from 2.4.3 to 2.4.4
-
Disable DHCP registration. I'm pretty sure that's going solve the issue. That's what's causing resolver to restart. Of course Unbound can't resolve any DNS queries while the service is restarting. That's why sometimes resolution is very delayed or fails completely. I had this exact same issue. Disabling DHCP registartion fixed it. There are many threads on this issue.
https://forum.netgate.com/topic/120838/unbound-appears-to-restart-frequently-and-fails-to-resolve-domains-sometimes/9https://forum.netgate.com/topic/80517/unbound-seems-to-be-restarting-frequently
-
The problem reoccurs roughly once a day for a couple of minutes. Then pfsense will work smoothly again without further intervention.
Right now, it turned up again.
@stephenw10 I tried pinging WAN1's monitoring IP via a different WAN line - RTT is very acceptable there. I tried putting a different monitoring IP, but the problem persists.
@Raffi_ I disabled DHCP registration right now. No improvement up to here, but I'll keep an eye on it.Oh, and there was one misinformation up there: on the machine we're currently using, pfblocker is still installed, but disabled. Not sure if this is of interest.
-
It hasn't happened again in the last three days. So was it really just disabling DHCP registration?
Well, thanks a bunch for the time being! -
I hope that was it. Keep an eye on it and let us know if not.
-
Hmm, curious. What's odd is that most people who have that enabled don't hit it. I've never seen any problems with it in testing. It could be simply a scaling issue; it works fine with a few test clients but gets overwhelmed on a large network. But we have many large customer networks not hitting it either.
When you look into what that is doing it's not hard to see why it causes some disruption . It's almost harder to see how it does not! Yet in the majority of cases it doesn't.Thanks for the update anyway.
Steve
-
@stephenw10 that's interesting. In my scenario it's a relatively small network of about 35 clients, about half of which have DHCP static mappings. I could not get reliable behavior from Unbound without disabling DHCP registration. Although it would be nice to have that feature enabled, having DNS queries fail is not an option.
-
It mostly depends on how quickly unbound restarts. For most people it's very fast, but the more you have in it (say, DNSBL lists), the slower it gets. Also depends on hardware speed.
Another case for offloading DNS to a proper DNS server in some way. Either offload DHCP registration to a proper BIND setup that can handle dynamic registration from dhcpd, or offload the filtering aspects to something like PiHole.
I suppose you could also do something convoluted like setup the DNS Forwarder on an alternate port, with DHCP registration enabled there, and then setup a domain override in the DNS Resolver to send queries for hosts it doesn't know in the domain there. That seems like a bad idea, though. :-)
-
@jimp that would explain it. In my case it's a nearly 10 year old desktop and even when it was new it wasn't top of the line. First gen i5 with 8 GB RAM and 120 GB SSD. I do use DNSBL with over 100k IP's/URLs. The idea of Bind in conjunction with Unbound is interesting.
-
How many DHCP cleints? What is the lease time?
I imagine a large DNSBL being added would slow down the Unbound restart time. Did you ever test it with that disabled?
Steve
-
@stephenw10 The setup has about 15 DHCP clients with the default 7200 second lease time. I don't remember if I tried enabling DHCP registration before setting up DNSBL. I'm currently running the setup in a production environment, so testing that is unfortunately not something I can do.
With DHCP registration enabled, it seems that each time a DHCP request is made, Unbound is restarting. With my lease time set to 2 hours, it makes sense that I was having a lot of trouble with Unbound restarting. I assume increasing the lease time to a day would dramatically reduce the number of times I see the problem in my case.
-
My mistake, almost all 35 or so clients are DHCP. About 16 out of that 35 are also DHCP but not statically mapped. I believe DHCP requests are sent our regardless of static mapping.
-
If you have it set to register dhcp clients in dns - then yes I believe unbound restarts.. So if you have lots of clients via dhcp and short lease times you prob have a lot of restarts of unbound.
I do not recall if they ever worked out where unbound doesn't have to restart to add new dhcp client in the dns listing?
I personally don't see the point of registering dhcp clients.. Static makes sense since if your taking the time to reserve and IP for a client then you prob have need of resolving it via name..
Why do you need to resolve these dhcp clients by name? If you do why not setup a reservation for them ;)
Also if your on the same network as the clients and dns does not resolve - windows will broadcast for the name ;)
7200 second lease time
That is a really LOW lease time... So every 3600 seconds or so you going to see a request for renewal... Why would you have lease so low... Are these clients that are very transient and you have too many clients for the available IPs? I set all my leases to 4days...
They should prob change that default - seems pretty freaking low.. 2 hours.. default of 24 hours would prob better choice if you ask me.
-
It's still not a huge number of clients though. I have to assume Unbound is slow to start on your system.
Try restarting it manually, check the logs, how long does it actually take?
Steve
-
@johnpoz I agree on all points. I personally don't have a need to resolve DHCP clients. It would be more of a nice to have thing. That's why I never put any effort into finding an alternative solution to having DHCP registration in Unbound or Bind or elsewhere.
Yes, the default 7200 second lease time is much to short. I probably should change that :0
I never realized it was so short until I looked it up to answer Steve. I agree with you that making the default lease time 24 hours sounds more reasonable.@stephenw10 Is this it? I had this in my log since Unbound restarts everyday due to DNSBL cron updates.
Nov 5 00:00:12 unbound 10476:0 notice: Restart of unbound 1.7.3.
Nov 5 00:00:13 unbound 10476:0 notice: init module 0: validator
Nov 5 00:00:13 unbound 10476:0 notice: init module 1: iterator
Nov 5 00:00:13 unbound 10476:0 info: start of service (unbound 1.7.3).1 second doesn't seem bad, but I think it could be a combination of short lease times, multiple DHCP requests, Unbound restarting or in the process of restarting and all of that creating the perfect storm for delayed/failed DNS queries.
-
There should be some time before that also, I expect it the restart process to be between:
Nov 5 21:41:24 unbound 51810:0 info: service stopped (unbound 1.7.3).
and
Nov 5 21:41:25 unbound 51810:0 info: start of service (unbound 1.7.3).
For example.Though that's still within 1s and I have DNSBL enabled. But far less entries than you.
Steve
-
I have had issues with DNS upgrading to 2.4.4 as well. It was so intense that I would lose internet connectivity every 3-5 mins. Pinging an external IP would work, but pinging anything with a name (www.google.com) wouldn't.
I also have a VPN client running and initially I thought it might be the VPN causing issues. I was going back and forth with my VPN provider to see what I could do. After a lot of reading I went against their tutorials and stopped forwarding DNS queries and started using DNS resolver. Also did a few other things at the time.
Overall, I ended up installing pfSense 4 times and setting things up over and over again. Finally, for me it turned out to be the excess blocking that I had enabled in pfBlocker. I had subscribed to too many lists, I guess. I ran pfSense without pfBlocker for a week and had no issues. Finally I enabled pfBlocker again, but only subscribed to 1 EasyList. This is a lot less than what I was subscribed to with 2.4.3 and still had no issues (in 2.4.3).
I do get a few more ads on my pages than I would like, but at least my wife isn't on my case every 5 mins. :)
I will keep watching this thread to see if there are other pointers that I can tweak in order to block as many ads, junk sites etc. without losing my mind over dropped DNS requests/unbound restarts.
EDIT: I just checked and it seems that I do have the
DHCP Registration, Static DHCP Registration & OpenVPN Clients Registration all checked.Can't remember if those were all checked when I was having issues or was this something that I enabled after I got stable network, however.
-
Any idea how many DNSBL entries?
I have 20480 here currently and have never seen any issues.
Steve
-
@stephenw10 Yup, the service stopped entry in my log had the same time stamp as the restart entry so I left it out since it was negligible.
-
@inxsible Disable DHCP registration if you're having issues with unbound restarts. It's a feature you probably don't need anyway, so any minor benefit you get from it is not worth the cost of having unbound restarts triggered.
Also, below is a great video on getting things going with pfblocker. You don't have to use all lists and recommendations, but this is where I started and I don't have many false positives.
https://www.youtube.com/watch?v=QwFpMwXEK5w&list=LLKjPM3pDxt_EiYOfJgxsvQQ&t=305s&index=5 -
@raffi_ Thanks. I will check the video out and see what I can tweak. As for the tutorials that I followed regarding pfBlocker setup were these:
https://www.linuxincluded.com/block-ads-malvertising-on-pfsense-using-pfblockerng-dnsbl-old/
The first couple of times that I set up pfBlocker, I used all the lists that he mentioned in that blog -- except the TLD blocking. Currently, I only use 1 EasyList.