Slow DNS after 22.05
-
Hello. I virtualize pfSense on my Proxmox node and I experienced slow DNS resolving via unbound (default behavior). This was not the case with version 22.01.
Anyone else experiencing this?
Thank you! -
Need a little more info in order to make a better diagnosis or suggest a troubleshooting path. You say "default behavior", so am assuming you have DNS Resolver enabled and running as a true resolver and your LAN clients are using pfSense for DNS.
Are DNS lookups slow because
unbound
(the DNS Resolver) is stopped or frequently restarting, or isunbound
running but lookups are taking a very long time to actually execute?Checking the pfSense system log can help you narrow down which of the two possibilities above might be the cause. If something is causing
unbound
to frequently restart on your upgraded system, then DNS will appear to be very slow as no client lookups can succeed when the DNS daemon is restarting. -
I was experiencing the same issue using unbound the built in DNS resolver, a manual restart of the service after the upgrade seems to have resolved the slowness for me.
I believe I use all default settings apart from the following advanced settings which were adjusted a long time ago to improve DNS response times:
Prefetch Support - Enabled
Prefetch DNS Key Support - Enabled
Harden DNSSEC Data
Serve Expired - Enabled -
Netgate SG-1100, after updating to 22.05, log file is full of lines like
Jun 30 17:54:10 unbound 35746 [35746:0] error: recvfrom 23 failed: Protocol not available
Then it will stop the service and restart it.
-
@domnado said in Slow DNS after 22.05:
Netgate SG-1100, after updating to 22.05, log file is full of lines like
Jun 30 17:54:10 unbound 35746 [35746:0] error: recvfrom 23 failed: Protocol not available
Then it will stop the service and restart it.
Do you have DNSSEC enabled by chance? If so, try disabling it for a test and restartingunbound
. I'm basing this hunch on the results of a quick Google search for error messages similar to yours withunbound
.If that clears up the problem, then I suspect your SG-1100 needs to be powered all the way down gracefully, and then restarted. So that means doing a shutdown from the DIAGNOSTICS menu, waiting for the shutdown to complete and box to halt, then remove the power for several seconds. Restore power and let it boot up again. The SG-1100, if I recall correctly, can sometimes have its crypto chip get into a state where it fails and the ONLY way to fix it is a power-off reboot. A typical restart will not reset the hardware.Scratch my first idea, misread the Google result against your error message.
Other Google hits suggest perhaps something going on with the NIC. For example, one user reported similar error messages when his NIC was dropping checksum error packets due to a driver problem.
-
Hope I'm not speaking too soon, I tried the halt and pull power cord for 30 seconds, system has been up for 10 minutes now with zero error messages.
-
@domnado said in Slow DNS after 22.05:
Hope I'm not speaking too soon, I tried the halt and pull power cord for 30 seconds, system has been up for 10 minutes now with zero error messages.
Resetting the crypto chip hardware certainly won't hurt anything, and it may be the solution. I did a some quick scans through the
unbound
source code but was unable to locate that specific error message template text. I was hoping if I found the error message in the source code that it would help identify a possible cause. -
Unbound started acting up again, same error messages. I did make a change to the DNS Resolver settings, but only to the Network Interfaces section. I had to halt the system and unplug power for it to operate normally again.
-
@domnado said in Slow DNS after 22.05:
Unbound started acting up again, same error messages. I did make a change to the DNS Resolver settings, but only to the Network Interfaces section. I had to halt the system and unplug power for it to operate normally again.
I think you tickled a clue there -- "make a change to the DNS Resolver settings, but only to the Network Interfaces section."
What specifically did you change there? What setting was working versus what setting you changed it to that resulted in the error message?
-
At first "Network Interfaces" was set to ALL, first I changed it to everything but ALL (LAN, WAN IPv6 Link-Local, LAN IPv6 Link-Local, and Localhost), then I just changed it to LAN and Localhost. I also turned off both Prefetch options in Advanced Settings when I selected LAN and Localhost interfaces. Both changes were fine after a halt and power cord pull. The errors only started after clicking the Apply Changes button.
-
I'm having the same problem with slow DNS after 22.05.
I've had my setup (Netgate 2100) for over a year, everything has been fine.
Suddenly DNS queries are timing out.
No, I didn't change anything, other than to install the upgrade when prompted to do so.
Any suggestions? -
@jax I moved to a virtualized OPNsense instance since the start of the thread. For now, having a better experience. No problems resolving DNS.
-
@mihaifpopa said in Slow DNS after 22.05:
virtualized OPNsense instance
That's good. I'm on a Netgate device and I'd like it to go back to working correctly!
-
@jax What are you seeing when you go to the Diagnostics->DNS Lookup page?
-
First try: about a 9 second wait followed by the correct answer.
Second try: about a 22 second wait followed by the correct answer.The pfSense display shows that 127.0.0.1 is timing out.
I have no idea why the Netgate device is querying itself.
As soon as it queries the next device upstream it gets an answer.Name server Query time 127.0.0.1 938 msec 192.168.xx.xx 48 msec
-
Ha! In General Setup -> DNS Resolution Behavior I chose "Use remote DNS servers, ignore local DNS" and things look better now. We'll see if that fixes it.
-
@jax Sounds like DNS Resolver is stopped.
Go to the Service->DNS Resolver page and click the "start" icon in the header, of Status->Services and click it there.FWIW reliance on the ISP DNS servers may result in being handed misleading DNS records. Remember when ISPs would resolve unresolving IPs and pass you to a search page? This helps you avoid that, among other things.
-
@rcoleman-netgate Okay, I restarted the DNS Resolver and have set the DNS Resolution Behavior back to use local DNS with fallback to remote. We'll see how this goes.
-
Do you run Suricata by any chance??
-
@cool_corona No, I don't.
-
@rcoleman-netgate Goes back to lousy performance. I've set it back to using remote DNS.
-
@jax Do you have any DNS specified in general settings??
-
@cool_corona No, no specified DNS servers. It's just using the default, the upstream WAN DHCP-assigned server.
-
@jax Can you pls. uncheck it
No DNS server overrides and test again.
-
@cool_corona What package(s) are installed?
-
@cool_corona Trying it unchecked with local + fallback.
-
@rcoleman-netgate no packages installed, just the default Netgate installation
-
@cool_corona Testing with no dns server overrides as you suggested seems to give me the same good performance that was only achieved previously by bypassing the pfSense resolver.
Can you explain this a little bit, please?
-
@jax It overrides the WAN DHCP DNS provided by your ISP provider and that can take some speed out of the equation.
You dont have to handshake and verify the DNS by the ISP and oes directly to the 13 root DNS servers.
-
Hmm, there still seems to be weird intermittent slowness in name resolution.
I dunno. This may be beyond my personal ability to debug. -
The slowness seems to be mostly focused on cdn services.
-
This is really quite frustrating, I'm not getting anywhere debugging this slowness problem.
-
@jax said in Slow DNS after 22.05:
This is really quite frustrating, I'm not getting anywhere debugging this slowness problem.
The first step in troubleshooting is to isolate the problem. Since you've tried a number of things on pfSense itself, why not take pfSense's DNS completely out of the picture?
-
Do this -- in the SYSTEM > GENERAL SETUP page, down in the DNS Settings area, put 8.8.8.8 (the Google DNS server IP) in the DNS Servers box. Save that change.
-
Next, go to SERVICES > DHCP SERVER and in Servers in the DNS Servers box also put 8.8.8.8. This will tell the DHCP server to give your LAN clients the Google DNS server for name resolution.
Now pfSense is out of the picture unless you have created any DNS related firewall rules previously. See how things behave with this test setup. If things are good, then you can assume you are having issues with
unbound
on your box when using the default settings. Those default settings configure the DNS Resolver to "resolver mode" and hand out the address of the pfSense box as the DNS server for your DHCP clients.If things are still poor, then pfSense it likely not at fault here (assuming you don't have a firewall rule in the way), and you need to look elsewhere for the problem.
If you have any DNS related firewall rules, make sure you are allowing both UDP and TCP for port 53 as some DNS lookups will need to use TCP.
-
-
-
I am too having problems after 22.05 upgrade with dns resolves timing out completely
unbound logs does not show any problems.
config haven't changed from 22.01 where dns worked perfectly.
running bare metal
plugins: openvpn client export, nut service for ups, watchdog that's it. -
@bmeeks I took your suggestion and this morning things seem to be working better.
We'll see how things go on later in the day, thanks for your help. -
@bmeeks of course this very much suggests pfSense DNS is indeed the problem.
-
@jax said in Slow DNS after 22.05:
@bmeeks of course this very much suggests pfSense DNS is indeed the problem.
But it's not a widespread problem or the forum here would be overflowing with posts about it. There are only a few. Not saying there can't be a problem, but it's not affecting everyone it seems.
It's entirely possible your virtualization environment could be at fault here as well. There could be an issue with the latest pfSense (FreeBSD) version and Proxmox.
-
@bmeeks Thanks again for your help.
I'm on a Netgate device that I purchased with pfSense already installed so no virtualization issues that would be unique to my setup. I have made no software modifications. There are many variables here:
- pfSense DNS
- pfSense DHCP interacting with desktop operating system
- pfSense DNS interacting with service provider premises devices
... and so forth.
In any case, the setup has been working for about 18 months for me, I personally made no changes and the problem seemed to emerge with pfSense 22.05. However, Correlation ≠ Causation as we all have been taught So I suppose I will have to continue to gather clues and see what I can figure out over time .
-
@vaidas I've come to report the same thing.
I thought it was PfBlockerNG-devel but even with that off I'm seeing CDN content fail (e.g. YouTube). If I bypass unbound by using a client VPN, no problems.
Seeing a lot of timeout and "domain not found" and have to manually reload pages to get them to load.
No changes were made to the unbound settings between 22.01 and 22.05.
-
@lohphat Yes, I'm seeing the same thing, that the problem seems to be mostly with cdn site resolution.