Slow DNS after 22.05
-
@lohphat said in Slow DNS after 22.05:
Confusing
True.
But changing a setting on a first page that automatically change (disables) a setting on a second page is more confusing.
All this is IMHO of course.Right now, you are hinted to disable first the extra 'special' DNSSEC settings and the second page, and then, at last; disable DNSSEC all together on the main page.
And strange .... DNSSEC works so well for me the last several years already.
I'm using DNSSEC, as pfSense is setup out of the box to use it.
Because a good 'flat' classic Internet connection should not intervene with my outgoing traffic.
If this wasn't the case, I would change ISP ASAP.These :
I've checked years ago.
Never had the need to remove them (which means I rarely visit sites with DNSSEC issues, I guess) -
@gertjan After all the testing the setting which has seemed to solve my problem is to enable "Serve Expired".
That was not enabled when I was running pre 22.05 so I wonder what changed between 22.01 and 22.05 which changed the behavior in my environment.
So far, after re-enabling DNSSEC and the Experimental 0x 20 support, things are working again -- just "Serve Expired" seems to have been the issue.
-
@lohphat I would turn on prefetch as well.
You also might want to set a min ttl. You mention seems to be problem with cdn stuff - possible the ttl is so freaking small that if your having time problems with resolving - you could run into timeouts..
I looked at one example you gave - 300 seconds..
;; QUESTION SECTION: ;i.ytimg.com. IN A ;; ANSWER SECTION: i.ytimg.com. 300 IN A 142.250.191.150 i.ytimg.com. 300 IN A 142.250.191.182 i.ytimg.com. 300 IN A 142.250.191.214 i.ytimg.com. 300 IN A 142.250.191.246 i.ytimg.com. 300 IN A 142.251.32.22 i.ytimg.com. 300 IN A 142.250.190.22 i.ytimg.com. 300 IN A 142.250.190.54 i.ytimg.com. 300 IN A 142.250.190.86 i.ytimg.com. 300 IN A 142.250.190.118 i.ytimg.com. 300 IN A 142.250.190.150 i.ytimg.com. 300 IN A 172.217.0.182 i.ytimg.com. 300 IN A 172.217.1.118 i.ytimg.com. 300 IN A 172.217.2.54 i.ytimg.com. 300 IN A 172.217.4.54 i.ytimg.com. 300 IN A 172.217.5.22 i.ytimg.com. 300 IN A 172.217.4.86
-
Hi, +1 here on the issue with erratic behaviour on DNS lookups since 22.05 update.
I'm going to read the full chain here, but lots of similarities in the bits I've skim-read.
Lots of these entries in the logs for unbound
Jul 13 21:43:18 unbound 982 [982:1] error: recvfrom 22 failed: Protocol not available
I'm using Cloudflare DNS servers, not allowing my WAN connections DHCP settings to flow through and have things set to "use remote, ignore local". I don't have these DNS servers set in my DHCP settings.
DNS Forwarder is inactive, DNS Resolver is active.
-
Over the last few days the only change I've made in addition to "Serve Expired" and add a minimum TTL of 900 sec (setting the help text doesn't specify units, but I have a long-standing complaint on the lack of min detail in setting help text). I also turned off "Use Experimental 0x 20" for DNS spoofing; this too over several days proved unstable (and a change between 22.01 to 22.05 as it was working fine before).
So yes, something has significantly changed in unbound in the last release.
-
@lohphat said in Slow DNS after 22.05:
something has significantly changed
Yeah it did went from version 1.12 or .13.something to 1.15
I have had zero issues with resolving anything. And unbound currently has been running for
[22.05-RELEASE][admin@sg4860.local.lan]/root: unbound-control -c /var/unbound/unbound.conf status version: 1.15.0 verbosity: 1 threads: 4 modules: 2 [ validator iterator ] uptime: 899181 seconds options: control(ssl) unbound (pid 87400) is running... [22.05-RELEASE][admin@sg4860.local.lan]/root:
900k seconds = like 10 days..
While not saying your not having issues - clearly it something with your connection or unique to your setup because if it was something wrong with unbound itself - then everyone running 22.05 would complaining..
-
@johnpoz I think you're probably right. The issue is most likely down to a combination of 22.05 running on my specific hardware (NG 3100 which uses ARM that someone said further up has quirks on occasion) with my specific setup (which isn't far from a few tweaks from vanilla).
What I'm hoping is, someone smarter than me will be able to point me in the right direction.
I'm going to try telling my devices to use an external DHCP server, effectively bypassing pfSense and see if that improves things.
-
@istacey said in Slow DNS after 22.05:
@johnpoz I think you're probably right. The issue is most likely down to a combination of 22.05 running on my specific hardware (NG 3100 which uses ARM that someone said further up has quirks on occasion) with my specific setup (which isn't far from a few tweaks from vanilla).
What I'm hoping is, someone smarter than me will be able to point me in the right direction.
I'm going to try telling my devices to use an external DHCP server, effectively bypassing pfSense and see if that improves things.
Hey!
Like I told before I am also having the same problem since 22.05 on my NG-3100 without changing anything else on the configuration. I´ve also tested different settings with the DNS Resolver, but with no success. After all I´m now using a DNS Resolver installed on my NAS - System, wich is set up as DNS Server in the DHCP - Settings.
With this everything is fine and works like before. But I´d like to change the settings back to the pfSense as DNS Resolver and hope the error will be find.
Greetings,
Markus
-
So far so good with DNS servers issued via DHCP to client devices.
Simple things like playing audio via Amazon Echo works, no intermittent problems with websites that I know are up.
Fingers crossed this is a sufficient work around.
-
Hi! Many helpful posts here!
Just wanted to mention that I'm also seeing the intermittently slow resolution described above:
Loading of websites often require refreshes to either have the site name resolved or CDN for images or stylesheets. I'd like to emphasize the intermittent nature of the problem -- I have duckduckgo.com set as my default search engine (i.e. a very frequently visited site) and have gotten name resolution errors in the browser time and time again over the last weeks with no clear pattern for when it's happening.I have a Netgate 2100 and upgraded from version 22.01 to 22.05 a few weeks ago. The problem started with the upgrade. I had not made changes to the DNS Resolver settings before so the default of using the DNS servers given via DHCP on WAN was reflected on the front page with three servers listed, 127.0.0.1 being the first. Client devices were given the pfSense IP as their DNS Server.
To remedy the situation I tried adding CloudFlare's 1.1.1.1 and 1.0.0.1 as DNS servers in System > General Setup and subsequently unchecked "Allow DNS server list to be overridden by DHCP/PPP on WAN or remote OpenVPN server" but the problem persisted.
Based on replies in this thread, I checked "Serve Expired" on Services > DNS Resolver > Advanced Settings. The problem still occurs from time to time although seeingly less frequent. Resolution appears slow.
Further, I tried disabling DNSSEC (unchecked "Enable DNSSEC Support" in Services > DNS Resolver > General Settings) and disabled hardening of DNSSEC data (unchecked "Harden DNSSEC Data" in Services > DNS Resolver > Advanced Settings). Failures still occur.
To circumvent these problems I temporarily disabled the DNS Resolver.
I'll be watching this thread, hoping a solution pops up.
-
@kvhs said in Slow DNS after 22.05:
Hi! Many helpful posts here!
Just wanted to mention that I'm also seeing the intermittently slow resolution described above:
Loading of websites often require refreshes to either have the site name resolved or CDN for images or stylesheets. I'd like to emphasize the intermittent nature of the problem -- I have duckduckgo.com set as my default search engine (i.e. a very frequently visited site) and have gotten name resolution errors in the browser time and time again over the last weeks with no clear pattern for when it's happening.I have a Netgate 2100 and upgraded from version 22.01 to 22.05 a few weeks ago. The problem started with the upgrade. I had not made changes to the DNS Resolver settings before so the default of using the DNS servers given via DHCP on WAN was reflected on the front page with three servers listed, 127.0.0.1 being the first. Client devices were given the pfSense IP as their DNS Server.
To remedy the situation I tried adding CloudFlare's 1.1.1.1 and 1.0.0.1 as DNS servers in System > General Setup and subsequently unchecked "Allow DNS server list to be overridden by DHCP/PPP on WAN or remote OpenVPN server" but the problem persisted.
Based on replies in this thread, I checked "Serve Expired" on Services > DNS Resolver > Advanced Settings. The problem still occurs from time to time although seeingly less frequent. Resolution appears slow.
Further, I tried disabling DNSSEC (unchecked "Enable DNSSEC Support" in Services > DNS Resolver > General Settings) and disabled hardening of DNSSEC data (unchecked "Harden DNSSEC Data" in Services > DNS Resolver > Advanced Settings). Failures still occur.
To circumvent these problems I temporarily disabled the DNS Resolver.
I'll be watching this thread, hoping a solution pops up.
Following on from my original reply where it looked like restarting the service resolved... it didn't.
Just wanted to say I have had the same experience - tried many of the suggestions here. I have tried with the resolver/forwarder, with DNSSEC enabled/disabled. Tried pre-fetch keys, harden DNSSEC data.
I have given up with the slow or unresponsive DNS resolution since 22.05 and put my clients on Google DNS over TLS which is working perfectly.
Hopefully somebody can find a solution as I rather liked using the resolver on my SG2100.
-
In summary, my fixes have been stable.
- Enable Serve Expired -- this helped with CDN lookups. This was not set in 22.01
- Set minimum TTL to 300 seconds. This was not set in 22.01
- Disable Experimental 0x 20 support -- this was working in 22.01 but caused instability in 22.05.
So far things have been stable for over a week. I tried with and without pfBlocker-devel and various attempts to use forwarding or not (it was necessary while I was searching for a fix but I'm back to resolving locally again).
So yes, it seems "something has changed" but there's no smoking gun.
-
There are a number of bug fixes on Unbound since 1.15.0 which pfSense 22.05 uses, but I don't have enough knowledge of DNS to determine if those fixes are likely to fix these problems.
https://github.com/NLnetLabs/unbound/tags
I find this one solved in 1.16.0 interesting though: https://github.com/NLnetLabs/unbound/issues/670
-
Having this issue with an SG-6100 after going from 22.01 to 22.05 also. So far the Enable Serve Expired seems to be resolving the issue, but time will tell
-
Also seeing these intermittent DNS issues on my 5100 since updating to 22.05.
Haven't had a chance to troubleshoot yet but same issues outlined above.
Will try enabling Serve Expired tomorrow and see if that resolves. -
@lohphat said in Slow DNS after 22.05:
Set minimum TTL to 300 seconds. This was not set in 22.01
I enabled Serve Expired but this didn't seem to help in my case.
Experimental 0x 20 support was already disabled.Is the min TTL setting: Minimum TTL for RRsets and Messages?
So far I'm just thinking of rolling back to 22.01.
It seems like whatever was updated in unbound is causing issues for a small subset of us. -
I may have to roll back as well. The Enable Serve Expired (seemingly) does help a little, but I am still getting dns timeouts frequently. I have now also enabled cache-min-ttl (also known as Minimum TTL for RRsets and Messages) to 300 sec. My Experimental 0x20 support has never been enabled.
https://nlnetlabs.nl/documentation/unbound/unbound.conf/
Not sure if this is related (probably should be talking on unbound's GitHub at this point) but I'm seeing a bunch of "outnettcp got tcp error -1" in debug logs when turned up to logging level 4.
-
@kvhs said in Slow DNS after 22.05:
I find this one solved in 1.16.0 interesting though: https://github.com/NLnetLabs/unbound/issues/670
This seems a reasonable trail to start following -- this may be an out of memory/heap issue.
Just curious, for those of us seeing issues are you also running IPv6? I am.
In the bug notes it seems that disabling IPv6 addressed the issue as less memory overhead is needed. I wonder if the unbound changes may necessitate bumping up memory allocation to prevent spurious lookup failures.
-
Just enabled logging level 4 and also see a few 'outnettcp got tcp error -1' errors but no idea if it's related.
Also running IPv6.
Not sure I can actually rollback unless I can use config backup from 22.05 on 22.01.
Wondering if it would be better if I just wipe and reinstall 22.05, then restore config just in case something got messed up with the upgrade.I believe I saw @johnpoz runs an SG-5100 too, and upgraded from 22.01 to 22.05 and doesn't have the same problems.
-
Add me to the list #nothingtodeclare
Running 22.05 one a Intel based box, a SG 4100.
I'm using IPv6, although tunnel based, using ipv6.he.netunbound settings are native, that is, I'm not forwarding, unbound makes use of the "13 main Internet Root servers".
On the Services > DNS Resolver > Advanced Settings I have set :
Query Name Minimization
Prefetch Support
Prefetch DNS Key Support
Harden DNSSEC Data
Serve Expired
Keep Probing
Experimental Bit 0x20 Support
Other values are - I guess, default.On the Services > DNS Resolver >General Settings page :
Network Interfaces : All
Outgoing Network Interfaces : All
DNSSEC : Enabled ( Remember : DNSSEC makes sense only when you are NOT forwarding )
Python Module : Ebaled ( As I'm using pfBlockerng-devel also)
Note : DHCP Registration NOT set, which means unbound doesn't get restarted on every DHCP lease event. All known important LAN devices have static MAC DHCP leases.
Static DHCP : enabled (as this one won't restart unbound)
Custom options : None.Memory usage ? How often unbound restarts ? Requests handled ?
I have it all the hard numbers and graphs, so I can see if something is happening, and I can check if setting makes any changes.
Look here.Remember : this is DNS. I can't have or tolerate a 'doesn't work'
Also : Netgate pfSense comes with a default DNS set up. This one works out of the box(teher might be one exception, read below) : why not using that setting and be done with it ?
And no, Netgate does not ask you to forward to any DNS requests to some company's remote resolver. pfSense has its own resolver.Yes, I've tried forwardig, it did seem to work fine, but I never kept this mode for longer as a couple of days. I guess I don't need a remote resolver as unbound does a good job doing that for me.
Btw ;: I'm using 22.05 on a SG4100 for a couple of weeks now. Before that, I was using a bare bone Intel box using a quand Intel NIC setup. Never had any issues except for the major unbound bugs that touches everybody back then, and that was always corrected immediately.
I never had to go back a previous version, and that for the last 10+ years, since pfSense version 1.xNetworks usage : 3 LANs,, one major company LAN, one untrusted client "captive portal" LAN with a bunch of access points for the hotel clients, one DMZ type LAN.
No VLAN stuffI'm using pfBlockerng-devel, it syncs feeds ones a week, with a minimal feeds list. I'm just blocking the major adds and bads hit list.
My ISP gives me a good (I guess) uplink with an static IPv4. It's still VDSL copper wire (about 24 Mbits sec down). This will be fibre in a very near future.
I tend to use pfSense functionality that I "know", that I can debug, that I trust, that I understand.
And one last thing :And please, I do not want to offend any one here :
I rent a 'big' bare bone server for for my web sites, mail and other stuff like Munin. I'm handling all my own DNS needs myself, using bind (named), about 20 domain names. My registrar's name server entries point to my own DNS name servers, a master and two or three DNS backup servers (small VPSs).
For 99,9 % of the time, I check regularly my DNS. For example, I use this site to mention just one.
Because I'm doing my own DNNSEC, I use this. Many other test sites exists.
The majority of the DNS tests are done remotely and locally using "dig".And here it is : no one should be handling its own DNS, as this forces you to fully understand what DNS is, how it works, how to see issues and how to deal with them
I didn't saw another way to fully understand this 'DNS' thing.
But, suddenly, when you know all DNS, DNS will never be an issue any more.Consider this : take a small 1 $/€ a month VPS, and a domain name (5 $/€ a year ?) a play with your own domain name. You'll be massacring loads of misunderstandings pretty fast.
This was called 'learning' back then ;)Do not underestimate the number of times your local pfSense has no issue at at, but your simply visiting a site that has issues with it's DNS. Just wait it out. Don't start modifying your own setup as it was good already.
Even Facebook managed to completely disappear from the net, a year or so ago, because some guy really messed up. -