DNS resolver very slow
-
Hey y'all
Over the last two weeks my DNS resolver at one site has become so slow that most sites are timing outTL;DR - DNS Resolver taking 2000-3000 msecs to resolve many hosts; sites and apps timing out as a result; not using forwarding
We run two redundant internal DNS (for AD and internal hosts and domains), both of those forward to pfSense's Unbound resolver for two reason:
- hosts not in AD or the local domains
- pfSense is the first box to come up after an outage or power failure - so having it serve as DHCP and the 3rd assigned DNS is a fail-safe
We don't wish to use forwarders or our ISP's DNS.
Note: we've tried with forwarding to 8.8.8.8 and 8.8.4.4 and to ISP's DNS - no material difference.
Question:
What are the best practices for optimizing query result times to root DNS servers with pfSense's Unbound?Details
- pfSense - 2.6
- internal (AD) DNS - bind 9.16 on FreeBSD
- all hosts hardwired on 10Gbps links
Results
(Tried to pick some random FQDNs that wouldn't be cached)
pfSense / unbound: 10.15.1.1
AD DNS: 10.15.15.57Random domain using unbound
dig slack.com @10.15.1.1 ;; Query time: 996 msec ;; SERVER: 10.15.1.1#53(10.15.1.1) (UDP) ;; WHEN: Thu May 04 16:20:26 UTC 2023 ;; MSG SIZE rcvd: 182
Local host using AD DNS
dig server.local.foo.bar @10.15.15.57 ;; Query time: 8 msec ;; SERVER: 10.15.15.57#53(10.15.15.57) (UDP) ;; WHEN: Thu May 04 16:22:47 UTC 2023 ;; MSG SIZE rcvd: 142
Random domain using AD DNS forwarding to Unbound on pfSense
dig um.edu @10.15.15.57 ;; Query time: 820 msec ;; SERVER: 10.15.15.57#53(10.15.15.57) (UDP) ;; WHEN: Thu May 04 16:25:07 UTC 2023 ;; MSG SIZE rcvd: 138
Random domain using AD DNS forwarding to Unbound on pfSense IPv6
dig wv.edu @10.15.1.1 AAAA
;; Query time: 860 msec
;; SERVER: 10.15.1.1#53(10.15.1.1) (UDP)
;; WHEN: Thu May 04 16:26:38 UTC 2023
;; MSG SIZE rcvd: 110 -
@spacebass said in DNS resolver very slow:
DNS Resolver taking 2000-3000 msecs
My first thought if I read that is that if IPv6 is configured but isn't working, IPv6 will try to connect and wait ~2s to time out, then IPv4 works.
-
@steveits said in DNS resolver very slow:
My first thought if I read that is that if IPv6 is configured but isn't working, IPv6 will try to connect and wait ~2s to time out, then IPv4 works.
it is a valid and still likely hypothesis
Wouldn't this suggest that IPv6 resolution is working?
I note and appreciate that the response time was reasonably fast!
Also is there a way to build a fail-safe rule? Like to try IPv4 roots first?
- 10.15.1.1 is pfSense unbound
- um.edu is a domain I've never queried before
โฐโโ dig um.edu @10.15.1.1 AAAA ; <<>> DiG 9.10.6 <<>> um.edu @10.15.1.1 AAAA ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 37430 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;um.edu. IN AAAA ;; AUTHORITY SECTION: edu. 900 IN SOA a.edu-servers.net. nstld.verisign-grs.com. 1683250401 1800 900 604800 86400 ;; Query time: 184 msec ;; SERVER: 10.15.1.1#53(10.15.1.1) ;; WHEN: Thu May 04 19:33:47 MDT 2023 ;; MSG SIZE rcvd: 110
-
@spacebass said in DNS resolver very slow:
Wouldn't this suggest that IPv6 resolution is working?
One can resolve AAAA records over IPv4 which is what your example is. The question would be, can it connect over IPv6.
um.edu does not have an AAAA record which matches your example. Try cnn.com which does:
dig cnn.com aaaa
or using Quad9:
dig cnn.com aaaa @2620:fe::9In pfSense there is an option in System/Advanced/Networking to "Prefer IPv4 over IPv6" which is only for pfSense itself.
-
@spacebass said in DNS resolver very slow:
um.edu is a domain I've never queried before
not even a valid domain, even via IPv4.. Which is why you got back SOA from one of the .edu NS.
-
@johnpoz said in DNS resolver very slow:
not even a valid domain, even via IPv4.. Which is why you got back SOA from one of the .edu NS.
ha! Didn't even notice that, I was so focused on the query time only
-
@steveits said in DNS resolver very slow:
In pfSense there is an option in System/Advanced/Networking to "Prefer IPv4 over IPv6" which is only for pfSense itself.
thanks, thats helpful
What else would that affect besides Unbound... updates?
Would it affect resolution of VPN peers? -
@spacebass said in DNS resolver very slow:
What else would that affect besides Unbound... updates?
All that setting means is pfsense should use ipv4 before IPv6 if it has both.. This would be for when it talks to something, like checking if new pfsense available, when it checks to see what packages available..
keep in mind that if the package like pfblocker makes the actual request, and not the OS (pfsense) it might try IPv6 first..
-
@johnpoz thanks! that's clear
-
still working on troubleshooting this ... not sure how exactly.
I have 'Prefer to use IPv4 even if IPv6 is available' checked
IPv6 seems to be configured correctly. I get addresses for each LAN client. I can ping6 any internal and external server with an IPv6 address.
But nonetheless, resolves are quite slow whenever IPv6 is enabled.
Might that just be the nature of my routes to root servers? Or could there be something else on my end to check?
-
bumping this one - still looking for some troubleshooting tips.
I've set pfSense to prefer IPv4 but AAAA queries still take between 2-8msecs.
Any tips for troubleshooting or resolving or tweaking? -
@spacebass said in DNS resolver very slow:
ight that just be the nature of my routes to root servers
Keep in mind root servers are only talked to get the gltd servers for whatever tld the domain is your looking for .net, .com, .org, etc..
So you not very often talk to the actual roots - only when your looking up something that has a different tld that you have not looked up before, like .biz or .edu etc..
It then asks those for the authoritative ns for domain.tld - it then talks that NS directly forwhatever host.domain.tld your looking for.
The roots and the gtld servers are then cached for quite some time, so you wouldn't really need to ask them again when looking up host.domain.tld or otherthing.domain.tld that you already know the authoritative NSers for..
Its quite possible the domain or domains your having slow resolving time with - you have a bad ipv6 connection too, etc.. Unbound will normally tend to use the fastest NS for some domain when the domain has multiple NSers.. which is really a requirement - but have seen some shit domains where its the same box, just different IP, etc. etc..
I would look to a specific domain your having slowness with, and look to see what kind of latency you see when you query them directly for some record they are authoritative for..
-
-
@steveits thanks - that's interesting. I'm not forwarding to quad 9 but I'll check and see if disabling aslr makes a difference.