Strange unbound issue
Apologies if this has been raised before, but I have a very strange issue when attempting to use unbound. I am using 2.2.4
Basically I have found that I cannot resolve certain domains initially, but if I wait a few minutes they then resolve.
So for example, if I try to ping www.tescobank.com
sebsmacbook:~ seb$ ping www.tescobank.com ping: cannot resolve www.tescobank.com: Unknown host sebsmacbook:~ seb$ ping www.tescobank.com ping: cannot resolve www.tescobank.com: Unknown host sebsmacbook:~ seb$ ping www.tescobank.com sebsmacbook:~ seb$ dig www.tescobank.com ; <<>> DiG 9.8.3-P1 <<>> www.tescobank.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 42740 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.tescobank.com. IN A ;; Query time: 876 msec ;; SERVER: 10.0.10.1#53(10.0.10.1) ;; WHEN: Thu Aug 13 11:43:05 2015 ;; MSG SIZE rcvd: 35
if I then wait a minute and try again
sebsmacbook:~ seb$ ping www.tescobank.com PING www.tescobank.com (188.8.131.52): 56 data bytes sebsmacbook:~ seb$ dig www.tescobank.com ; <<>> DiG 9.8.3-P1 <<>> www.tescobank.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17424 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.tescobank.com. IN A ;; ANSWER SECTION: www.tescobank.com. 43099 IN A 184.108.40.206 ;; Query time: 26 msec ;; SERVER: 10.0.10.1#53(10.0.10.1) ;; WHEN: Thu Aug 13 11:44:59 2015 ;; MSG SIZE rcvd: 51
It is only a few sites that do this, I have no idea what causes it, before I just give up completely and go back to the dns forwarder I wondered if anyone else has come across this and what they did to fix it. It feels like maybe unbound is being too aggressive with backing off?
too aggressive in backing off?
You see this right? status: SERVFAIL
that you got 26 ms when clearly you are getting that from cache on the ttl seems odd
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.tescobank.com. IN A
;; ANSWER SECTION:
www.tescobank.com. 43126 IN A 220.127.116.11
;; Query time: 1 msec
;; SERVER: 192.168.9.253#53(192.168.9.253)
;; WHEN: Thu Aug 13 09:35:58 CDT 2015
;; MSG SIZE rcvd: 62
You do understand a resolver talks to the authoritative servers for a domain to get answer. so yes if the name servers are on the other side of the globe and they suck or your connection sucks then yeah you can have timeouts when doing queries.
so you see in my first query it took a while to get that answer
;; Query time: 670 msec
;; SERVER: 192.168.9.25
But second time it was instant. Curious to why your local query to pfsense would take 26 ms??
Are you using dnssec? this can slow down queries as well. Run a dig +trace to see the path to look up what your interested in and see where the problem might be in the chain. Or maybe its just the authoritative servers for that domain are not good, etc.
user@ubuntu:~$ dig www.tescobank.com +trace
; <<>> DiG 9.9.5-3ubuntu0.4-Ubuntu <<>> www.tescobank.com +trace
;; global options: +cmd
. 470293 IN NS i.root-servers.net.
. 470293 IN NS m.root-servers.net.
. 470293 IN NS f.root-servers.net.
. 470293 IN NS d.root-servers.net.
. 470293 IN NS k.root-servers.net.
. 470293 IN NS e.root-servers.net.
. 470293 IN NS h.root-servers.net.
. 470293 IN NS b.root-servers.net.
. 470293 IN NS j.root-servers.net.
. 470293 IN NS c.root-servers.net.
. 470293 IN NS a.root-servers.net.
. 470293 IN NS g.root-servers.net.
. 470293 IN NS l.root-servers.net.
. 470293 IN RRSIG NS 8 0 518400 20150822170000 201 50812160000 1518 . FxJN8Ehr2iJlNqWYAz7k1+cIVAHR+CoHSSOc6aL0saMFDprK0wDp2alu aLaK ePXjsQmPKhblwm39Oi0s5a95yyAAe0ENQvvHP8ulTF7J4hf+2hlA wPcbxpcRwLcqLMYL2n8tO2ErPmX JTmx/qCyHtXQFBx4aRp0hisWZ5a/L 6aA=
;; Received 397 bytes from 192.168.9.253#53(192.168.9.253) in 1663 ms
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 86400 IN DS 30909 8 2 E2D3C916F6DEEAC73294E8 268FB5885044A833FC5459588F4A9184CF C41A5766
com. 86400 IN RRSIG DS 8 1 86400 20150822170000 2015 0812160000 1518 . enjxq7j0RibyC1CosJbsrdBq9zStRipAId7MShiW9kqlPSYzzK2NmjhH g9D7g T1fyh8lJWKNs/pJZW/lkOEXO8xNUD7BC/bPY9WOITg292WvJflr bpknnMN2U1XeXrJ22xM2tiFhhM13 We0AB4eF6Rdq3z7vll+53mJ5SaYB /yA=
;; Received 741 bytes from 18.104.22.168#53(k.root-servers.net) in 1064 ms
tescobank.com. 172800 IN NS pdns194.ultradns.net.
tescobank.com. 172800 IN NS pdns194.ultradns.com.
tescobank.com. 172800 IN NS pdns194.ultradns.info.
tescobank.com. 172800 IN NS pdns194.ultradns.co.uk.
tescobank.com. 172800 IN NS pdns194.ultradns.biz.
tescobank.com. 172800 IN NS pdns194.ultradns.org.
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0QFMDQRCSRU0651QL VA1JQB21IF7UR NS SOA RRSIG DNSKEY NSEC3PARAM
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN RRSIG NSEC3 8 2 86400 20150820045 245 20150813034245 35864 com. DUHg41F+98lGhf7Raspw1f6Sk7jqJjLfmLj5bRqa8KEUT5boTG K1qKNI JyJKvbzSMzEnOkwekQqIXvZrcbKEQAOJGZMNL8NDNL/3E9/mTO46EFkv jjON5AtDDnqWL3bi bNXmFrDkAVw1KYDSiLqJ7wyEXtQ2o3eBPZYP9/2U TW4=
S005QJCBHSRBI0H8A1QO0SSHE8JMM13J.com. 86400 IN NSEC3 1 1 0 - S00CTJ7ECU2P7T725IH BUGP2R2898MVU NS DS RRSIG
S005QJCBHSRBI0H8A1QO0SSHE8JMM13J.com. 86400 IN RRSIG NSEC3 8 2 86400 20150820045 732 20150813034732 35864 com. LZc6JJbtmwBJAofMfL5xgb39bpNRHcolinD4VE/ZIjLo1K9wuu kKhXnJ UMAnhK8+0sceuqJ9QZu9Bl278Hv1arT3LEHLuDaX1jMuqRfuZTx0QBcD 3ol5M2/koO27L26e MHYYoj0gOWN6CABohsVc/qFW037uKpPHAPtY16rF 9Y0=
;; Received 823 bytes from 22.214.171.124#53(c.gtld-servers.net) in 892 ms
www.tescobank.com. 600 IN NS gslb1.tescobank.uk.com.
www.tescobank.com. 600 IN NS gslb2.tescobank.uk.com.
;; Received 99 bytes from 2001:502:4612::e6#53(pdns194.ultradns.org) in 180 ms
www.tescobank.com. 43200 IN A 126.96.36.199
;; Received 51 bytes from 188.8.131.52#53(gslb1.tescobank.uk.com) in 105 ms
Thanks for your reply,
First off the reason for the 26ms delay is that I was connecting over vpn when I generated those examples, but the issues presents itself when connecting locally as well.
While I do understand that the resolver will go and talk to the authoritative servers in order to get an answer, I thought this process has a timeout, which increases after every failure.
Maybe "too aggressive back off" is not the right way to word what I was asking. What I actually meant is the maybe the initial timeout is too small.
Also and forgive my ignorance assuming unbound doesn't get a response on the first try does it respond to the client immediately with a failure and continue to retry, so a request that comes in later will work as by this point unbound will have succeeded and cached the result?
When I get home I will try a trace and see what it says.
timeout is on the client site for how long it waits for a dns response. As to unbound backoff, etc.. here
But your issue when you try to ping is client timeout not the server.
tescobank.com NSes reply really slowly at times.
;; Query time: 2160 msec
and your client doesn't wait long enough before timing out in that case. May get better results enabling forwarding mode and using Google public DNS, since they're more likely to have it in cache hence avoiding that recursion delay.
Thanks for the info guys.
I tried to do a +trace but that doesn't work when using unbound unless you make some settings change, but since the overall conclusion is that the name servers are slow and the clients themselves are not waiting long enough I have opted to turn on forwarding in Unbound and this has solved the issue.
Maybe when I get some free time I will tackle trying to increase the timeout on those machines.
Out of interest what is better, using dnsmasq or unbound as a forwarder?
huh setting changes? trace is done from the client, so unless you have 53 blocked outbound from your clients doing the +trace what would unbound have to do with it?
If your going to do just pure forwarding - I would say that dnsmasq is better suited. With dnsmasq out of the box it will query all the forwarders and take the first response. I don't believe unbound works that way in forwarder mode.
Also from just a simple config - dnsmasq does not do resolver mode. So you know if your using "forwarder" its going to be doing forwarding. With unbound its more designed to be a true resolver so while it supports forwarder mode - makes more sense to me to just use dnsmasq.
When using dig +trace I have the same issue as described in this post, I get back an empty response.
Without enabling snoop it would appear that +trace doesn't work.
So far I'm happy with unbound acting as a forwarder, but I may go back to dnsmasq if any specific issues present themselves later on. I just wondered if there was any particular reason not to use unbound in that configuration.
A domain using shitty DNS servers is not pfSense issue… Not exactly sure what solution you are searching for here - their DNS servers take seconds to respond -> broken crap.