Strange unbound issue



  • Apologies if this has been raised before, but I have a very strange issue when attempting to use unbound. I am using 2.2.4

    Basically I have found that I cannot resolve certain domains initially, but if I wait a few minutes they then resolve.

    So for example, if I try to ping www.tescobank.com

    sebsmacbook:~ seb$ ping www.tescobank.com
    ping: cannot resolve www.tescobank.com: Unknown host
    sebsmacbook:~ seb$ ping www.tescobank.com
    ping: cannot resolve www.tescobank.com: Unknown host
    sebsmacbook:~ seb$ ping www.tescobank.com
    
    sebsmacbook:~ seb$ dig www.tescobank.com
    
    ; <<>> DiG 9.8.3-P1 <<>> www.tescobank.com
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 42740
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.tescobank.com.		IN	A
    
    ;; Query time: 876 msec
    ;; SERVER: 10.0.10.1#53(10.0.10.1)
    ;; WHEN: Thu Aug 13 11:43:05 2015
    ;; MSG SIZE  rcvd: 35
    

    if I then wait a minute and try again

    sebsmacbook:~ seb$ ping www.tescobank.com
    PING www.tescobank.com (178.17.68.12): 56 data bytes
    
    sebsmacbook:~ seb$ dig www.tescobank.com
    
    ; <<>> DiG 9.8.3-P1 <<>> www.tescobank.com
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17424
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.tescobank.com.		IN	A
    
    ;; ANSWER SECTION:
    www.tescobank.com.	43099	IN	A	178.17.68.12
    
    ;; Query time: 26 msec
    ;; SERVER: 10.0.10.1#53(10.0.10.1)
    ;; WHEN: Thu Aug 13 11:44:59 2015
    ;; MSG SIZE  rcvd: 51
    

    It is only a few sites that do this, I have no idea what causes it, before I just give up completely and go back to the dns forwarder I wondered if anyone else has come across this and what they did to fix it. It feels like maybe unbound is being too aggressive with backing off?


  • Rebel Alliance Global Moderator

    too aggressive in backing off?

    You see this right? status: SERVFAIL

    that you got 26 ms when clearly you are getting that from cache on the ttl seems odd

    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 4096
    ;; QUESTION SECTION:
    ;www.tescobank.com.            IN      A

    ;; ANSWER SECTION:
    www.tescobank.com.      43126  IN      A      178.17.68.12

    ;; Query time: 1 msec
    ;; SERVER: 192.168.9.253#53(192.168.9.253)
    ;; WHEN: Thu Aug 13 09:35:58 CDT 2015
    ;; MSG SIZE  rcvd: 62

    You do understand a resolver talks to the authoritative servers for a domain to get answer.  so yes if the name servers are on the other side of the globe and they suck or your connection sucks then yeah you can have timeouts when doing queries.

    so you see in my first query it took a while to get that answer
    ;; Query time: 670 msec
    ;; SERVER: 192.168.9.25

    But second time it was instant.  Curious to why your local query to pfsense would take 26 ms??

    Are you using dnssec?  this can slow down queries as well.  Run a dig +trace to see the path to look up what your interested in and see where the problem might be in the chain.  Or maybe its just the authoritative servers for that domain are not good, etc.

    user@ubuntu:~$ dig www.tescobank.com +trace

    ; <<>> DiG 9.9.5-3ubuntu0.4-Ubuntu <<>> www.tescobank.com +trace
    ;; global options: +cmd
    .                      470293  IN      NS      i.root-servers.net.
    .                      470293  IN      NS      m.root-servers.net.
    .                      470293  IN      NS      f.root-servers.net.
    .                      470293  IN      NS      d.root-servers.net.
    .                      470293  IN      NS      k.root-servers.net.
    .                      470293  IN      NS      e.root-servers.net.
    .                      470293  IN      NS      h.root-servers.net.
    .                      470293  IN      NS      b.root-servers.net.
    .                      470293  IN      NS      j.root-servers.net.
    .                      470293  IN      NS      c.root-servers.net.
    .                      470293  IN      NS      a.root-servers.net.
    .                      470293  IN      NS      g.root-servers.net.
    .                      470293  IN      NS      l.root-servers.net.
    .                      470293  IN      RRSIG  NS 8 0 518400 20150822170000 201                                                            50812160000 1518 . FxJN8Ehr2iJlNqWYAz7k1+cIVAHR+CoHSSOc6aL0saMFDprK0wDp2alu aLaK                                                            ePXjsQmPKhblwm39Oi0s5a95yyAAe0ENQvvHP8ulTF7J4hf+2hlA wPcbxpcRwLcqLMYL2n8tO2ErPmX                                                            JTmx/qCyHtXQFBx4aRp0hisWZ5a/L 6aA=
    ;; Received 397 bytes from 192.168.9.253#53(192.168.9.253) in 1663 ms

    com.                    172800  IN      NS      a.gtld-servers.net.
    com.                    172800  IN      NS      b.gtld-servers.net.
    com.                    172800  IN      NS      c.gtld-servers.net.
    com.                    172800  IN      NS      d.gtld-servers.net.
    com.                    172800  IN      NS      e.gtld-servers.net.
    com.                    172800  IN      NS      f.gtld-servers.net.
    com.                    172800  IN      NS      g.gtld-servers.net.
    com.                    172800  IN      NS      h.gtld-servers.net.
    com.                    172800  IN      NS      i.gtld-servers.net.
    com.                    172800  IN      NS      j.gtld-servers.net.
    com.                    172800  IN      NS      k.gtld-servers.net.
    com.                    172800  IN      NS      l.gtld-servers.net.
    com.                    172800  IN      NS      m.gtld-servers.net.
    com.                    86400  IN      DS      30909 8 2 E2D3C916F6DEEAC73294E8                                                            268FB5885044A833FC5459588F4A9184CF C41A5766
    com.                    86400  IN      RRSIG  DS 8 1 86400 20150822170000 2015                                                            0812160000 1518 . enjxq7j0RibyC1CosJbsrdBq9zStRipAId7MShiW9kqlPSYzzK2NmjhH g9D7g                                                            T1fyh8lJWKNs/pJZW/lkOEXO8xNUD7BC/bPY9WOITg292WvJflr bpknnMN2U1XeXrJ22xM2tiFhhM13                                                            We0AB4eF6Rdq3z7vll+53mJ5SaYB /yA=
    ;; Received 741 bytes from 193.0.14.129#53(k.root-servers.net) in 1064 ms

    tescobank.com.          172800  IN      NS      pdns194.ultradns.net.
    tescobank.com.          172800  IN      NS      pdns194.ultradns.com.
    tescobank.com.          172800  IN      NS      pdns194.ultradns.info.
    tescobank.com.          172800  IN      NS      pdns194.ultradns.co.uk.
    tescobank.com.          172800  IN      NS      pdns194.ultradns.biz.
    tescobank.com.          172800  IN      NS      pdns194.ultradns.org.
    CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0QFMDQRCSRU0651QL                                                            VA1JQB21IF7UR NS SOA RRSIG DNSKEY NSEC3PARAM
    CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN RRSIG NSEC3 8 2 86400 20150820045                                                            245 20150813034245 35864 com. DUHg41F+98lGhf7Raspw1f6Sk7jqJjLfmLj5bRqa8KEUT5boTG                                                            K1qKNI JyJKvbzSMzEnOkwekQqIXvZrcbKEQAOJGZMNL8NDNL/3E9/mTO46EFkv jjON5AtDDnqWL3bi                                                            bNXmFrDkAVw1KYDSiLqJ7wyEXtQ2o3eBPZYP9/2U TW4=
    S005QJCBHSRBI0H8A1QO0SSHE8JMM13J.com. 86400 IN NSEC3 1 1 0 - S00CTJ7ECU2P7T725IH                                                            BUGP2R2898MVU NS DS RRSIG
    S005QJCBHSRBI0H8A1QO0SSHE8JMM13J.com. 86400 IN RRSIG NSEC3 8 2 86400 20150820045                                                            732 20150813034732 35864 com. LZc6JJbtmwBJAofMfL5xgb39bpNRHcolinD4VE/ZIjLo1K9wuu                                                            kKhXnJ UMAnhK8+0sceuqJ9QZu9Bl278Hv1arT3LEHLuDaX1jMuqRfuZTx0QBcD 3ol5M2/koO27L26e                                                            MHYYoj0gOWN6CABohsVc/qFW037uKpPHAPtY16rF 9Y0=
    ;; Received 823 bytes from 192.26.92.30#53(c.gtld-servers.net) in 892 ms

    www.tescobank.com.      600    IN      NS      gslb1.tescobank.uk.com.
    www.tescobank.com.      600    IN      NS      gslb2.tescobank.uk.com.
    ;; Received 99 bytes from 2001:502:4612::e6#53(pdns194.ultradns.org) in 180 ms

    www.tescobank.com.      43200  IN      A      178.17.68.12
    ;; Received 51 bytes from 178.17.64.7#53(gslb1.tescobank.uk.com) in 105 ms

    user@ubuntu:~$



  • Thanks for your reply,

    First off the reason for the 26ms delay is that I was connecting over vpn when I generated those examples, but the issues presents itself when connecting locally as well.

    While I do understand that the resolver will go and talk to the authoritative servers in order to get an answer, I thought this process has a timeout, which increases after every failure.
    Maybe "too aggressive back off" is not the right way to word what I was asking. What I actually meant is the maybe the initial timeout is too small.
    Also and forgive my ignorance assuming unbound doesn't get a response on the first try does it respond to the client immediately with a failure and continue to retry, so a request that comes in later will work as by this point unbound will have succeeded and cached the result?

    When I get home I will try a trace and see what it says.


  • Rebel Alliance Global Moderator

    timeout is on the client site for how long it waits for a dns response.  As to unbound backoff, etc.. here
    https://www.unbound.net/documentation/info_timeout.html

    But your issue when you try to ping is client timeout not the server.



  • tescobank.com NSes reply really slowly at times.

    ;; Query time: 2160 msec
    
    

    and your client doesn't wait long enough before timing out in that case. May get better results enabling forwarding mode and using Google public DNS, since they're more likely to have it in cache hence avoiding that recursion delay.



  • Thanks for the info guys.

    I tried to do a +trace but that doesn't work when using unbound unless you make some settings change, but since the overall conclusion is that the name servers are slow and the clients themselves are not waiting long enough I have opted to turn on forwarding in Unbound and this has solved the issue.

    Maybe when I get some free time I will tackle trying to increase the timeout on those machines.

    Out of interest what is better, using dnsmasq or unbound as a forwarder?

    Seb


  • Rebel Alliance Global Moderator

    huh setting changes?  trace is done from the client, so unless you have 53 blocked outbound from your clients doing the +trace what would unbound have to do with it?

    If your going to do just pure forwarding - I would say that dnsmasq is better suited.  With dnsmasq out of the box it will query all the forwarders and take the first response.  I don't believe unbound works that way in forwarder mode.

    Also from just a simple config - dnsmasq does not do resolver mode.  So you know if your using "forwarder" its going to be doing forwarding.  With unbound its more designed to be a true resolver so while it supports forwarder mode - makes more sense to me to just use dnsmasq.



  • Hi,

    When using dig +trace I have the same issue as described in this post, I get back an empty response.

    http://unbound.net/pipermail/unbound-users/2010-November/001489.html

    Without enabling snoop it would appear that +trace doesn't work.

    So far I'm happy with unbound acting as a forwarder, but I may go back to dnsmasq if any specific issues present themselves later on. I just wondered if there was any particular reason not to use unbound in that configuration.


  • Banned

    A domain using shitty DNS servers is not pfSense issue… Not exactly sure what solution you are searching for here - their DNS servers take seconds to respond -> broken crap.