unbound unstable?
-
I use unbound. For the last 8 months or maybe longer I have had intermittent DNS problems. Like it will work for 2 months flawlessly, then problems, then goes away again. Today it was zoom.us was not resolving for a user....
cnn.com worked (I think no one uses cnn.com on the network), everything else failed.
I restart unbound and it works for about 10 minutes again until failing again.
Any help would be appreciated.david@iMac % nslookup zoom.us Server: 192.168.9.1 Address: 192.168.9.1#53 ** server can't find zoom.us: SERVFAIL
-
@david_moo seems like its constantly restarting from your log
Jan 31 11:11:40 unbound 52588 [52588:0] notice: Restart of unbound 1.12.0.
Not going to be very useable if always restarting. Do you have it set to register dhcp clients? I would turn that off.. That has been a issue for long time to be honest.
If it only starts ever now and then on a dhcpd client not a big issue, but if you have lots of clients renewing their IPs all the time (really short lease time) or lots of clients then yeah can be very problematic.
pfblocker restarting unbound can also cause issues, etc.
But from your log and how often its restarting - not going to be very useable no.
-
I do have it set to register dhcp clients. I will turn off.
I have maybe 200 clients, so that could be it. I have lengthened the lease time from default to 14440. So that should cut the traffic in half for a given number of clients.
Thanks for the help. I at least understand what's going on now.
-
@david_moo did you disable dnssec for some specific reason - your not forwarding.. So not sure why you would have that off, unless you had disabled it in an attempt and fixing the issue?
-
Yes it was disabled in the past to try and fix the issue.
I have reenabled it. -
@david_moo so since you disabled dhcp hows it working?
-
I would say no change:
It does find some websites, but not zoom?david@iMac ~ % sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder david@iMac ~ % nslookup zoom.us Server: 192.168.9.1 Address: 192.168.9.1#53 ** server can't find zoom.us: SERVFAIL
The unbound server doesn't seem to be restarting (looking at the end of the log).
log2.txt -
@david_moo well if you can not find something specific.. try doing a trace and see where your failing.. I am not having any issues resolving that.
But when you resolve - its possible your having a hard time talking to the authoritative NS for that domain.
I set it not to do dnssec in the trace - so its easier to read.
[21.05.2-RELEASE][admin@sg4860.local.lan]/root: dig zoom.us +trace +nodnssec ; <<>> DiG 9.16.16 <<>> zoom.us +trace +nodnssec ;; global options: +cmd . 72211 IN NS i.root-servers.net. . 72211 IN NS j.root-servers.net. . 72211 IN NS k.root-servers.net. . 72211 IN NS l.root-servers.net. . 72211 IN NS m.root-servers.net. . 72211 IN NS a.root-servers.net. . 72211 IN NS b.root-servers.net. . 72211 IN NS c.root-servers.net. . 72211 IN NS d.root-servers.net. . 72211 IN NS e.root-servers.net. . 72211 IN NS f.root-servers.net. . 72211 IN NS g.root-servers.net. . 72211 IN NS h.root-servers.net. ;; Received 239 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms us. 172800 IN NS b.cctld.us. us. 172800 IN NS f.cctld.us. us. 172800 IN NS k.cctld.us. us. 172800 IN NS w.cctld.us. us. 172800 IN NS x.cctld.us. us. 172800 IN NS y.cctld.us. ;; Received 402 bytes from 192.203.230.10#53(e.root-servers.net) in 13 ms zoom.us. 7200 IN NS ns-1137.awsdns-14.org. zoom.us. 7200 IN NS ns-1772.awsdns-29.co.uk. zoom.us. 7200 IN NS ns-387.awsdns-48.com. zoom.us. 7200 IN NS ns-888.awsdns-47.net. ;; Received 176 bytes from 2001:dcd:1::15#53(w.cctld.us) in 33 ms zoom.us. 60 IN A 170.114.10.71 zoom.us. 172800 IN NS ns-1137.awsdns-14.org. zoom.us. 172800 IN NS ns-1772.awsdns-29.co.uk. zoom.us. 172800 IN NS ns-387.awsdns-48.com. zoom.us. 172800 IN NS ns-888.awsdns-47.net. ;; Received 192 bytes from 2600:9000:5303:7800::1#53(ns-888.awsdns-47.net) in 54 ms [21.05.2-RELEASE][admin@sg4860.local.lan]/root:
Does your trace work from pfsense?
-
Works from pfsense shell and not from desktop's.
From
[21.05.2-RELEASE][root@pfsense]/root: dig zoom.us +trace +nodnssec ; <<>> DiG 9.16.16 <<>> zoom.us +trace +nodnssec ;; global options: +cmd . 81920 IN NS b.root-servers.net. . 81920 IN NS m.root-servers.net. . 81920 IN NS h.root-servers.net. . 81920 IN NS e.root-servers.net. . 81920 IN NS l.root-servers.net. . 81920 IN NS d.root-servers.net. . 81920 IN NS j.root-servers.net. . 81920 IN NS f.root-servers.net. . 81920 IN NS g.root-servers.net. . 81920 IN NS a.root-servers.net. . 81920 IN NS i.root-servers.net. . 81920 IN NS c.root-servers.net. . 81920 IN NS k.root-servers.net. ;; Received 239 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms us. 172800 IN NS y.cctld.us. us. 172800 IN NS b.cctld.us. us. 172800 IN NS w.cctld.us. us. 172800 IN NS f.cctld.us. us. 172800 IN NS k.cctld.us. us. 172800 IN NS x.cctld.us. ;; Received 432 bytes from 192.112.36.4#53(g.root-servers.net) in 55 ms zoom.us. 7200 IN NS ns-1137.awsdns-14.org. zoom.us. 7200 IN NS ns-387.awsdns-48.com. zoom.us. 7200 IN NS ns-888.awsdns-47.net. zoom.us. 7200 IN NS ns-1772.awsdns-29.co.uk. ;; Received 176 bytes from 37.209.192.15#53(w.cctld.us) in 38 ms zoom.us. 60 IN A 170.114.10.80 zoom.us. 172800 IN NS ns-1137.awsdns-14.org. zoom.us. 172800 IN NS ns-1772.awsdns-29.co.uk. zoom.us. 172800 IN NS ns-387.awsdns-48.com. zoom.us. 172800 IN NS ns-888.awsdns-47.net. ;; Received 192 bytes from 205.251.195.120#53(ns-888.awsdns-47.net) in 34 ms [21.05.2-RELEASE][root@pfsense]/root:
From Desktop:
david@iMac ~ % dig zoom.us +trace +nodnssec ; <<>> DiG 9.10.6 <<>> zoom.us +trace +nodnssec ;; global options: +cmd ;; Received 17 bytes from 192.168.9.1#53(192.168.9.1) in 0 ms david@iMac ~ %
-
@david_moo just try dig zoom.us
ServFail is something else going wrong, and not specific wrong with NS resolving - it could be lots of things that are going wrong.
-
Works from pfsense, and linux boxes. Fails from MacOS boxes (I tried 2). Works from MacOS boxes if I spec another DNS server.
david@iMac Desktop % dig zoom.us +trace +nodnssec ; <<>> DiG 9.10.6 <<>> zoom.us +trace +nodnssec ;; global options: +cmd ;; Received 17 bytes from 192.168.9.1#53(192.168.9.1) in 0 ms
david@iMac Desktop % dig @8.8.8.8 +trace +nodnssec zoom.us ; <<>> DiG 9.10.6 <<>> @8.8.8.8 +trace +nodnssec zoom.us ; (1 server found) ;; global options: +cmd . 34761 IN NS b.root-servers.net. . 34761 IN NS f.root-servers.net. . 34761 IN NS k.root-servers.net. . 34761 IN NS a.root-servers.net. . 34761 IN NS d.root-servers.net. . 34761 IN NS j.root-servers.net. . 34761 IN NS l.root-servers.net. . 34761 IN NS g.root-servers.net. . 34761 IN NS c.root-servers.net. . 34761 IN NS e.root-servers.net. . 34761 IN NS h.root-servers.net. . 34761 IN NS i.root-servers.net. . 34761 IN NS m.root-servers.net. ;; Received 239 bytes from 8.8.8.8#53(8.8.8.8) in 23 ms us. 172800 IN NS f.cctld.us. us. 172800 IN NS y.cctld.us. us. 172800 IN NS w.cctld.us. us. 172800 IN NS x.cctld.us. us. 172800 IN NS b.cctld.us. us. 172800 IN NS k.cctld.us. ;; Received 404 bytes from 193.0.14.129#53(k.root-servers.net) in 58 ms zoom.us. 7200 IN NS ns-387.awsdns-48.com. zoom.us. 7200 IN NS ns-1137.awsdns-14.org. zoom.us. 7200 IN NS ns-888.awsdns-47.net. zoom.us. 7200 IN NS ns-1772.awsdns-29.co.uk. ;; Received 176 bytes from 37.209.192.15#53(w.cctld.us) in 39 ms zoom.us. 60 IN A 170.114.10.75 zoom.us. 172800 IN NS ns-1137.awsdns-14.org. zoom.us. 172800 IN NS ns-1772.awsdns-29.co.uk. zoom.us. 172800 IN NS ns-387.awsdns-48.com. zoom.us. 172800 IN NS ns-888.awsdns-47.net. ;; Received 192 bytes from 205.251.198.236#53(ns-1772.awsdns-29.co.uk) in 24 ms
-
@david_moo what does the imac say when just dig zoom.us
its says servfail?
what about www.zoom.us, which is just the cname that points to zoom.us
-
same
david@iMac Desktop % dig www.zoom.us +trace ; <<>> DiG 9.10.6 <<>> www.zoom.us +trace ;; global options: +cmd ;; Received 17 bytes from 192.168.9.1#53(192.168.9.1) in 0 ms
but remove the trace and all is ok? MacOS trace issue?
david@iMac Desktop % dig www.zoom.us +nodnssec ; <<>> DiG 9.10.6 <<>> www.zoom.us +nodnssec ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34020 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;www.zoom.us. IN A ;; ANSWER SECTION: www.zoom.us. 231 IN CNAME zoom.us. zoom.us. 60 IN A 170.114.10.74 ;; Query time: 23 msec ;; SERVER: 192.168.9.1#53(192.168.9.1) ;; WHEN: Mon Jan 31 16:24:26 AST 2022 ;; MSG SIZE rcvd: 70
-
@david_moo what does it do with just
dig zoom.us
before you got a servfail? Is that still happening?
-
No it's working.
david@iMac Desktop % dig zoom.us ; <<>> DiG 9.10.6 <<>> zoom.us ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37066 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;zoom.us. IN A ;; ANSWER SECTION: zoom.us. 60 IN A 170.114.10.77 ;; Query time: 23 msec ;; SERVER: 192.168.9.1#53(192.168.9.1) ;; WHEN: Mon Jan 31 17:20:38 AST 2022 ;; MSG SIZE rcvd: 52
so in theory everything is ok.
-
@david_moo yeah sure looks like it to me.. Servfails can be tricky sometimes to track down - its an error, but its very vague ;) Going forward there with newer versions of dns clients and servers be possible to get a rcode back via ede that will give you clue to why exactly it failed, etc.
https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-6
edns and the ede can provide lots of info to what is going on, what went wrong, etc.
-
Awsome. Nice RFC should help things when implemented.
-
@david_moo here is prob easier to read with info
https://tools.ietf.org/id/draft-ietf-dnsop-extended-error-11.html
I think I saw somewhere while back cloudflare was starting to provide EDE codes.. Let me see if can find that article
edit: here you go
https://blog.cloudflare.com/unwrap-the-servfail/in the days of just asking your ISP dns, it either worked or it didn't asking for something. But when you start to run your own actual resolver like unbound does out of the box.. Sometimes you need to get a bit deeper into the weeds on why something specific isn't working.. Servfail is just a catch all that doesn't really give you even hint to what is wrong ;) Other than what you asked for failed ;)