Slow DNS after 22.05
-
@bmeeks Thanks again for clarifying ... @rcoleman-netgate do you have any further observations on this issue? It does seem to be Netgate-specific.
-
I had to put in a support ticket because I wasn't able to boot my device after trying to rollback to 22.01. While in my support ticket for getting the firmware to reflash I was told about an IP monitoring option "System->Routing->Gateways" edit ipv4 Gateway and to change "Monitor IP" to a public DNS server because the system pings this IP on a consistent basis and some ISPs treat this as a DoS attack and will temporarily block it, then the router will consider the WAN connection down. After using a Google, Cloudflare or OpenDNS server IP address in this field I have not had any issues with unbound on my SG 1100.
-
Datapoint:
I had DNSSEC enabled in 22.01 and the setting carried over into 22.05 when I upgraded this morning.
After playing with different configs all day, turning off DNSSEC seems to have made things stable for me. I'll keep playing with it.
8jul2022 Update: Nope, it was better after startup but started misbehaving anyway after an hour.
9jul2022 Update: The stable config for me is to disable local DNS resolution and just forward it to the upstream DNS providers. DNSSEC is enabled. I've just added pfBlockerNG-devel into the mix and will see how things work over the day.
9jul2022 Update 2: Had to disable pfBlockerNG-devel due to inability to resolve domains. Just running on unbound only, no filtering.
-
I'm getting a lot of these in the DNS Resolver log with pfBlockerNG-devel uninstalled, DNS forwarding and DNSSEC enabled:
Jul 9 12:43:02 filterdns 82159 failed to resolve host steamusercontent.com will retry later again. Jul 9 12:43:02 filterdns 82159 failed to resolve host steamstatic.com will retry later again. Jul 9 12:43:02 filterdns 82159 failed to resolve host steamcontent.com will retry later again.
-
@lohphat said in Slow DNS after 22.05:
DNS forwarding and DNSSEC enabled:
That is never a good idea, if your going to forward where you forward either does dnssec or it doesn't you asking for it does nothing - other than problematic.
If you forward, uncheck use dnssec in the unbound settings.
example, 8.8.8.8 does dnssec be it you ask it to or not
$ dig @8.8.8.8 www.dnssec-failed.org ; <<>> DiG 9.16.30 <<>> @8.8.8.8 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 13556 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; Query time: 83 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sat Jul 09 13:05:11 Central Daylight Time 2022 ;; MSG SIZE rcvd: 50
see how that fails - while if I ask 4.2.2.2 it does not
$ dig @4.2.2.2 www.dnssec-failed.org ; <<>> DiG 9.16.30 <<>> @4.2.2.2 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25984 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 8192 ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; ANSWER SECTION: www.dnssec-failed.org. 7200 IN A 68.87.109.242 www.dnssec-failed.org. 7200 IN A 69.252.193.191 ;; Query time: 191 msec ;; SERVER: 4.2.2.2#53(4.2.2.2) ;; WHEN: Sat Jul 09 13:05:24 Central Daylight Time 2022 ;; MSG SIZE rcvd: 82
-
@johnpoz Yes, I know, but I'm forwarding to the 9.9.9.9 group of servers and they claim to support DNSSEC.
https://quad9.net/support/faq/#dnssec
-
@lohphat said in Slow DNS after 22.05:
9.9.9.9 group of servers and they claim to support DNSSEC.
There is no need to ask for dnssec - they are doing it be it you ask them or not..
$ dig @9.9.9.9 www.dnssec-failed.org ; <<>> DiG 9.16.30 <<>> @9.9.9.9 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 1538 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ; EDE: 9 (DNSKEY Missing) ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; Query time: 72 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: Sat Jul 09 13:10:14 Central Daylight Time 2022 ;; MSG SIZE rcvd: 56
-
@johnpoz Ok, Done. The option needs a better info text as it is it seems to imply that it needs to be on to support any DNSSEC at all.
-
@lohphat I would agree with you, should really have a note under it - If your going to forward do not set this or something like that.
-
Still getting filterdns errors with DNSSEC unchecked:
Jul 9 15:23:01 filterdns 82159 failed to resolve host steamstatic.com will retry later again. Jul 9 15:23:01 filterdns 82159 failed to resolve host steamcontent.com will retry later again. Jul 9 15:23:01 filterdns 82159 failed to resolve host steamusercontent.com will retry later again. Jul 9 15:18:01 filterdns 82159 failed to resolve host steamstatic.com will retry later again. Jul 9 15:18:01 filterdns 82159 failed to resolve host steamcontent.com will retry later again.
-
@lohphat thats not a unbound problem
;; QUESTION SECTION: ;steamstatic.com. IN A ;; AUTHORITY SECTION: steamstatic.com. 1344 IN SOA ns1.valvesoftware.com. admin.valvesoftware.com. 2022041804 3600 900 24192 00 3600 ;; Query time: 84 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: Sat Jul 09 14:31:06 Central Daylight Time 2022 ;; MSG SIZE rcvd: 104
$ dig @9.9.9.9 steamusercontent.com ; <<>> DiG 9.16.30 <<>> @9.9.9.9 steamusercontent.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47670 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;steamusercontent.com. IN A ;; AUTHORITY SECTION: steamusercontent.com. 3600 IN SOA ns1.valvesoftware.com. admin.valvesoftware.com. 2022010300 3600 900 24192 00 3600 ;; Query time: 68 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: Sat Jul 09 14:32:16 Central Daylight Time 2022 ;; MSG SIZE rcvd: 109
I get same results trying to just resolve.. they seem to be having an issue.. Or that isn't meant to resolve in the first place.
-
@johnpoz said in Slow DNS after 22.05:
thats not a unbound problem
I get same results trying to just resolve.. they seem to be having an issue.. Or that isn't meant to resolve in the first place.
What's even stranger, is that the steam.exe client isn't running and it's still throwing the DNS errors hours after the client exited.
I'm still getting occasional failed lookups where I have to force reload a page for the domain to resolve.
Something is still broken.
-
@lohphat said in Slow DNS after 22.05:
force reload a page for the domain to resolve.
You sure your browser isn't doing doh? If you feel something isn't resolving - then troubleshoot it vs just thinking X is the problem. If something is taking long to resolve - why?
Your trying to access www.somedomain.tld - you sure you browser even asked your dns for that? If so why did it not resolve? Where is the delay? Your forwarding - maybe they just suck at resolving, or answering.. Maybe they are having the problem?
Did you restart the browser? Maybe its having an issue with its cache it keeps.
Vs just thinking its something wrong with unbound, find out where the trouble is.. If you ask unbound for xyz, and it goes and asks abc for xyz - did it not get an answer - how long did it take for unbound to ask abc for xyz, after you asked it?
-
@johnpoz said in Slow DNS after 22.05:
@lohphat said in Slow DNS after 22.05:
force reload a page for the domain to resolve.
You sure your browser isn't doing doh? If you feel something isn't resolving - then troubleshoot it vs just thinking X is the problem. If something is taking long to resolve - why?
Some of us experiencing this problem have done a bit of testing, including from the command line. The resolution problem is not specific to browsing, it's an intermittent failure of resolution.
-
@jax said in Slow DNS after 22.05:
it's an intermittent failure of resolution.
This screams unbound restarting.. Is it?
[22.05-RELEASE][admin@sg4860.local.lan]/: unbound-control -c /var/unbound/unbound.conf status version: 1.15.0 verbosity: 1 threads: 4 modules: 2 [ validator iterator ] uptime: 457693 seconds options: control(ssl) unbound (pid 87400) is running... [22.05-RELEASE][admin@sg4860.local.lan]/:
457k seconds up - whats that like 5 days? Only reason its prob not longer is testing something for some thread here, etc.
-
@johnpoz I don't know enough about these mechanisms to respond to your comment. I'd have to go hit the man pages and study more. Unless you can lay this out for me.
-
@jax look in your log - is unbound restarting, if it is - hard to resolve something if its not actually running or just started up couple of ms ago, etc.
You can validate how long its been up with that command..
Again to the root of the problem, if your having issues - why.. Unbound just doesn't say eh I don't feel like resolving that right now ;)
So either it is having a hard time finding what you asked for, or maybe it in the middle of restart why that specific query failed, etc.
Its a long running issue - register dhcp, unbound restarts - this can be quite often depending on the number of dhcp clients, the length of the lease, etc. When unbound restarts, cache is lost, etc
With pfblocker - the length of time for unbound to restart can be much longer than normal, etc.
If you ask unbound for www.domain.tld - and you don't get the answer you want - the question is why? There is almost always a logical explanation to why..
-
@johnpoz For one thing the behavior is new since 22.01 to 22.05 and it's happening with common websites using different apps, not just FF.
I have DoH/DoT disabled in both 22.01 and 22.05 but the incidence of failed lookups since the upgrade is consistent over the last two days.
-
@lohphat said in Slow DNS after 22.05:
is new since 22.01 to 22.05
On 22.05 went from 22.01 - not seeing any such issue.. Have had zero issues resolving stuff.
If your having an issue - the logical thing to do is is troubleshoot why, not oh something wrong with version X vs Y.. Maybe there is nothing wrong with Y, but some other variable has been introduced. Like maybe whatever you doing now, unbound is taking longer to restart - maybe on your system it use to restart in like .3 seconds, and now its taking 30.. So before you never noticed, but now you do.
-
@johnpoz As soon as I get I lookup error, I go and look at logs and find nothing out of the ordinary. I have DHCP registration disabled as I know it restarts unbound and with pfB-dev that's a non-starter.
I just can't find a smoking gun yet other than it's still happening and others are reporting it too.