Slow DNS after 22.05
-
@vaidas I've come to report the same thing.
I thought it was PfBlockerNG-devel but even with that off I'm seeing CDN content fail (e.g. YouTube). If I bypass unbound by using a client VPN, no problems.
Seeing a lot of timeout and "domain not found" and have to manually reload pages to get them to load.
No changes were made to the unbound settings between 22.01 and 22.05.
-
@lohphat Yes, I'm seeing the same thing, that the problem seems to be mostly with cdn site resolution.
-
@jax said in Slow DNS after 22.05:
@bmeeks Thanks again for your help.
I'm on a Netgate device that I purchased with pfSense already installed so no virtualization issues that would be unique to my setup.
Sorry, I confused your post with another at the top of this thread where the OP said they were running on Proxmox.
I see where you said you were running on an SG-2100. That is an ARM-based appliance (not Intel). Another poster in this thread has an SG-3100 in his signature. That is also an ARM-based appliance. Could be an issue with the latest
unbound
version and ARM hardware. I did notice that when my SG-5100 updated it pulled down a newunbound
version as part of the upgrade. I've not seen any issue on my SG-5100, but it is Intel-based hardware.There have, in the past, been some weird issues with software running on ARM hardware due in part to some quirkiness with the llvm compiler used.
-
@bmeeks Thanks again for clarifying ... @rcoleman-netgate do you have any further observations on this issue? It does seem to be Netgate-specific.
-
I had to put in a support ticket because I wasn't able to boot my device after trying to rollback to 22.01. While in my support ticket for getting the firmware to reflash I was told about an IP monitoring option "System->Routing->Gateways" edit ipv4 Gateway and to change "Monitor IP" to a public DNS server because the system pings this IP on a consistent basis and some ISPs treat this as a DoS attack and will temporarily block it, then the router will consider the WAN connection down. After using a Google, Cloudflare or OpenDNS server IP address in this field I have not had any issues with unbound on my SG 1100.
-
Datapoint:
I had DNSSEC enabled in 22.01 and the setting carried over into 22.05 when I upgraded this morning.
After playing with different configs all day, turning off DNSSEC seems to have made things stable for me. I'll keep playing with it.
8jul2022 Update: Nope, it was better after startup but started misbehaving anyway after an hour.
9jul2022 Update: The stable config for me is to disable local DNS resolution and just forward it to the upstream DNS providers. DNSSEC is enabled. I've just added pfBlockerNG-devel into the mix and will see how things work over the day.
9jul2022 Update 2: Had to disable pfBlockerNG-devel due to inability to resolve domains. Just running on unbound only, no filtering.
-
I'm getting a lot of these in the DNS Resolver log with pfBlockerNG-devel uninstalled, DNS forwarding and DNSSEC enabled:
Jul 9 12:43:02 filterdns 82159 failed to resolve host steamusercontent.com will retry later again. Jul 9 12:43:02 filterdns 82159 failed to resolve host steamstatic.com will retry later again. Jul 9 12:43:02 filterdns 82159 failed to resolve host steamcontent.com will retry later again.
-
@lohphat said in Slow DNS after 22.05:
DNS forwarding and DNSSEC enabled:
That is never a good idea, if your going to forward where you forward either does dnssec or it doesn't you asking for it does nothing - other than problematic.
If you forward, uncheck use dnssec in the unbound settings.
example, 8.8.8.8 does dnssec be it you ask it to or not
$ dig @8.8.8.8 www.dnssec-failed.org ; <<>> DiG 9.16.30 <<>> @8.8.8.8 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 13556 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; Query time: 83 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sat Jul 09 13:05:11 Central Daylight Time 2022 ;; MSG SIZE rcvd: 50
see how that fails - while if I ask 4.2.2.2 it does not
$ dig @4.2.2.2 www.dnssec-failed.org ; <<>> DiG 9.16.30 <<>> @4.2.2.2 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25984 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 8192 ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; ANSWER SECTION: www.dnssec-failed.org. 7200 IN A 68.87.109.242 www.dnssec-failed.org. 7200 IN A 69.252.193.191 ;; Query time: 191 msec ;; SERVER: 4.2.2.2#53(4.2.2.2) ;; WHEN: Sat Jul 09 13:05:24 Central Daylight Time 2022 ;; MSG SIZE rcvd: 82
-
@johnpoz Yes, I know, but I'm forwarding to the 9.9.9.9 group of servers and they claim to support DNSSEC.
https://quad9.net/support/faq/#dnssec
-
@lohphat said in Slow DNS after 22.05:
9.9.9.9 group of servers and they claim to support DNSSEC.
There is no need to ask for dnssec - they are doing it be it you ask them or not..
$ dig @9.9.9.9 www.dnssec-failed.org ; <<>> DiG 9.16.30 <<>> @9.9.9.9 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 1538 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ; EDE: 9 (DNSKEY Missing) ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; Query time: 72 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: Sat Jul 09 13:10:14 Central Daylight Time 2022 ;; MSG SIZE rcvd: 56
-
@johnpoz Ok, Done. The option needs a better info text as it is it seems to imply that it needs to be on to support any DNSSEC at all.
-
@lohphat I would agree with you, should really have a note under it - If your going to forward do not set this or something like that.
-
Still getting filterdns errors with DNSSEC unchecked:
Jul 9 15:23:01 filterdns 82159 failed to resolve host steamstatic.com will retry later again. Jul 9 15:23:01 filterdns 82159 failed to resolve host steamcontent.com will retry later again. Jul 9 15:23:01 filterdns 82159 failed to resolve host steamusercontent.com will retry later again. Jul 9 15:18:01 filterdns 82159 failed to resolve host steamstatic.com will retry later again. Jul 9 15:18:01 filterdns 82159 failed to resolve host steamcontent.com will retry later again.
-
@lohphat thats not a unbound problem
;; QUESTION SECTION: ;steamstatic.com. IN A ;; AUTHORITY SECTION: steamstatic.com. 1344 IN SOA ns1.valvesoftware.com. admin.valvesoftware.com. 2022041804 3600 900 24192 00 3600 ;; Query time: 84 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: Sat Jul 09 14:31:06 Central Daylight Time 2022 ;; MSG SIZE rcvd: 104
$ dig @9.9.9.9 steamusercontent.com ; <<>> DiG 9.16.30 <<>> @9.9.9.9 steamusercontent.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47670 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;steamusercontent.com. IN A ;; AUTHORITY SECTION: steamusercontent.com. 3600 IN SOA ns1.valvesoftware.com. admin.valvesoftware.com. 2022010300 3600 900 24192 00 3600 ;; Query time: 68 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: Sat Jul 09 14:32:16 Central Daylight Time 2022 ;; MSG SIZE rcvd: 109
I get same results trying to just resolve.. they seem to be having an issue.. Or that isn't meant to resolve in the first place.
-
@johnpoz said in Slow DNS after 22.05:
thats not a unbound problem
I get same results trying to just resolve.. they seem to be having an issue.. Or that isn't meant to resolve in the first place.
What's even stranger, is that the steam.exe client isn't running and it's still throwing the DNS errors hours after the client exited.
I'm still getting occasional failed lookups where I have to force reload a page for the domain to resolve.
Something is still broken.
-
@lohphat said in Slow DNS after 22.05:
force reload a page for the domain to resolve.
You sure your browser isn't doing doh? If you feel something isn't resolving - then troubleshoot it vs just thinking X is the problem. If something is taking long to resolve - why?
Your trying to access www.somedomain.tld - you sure you browser even asked your dns for that? If so why did it not resolve? Where is the delay? Your forwarding - maybe they just suck at resolving, or answering.. Maybe they are having the problem?
Did you restart the browser? Maybe its having an issue with its cache it keeps.
Vs just thinking its something wrong with unbound, find out where the trouble is.. If you ask unbound for xyz, and it goes and asks abc for xyz - did it not get an answer - how long did it take for unbound to ask abc for xyz, after you asked it?
-
@johnpoz said in Slow DNS after 22.05:
@lohphat said in Slow DNS after 22.05:
force reload a page for the domain to resolve.
You sure your browser isn't doing doh? If you feel something isn't resolving - then troubleshoot it vs just thinking X is the problem. If something is taking long to resolve - why?
Some of us experiencing this problem have done a bit of testing, including from the command line. The resolution problem is not specific to browsing, it's an intermittent failure of resolution.
-
@jax said in Slow DNS after 22.05:
it's an intermittent failure of resolution.
This screams unbound restarting.. Is it?
[22.05-RELEASE][admin@sg4860.local.lan]/: unbound-control -c /var/unbound/unbound.conf status version: 1.15.0 verbosity: 1 threads: 4 modules: 2 [ validator iterator ] uptime: 457693 seconds options: control(ssl) unbound (pid 87400) is running... [22.05-RELEASE][admin@sg4860.local.lan]/:
457k seconds up - whats that like 5 days? Only reason its prob not longer is testing something for some thread here, etc.
-
@johnpoz I don't know enough about these mechanisms to respond to your comment. I'd have to go hit the man pages and study more. Unless you can lay this out for me.
-
@jax look in your log - is unbound restarting, if it is - hard to resolve something if its not actually running or just started up couple of ms ago, etc.
You can validate how long its been up with that command..
Again to the root of the problem, if your having issues - why.. Unbound just doesn't say eh I don't feel like resolving that right now ;)
So either it is having a hard time finding what you asked for, or maybe it in the middle of restart why that specific query failed, etc.
Its a long running issue - register dhcp, unbound restarts - this can be quite often depending on the number of dhcp clients, the length of the lease, etc. When unbound restarts, cache is lost, etc
With pfblocker - the length of time for unbound to restart can be much longer than normal, etc.
If you ask unbound for www.domain.tld - and you don't get the answer you want - the question is why? There is almost always a logical explanation to why..