DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times
-
@RickyBaker there you go - some actual useful info
So your having some sort of issue with dnssec.. I would expect that to fail with that query - that fqdn is test fqdn for making sure dnssec is working.. But we are seeing the servfail reason..
So now when normal queries fail we might get to the bottom of why your getting servfail vs an answer to what you ask for.
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
does this mean that my plex server isn't using pfsense for dns resolving?
No what it means is its asking the local cache at 127.0.0.53, your command shows that points to 10.10.10.1
Clearly went over this already like 6 days ago...
-
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
So your having some sort of issue with dnssec.
When looking up dnssec-failed.org, what would you expect ?
https://www.internetsociety.org/resources/deploy360/2013/dnssec-test-sites/
-
@Gertjan exactly - like I said ;)
-
First, I would like to again apologize for my lack of knowledge. I promise I'm not trying to be difficult or annoying. This is all foreign terminology and concepts to me, but I'm trying my best and can't quantify how much I appreciate the time you're taking
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
So your having some sort of issue with dnssec.. I would expect that to fail with that query - that fqdn is test fqdn for making sure dnssec is working.. But we are seeing the servfail reason..
So now when normal queries fail we might get to the bottom of why your getting servfail vs an answer to what you ask for.
What do you mean by a normal query? How is this NOT a normal query? (ducks:)) What's the next step you'd like to see to further clarify?
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
No what it means is its asking the local cache at 127.0.0.53, your command shows that points to 10.10.10.1
Clearly went over this already like 6 days ago...
ahh that makes sense, sorry I missed that earlier. so does this mean i should be constantly trying new websites i don't ever visit to avoid it falling back to local cache? or is that a fundamental misunderstanding of the steps
@Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
When looking up dnssec-failed.org, what would you expect ?
thank you for the links. It somehow moved me closer AND farther away from understanding. I have AT&T fiber, why did it attempt a comcast run dnssec fail website. Is going to this website something built into the dig command? Also, correct me if I'm wrong, but I believe y'all had me re-enable DNSSEC just cause it was good practice. I can see how this failing is symptomatic of my greater problems but it's odd to me that whats manifesting itself is something I've been told is really optional and best practice, not required.
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
that fqdn is test fqdn for making sure dnssec is working.. But we are seeing the servfail reason..
All of this leaves me a little lost as to next steps. I keep going back to this line. I know what fqdn stands for, but this collection of words together just doesn't make sense to me, and I believe it's the key to understanding what I need to do next. as always, thanks for everything and further guidance would be greatly appreciated.
-
dnssec-failed.org
Just for reference I see SERVFAIL for it via Google or others.
>dig dnssec-failed.org @8.8.8.8 ; <<>> DiG 9.16.44 <<>> dnssec-failed.org @8.8.8.8 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 64906 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ; EDE: 9 (DNSKEY Missing): (No DNSKEY matches DS RRs of dnssec-failed.org) ;; QUESTION SECTION: ;dnssec-failed.org. IN A ;; Query time: 120 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Mon May 13 10:38:02 Central Daylight Time 2024 ;; MSG SIZE rcvd: 97
https://bluecatnetworks.com/blog/the-top-four-dns-response-codes-and-what-they-mean/
"a SERVFAIL is the DNS server telling you, “Hey, I can’t give you the answer for that query.”" -
@SteveITS well yeah forwarding and trying to do dnssec is going to be problematic.. But that dnssec-failed.org should always fail.. It meant to fail.. As a way to validate your dnssec is working..
So yeah if you query any NS that is doing dnssec, google, quad9, etc.. then it would fail.. But if you query some NS that isn't doing dnssec than it would pass..
example
; <<>> DiG 9.16.50 <<>> @8.8.8.8 dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 3602 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ; EDE: 9 (DNSKEY Missing): (No DNSKEY matches DS RRs of dnssec-failed.org) ;; QUESTION SECTION: ;dnssec-failed.org. IN A ;; Query time: 95 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Mon May 13 10:54:20 Central Daylight Time 2024 ;; MSG SIZE rcvd: 97
But if say ask something not doing dnssec..
$ dig @4.2.2.2 dnssec-failed.org ; <<>> DiG 9.16.50 <<>> @4.2.2.2 dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39041 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 8192 ;; QUESTION SECTION: ;dnssec-failed.org. IN A ;; ANSWER SECTION: dnssec-failed.org. 300 IN A 96.99.227.255 ;; Query time: 52 msec ;; SERVER: 4.2.2.2#53(4.2.2.2) ;; WHEN: Mon May 13 10:55:08 Central Daylight Time 2024 ;; MSG SIZE rcvd: 62
This is another example where it makes no sense to check to use dnssec if your forwarding.. Either where you forward is doing dnssec already. Most of the major players do, some have some different IPs you can query that don't.. But pretty much all of them do dnssec. If where you forward does not do dnssec, asking for it in unbound settings isn't going to do anything other then more than likely cause failures..
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
All of this leaves me a little lost as to next steps.
The next step is to wait till you fail again.. You were seeing servfail - but we didn't know why or what was the reason for it. Now that you have enabled logging of servfail details.. Next time you have a problem - we can hope to see why.. And then address that..
Also have you updated to 2.7.2 yet? This should be your next step to be honest..
-
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
Also have you updated to 2.7.2 yet? This should be your next step to be honest..
no i have not but I can prioritize. i know it SHOULD be easy and smooth but i'm so nervous. especially with it not updating by itself.
@SteveITS said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
https://bluecatnetworks.com/blog/the-top-four-dns-response-codes-and-what-they-mean/
thanks this is a very useful article
-
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
The next step is to wait till you fail again..
and what specific commands should i be running? I assume you don't need the resolvectl one, just "dig www.netgate.com" or www.msn.com?
-
@RickyBaker yeah and looking in the log.. So we can see what it logs for failure if the dig output doesn't show us as much detail on it, etc.
-
@johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
yeah and looking in the log.. So we can see what it logs for failure if the dig output doesn't show us as much detail on it, etc.
great thanks for the clarification
-
@RickyBaker spent all afternoon waiting with my computer for an outage, finally went to bed around 1130pm. Happened right away and resolved by the time i sprinted downstairs. stay tuned
-
been hunting non-stop but the network has "unfortunately" been very stable this week. This morning my wife said she was experienceng the DNS NX issue on her phone right when we woke up but when i fired up my phone I wasn't experiencing the problem. Went about my morning and a few minutes later, while on the head without a laptop, it happened to me. I fired up my ssh app and ssh'ed into the plex server and got this:
...but then i realized i was doing it on another machine that may not be experiencing the problem. I don't know why I didn't put that together before but the DNS issue USUALLY affects all devices at once but obviously not always. unfortunately my phone's local ssh session doesn't have the dig command. I'll look at installing it to increase my chance of catching it. Unfortunately I forgot that i only have a few minutes to screenshot the logs before they roll off and I missed it. I'm optimistic i'll catch it this weekend.
-
@RickyBaker Still hunting, frustratingly the problem has def gotten less frequent an shorter in duration (but still ever present, my wife agrees, i'm not crazy). It's also happening more on individual devices where other devices work fine more often than before. It happened on my PC and when I ran the dig command on my plex debian box it was fine
In the log though I did find this around when I tried the dig command:
I also found this which looks shady to me:
Since it seems to be singular devices at a time now i'm slowing figuring out how to run dig commands on all the different OS's in my house. I have Android and linux and am following a tutorial for Windows now...
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
following a tutorial for Windows now...
So i was trying to follow this tutorial but when I went to install it said it was already installed (I used the legacy windows version). I had tried to install it before but then typing the dig command still returned a "command not found". The installer suggested i remove the old one from add/remove programs but I couldn't find anything under BIND or ISC and the last installed program was discord a LONG time ago.
I tried to continue with the tutorial but it asked where BIND was installed to add it to the PATH (which I'm sure was my problem the first time around) but I don't know where it's installed and a windows search for BIND or ISC is expectedly noisy. any suggestions?
I'll keep plugging at it but it's an annoying speed bump that's really slowing down the troublshooting...
-
Finally got one!!!
I pasted everything in the log back a few minutes here in case the totality of it is usefulhttps://pastebin.com/w2SGh8P0
@johnpoz Sorry for the delay in getting this I swear i was trying the whole time. thanks for the patience.
-
-
-
@johnpoz got another one! though it does seem to be happening with a lot less frequency for some reason, i've just gotten better at catching them during the quick window of opportunity:
-
This one got a NXDOMAIN error:
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
This one got a NXDOMAIN error:
That is a URL not a hostname so it should fail. Remove the /apps/staff (as shown in the prior post).
Searching for "exceeded the maximum number of sends" looks like DNSSEC...:
https://community.ipfire.org/t/servfail-exceeded-the-maximum-number-of-sends/7645
https://www.reddit.com/r/pihole/comments/11hqrco/intermittent_servfail_when_using_unbound/this one talks about not using UDP for DNS...?!
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270824This one talks about torrenting and DNSSEC:
https://www.reddit.com/r/opnsense/comments/1cinuyn/unbound_dns_issues_freezes_randomly/