DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times
-
@RickyBaker Still hunting, frustratingly the problem has def gotten less frequent an shorter in duration (but still ever present, my wife agrees, i'm not crazy). It's also happening more on individual devices where other devices work fine more often than before. It happened on my PC and when I ran the dig command on my plex debian box it was fine
In the log though I did find this around when I tried the dig command:
I also found this which looks shady to me:
Since it seems to be singular devices at a time now i'm slowing figuring out how to run dig commands on all the different OS's in my house. I have Android and linux and am following a tutorial for Windows now...
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
following a tutorial for Windows now...
So i was trying to follow this tutorial but when I went to install it said it was already installed (I used the legacy windows version). I had tried to install it before but then typing the dig command still returned a "command not found". The installer suggested i remove the old one from add/remove programs but I couldn't find anything under BIND or ISC and the last installed program was discord a LONG time ago.
I tried to continue with the tutorial but it asked where BIND was installed to add it to the PATH (which I'm sure was my problem the first time around) but I don't know where it's installed and a windows search for BIND or ISC is expectedly noisy. any suggestions?
I'll keep plugging at it but it's an annoying speed bump that's really slowing down the troublshooting...
-
Finally got one!!!
I pasted everything in the log back a few minutes here in case the totality of it is usefulhttps://pastebin.com/w2SGh8P0
@johnpoz Sorry for the delay in getting this I swear i was trying the whole time. thanks for the patience.
-
-
-
@johnpoz got another one! though it does seem to be happening with a lot less frequency for some reason, i've just gotten better at catching them during the quick window of opportunity:
-
This one got a NXDOMAIN error:
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
This one got a NXDOMAIN error:
That is a URL not a hostname so it should fail. Remove the /apps/staff (as shown in the prior post).
Searching for "exceeded the maximum number of sends" looks like DNSSEC...:
https://community.ipfire.org/t/servfail-exceeded-the-maximum-number-of-sends/7645
https://www.reddit.com/r/pihole/comments/11hqrco/intermittent_servfail_when_using_unbound/this one talks about not using UDP for DNS...?!
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270824This one talks about torrenting and DNSSEC:
https://www.reddit.com/r/opnsense/comments/1cinuyn/unbound_dns_issues_freezes_randomly/ -
About :
This very issue (or whatever it is) has its own thread on NLnetLabs (the author of unbound) exceeded the maximum nameserver nxdomains.
One of the authors of unbound is also answering.
Some tips are present.Btw : this is DNS at its finest. I'll take this one home tonight, need to read it again.
Latest posts in that thread are just hours ago.
Here you go :
server: qname-minimisation: no aggressive-nsec: no infra-keep-probing: yes infra-cache-max-rtt: 2000 infra-host-ttl: 0 outbound-msg-retry: 32 max-sent-count: 128
dono what the impact will be ....
I've never seen this "exceeded the maximum nameserver nxdomains" message myself. -
@Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
Here you go :
is the suggestion to throw this into the custom options section of the dns resolver? I'll check out all the links, was just looking to confirm the suggestion you had forwarded on...
-
Exact.
Like this :
-
@Gertjan awesome, thanks for clarification. It's been added. I'll read up on all these threads while I wait for it to fail...
-
Just to link that other thread, which two of us linked above, to this one:
https://forum.netgate.com/topic/188297/sporadic-dns-issues-cryptic-error-in-logs/ -
@SteveITS said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
this one talks about not using UDP for DNS...?!
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270824reading through these now, this one caught my eye because one of the only packages i have installed is UDP Broadcast Relay (in order to forward across the VLAN's I set up). Though i'll be honest, I really don't know much about UDP/TCP and not sure if this is the same ballpark as the Broadcast Relay. I can try the tcp-upstream: yes option after I feel confident the last iteration of changes didn't solve the issue.
-
@SteveITS said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
https://www.reddit.com/r/opnsense/comments/1cinuyn/unbound_dns_issues_freezes_randomly/
This is another interesting theory, but I searched my log and I don't see anything referencing a tracker. Though I did just notice my enphase solar controller also just got a bunch of servfails...
-
@SteveITS said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
https://community.ipfire.org/t/servfail-exceeded-the-maximum-number-of-sends/7645
just to dovetail all my thoughts on reading these links: this one seems very promising but doesn't have the total solution included. Though it also claims that DNSSEC was the culprit and I'm quite certain I was still experiencing the issue with DNSSEC disabled....
-
Just to stay on top of things: I've yet to experience an outage like i'm used to (DNS_PROBE_FINISHED_NXDOMAIN). I was out of the house so its atypical but not unheard of (the length of time that is). However last night I tried to open an ebay link and the app wasn't able to bring up the item. It was an odd error in the app, but the really odd thing was that I was able to load webpages in chrome. Just documenting my journey, i'll be sure to grab any NXDOMAIN issues i catch in the coming days...
-
@Gertjan just noticed an interesting thing: all 5 of my Nest Protects (Google Fire Alarms) are reporting not being connected to the internet (WiFi issue). It was last checked about 18 hours ago (a few hours after I made the changes, so maybe it checked a few times before it stopped retrying). The History showed that they have been connected and without error for as long as the History goes back so seems likely connected to these specific changes....But also no outages yet besides that weird ebay one that was not the same as the usual.
-
@RickyBaker if I'm not mistaken this is WAY more servfail's that i was previously experiencing:
-
@RickyBaker 3 instances of squeakydoor.nest.com servfail and 543 instances (probably a third as many fails) of time.google.com or some derivation of. I was not seeing this before the latest round of changes...
-
@RickyBaker The images above were lost due to the forum error...do you have forwarding enabled? This:
-
@RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:
if I'm not mistaken this is WAY ...
Scrap all the lines that terminate with "localdomain"
a3ayogyffhzcp1.iot.us-east-1.amazonaws.com.localdomain
as that domain doesn't exist - so that's a fail for sure.
The DNS request, coming from a LAN device, was wrong.
It should have asked fora3ayogyffhzcp1.iot.us-east-1.amazonaws.com.
You saw the terminating dot ? That means that .com. is the TLD. Without the terminating dot unbound starts by adding it's own local domain first, which will fail.
But there are a bunch of IOT in place here, and these aren't known for their nice DNS requesting.
Coupled with the huge forest of DNS domain servers, domain name server, as we have to deal with "amazonaws" here .... (world's best organized DNS mess ever).
To make things even worse : the Time To Live (TTL) is set to 60 seconds. So a request A record has to be request again within 60 seconds.
Yeah, things can get messy quick. "Lets add another IOT" ^^Btw : this nicely looking host name actually resolves :
[24.03-RELEASE][root@pfSense.bhf.tld]/root: dig a3ayogyffhzcp1.iot.us-east-1.amazonaws.com +short 54.209.119.230 52.71.151.159 54.164.100.117 52.87.91.214 54.162.199.177 52.70.244.97 54.147.162.149 52.4.223.197