Diagnostics / DNS Lookup, using Resolver, always shows Query time 0 msec
-
Ok, so very interesting. I decided to disable the DNS Resolver by going to Services > DNS Resolver and unchecking "Enable DNS Resolver", and I instead went over to DNS Forwarder and enabled that. I again checked "Register DHCP leases" and "Register DHCP static mappings", and nothing else, and hit save.
Now, when I go to Diagnostics > DNS Lookup and look up any domain, I see results for 127.0.0.1. Google.com for example gives me 11 msec, and Google DNS and OpenDNS give me 12 and 13 msec respectively (those are the 2 DNS servers I have configured for forwarding).
So now I'm curious on a couple of things:
•If I click "Lookup" a few times on the DNS Lookup page for Google.com, 127.0.0.1 shows me different results each time. If I'm understanding correctly though, both the DNS Forwarder and the DNS Resolver should cache. But instead, I never see 127.0.0.1 with a response of 0 msec. It's always somewhere between 10 and 15 msec, telling me it's checking the forwarder every time. Last I checked Google A records didn't have a TTL of half a second. What's happening here?•I switched back to the DNS Resolver and went back to Diagnostics > DNS Lookup, and now when I look something up, 127.0.0.1 doesn't even appear in the list. I see timings for the Google and OpenDNS servers which are specified in General setup, but I don't see 127.0.0.1 like I did before. Does this make any sense? Any setting that would have turned that off from being viewable here?
EDIT: I went over to System > General Setup and checked "Disable DNS Forwarder" to not use DNS Forwarder/Resolver for the firewall, saved, and then unchecked it, and saved, and now 127.0.0.1 is back again, but still showing 0 msec for every lookup on DNS Lookup diagnostics page. It's either a bug for only the DNS Resolver (works fine with Dnsmasq), or I need to reinstall.
EDIT 2: Ok, and even weirder, if I use the DNS Resolver, enable forwarding, and stop and start the resolver, and head over to the DNS Lookup diagnostics page, I can search something like apple.com and I see 127.0.0.1 with a result of 11 msec or so, and every other search after that is 0 msec (as in, not just for Apple.com, but ANY domain searched after the very first lookup).
-
I wouldn't call myself an pfSense expert. Just a techie that's getting comfortable with the software and I've spent a majority of my time working with the resolver. So, I'd only be guessing at this point. I agree your results seem inconsistent, and you seem concerned. (Have things got worse, or can you get it back to where it was in the beginning?).
One time, I had made too many changes to truly isolate an issue. So, I started from scratch, which allowed me to isolate the issue. I then restored my original configuration (and packages), now knowing the source of my issue. Maybe it has to do with the firmware of your new hardware? I don't think I'll be much help at this point, aside from reiterating that usually the key to troubleshooting an issue like this is to isolate it from other variables. (Oh, also, I've found making multiple changes to pfSense w/o a reboot can result in inconstant results)
-
You have to remove all the other variables... What else on your network might be doing queries for what your doing queries for.
Just up the logging level and it will show you exactly the query and the answers.. Or just sniff for the dns traffic on your wan, etc.
Did you get an answer? If so is that answer wrong in some way? What does it matter if it shows zero or not? Maybe its a glitch in the gui, the browser? Clearly its not possible for the dns query to happen in 0msec if there was an actual query that had to be resolved.. If if just asking the authoritative NS directly since its already cached.
The time it takes to resolve something be it cached or not has little to do with anything - unless its not resolving.. Or the info your getting back is wrong in some way?
Here lets take the gui out of the equation and use dig
So here is query for www.cisco.com
$ dig @192.168.9.253 www.cisco.com ; <<>> DiG 9.12.3-P1 <<>> @192.168.9.253 www.cisco.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31861 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.cisco.com. IN A ;; ANSWER SECTION: www.cisco.com. 3600 IN CNAME www.cisco.com.akadns.net. www.cisco.com.akadns.net. 300 IN CNAME wwwds.cisco.com.edgekey.net. wwwds.cisco.com.edgekey.net. 21600 IN CNAME wwwds.cisco.com.edgekey.net.globalredir.akadns.net. wwwds.cisco.com.edgekey.net.globalredir.akadns.net. 3600 IN CNAME e2867.dsca.akamaiedge.net. e2867.dsca.akamaiedge.net. 20 IN A 23.34.70.245 ;; Query time: 570 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Fri Jan 04 05:19:11 Central Standard Time 2019 ;; MSG SIZE rcvd: 224
You can see took a while to get back answer... So here is 2nd query for it - clearly cached as you see the TTLs are not full
$ dig @192.168.9.253 www.cisco.com ; <<>> DiG 9.12.3-P1 <<>> @192.168.9.253 www.cisco.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56829 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.cisco.com. IN A ;; ANSWER SECTION: www.cisco.com. 3590 IN CNAME www.cisco.com.akadns.net. www.cisco.com.akadns.net. 290 IN CNAME wwwds.cisco.com.edgekey.net. wwwds.cisco.com.edgekey.net. 21590 IN CNAME wwwds.cisco.com.edgekey.net.globalredir.akadns.net. wwwds.cisco.com.edgekey.net.globalredir.akadns.net. 3590 IN CNAME e2867.dsca.akamaiedge.net. e2867.dsca.akamaiedge.net. 10 IN A 23.34.70.245 ;; Query time: 0 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Fri Jan 04 05:19:21 Central Standard Time 2019 ;; MSG SIZE rcvd: 224
So now I flush the cache on unbound
[2.4.4-RELEASE][root@sg4860.local.lan]/root: unbound-control -c /var/unbound/unbound.conf flush_zone cisco.com ok removed 8 rrsets, 0 messages and 0 key entries [2.4.4-RELEASE][root@sg4860.local.lan]/root:
Now do the query again.
$ dig @192.168.9.253 www.cisco.com ; <<>> DiG 9.12.3-P1 <<>> @192.168.9.253 www.cisco.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1101 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.cisco.com. IN A ;; ANSWER SECTION: www.cisco.com. 3600 IN CNAME www.cisco.com.akadns.net. www.cisco.com.akadns.net. 284 IN CNAME wwwds.cisco.com.edgekey.net. wwwds.cisco.com.edgekey.net. 21585 IN CNAME wwwds.cisco.com.edgekey.net.globalredir.akadns.net. wwwds.cisco.com.edgekey.net.globalredir.akadns.net. 3585 IN CNAME e2867.dsca.akamaiedge.net. e2867.dsca.akamaiedge.net. 5 IN A 23.34.70.245 ;; Query time: 93 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Fri Jan 04 05:25:45 Central Standard Time 2019 ;; MSG SIZE rcvd: 224
Do you have unbound set to answer 0 ttl? Do you have it set to prefetch? You do understand that can mess up even dumping cache right?
You sure browser just not returning cache for you, etc.
Its clearly impossible for response time to be 0 or less than 1ms if a query was actually made and not returned from cache.
BTW.. my query is direct to 192.168.9.253 (pfsense lan IP) because my PC using point to my pihole as default dns and that also caches, etc. etc.
You prob want to test with records that are not so convoluted.. Something a bit more direct without cnames and other CDNs and NS envolved.. lets do it with something simple like www.pfsense.org Notice how above the TTLs of some of those records in the cname paths from the other zone which were cached and not flushed by my command.
$ dig @192.168.9.253 www.pfsense.org ; <<>> DiG 9.12.3-P1 <<>> @192.168.9.253 www.pfsense.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1648 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.pfsense.org. IN A ;; ANSWER SECTION: www.pfsense.org. 278 IN A 208.123.73.69 ;; Query time: 0 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Fri Jan 04 05:35:53 Central Standard Time 2019 ;; MSG SIZE rcvd: 60
Clearly that was from cache - see the TTL on the record is 278 seconds left... I then flushed it
[2.4.4-RELEASE][root@sg4860.local.lan]/root: unbound-control -c /var/unbound/unbound.conf flush_zone pfsense.org ok removed 4 rrsets, 3 messages and 0 key entries [2.4.4-RELEASE][root@sg4860.local.lan]/root:
Then did the query again and you can see it got the full 300 second TTL returned because it had to actually resolve it vs cache.
$ dig @192.168.9.253 www.pfsense.org ; <<>> DiG 9.12.3-P1 <<>> @192.168.9.253 www.pfsense.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36716 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.pfsense.org. IN A ;; ANSWER SECTION: www.pfsense.org. 300 IN A 208.123.73.69 ;; Query time: 69 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Fri Jan 04 05:36:01 Central Standard Time 2019 ;; MSG SIZE rcvd: 60
Lets go back to what exactly is your problem or what do you think is a problem? Or what exactly is the concern or question?
If anything I would like the gui to give back some more info like what the TTL of the record returned is.. From that you can normally tell if actually resolved or cached.. Unless you got really lucky and did a query for exactly when the TTL had common TTL time left on it from a longer TTL ;)
-
I've had this same exact problem for a while now. I originally discovered it because I was trying to troubleshoot slow queries. Then I was curious about why it always displayed 0 msec, but ultimately my solution was to ignore it :)
It doesn't appear to be impacting anything as far as I can tell. It's interesting that it's happening on a fresh install too.
https://forum.netgate.com/topic/135748/dns-query-time-always-0-msec -
@johnpoz said in Diagnostics / DNS Lookup, using Resolver, always shows Query time 0 msec:
You have to remove all the other variables... What else on your network might be doing queries for what your doing queries for.
Nothing else would be. This is for a home network, and I'm the only one home, some of these domains I've looked up are entirely guesses and literally a dictionary word followed by ".com", which I'm confident no application on my computer would be looking up in the background. If it was in a larger organization, I'd definitely understand that someone else could be performing those queries, but nope; just me.
Did you get an answer? If so is that answer wrong in some way? What does it matter if it shows zero or not? Maybe its a glitch in the gui, the browser?
Most likely it's just a glitch in the gui, and if that's the case, I'm more than happy to wait on it to be fixed or wait for the next release and see what happens after an upgrade or a fresh install of that release. Again, everything is working, so it's not a big deal, I was moreso just curious as to why, or if anyone else had come across it, which is looks like others have.
Here lets take the gui out of the equation and use dig
Yep, dig or other DNS response tools give me the timings I'd expect.
Do you have unbound set to answer 0 ttl?
Didn't even know that was possible. Is that an option in the GUI somewhere? All pfSense configurations have been done GUI side, with no manual editing of configuration files.
Do you have it set to prefetch?
I do have Prefetch DNS Key Support enabled, but not Prefetch Support. That wouldn't cause an issue would it? It's been enabled in past installations on different hardware a couple years ago and I would always see timings show correct on the DNS Lookup page.
Lets go back to what exactly is your problem or what do you think is a problem? Or what exactly is the concern or question?
As covered before, this is really just a curiosity thing and making sure I'm not doing something drastically wrong since I know this feature has worked perfectly fine in the past. Appreciate your time and your explanations though; it all makes perfect sense and is in line with what I'd expect. About the only thing I can think of at this stage is some exact combination of options I have selected for the DNS Resolver (Unbound) in the pfSense GUI is making it show 0 msec, or it's a bug altogether in the GUI regardless of what options I have selected.
If anything I would like the gui to give back some more info like what the TTL of the record returned is.. From that you can normally tell if actually resolved or cached.. Unless you got really lucky and did a query for exactly when the TTL had common TTL time left on it from a longer TTL ;)
Yeah I agree; this would be extremely handy.
@raffi_ said in Diagnostics / DNS Lookup, using Resolver, always shows Query time 0 msec:
I've had this same exact problem for a while now. I originally discovered it because I was trying to troubleshoot slow queries. Then I was curious about why it always displayed 0 msec, but ultimately my solution was to ignore it :)
It doesn't appear to be impacting anything as far as I can tell. It's interesting that it's happening on a fresh install too.
https://forum.netgate.com/topic/135748/dns-query-time-always-0-msecAh, thank you very much for posting this. That at least confirms my suspicions that it's not just me, which really, just saves me from pointlessly reinstalling thinking my install is broken (which I would have found surprising seeing how fresh everything is), which probably would have led to me still seeing a query time of 0 msec anyway. Thanks!
-
I have never really used that gui much... And never really looked any deeper into it showing zero.. for something that is NOT In the cache... But you would have to make sure you FLUSH the whole thing to be sure... But making up some random thing gives you what the SOA for the tld,
;; QUESTION SECTION: ;lksjfldsjfljslfjdsljdlfdsjlfdsjljfdslf.com. IN A ;; AUTHORITY SECTION: com. 900 IN SOA a.gtld-servers.net. nstld.verisign-grs.com. 1546634914 1800 900 604800 86400
So guess what is cached in that respect the SOA, the roots down to your TLD also cached, etc. etc.. So while we all KNOW it can not be zero.. Maybe the way the browser does the query and the response is giving unbound some time that is not counted to do the query, and then when code your gui is using asks again it comes back as 0..
Really need a DEV to take a look.. And make sure what your going to look up is NOT actually in the cache, etc..
Here flagging @jimp he might be able to shed some light to what exactly happens when the gui page is used to ask pfsense to lookup something.
When I need to troubleshoot anything to do with dns, I always just use dig... Only time I really use that gui is for examples for users to validate if pfsense can actually resolve or not.
As to answering Zero and Prefetch - yes both options that can be turned on, and can be very helpful.. I am not sure if on by default because I always make sure they are both on - unless testing something.. You should see a drastic increase in your cache hit % turning on the prefetch option.
-
@johnpoz said in Diagnostics / DNS Lookup, using Resolver, always shows Query time 0 msec:
As to answering Zero and Prefetch - yes both options that can be turned on, and can be very helpful.. I am not sure if on by default because I always make sure they are both on - unless testing something..
Interesting! If I may ask, why do you turn on Serve Expired? That seems like something we wouldn't want turned on if DNS records for a website changed but our pfSense installs were instead giving us an older expired value (I understand that 99% of the time, it'd be just fine since most DNS records don't change that frequently). How does pfSense know when to drop that expired value then or update it? Unless I'm misunderstanding what that option does.
-
@johnpoz Thanks for the tip on the advanced resolver options. I'll give them a try. They are both disabled by default, FYI.
-
Zero TTL allows to answer the query while relooking up the record after it answers... It hands the client the 0 ttl so new entry will be gotten next time client asks.
So in typical case where the IP has not changed and the TTL has just expired, client gets what it wants and can talk to the IP it needs to talk to.. If its no longer working - that would fail and the client would ask again anyway since the TTL was handed to it as ZERO.
By this time the resolver has gotten new IP with fresh ttl, etc.
Where it can help is those odd ball look ups where the NS for the domain are not the best.. So this allows the client to get a instant answer... Vs having to wait for possible extra ms for it to be resolved - just because it hasn't been a very looked up record and was not prefetched, etc.
Then next time the client asks it will get whatever the normal TTL on the record is.. It should help with some of the pesky domains that users might have complained they had to hit refresh in their browser, etc some of these domains NS are pretty shitty ;) So sometimes it can be delayed resolving directly.. It can help with those for sure... You would have to look at your stats to see if you have some NS with high long response times or time outs, etc.
-
@johnpoz Got it! In that case, I'll enable 'em both. Thanks!