DNS Forwarder - Excessive (20x) latency vs external server
-
Hi everyone,
This is my first time playing with DNS Forwarder, however it's not going well. As it is right now, I'm getting 2-3x the delay when querying pfsense vs requesting a record from an external server (8.8.4.4/4.2.2.2 etc).
Currently the FW's DNS servers are configured as:
127.0.0.1
8.8.4.4
4.2.2.2Querying it repeatedly for the same host does not improve the response time:
While using pfsense:$ while [ true ]; do time nslookup google.com | grep real; sleep 1; done real 0m1.047s user 0m0.000s sys 0m0.008s real 0m1.057s user 0m0.004s sys 0m0.004s real 0m1.046s user 0m0.004s sys 0m0.004s real 0m1.053s user 0m0.000s sys 0m0.008s
While directly going to 8.8.4.4:
$ while [ true ]; do time nslookup google.com | grep real; sleep 1; done real 0m0.052s user 0m0.000s sys 0m0.008s real 0m0.048s user 0m0.008s sys 0m0.000s real 0m0.053s user 0m0.000s sys 0m0.008s real 0m0.049s user 0m0.004s sys 0m0.004s
What am I doing wrong?
Thanks!
-
That's not a valid means of testing DNS lookup times. Check the response time in dig's output.
-
No offense, but I think it's perfectly valid. I originally used dig, and it alone does not accurately display the latency while using pfsense as a DNS server/forwarder. The times between nslookup/dig are identical btw:
While using:
$ while [ true ]; do time dig google.com | grep Query; sleep 1; done ;; Query time: 38 msec real 0m1.048s user 0m0.004s sys 0m0.004s ;; Query time: 45 msec real 0m1.055s user 0m0.000s sys 0m0.008s ;; Query time: 38 msec real 0m1.047s user 0m0.008s sys 0m0.000s ;; Query time: 35 msec real 0m1.044s user 0m0.004s sys 0m0.004s
Without:
$ while [ true ]; do time dig google.com | grep Query; sleep 1; done ;; Query time: 34 msec real 0m0.044s user 0m0.000s sys 0m0.008s ;; Query time: 40 msec real 0m0.046s user 0m0.000s sys 0m0.004s ;; Query time: 37 msec real 0m0.046s user 0m0.000s sys 0m0.008s ;; Query time: 36 msec real 0m0.046s user 0m0.000s sys 0m0.008s ;; Query time: 38 msec real 0m0.047s user 0m0.000s sys 0m0.008s
I'm looking at a 20x increase is latency while using the FW
-
I think I should also correct the Post subject/issue I'm having.
It does seem that the device is caching (reduced query times), however I'm experiencing severe latency while doing so.
Where should I be looking? I'm not seeing any cpu spikes/nic drops etc when I try this. CPU is idling between 99-100%
Proof it is indeed caching:
$ while [ true ]; do time dig pfsense.org | grep Query; sleep 1; done
;; Query time: 86 msecreal 0m1.095s
user 0m0.000s
sys 0m0.008s
;; Query time: 40 msecreal 0m1.050s
user 0m0.008s
sys 0m0.000s
;; Query time: 52 msecreal 0m1.062s
user 0m0.004s
sys 0m0.004s
;; Query time: 36 msecreal 0m1.046s
user 0m0.000s
sys 0m0.008sEdit: Although the Query times still aren't any faster than external servers…physics says it should be.
-
The time nslookup takes to run doesn't necessarily have any exact relation to how fast the DNS server responds is why that isn't valid.
This is along the lines of what's typical to see, first query will depend on how fast your configured DNS servers respond, subsequent ones within the TTL will respond from cache.
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
;; Query time: 70 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
;; Query time: 1 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
;; Query time: 1 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
;; Query time: 1 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
;; Query time: 1 msec[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
;; Query time: 28 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
;; Query time: 1 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
;; Query time: 1 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
;; Query time: 1 msec
[cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
;; Query time: 1 msec -
You clearly have something wrong - is that 1m I am seeing in your post?
Here is from my linux box to my pfsense box (dns forwarder)
@ubuntu:~$ while [ true ]; do time nslookup google.com | grep real; sleep 1; done
real 0m0.669s
user 0m0.036s
sys 0m0.112sreal 0m0.021s
user 0m0.004s
sys 0m0.012sreal 0m0.021s
user 0m0.012s
sys 0m0.004sreal 0m0.020s
user 0m0.004s
sys 0m0.008s -
You clearly have something wrong - is that 1m I am seeing in your post?
Here is from my linux box to my pfsense box (dns forwarder)
@ubuntu:~$ while [ true ]; do time nslookup google.com | grep real; sleep 1; done
real 0m0.669s
user 0m0.036s
sys 0m0.112sreal 0m0.021s
user 0m0.004s
sys 0m0.012sreal 0m0.021s
user 0m0.012s
sys 0m0.004sreal 0m0.020s
user 0m0.004s
sys 0m0.008sYeah, I definitely have something wrong. I'm trying to figure out what to debug/look at, but I'm lost. In my post, you're seeing 1.05~ seconds to complete the process vs 0.05~ seconds when I'm not using the forwarder. While 1.05 seconds isn't a 'ton', it's extremely noticeable when browsing the web.
As I said earlier, it's clearly not a caching problem, there's something causing the gigantic delay for the FW to process my request.
What should I be looking at?
-
Just so we're clear, the caching mechanism is working:
dig pfsense.org | grep Query
;; Query time: 39 msec
dig pfsense.org | grep Query
;; Query time: 0 msec
dig eff.org | grep Query
;; Query time: 37 msec
dig eff.org | grep Query
;; Query time: 0 msec -
How many domains in the search list?
Use Wireshark to see what is really going on with the DNS queries.
-
How many domains in the search list?
Use Wireshark to see what is really going on with the DNS queries.
I was only using localhost + 2 domains (8.8.4.4 and 4.2.2.2). When I was watching via tcpdump, it looked like the box queried both DNS servers simultaneously. Is that normal? I was under the assumption it would only use server #3 if no response was heard from server #2 within 'x' time.
Another strange thing, the ridiculous latency is now gone, but I haven't changed anything yet… Has anyone experienced something like this before?
Unrelated questions:
- Is there any way I can view what's in the cache?
- What's the default number of entries for the cache? It doesn't seem to maintain a very large amount... What's the 'supported' way of increasing the size?
Thanks again,
-
Yes, pfSense queries all DNS servers simultaneously and uses the first response. That is normal for pfSense. That is my understanding anyway.
Look in the query response in tcp dump. In there will be the TTL. That’s how long it should remain available in cache, assuming it’s not purged for some other reason. Don’t know what size limitation pfSense may have but it’s probably more than you’ll run into.
P.S. Modified my pfSense forwarder to query the DNS servers sequentially. My primary DNS server responds quickest nearly every time anyway. So the additional queries don't really add any benefit, except when primary DNS server is down query responses will be slower. But that is rare.
-
P.S. Modified my pfSense forwarder to query the DNS servers sequentially. My primary DNS server responds quickest nearly every time anyway. So the additional queries don't really add any benefit, except when primary DNS server is down query responses will be slower. But that is rare.
I wouldn't do that in most circumstances, you'll have much more consistent performance with the defaults, and it's not like doubling, tripling or quadrupling your DNS requests has any notable impact on bandwidth or anything else.
-
Unrelated questions:
- Is there any way I can view what's in the cache?
- What's the default number of entries for the cache? It doesn't seem to maintain a very large amount… What's the 'supported' way of increasing the size?
http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html
and related searches for dnsmasq will find your answers. -
@cmb:
Unrelated questions:
- Is there any way I can view what's in the cache?
- What's the default number of entries for the cache? It doesn't seem to maintain a very large amount… What's the 'supported' way of increasing the size?
http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html
and related searches for dnsmasq will find your answers.Thanks. I've found the answer to #2, however I haven't been able to find a way to view the entries in the cache without either:
a) using debug mode
or
b) killing the processIs there a way to do it while it is running normally without killing it?
-
@cmb:
P.S. Modified my pfSense forwarder to query the DNS servers sequentially. My primary DNS server responds quickest nearly every time anyway. So the additional queries don't really add any benefit, except when primary DNS server is down query responses will be slower. But that is rare.
I wouldn't do that in most circumstances, you'll have much more consistent performance with the defaults, and it's not like doubling, tripling or quadrupling your DNS requests has any notable impact on bandwidth or anything else.
Not suggesting that you or anyone else should do this. Just pointing out that it can be done because the OP asked about pfSense simultaneous DNS queries behavior.
As mentioned previously, since my primary DNS server is the first to respond nearly 100 percent of the time the main benefit of the others is if/when the primary goes down, which is rare. I’ll stick with sequential queries. Don’t consider mine to be "most circumstances".
But let’s not hijack this thread. I’ve posted details for doing this in another thread where it can be discussed on topic.