DNS Forwarder - Excessive (20x) latency vs external server



  • Hi everyone,

    This is my first time playing with DNS Forwarder, however it's not going well. As it is right now, I'm getting 2-3x the delay when querying pfsense vs requesting a record from an external server (8.8.4.4/4.2.2.2 etc).

    Currently the FW's DNS servers are configured as:
    127.0.0.1
    8.8.4.4
    4.2.2.2

    Querying it repeatedly for the same host does not improve the response time:
    While using pfsense:

    $ while [ true ]; do time nslookup google.com | grep real; sleep 1; done
    
    real	0m1.047s
    user	0m0.000s
    sys	0m0.008s
    
    real	0m1.057s
    user	0m0.004s
    sys	0m0.004s
    
    real	0m1.046s
    user	0m0.004s
    sys	0m0.004s
    
    real	0m1.053s
    user	0m0.000s
    sys	0m0.008s
    
    

    While directly going to 8.8.4.4:

    $ while [ true ]; do time nslookup google.com | grep real; sleep 1; done
    
    real	0m0.052s
    user	0m0.000s
    sys	0m0.008s
    
    real	0m0.048s
    user	0m0.008s
    sys	0m0.000s
    
    real	0m0.053s
    user	0m0.000s
    sys	0m0.008s
    
    real	0m0.049s
    user	0m0.004s
    sys	0m0.004s
    

    What am I doing wrong?

    Thanks!



  • That's not a valid means of testing DNS lookup times. Check the response time in dig's output.



  • No offense, but I think it's perfectly valid. I originally used dig, and it alone does not accurately display the latency while using pfsense as a DNS server/forwarder. The times between nslookup/dig are identical btw:

    While using:

    $ while [ true ]; do time dig google.com | grep Query; sleep 1; done
    ;; Query time: 38 msec
    
    real	0m1.048s
    user	0m0.004s
    sys	0m0.004s
    ;; Query time: 45 msec
    
    real	0m1.055s
    user	0m0.000s
    sys	0m0.008s
    ;; Query time: 38 msec
    
    real	0m1.047s
    user	0m0.008s
    sys	0m0.000s
    ;; Query time: 35 msec
    
    real	0m1.044s
    user	0m0.004s
    sys	0m0.004s
    
    

    Without:

    $ while [ true ]; do time dig google.com | grep Query; sleep 1; done
    ;; Query time: 34 msec
    
    real	0m0.044s
    user	0m0.000s
    sys	0m0.008s
    ;; Query time: 40 msec
    
    real	0m0.046s
    user	0m0.000s
    sys	0m0.004s
    ;; Query time: 37 msec
    
    real	0m0.046s
    user	0m0.000s
    sys	0m0.008s
    ;; Query time: 36 msec
    
    real	0m0.046s
    user	0m0.000s
    sys	0m0.008s
    ;; Query time: 38 msec
    
    real	0m0.047s
    user	0m0.000s
    sys	0m0.008s
    

    I'm looking at a 20x increase is latency while using the FW



  • I think I should also correct the Post subject/issue I'm having.

    It does seem that the device is caching (reduced query times), however I'm experiencing severe latency while doing so.

    Where should I be looking? I'm not seeing any cpu spikes/nic drops etc when I try this. CPU is idling between 99-100%

    Proof it is indeed caching:
    $ while [ true ]; do time dig pfsense.org | grep Query; sleep 1; done
    ;; Query time: 86 msec

    real 0m1.095s
    user 0m0.000s
    sys 0m0.008s
    ;; Query time: 40 msec

    real 0m1.050s
    user 0m0.008s
    sys 0m0.000s
    ;; Query time: 52 msec

    real 0m1.062s
    user 0m0.004s
    sys 0m0.004s
    ;; Query time: 36 msec

    real 0m1.046s
    user 0m0.000s
    sys 0m0.008s

    Edit: Although the Query times still aren't any faster than external servers…physics says it should be.



  • The time nslookup takes to run doesn't necessarily have any exact relation to how fast the DNS server responds is why that isn't valid.

    This is along the lines of what's typical to see, first query will depend on how fast your configured DNS servers respond, subsequent ones within the TTL will respond from cache.

    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
    ;; Query time: 70 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
    ;; Query time: 1 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
    ;; Query time: 1 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
    ;; Query time: 1 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.5|grep Query
    ;; Query time: 1 msec

    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
    ;; Query time: 28 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
    ;; Query time: 1 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
    ;; Query time: 1 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
    ;; Query time: 1 msec
    [cmb@fbsd83 ~]$ dig forum.pfsense.org @10.x.x.1|grep Query
    ;; Query time: 1 msec


  • Rebel Alliance Global Moderator

    You clearly have something wrong - is that 1m I am seeing in your post?

    Here is from my linux box to my pfsense box (dns forwarder)

    @ubuntu:~$ while [ true ]; do time nslookup google.com | grep real; sleep 1; done

    real    0m0.669s
    user    0m0.036s
    sys    0m0.112s

    real    0m0.021s
    user    0m0.004s
    sys    0m0.012s

    real    0m0.021s
    user    0m0.012s
    sys    0m0.004s

    real    0m0.020s
    user    0m0.004s
    sys    0m0.008s



  • @johnpoz:

    You clearly have something wrong - is that 1m I am seeing in your post?

    Here is from my linux box to my pfsense box (dns forwarder)

    @ubuntu:~$ while [ true ]; do time nslookup google.com | grep real; sleep 1; done

    real    0m0.669s
    user    0m0.036s
    sys     0m0.112s

    real    0m0.021s
    user    0m0.004s
    sys     0m0.012s

    real    0m0.021s
    user    0m0.012s
    sys     0m0.004s

    real    0m0.020s
    user    0m0.004s
    sys     0m0.008s

    Yeah, I definitely have something wrong. I'm trying to figure out what to debug/look at, but I'm lost. In my post, you're seeing 1.05~ seconds to complete the process vs 0.05~ seconds when I'm not using the forwarder. While 1.05 seconds isn't a 'ton', it's extremely noticeable when browsing the web.

    As I said earlier, it's clearly not a caching problem, there's something causing the gigantic delay for the FW to process my request.

    What should I be looking at?



  • Just so we're clear, the caching mechanism is working:

    dig pfsense.org | grep Query
    ;; Query time: 39 msec
    dig pfsense.org | grep Query
    ;; Query time: 0 msec
    dig eff.org | grep Query
    ;; Query time: 37 msec
    dig eff.org | grep Query
    ;; Query time: 0 msec



  • How many domains in the search list?

    Use Wireshark to see what is really going on with the DNS queries.



  • @NOYB:

    How many domains in the search list?

    Use Wireshark to see what is really going on with the DNS queries.

    I was only using localhost + 2 domains (8.8.4.4 and 4.2.2.2). When I was watching via tcpdump, it looked like the box queried both DNS servers simultaneously. Is that normal? I was under the assumption it would only use server #3 if no response was heard from server #2 within 'x' time.

    Another strange thing, the ridiculous latency is now gone, but I haven't changed anything yet…  Has anyone experienced something like this before?

    Unrelated questions:

    1. Is there any way I can view what's in the cache?
    2. What's the default number of entries for the cache? It doesn't seem to maintain a very large amount... What's the 'supported' way of increasing the size?

    Thanks again,



  • Yes, pfSense queries all DNS servers simultaneously and uses the first response.  That is normal for pfSense.  That is my understanding anyway.

    Look in the query response in tcp dump.  In there will be the TTL.  That’s how long it should remain available in cache, assuming it’s not purged for some other reason.  Don’t know what size limitation pfSense may have but it’s probably more than you’ll run into.

    P.S. Modified my pfSense forwarder to query the DNS servers sequentially.  My primary DNS server responds quickest nearly every time anyway.  So the additional queries don't really add any benefit, except when primary DNS server is down query responses will be slower.  But that is rare.



  • @NOYB:

    P.S. Modified my pfSense forwarder to query the DNS servers sequentially.  My primary DNS server responds quickest nearly every time anyway.  So the additional queries don't really add any benefit, except when primary DNS server is down query responses will be slower.  But that is rare.

    I wouldn't do that in most circumstances, you'll have much more consistent performance with the defaults, and it's not like doubling, tripling or quadrupling your DNS requests has any notable impact on bandwidth or anything else.



  • @NetworkNubbin:

    Unrelated questions:

    1. Is there any way I can view what's in the cache?
    2. What's the default number of entries for the cache? It doesn't seem to maintain a very large amount… What's the 'supported' way of increasing the size?

    http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html
    and related searches for dnsmasq will find your answers.



  • @cmb:

    @NetworkNubbin:

    Unrelated questions:

    1. Is there any way I can view what's in the cache?
    2. What's the default number of entries for the cache? It doesn't seem to maintain a very large amount… What's the 'supported' way of increasing the size?

    http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html
    and related searches for dnsmasq will find your answers.

    Thanks. I've found the answer to #2, however I haven't been able to find a way to view the entries in the cache without either:
    a) using debug mode
    or
    b) killing the process

    Is there a way to do it while it is running normally without killing it?



  • @cmb:

    @NOYB:

    P.S. Modified my pfSense forwarder to query the DNS servers sequentially.  My primary DNS server responds quickest nearly every time anyway.  So the additional queries don't really add any benefit, except when primary DNS server is down query responses will be slower.  But that is rare.

    I wouldn't do that in most circumstances, you'll have much more consistent performance with the defaults, and it's not like doubling, tripling or quadrupling your DNS requests has any notable impact on bandwidth or anything else.

    Not suggesting that you or anyone else should do this.  Just pointing out that it can be done because the OP asked about pfSense simultaneous DNS queries behavior.

    As mentioned previously, since my primary DNS server is the first to respond nearly 100 percent of the time the main benefit of the others is if/when the primary goes down, which is rare.  I’ll stick with sequential queries.  Don’t consider mine to be "most circumstances".

    But let’s not hijack this thread.  I’ve posted details for doing this in another thread where it can be discussed on topic.


Locked