Intermittent failure of DNS requests



  • Hi All,

    I deployed a pfSense to my home network and have a very strange intermittent failure regarding DNS.

    I have some VLANs on the internal side, the wan is currently connected via a Fritzbox Router to the public internet. The pfSense serves as DNS forwarder for all internal clients using the default settings (local gateway ip) and uses my providers upstream servers as dns source.
    Ocassionally, when I request a webpage or even a ping, I do not get a proper dns answer but get a "no such domain" error. This happens without a noticable delay and when hitting reload/resending the command some seconds later I get a proper answer.

    The problem disappears when I use the upstream dns or the fritzbox for the clients, so I assume its something within pfsense.

    Anybody has ever seen such a behaviour, is there any timeout I can adjust or any other issue I am overlooking? As I am currently migrating the firewall has "allow any" rules an all interfaces.

    Oli



  • Anybody? It is now getting more and more worse and as pretty unusable - I switched my clients to use upstream dns as I get dns failure on roughly 20% of the requests!

    Oliver



  • @oliwel:

    Anybody? It is now getting more and more worse and as pretty unusable - I switched my clients to use upstream dns as I get dns failure on roughly 20% of the requests!

    Oliver

    Can you please post a nslookup with the response failing?

    Add log-queries to your DNS Forwarder advanced options and look at the resolver log and then paste the output of the corresponding nslookup.


  • Rebel Alliance Global Moderator

    so you get say something like this?

    C:>nslookup
    Default Server:  pfsense.local.lan
    Address:  192.168.1.253

    sllssjf.lsjdlsjfsflskfdjsf.com
    Server:  pfsense.local.lan
    Address:  192.168.1.253

    *** pfsense.local.lan can't find sllssjf.lsjdlsjfsflskfdjsf.com: Non-existent domain

    But then you try again and it resolves?  Possible you give an example of actual record your doing a query for.  When you say it users your providers dns - are you doing queries directly to them, or is pfsense asking your router in front, who in turn asks your providers dns?

    Do you have pfsense set for sequential queries in the dns forwarder settings?

    [checkbox] Query DNS servers sequentially
    If this option is set, pfSense DNS Forwarder (dnsmasq) will query the DNS servers sequentially in the order specified (System - General Setup - DNS Servers), rather than all at once in parallel.

    How many upstream dns do you have set in pfsense? 1, 2, 4, etc.  I point to my isp dns along with couple other public ones..  Just habit from when my isp dns (comcast) really use to blow chucks.. They have made great strides over the years and now use anycast, etc.  But asking a few public upstream servers and using the one that answers fastest only cost you a few bits per query vs doing 1 at a time, waiting for time out, then asking next one, etc.  Makes it more likely you will get a response vs timeouts, etc.



  • Thanks for your answers, @bryan.paradis: As I am new to pfSense and did not use dnsmasq before, can you please point me to some docs or give an example how to enable logging?

    @johnpoz: Exactly this happens, here are the results made on my Ubuntu Workstation within 2 second - I got two failures and a result on the third try.

    
    oliwel@platin ~ $ dig www.bus-profi.de @10.16.6.1
    
    ; <<>> DiG 9.9.2-P1 <<>> www.bus-profi.de @10.16.6.1
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18937
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.bus-profi.de.		IN	A
    
    ;; Query time: 32 msec
    ;; SERVER: 10.16.6.1#53(10.16.6.1)
    ;; WHEN: Thu Feb 27 08:03:15 2014
    ;; MSG SIZE  rcvd: 34
    
    oliwel@platin ~ $ dig www.bus-profi.de @10.16.6.1
    
    ; <<>> DiG 9.9.2-P1 <<>> www.bus-profi.de @10.16.6.1
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23048
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.bus-profi.de.		IN	A
    
    ;; Query time: 31 msec
    ;; SERVER: 10.16.6.1#53(10.16.6.1)
    ;; WHEN: Thu Feb 27 08:08:12 2014
    ;; MSG SIZE  rcvd: 34
    
    oliwel@platin ~ $ dig www.bus-profi.de @10.16.6.1
    
    ; <<>> DiG 9.9.2-P1 <<>> www.bus-profi.de @10.16.6.1
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45941
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 512
    ;; QUESTION SECTION:
    ;www.bus-profi.de.		IN	A
    
    ;; ANSWER SECTION:
    www.bus-profi.de.	6902	IN	CNAME	bus-profi.de.
    bus-profi.de.		6902	IN	A	81.169.145.152
    
    ;; Query time: 28 msec
    ;; SERVER: 10.16.6.1#53(10.16.6.1)
    ;; WHEN: Thu Feb 27 08:08:13 2014
    ;; MSG SIZE  rcvd: 75
    
    

    Upstream Servers on Pos 1 and 2 are those of my Upstream Provider and 3 and 4 are the google ones (8.8.8.8 and 8.8.4.4) using parallel query. I also dropped the provider servers and just used google but it didnt change anything. As said, the provider dns works flawlessly when used directly from the clients.

    Oliver



  • @oliwel:

    Thanks for your answers, @bryan.paradis: As I am new to pfSense and did not use dnsmasq before, can you please point me to some docs or give an example how to enable logging?

    @johnpoz: Exactly this happens, here are the results made on my Ubuntu Workstation within 2 second - I got two failures and a result on the third try.

    
    oliwel@platin ~ $ dig www.bus-profi.de @10.16.6.1
    
    ; <<>> DiG 9.9.2-P1 <<>> www.bus-profi.de @10.16.6.1
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18937
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.bus-profi.de.		IN	A
    
    ;; Query time: 32 msec
    ;; SERVER: 10.16.6.1#53(10.16.6.1)
    ;; WHEN: Thu Feb 27 08:03:15 2014
    ;; MSG SIZE  rcvd: 34
    
    oliwel@platin ~ $ dig www.bus-profi.de @10.16.6.1
    
    ; <<>> DiG 9.9.2-P1 <<>> www.bus-profi.de @10.16.6.1
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23048
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.bus-profi.de.		IN	A
    
    ;; Query time: 31 msec
    ;; SERVER: 10.16.6.1#53(10.16.6.1)
    ;; WHEN: Thu Feb 27 08:08:12 2014
    ;; MSG SIZE  rcvd: 34
    
    oliwel@platin ~ $ dig www.bus-profi.de @10.16.6.1
    
    ; <<>> DiG 9.9.2-P1 <<>> www.bus-profi.de @10.16.6.1
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45941
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 512
    ;; QUESTION SECTION:
    ;www.bus-profi.de.		IN	A
    
    ;; ANSWER SECTION:
    www.bus-profi.de.	6902	IN	CNAME	bus-profi.de.
    bus-profi.de.		6902	IN	A	81.169.145.152
    
    ;; Query time: 28 msec
    ;; SERVER: 10.16.6.1#53(10.16.6.1)
    ;; WHEN: Thu Feb 27 08:08:13 2014
    ;; MSG SIZE  rcvd: 75
    
    

    Upstream Servers on Pos 1 and 2 are those of my Upstream Provider and 3 and 4 are the google ones (8.8.8.8 and 8.8.4.4) using parallel query. I also dropped the provider servers and just used google but it didnt change anything. As said, the provider dns works flawlessly when used directly from the clients.

    Oliver

    Services -> DNS Forwarder -> Go down to Advanced and add log-queries -> Save

    Status -> System Logs -> Resolver Log