Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbound: DNS request timed out for two requests, then returns Non-authoritative answer

    Scheduled Pinned Locked Moved DHCP and DNS
    28 Posts 3 Posters 8.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • johnpozJ
      johnpoz LAYER 8 Global Moderator @Paint
      last edited by johnpoz

      That screams something wrong with your windows machine talking to pfsense in general..

      If some linux client gets the answer right away..

      Or it could be maybe windows is adding a lot of search suffix that are not working?

      In your nslookup set debug..

      example.

      C:\>nslookup
      Default Server:  pi-hole.local.lan
      Address:  192.168.3.10
      
      > set debug
      > www.google.com
      Server:  pi-hole.local.lan
      Address:  192.168.3.10
      
      ------------
      Got answer:
          HEADER:
              opcode = QUERY, id = 2, rcode = NXDOMAIN
              header flags:  response, auth. answer, want recursion, recursion avail.
              questions = 1,  answers = 0,  authority records = 0,  additional = 0
      
          QUESTIONS:
              www.google.com.local.lan, type = A, class = IN
      
      ------------
      ------------
      Got answer:
          HEADER:
              opcode = QUERY, id = 3, rcode = NXDOMAIN
              header flags:  response, auth. answer, want recursion, recursion avail.
              questions = 1,  answers = 0,  authority records = 0,  additional = 0
      
          QUESTIONS:
              www.google.com.local.lan, type = AAAA, class = IN
      
      ------------
      ------------
      Got answer:
          HEADER:
              opcode = QUERY, id = 4, rcode = NOERROR
              header flags:  response, want recursion, recursion avail.
              questions = 1,  answers = 1,  authority records = 0,  additional = 0
      
          QUESTIONS:
              www.google.com, type = A, class = IN
          ANSWERS:
          ->  www.google.com
              internet address = 172.217.6.4
              ttl = 176 (2 mins 56 secs)
      
      ------------
      Non-authoritative answer:
      ------------
      Got answer:
          HEADER:
              opcode = QUERY, id = 5, rcode = NOERROR
              header flags:  response, want recursion, recursion avail.
              questions = 1,  answers = 1,  authority records = 0,  additional = 0
      
          QUESTIONS:
              www.google.com, type = AAAA, class = IN
          ANSWERS:
          ->  www.google.com
              AAAA IPv6 address = 2607:f8b0:4009:811::2004
              ttl = 1428 (23 mins 48 secs)
      
      ------------
      Name:    www.google.com
      Addresses:  2607:f8b0:4009:811::2004
                172.217.6.4
      
      >
      

      Notice how its asking for www.google.com.local.lan and getting nx for that, and then trying again without the search suffix..

      Trying putting a . on the end of the query so it doesn't do suffix search.

      $ nslookup
      Default Server:  pi-hole.local.lan
      Address:  192.168.3.10
      
      > set debug
      > www.google.com.
      Server:  pi-hole.local.lan
      Address:  192.168.3.10
      
      ------------
      Got answer:
          HEADER:
              opcode = QUERY, id = 2, rcode = NOERROR
              header flags:  response, want recursion, recursion avail.
              questions = 1,  answers = 1,  authority records = 0,  additional = 0
      
          QUESTIONS:
              www.google.com, type = A, class = IN
          ANSWERS:
          ->  www.google.com
              internet address = 172.217.6.4
              ttl = 73 (1 min 13 secs)
      
      ------------
      Non-authoritative answer:
      ------------
      Got answer:
          HEADER:
              opcode = QUERY, id = 3, rcode = NOERROR
              header flags:  response, want recursion, recursion avail.
              questions = 1,  answers = 1,  authority records = 0,  additional = 0
      
          QUESTIONS:
              www.google.com, type = AAAA, class = IN
          ANSWERS:
          ->  www.google.com
              AAAA IPv6 address = 2607:f8b0:4009:811::2004
              ttl = 1325 (22 mins 5 secs)
      
      ------------
      Name:    www.google.com
      Addresses:  2607:f8b0:4009:811::2004
                172.217.6.4
      
      >
      

      Notice it doesn't do the search suffix queries when I add the . on the end..

      If the suffix is search is what is causing the issue - you could set your zone pf.lan as static so unbound doesn't try and resolve those if not local record.

      is this windows client wired or wireless? if still problems even when you don't ask for suffix, then do a sniff and make sure unbound is actually getting the query the first time windows asks for it.. You can do this via sniff on windows client via say wireshark, and same time doing sniff on pfsense inteface with packet capture under diag.. Making sure that pfsense actually is getting all the queries.

      edit: While for sure shouldn't be causing this.. I am personally not a fan of using ALL for outbound.. Maybe unbound is trying wrong interface? Try setting your outbound interface to only the interface(s) that can actually be used to resolve.. Say wan only, or just using localhost as outbound interface.

      You don't have any domain forwards or anything setup for say that pf.lan network do you? Or google.com ?

      edit2: BTW what are you trying to do with your ACLs? Just looked at that - why are you doing a deny with 0.0.0.0? Did you turn off auto ACLs if you want to do your own?

      An intelligent man is sometimes forced to be drunk to spend time with his fools
      If you get confused: Listen to the Music Play
      Please don't Chat/PM me for help, unless mod related
      SG-4860 24.11 | Lab VMs 2.8, 24.11

      P 1 Reply Last reply Reply Quote 0
      • P
        Paint @johnpoz
        last edited by

        @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

        That screams something wrong with your windows machine talking to pfsense in general..

        If some linux client gets the answer right away..

        Or it could be maybe windows is adding a lot of search suffix that are not working?

        In your nslookup set debug..

        example.

        C:\>nslookup
        Default Server:  pi-hole.local.lan
        Address:  192.168.3.10
        
        > set debug
        > www.google.com
        Server:  pi-hole.local.lan
        Address:  192.168.3.10
        
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 2, rcode = NXDOMAIN
                header flags:  response, auth. answer, want recursion, recursion avail.
                questions = 1,  answers = 0,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com.local.lan, type = A, class = IN
        
        ------------
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 3, rcode = NXDOMAIN
                header flags:  response, auth. answer, want recursion, recursion avail.
                questions = 1,  answers = 0,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com.local.lan, type = AAAA, class = IN
        
        ------------
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 4, rcode = NOERROR
                header flags:  response, want recursion, recursion avail.
                questions = 1,  answers = 1,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com, type = A, class = IN
            ANSWERS:
            ->  www.google.com
                internet address = 172.217.6.4
                ttl = 176 (2 mins 56 secs)
        
        ------------
        Non-authoritative answer:
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 5, rcode = NOERROR
                header flags:  response, want recursion, recursion avail.
                questions = 1,  answers = 1,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com, type = AAAA, class = IN
            ANSWERS:
            ->  www.google.com
                AAAA IPv6 address = 2607:f8b0:4009:811::2004
                ttl = 1428 (23 mins 48 secs)
        
        ------------
        Name:    www.google.com
        Addresses:  2607:f8b0:4009:811::2004
                  172.217.6.4
        
        >
        

        Notice how its asking for www.google.com.local.lan and getting nx for that, and then trying again without the search suffix..

        Trying putting a . on the end of the query so it doesn't do suffix search.

        $ nslookup
        Default Server:  pi-hole.local.lan
        Address:  192.168.3.10
        
        > set debug
        > www.google.com.
        Server:  pi-hole.local.lan
        Address:  192.168.3.10
        
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 2, rcode = NOERROR
                header flags:  response, want recursion, recursion avail.
                questions = 1,  answers = 1,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com, type = A, class = IN
            ANSWERS:
            ->  www.google.com
                internet address = 172.217.6.4
                ttl = 73 (1 min 13 secs)
        
        ------------
        Non-authoritative answer:
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 3, rcode = NOERROR
                header flags:  response, want recursion, recursion avail.
                questions = 1,  answers = 1,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com, type = AAAA, class = IN
            ANSWERS:
            ->  www.google.com
                AAAA IPv6 address = 2607:f8b0:4009:811::2004
                ttl = 1325 (22 mins 5 secs)
        
        ------------
        Name:    www.google.com
        Addresses:  2607:f8b0:4009:811::2004
                  172.217.6.4
        
        >
        

        Notice it doesn't do the search suffix queries when I add the . on the end..

        If the suffix is search is what is causing the issue - you could set your zone pf.lan as static so unbound doesn't try and resolve those if not local record.

        is this windows client wired or wireless? if still problems even when you don't ask for suffix, then do a sniff and make sure unbound is actually getting the query the first time windows asks for it.. You can do this via sniff on windows client via say wireshark, and same time doing sniff on pfsense inteface with packet capture under diag.. Making sure that pfsense actually is getting all the queries.

        edit: While for sure shouldn't be causing this.. I am personally not a fan of using ALL for outbound.. Maybe unbound is trying wrong interface? Try setting your outbound interface to only the interface(s) that can actually be used to resolve.. Say wan only, or just using localhost as outbound interface.

        You don't have any domain forwards or anything setup for say that pf.lan network do you? Or google.com ?

        edit2: BTW what are you trying to do with your ACLs? Just looked at that - why are you doing a deny with 0.0.0.0? Did you turn off auto ACLs if you want to do your own?

        Hi @johnpoz thank you for your continued help. Here are my answers:

        nslookup debug:

        nslookup
        Default Server:  pfSense.pf.lan
        Address:  2001:<redact>::1
        
        > set debug
        > www.google.com
        Server:  pfSense.pf.lan
        Address:  2001:<redact>::1
        
        DNS request timed out.
            timeout was 2 seconds.
        timeout (2 secs)
        DNS request timed out.
            timeout was 2 seconds.
        timeout (2 secs)
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 4, rcode = NOERROR
                header flags:  response, want recursion, recursion avail.
                questions = 1,  answers = 1,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com, type = A, class = IN
            ANSWERS:
            ->  www.google.com
                internet address = 172.217.6.196
                ttl = 30 (30 secs)
        
        ------------
        Non-authoritative answer:
        ------------
        Got answer:
            HEADER:
                opcode = QUERY, id = 5, rcode = NOERROR
                header flags:  response, want recursion, recursion avail.
                questions = 1,  answers = 1,  authority records = 0,  additional = 0
        
            QUESTIONS:
                www.google.com, type = AAAA, class = IN
            ANSWERS:
            ->  www.google.com
                AAAA IPv6 address = 2607:f8b0:4006:804::2004
                ttl = 236 (3 mins 56 secs)
        
        ------------
        Name:    www.google.com
        Addresses:  2607:f8b0:4006:804::2004
                  172.217.6.196
        

        How do I set my zone pf.lan as static?

        I just tested this on wired windows machines and wireless windows machines. It seems to only happen on wireless devices.

        Ill change my outbound for unbound to be WAN, HENETV6, and localhost

        You don't have any domain forwards or anything setup for say that pf.lan network do you? Or google.com ? no domain forwards

        With my ACLs, I am trying to deny all hosts besides specific ipv4/ipv6 subnets. I didnt turn off auto ACLs. Here is what my access_lists.conf looks like:

        [2.4.5-RELEASE][root@pfSense.pf.lan]/var/unbound: cat access_lists.conf
        access-control: 127.0.0.1/32 allow_snoop
        access-control: ::1 allow_snoop
        access-control: 2001:<redacted>::/64 allow
        access-control: 2001:<redacted>::2/128 allow
        access-control: 2001:<redacted>::/64 allow
        access-control: 172.16.10.1/32 allow
        access-control: 192.168.50.0/24 allow
        access-control: 192.168.95.1/32 allow
        access-control: 192.168.98.1/32 allow
        access-control: 192.168.99.1/32 allow
        access-control: 127.0.0.0/8 allow
        access-control: ::1/128 allow
        access-control: 192.168.1.0/24 allow
        access-control: 192.168.99.0/24 allow
        #BlockDNS
        access-control: 0.0.0.0/0 deny
        access-control: ::/0 deny
        #AllowDNS
        access-control: 127.0.0.0/8 allow
        access-control: fc00::/7 allow
        access-control: 192.168.0.0/16 allow
        access-control: 2001:<redacted>::/48 allow
        

        pfSense i5-4590
        940/880 mbit Fiber Internet from FiOS
        BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
        Netgear R8000 AP (DD-WRT)

        1 Reply Last reply Reply Quote 0
        • johnpozJ
          johnpoz LAYER 8 Global Moderator
          last edited by

          that just looks like windows isn't able to actually talk to pfsense on 53..

          I would do a sniff.. If you say it doesn't happen on wired - then points to your wireless having issues.

          As to your ACLs there is not reason to do a deny, since that would be the default if there is no allow.. the only reason would need a deny is if the IP you wanted to deny fell in the same range you wanted to allow..

          There is no point to allowing networks that are directly attached to pfsense, since the auto ACLs would do that for you.. Only reason you should have to create ACLs on your own is if you disabled the auto acls, or if you wanted to say deny a specific IP or range, that your auto ACLs would allow..

          But it would still be better if your going to do your own ACLs to disable the auto.. But that is not your issue. Your issue pointing to just the client having a hard time talking to pfsense.. A sniff would show you this..

          An intelligent man is sometimes forced to be drunk to spend time with his fools
          If you get confused: Listen to the Music Play
          Please don't Chat/PM me for help, unless mod related
          SG-4860 24.11 | Lab VMs 2.8, 24.11

          P 1 Reply Last reply Reply Quote 0
          • P
            Paint @johnpoz
            last edited by

            @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

            that just looks like windows isn't able to actually talk to pfsense on 53..

            I would do a sniff.. If you say it doesn't happen on wired - then points to your wireless having issues.

            As to your ACLs there is not reason to do a deny, since that would be the default if there is no allow.. the only reason would need a deny is if the IP you wanted to deny fell in the same range you wanted to allow..

            There is no point to allowing networks that are directly attached to pfsense, since the auto ACLs would do that for you.. Only reason you should have to create ACLs on your own is if you disabled the auto acls, or if you wanted to say deny a specific IP or range, that your auto ACLs would allow..

            But it would still be better if your going to do your own ACLs to disable the auto.. But that is not your issue. Your issue pointing to just the client having a hard time talking to pfsense.. A sniff would show you this..

            thank you for the advice. Ill remove my deny ACLs and simplify the allows I put.

            Ill let you know the results of wireshark and investigating the issue with wireless clients. thank you

            pfSense i5-4590
            940/880 mbit Fiber Internet from FiOS
            BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
            Netgear R8000 AP (DD-WRT)

            P 1 Reply Last reply Reply Quote 0
            • P
              Paint @Paint
              last edited by

              @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

              @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

              that just looks like windows isn't able to actually talk to pfsense on 53..

              I would do a sniff.. If you say it doesn't happen on wired - then points to your wireless having issues.

              As to your ACLs there is not reason to do a deny, since that would be the default if there is no allow.. the only reason would need a deny is if the IP you wanted to deny fell in the same range you wanted to allow..

              There is no point to allowing networks that are directly attached to pfsense, since the auto ACLs would do that for you.. Only reason you should have to create ACLs on your own is if you disabled the auto acls, or if you wanted to say deny a specific IP or range, that your auto ACLs would allow..

              But it would still be better if your going to do your own ACLs to disable the auto.. But that is not your issue. Your issue pointing to just the client having a hard time talking to pfsense.. A sniff would show you this..

              thank you for the advice. Ill remove my deny ACLs and simplify the allows I put.

              Ill let you know the results of wireshark and investigating the issue with wireless clients. thank you

              Nothing odd about the wireshark capture when doing DNS requests.

              I have isolated the issue to Windows 10 machines, wired or wireless connectivity does not matter. Windows 7 and Ubuntu work fine.

              Additionally, if I do nslookup google.com. the response comes back immediately. If I leave off the period at the end, it timesout twice before getting a result.

              pfSense i5-4590
              940/880 mbit Fiber Internet from FiOS
              BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
              Netgear R8000 AP (DD-WRT)

              johnpozJ P 2 Replies Last reply Reply Quote 0
              • johnpozJ
                johnpoz LAYER 8 Global Moderator @Paint
                last edited by

                @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                If I leave off the period at the end, it timesout twice before getting a result.

                Well what is your search suffix that is being added.. If your using that pf.lan as your suffix and you don't have pf.lan set in unbound as a static zone - unbound would try and resolve those. That could be having delays vs just sending back a SOA..

                anything.pf.lan should come back almost instantly with just a SOA since not a valid tld.

                ;; QUESTION SECTION:
                ;www.pf.lan.                    IN      A
                
                ;; AUTHORITY SECTION:
                .                       3600    IN      SOA     a.root-servers.net. nstld.verisign-grs.com. 2020122301 1800 900 604800 86400
                

                But depending on how your setup - maybe you have some delay in getting this. You can stop that from happening if set your zone in unbound to static.. Then only stuff that exisits.pf.lan would resolve, something.pf.lan would just send back NX.. If pf.lan is what you have setup for pfsense domain, and your clients..

                Your linux box and other devices are prob not seeing this because they most likely do not use suffix search out of the box, etc..

                As stated before - set debug in nslookup and see what the client is actually asking for when you do not use the . on the end of the query. Maybe they are using something other than pf.lan? Maybe they have a huge list of suffixes they are searching for?

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 24.11 | Lab VMs 2.8, 24.11

                1 Reply Last reply Reply Quote 0
                • P
                  Paint @Paint
                  last edited by

                  @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                  @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                  @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                  that just looks like windows isn't able to actually talk to pfsense on 53..

                  I would do a sniff.. If you say it doesn't happen on wired - then points to your wireless having issues.

                  As to your ACLs there is not reason to do a deny, since that would be the default if there is no allow.. the only reason would need a deny is if the IP you wanted to deny fell in the same range you wanted to allow..

                  There is no point to allowing networks that are directly attached to pfsense, since the auto ACLs would do that for you.. Only reason you should have to create ACLs on your own is if you disabled the auto acls, or if you wanted to say deny a specific IP or range, that your auto ACLs would allow..

                  But it would still be better if your going to do your own ACLs to disable the auto.. But that is not your issue. Your issue pointing to just the client having a hard time talking to pfsense.. A sniff would show you this..

                  thank you for the advice. Ill remove my deny ACLs and simplify the allows I put.

                  Ill let you know the results of wireshark and investigating the issue with wireless clients. thank you

                  Nothing odd about the wireshark capture when doing DNS requests.

                  I have isolated the issue to Windows 10 machines, wired or wireless connectivity does not matter. Windows 7 and Ubuntu work fine.

                  Additionally, if I do nslookup google.com. the response comes back immediately. If I leave off the period at the end, it timesout twice before getting a result.

                  I changed the System Domain Local Zone Type setting to be Static, as you suggested. It doesn't fix this issue, unfortunately.

                  pf.lan is my local domain for my LAN devices, correct.

                  nslookup with debug on still doesnt give me enough information to determine that windows is first trying google.com.pf.lan, google.com.pf and finally google.com, where the last query works. If I change my DNS settings in the network adapter to "Append these DNS suffixes (in order): and add ".", it fixes the problem.

                  pfSense i5-4590
                  940/880 mbit Fiber Internet from FiOS
                  BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                  Netgear R8000 AP (DD-WRT)

                  johnpozJ 1 Reply Last reply Reply Quote 0
                  • johnpozJ
                    johnpoz LAYER 8 Global Moderator @Paint
                    last edited by johnpoz

                    @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                    nslookup with debug on still doesnt give me enough information

                    Post the full output of your nslookup after you set debug.. Without . being set in your append in order setting your doing.

                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                    If you get confused: Listen to the Music Play
                    Please don't Chat/PM me for help, unless mod related
                    SG-4860 24.11 | Lab VMs 2.8, 24.11

                    P 1 Reply Last reply Reply Quote 0
                    • P
                      Paint @johnpoz
                      last edited by

                      @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                      @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                      nslookup with debug on still doesnt give me enough information

                      Post the full output of your nslookup after you set debug.. Without . being set in your append in order setting your doing.

                      nslookup
                      Default Server:  pfSense.pf.lan
                      Address:  2001:<redact>::1
                      
                      > set debug
                      > google.com
                      Server:  pfSense.pf.lan
                      Address:  2001:<redact>::1
                      
                      DNS request timed out.
                          timeout was 2 seconds.
                      timeout (2 secs)
                      DNS request timed out.
                          timeout was 2 seconds.
                      timeout (2 secs)
                      ------------
                      Got answer:
                          HEADER:
                              opcode = QUERY, id = 4, rcode = NOERROR
                              header flags:  response, want recursion, recursion avail.
                              questions = 1,  answers = 1,  authority records = 0,  additional = 0
                      
                          QUESTIONS:
                              google.com, type = A, class = IN
                          ANSWERS:
                          ->  google.com
                              internet address = 172.217.12.142
                              ttl = 30 (30 secs)
                      
                      ------------
                      Non-authoritative answer:
                      ------------
                      Got answer:
                          HEADER:
                              opcode = QUERY, id = 5, rcode = NOERROR
                              header flags:  response, want recursion, recursion avail.
                              questions = 1,  answers = 1,  authority records = 0,  additional = 0
                      
                          QUESTIONS:
                              google.com, type = AAAA, class = IN
                          ANSWERS:
                          ->  google.com
                              AAAA IPv6 address = 2607:f8b0:4006:819::200e
                              ttl = 30 (30 secs)
                      
                      ------------
                      Name:    google.com
                      Addresses:  2607:f8b0:4006:819::200e
                                172.217.12.142
                      

                      pfSense i5-4590
                      940/880 mbit Fiber Internet from FiOS
                      BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                      Netgear R8000 AP (DD-WRT)

                      johnpozJ 1 Reply Last reply Reply Quote 0
                      • johnpozJ
                        johnpoz LAYER 8 Global Moderator @Paint
                        last edited by

                        That doesn't make any sense at all..

                        What does the sniff show.. Post up the pcap from client side and the pfsense side..

                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                        If you get confused: Listen to the Music Play
                        Please don't Chat/PM me for help, unless mod related
                        SG-4860 24.11 | Lab VMs 2.8, 24.11

                        P 1 Reply Last reply Reply Quote 0
                        • P
                          Paint @johnpoz
                          last edited by

                          @johnpoz I think its a windows 10 issue.... and the way they propagate DNS

                          Ill PM you the two pcaps

                          pfSense i5-4590
                          940/880 mbit Fiber Internet from FiOS
                          BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                          Netgear R8000 AP (DD-WRT)

                          johnpozJ 1 Reply Last reply Reply Quote 0
                          • johnpozJ
                            johnpoz LAYER 8 Global Moderator @Paint
                            last edited by johnpoz

                            Yeah I got them... This is strange as F.. You can see the unbound sent back response..

                            response.png

                            But the sniff on the client doesn't show that response ever getting there???

                            That is really odd -- The response was sent back in like 1.2ms - but the client never got it?? What is between the client and pfsense? switch, wireless? Clearly something filtered that response.. Sure looks like sent to the correct mac, etc.

                            That is very odd - but explains why your seeing the timeout, because you never got the NX response back saying hey SOA a.rootservers..

                            I am on windows 10.. But not using ipv6 for dns - but per the sniff the response never got to the OS to do anything with..

                            But your other queries do, like your initial PTR for the ns name based on its IP.. that is really really strange.

                            As a test could you do something that for sure would come back NX..

                            say something like test.sljhdlosjdfsljfdsljsdfjls.whateverdomain

                            There is no tld of whateverdomain ;) That should come back NX

                            > test.soldjflsjfldsjdfsfd.whateverdomain
                            Server:  pi-hole.local.lan
                            Address:  192.168.3.10
                            
                            *** pi-hole.local.lan can't find test.soldjflsjfldsjdfsfd.whateverdomain: Non-existent domain
                            

                            Or does that just timeout?

                            edit:
                            A sniff is done at the wire - before any like security software that could filter it.. So even if was something odd with the OS saying hey I don't like this nx, and going to ignore it, etc You should still see that in the sniff that it got there. But in your host sniff the response for NX is not there.. But clearly it was put on the wire by sniff you did on pfsense.. Something is odd for sure!!

                            edit2:
                            Ok something else odd.. The stuff you are getting back is being sent twice by the server.. But your host is only seeing it once..

                            The NX responses are only being sent once..

                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                            If you get confused: Listen to the Music Play
                            Please don't Chat/PM me for help, unless mod related
                            SG-4860 24.11 | Lab VMs 2.8, 24.11

                            P 1 Reply Last reply Reply Quote 0
                            • P
                              Paint @johnpoz
                              last edited by Paint

                              @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                              wire - before any like security software that could filter it.. So even if was something odd with the OS saying hey I don't like this nx, and going to ignore it, etc You should still see that in the sniff that it got there. But in your host sniff the response for NX is not there.. But clearly it was put on th

                              I agree.. this is very odd. I run a pfSense box with a LAGG to a brocade layer 3 switch. I have two other unmanaged switches in the house. However, this seems to be happening with all of my windows 10 machines. I do run ipv6, but the issue also occurs on ipv4 (using 192.168.1.1).

                              nslookup
                              Default Server:  pfSense.pf.lan
                              Address:  2001:<redact>::1
                              
                              > set debug
                              > whateverfakedomain.pf.lan
                              Server:  pfSense.pf.lan
                              Address:  2001:<redact>::1
                              
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              *** Request to pfSense.pf.lan timed-out
                              > whateverfakedomain.pf.lan 192.168.1.1
                              Server:  [192.168.1.1]
                              Address:  192.168.1.1
                              
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              DNS request timed out.
                                  timeout was 2 seconds.
                              timeout (2 secs)
                              *** Request to 192.168.1.1 timed-out
                              

                              For completeness, here are my custom options:

                              server:
                              private-domain: "pf.lan."
                              local-zone: "netflix.com" typetransparent
                              local-data: "netflix.com IN AAAA ::"
                              local-zone: "netflix.net" typetransparent
                              local-data: "netflix.net IN AAAA ::"
                              local-zone: "nflxext.com" typetransparent
                              local-data: "nflxext.com IN AAAA ::"
                              local-zone: "nflximg.net" typetransparent
                              local-data: "nflximg.net IN AAAA ::"
                              local-zone: "nflxvideo.net" typetransparent
                              local-data: "nflxvideo.net IN AAAA ::"
                              local-zone: "www.netflix.com" typetransparent
                              local-data: "www.netflix.com IN AAAA ::"
                              local-zone: "customerevents.netflix.com" typetransparent
                              local-data: "customerevents.netflix.com IN AAAA ::"
                              local-zone: "secure.netflix.com" typetransparent
                              local-data: "secure.netflix.com IN AAAA ::"
                              local-zone: "adtech.nflximg.net" typetransparent
                              local-data: "adtech.nflximg.net IN AAAA ::"
                              local-zone: "assets.nflxext.com" typetransparent
                              local-data: "assets.nflxext.com IN AAAA ::"
                              local-zone: "codex.nflxext.com" typetransparent
                              local-data: "codex.nflxext.com IN AAAA ::"
                              local-zone: "dockhand.netflix.com" typetransparent
                              local-data: "dockhand.netflix.com IN AAAA ::"
                              local-zone: "ichnaea.netflix.com" typetransparent
                              local-data: "ichnaea.netflix.com IN AAAA ::"
                              local-zone: "art-s.nflximg.net" typetransparent
                              local-data: "art-s.nflximg.net IN AAAA ::"
                              local-zone: "tp-s.nflximg.net" typetransparent
                              local-data: "tp-s.nflximg.net IN AAAA ::"
                              

                              pfSense i5-4590
                              940/880 mbit Fiber Internet from FiOS
                              BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                              Netgear R8000 AP (DD-WRT)

                              GertjanG johnpozJ 2 Replies Last reply Reply Quote 0
                              • GertjanG
                                Gertjan @Paint
                                last edited by

                                @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                                with all of my windows 10 machine

                                And these are all 20H2 ?

                                No "help me" PM's please. Use the forum, the community will thank you.
                                Edit : and where are the logs ??

                                P 1 Reply Last reply Reply Quote 0
                                • johnpozJ
                                  johnpoz LAYER 8 Global Moderator @Paint
                                  last edited by johnpoz

                                  @paint said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                                  I run a pfSense box with a LAGG to a brocade layer 3 switch

                                  Remove the the lagg..

                                  That is where your problem is... Your traffic is being lost unless its sent twice..

                                  Window 10 has NOTHING to do with it.. From your sniffs the traffic gets to pfsense, pfsense answers with NX.. But your client doesn't get the answer.

                                  This is not blocked at some software firewall, or the OS ignoring it.. You don't see the packet on the wire.. But it was put on the wire.. So somewhere on the wire it was lost.

                                  Remove your lagg from the equation.

                                  That is not really what I asked for.. .look for something.slosjfsldfjs.whateverdomain not whateverdomain.pf.lan

                                  Also do a sniff when you do that on host and pfsense - this time at the same time... Not 2 different times like the last time.. Was easy enough to spot because the source ports were different on your queries..

                                  Your loosing traffic that is NX and only sent once.. From your sniff..

                                  When you do a query for something that doesn't exisit, you would get back a NX.. Not a timeout.. But the response is not getting back to your client.. So its a time out. The response was put on the wire by the server (unbound/pfsense) but the client didn't get it.. Its being lost in your network between... That is what the sniffs show..

                                  I would do a sniff same way from one of your linux boxes - also doing query for something that doesn't exist and would force a NX.. like sljdf.sljdfsdlf.whateverdomain

                                  Linux boxes do not add search suffix unless you specifically tell them too in a setting.. So that is why your not seeing the timeouts.. But would be interesting to see if when only 1 NX sent your linux boxes all so miss the traffic.

                                  Look in your pfsense sniff.. All the traffic that you see on the client has been sent twice..

                                  retrans.png

                                  Stuff that was only put on the wire once - ie your NX client never got, so timeout..

                                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                                  If you get confused: Listen to the Music Play
                                  Please don't Chat/PM me for help, unless mod related
                                  SG-4860 24.11 | Lab VMs 2.8, 24.11

                                  P 1 Reply Last reply Reply Quote 0
                                  • P
                                    Paint @Gertjan
                                    last edited by

                                    @gertjan 20H2, 1903 and one other version in between. They all have the same issue

                                    pfSense i5-4590
                                    940/880 mbit Fiber Internet from FiOS
                                    BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                                    Netgear R8000 AP (DD-WRT)

                                    1 Reply Last reply Reply Quote 0
                                    • P
                                      Paint @johnpoz
                                      last edited by

                                      @johnpoz Ill see if removing the LAGG fixes the issue. Thank you!

                                      Yes, I ran the captures at two different times - I originally configured the capture from my pfSense machine wrong.

                                      pfSense i5-4590
                                      940/880 mbit Fiber Internet from FiOS
                                      BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                                      Netgear R8000 AP (DD-WRT)

                                      johnpozJ 1 Reply Last reply Reply Quote 0
                                      • johnpozJ
                                        johnpoz LAYER 8 Global Moderator @Paint
                                        last edited by johnpoz

                                        Well it worked to show the problem atleast.. But yeah when troubleshooting stuff like this is best to do the sniffs at the same time so that if intermittent packet loss is the problem you can see specifically what happened to specific packet.. In a normal tcp conversation you could use the seq/ack numbers to track which are which.

                                        But with udp, the source port (different for each query) and the transaction ID can help line up which queries and responses go with each other..

                                        Its still odd to me why nx only being sent once, while normal responses are being sent twice.. Maybe that has something to do with the lagg? Very strange.. I do not recall ever seeing such a thing before in troubleshooting dns.. No reponse sure, lost traffic sure.. But in 20 some years of troubleshooting networking, dns, etc.. I do not recall seeing dupes like that..

                                        The closes thing that comes to mind.. Is we had a bug on a cisco switch that was dropping dns inside a vlan.. Bug turned out to be if there was no svi set for that vlan.. When you sniffed the vlan on the switch you could see packets being dropped.. You should always see 2 copies of the packet as it enters the switch and when it leaves the switch.. The bug we were seeing is sometimes you would see the packet enter the switch - but not leave the switch.

                                        That one took a a bit to track down ;) There were multiple switches in the path.. And we could see the packets leaving the source, and being returned by the server.. But the client was not getting the response - same as your seeing.. But then we had to follow the path of the traffic through multiple switches in the datacenter.. And some switches did not support sniffing right on the switch.. So we had to setup span ports with a laptop where the packets were being dropped.. Once we figured out where the packets were being lost - it was simple enough to track down the actual bug report.. Adding a svi to the vlan on that switch, even though it was just doing layer 2 was a work around until they fixed the bug in firmware update on the switch.

                                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                                        If you get confused: Listen to the Music Play
                                        Please don't Chat/PM me for help, unless mod related
                                        SG-4860 24.11 | Lab VMs 2.8, 24.11

                                        P 1 Reply Last reply Reply Quote 0
                                        • P
                                          Paint @johnpoz
                                          last edited by

                                          @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                                          Well it worked to show the problem atleast.. But yeah when troubleshooting stuff like this is best to do the sniffs at the same time so that if intermittent packet loss is the problem you can see specifically what happened to specific packet.. In a normal tcp conversation you could use the seq/ack numbers to track which are which.

                                          But with udp, the source port (different for each query) and the transaction ID can help line up which queries and responses go with each other..

                                          Its still odd to me why nx only being sent once, while normal responses are being sent twice.. Maybe that has something to do with the lagg? Very strange.. I do not recall ever seeing such a thing before in troubleshooting dns.. No reponse sure, lost traffic sure.. But in 20 some years of troubleshooting networking, dns, etc.. I do not recall seeing dupes like that..

                                          The closes thing that comes to mind.. Is we had a bug on a cisco switch that was dropping dns inside a vlan.. Bug turned out to be if there was no svi set for that vlan.. When you sniffed the vlan on the switch you could see packets being dropped.. You should always see 2 copies of the packet as it enters the switch and when it leaves the switch.. The bug we were seeing is sometimes you would see the packet enter the switch - but not leave the switch.

                                          That one took a a bit to track down ;)

                                          thank you, @johnpoz, for your help thus far!

                                          Im not using any VLANS or tagging on my Brocade ICX6450 switch or in my LAN.

                                          Ill investigate if I have any settings wrong on the managed switch and then remove the LAGG.

                                          pfSense i5-4590
                                          940/880 mbit Fiber Internet from FiOS
                                          BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                                          Netgear R8000 AP (DD-WRT)

                                          johnpozJ 1 Reply Last reply Reply Quote 0
                                          • johnpozJ
                                            johnpoz LAYER 8 Global Moderator @Paint
                                            last edited by johnpoz

                                            No didn't mean to suggest it was the same sort of bug.. That was just the closest thing I could remember to such an issue in like 30 years doing this sort of thing ;)

                                            Its sim in the fact that we see the server sending the response, but the client not getting it - so the packet is being lost somewhere..

                                            You also notice in your sniffs that 2 packets are sent for the responses you do get - but your client is only seeing 1 of them..

                                            I have to say it has to be related to your lag.. But I do not recall ever seeing server send 2 responses..

                                            Here
                                            response.png

                                            This is the initial ptr the client does for the name of the NS.. You sent 2 of those - but only 1 was seen by your client.. Its harder to know for sure which one you got.. Because your sniffs were not done at the same time..

                                            But 2 responses were put on the wire - your client should of seen both of those.. They were sent 0.4 ms apart..

                                            The odd thing for me - is why only some responses being sent twice? The NX responses are only sent once - which you do not get.. Strange for sure..

                                            edit:
                                            It would be interesting to see if the 2nd packet is the one you get and the first is always lost sort of thing. This would make sense why your not getting the NX which is only sent once.

                                            But I am not spotting any difference in the packets that could explain why they might be filtered vs the ones sent twice.. They look the same.. same macs, same transaction ids, same ports.. They are just retrans.. I have to assume your getting the retrans.. And I guess its possible that maybe unbound itself is not sending it, but something your switch is doing, resending the packets? That would make more sense really since not sure why unbound would send out retrans for normal, but not NX.. And why would the retrans be sent so fast? Maybe your switch is doing it??? All the sniff tells us is they were seen on the wire..

                                            Which is why I guess it has something to do with the lagg..

                                            Would be interesting to see what happens on your linux boxes where you say your not seeing the problem when you do a query for something that is NX.. And sniff where we see normal responses and nx responses - are the nx only being seen once, while normals are actually seen twice?

                                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                                            If you get confused: Listen to the Music Play
                                            Please don't Chat/PM me for help, unless mod related
                                            SG-4860 24.11 | Lab VMs 2.8, 24.11

                                            P 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.