Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbound: DNS request timed out for two requests, then returns Non-authoritative answer

    Scheduled Pinned Locked Moved DHCP and DNS
    28 Posts 3 Posters 8.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      Paint @Gertjan
      last edited by

      @gertjan 20H2, 1903 and one other version in between. They all have the same issue

      pfSense i5-4590
      940/880 mbit Fiber Internet from FiOS
      BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
      Netgear R8000 AP (DD-WRT)

      1 Reply Last reply Reply Quote 0
      • P
        Paint @johnpoz
        last edited by

        @johnpoz Ill see if removing the LAGG fixes the issue. Thank you!

        Yes, I ran the captures at two different times - I originally configured the capture from my pfSense machine wrong.

        pfSense i5-4590
        940/880 mbit Fiber Internet from FiOS
        BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
        Netgear R8000 AP (DD-WRT)

        johnpozJ 1 Reply Last reply Reply Quote 0
        • johnpozJ
          johnpoz LAYER 8 Global Moderator @Paint
          last edited by johnpoz

          Well it worked to show the problem atleast.. But yeah when troubleshooting stuff like this is best to do the sniffs at the same time so that if intermittent packet loss is the problem you can see specifically what happened to specific packet.. In a normal tcp conversation you could use the seq/ack numbers to track which are which.

          But with udp, the source port (different for each query) and the transaction ID can help line up which queries and responses go with each other..

          Its still odd to me why nx only being sent once, while normal responses are being sent twice.. Maybe that has something to do with the lagg? Very strange.. I do not recall ever seeing such a thing before in troubleshooting dns.. No reponse sure, lost traffic sure.. But in 20 some years of troubleshooting networking, dns, etc.. I do not recall seeing dupes like that..

          The closes thing that comes to mind.. Is we had a bug on a cisco switch that was dropping dns inside a vlan.. Bug turned out to be if there was no svi set for that vlan.. When you sniffed the vlan on the switch you could see packets being dropped.. You should always see 2 copies of the packet as it enters the switch and when it leaves the switch.. The bug we were seeing is sometimes you would see the packet enter the switch - but not leave the switch.

          That one took a a bit to track down ;) There were multiple switches in the path.. And we could see the packets leaving the source, and being returned by the server.. But the client was not getting the response - same as your seeing.. But then we had to follow the path of the traffic through multiple switches in the datacenter.. And some switches did not support sniffing right on the switch.. So we had to setup span ports with a laptop where the packets were being dropped.. Once we figured out where the packets were being lost - it was simple enough to track down the actual bug report.. Adding a svi to the vlan on that switch, even though it was just doing layer 2 was a work around until they fixed the bug in firmware update on the switch.

          An intelligent man is sometimes forced to be drunk to spend time with his fools
          If you get confused: Listen to the Music Play
          Please don't Chat/PM me for help, unless mod related
          SG-4860 24.11 | Lab VMs 2.8, 24.11

          P 1 Reply Last reply Reply Quote 0
          • P
            Paint @johnpoz
            last edited by

            @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

            Well it worked to show the problem atleast.. But yeah when troubleshooting stuff like this is best to do the sniffs at the same time so that if intermittent packet loss is the problem you can see specifically what happened to specific packet.. In a normal tcp conversation you could use the seq/ack numbers to track which are which.

            But with udp, the source port (different for each query) and the transaction ID can help line up which queries and responses go with each other..

            Its still odd to me why nx only being sent once, while normal responses are being sent twice.. Maybe that has something to do with the lagg? Very strange.. I do not recall ever seeing such a thing before in troubleshooting dns.. No reponse sure, lost traffic sure.. But in 20 some years of troubleshooting networking, dns, etc.. I do not recall seeing dupes like that..

            The closes thing that comes to mind.. Is we had a bug on a cisco switch that was dropping dns inside a vlan.. Bug turned out to be if there was no svi set for that vlan.. When you sniffed the vlan on the switch you could see packets being dropped.. You should always see 2 copies of the packet as it enters the switch and when it leaves the switch.. The bug we were seeing is sometimes you would see the packet enter the switch - but not leave the switch.

            That one took a a bit to track down ;)

            thank you, @johnpoz, for your help thus far!

            Im not using any VLANS or tagging on my Brocade ICX6450 switch or in my LAN.

            Ill investigate if I have any settings wrong on the managed switch and then remove the LAGG.

            pfSense i5-4590
            940/880 mbit Fiber Internet from FiOS
            BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
            Netgear R8000 AP (DD-WRT)

            johnpozJ 1 Reply Last reply Reply Quote 0
            • johnpozJ
              johnpoz LAYER 8 Global Moderator @Paint
              last edited by johnpoz

              No didn't mean to suggest it was the same sort of bug.. That was just the closest thing I could remember to such an issue in like 30 years doing this sort of thing ;)

              Its sim in the fact that we see the server sending the response, but the client not getting it - so the packet is being lost somewhere..

              You also notice in your sniffs that 2 packets are sent for the responses you do get - but your client is only seeing 1 of them..

              I have to say it has to be related to your lag.. But I do not recall ever seeing server send 2 responses..

              Here
              response.png

              This is the initial ptr the client does for the name of the NS.. You sent 2 of those - but only 1 was seen by your client.. Its harder to know for sure which one you got.. Because your sniffs were not done at the same time..

              But 2 responses were put on the wire - your client should of seen both of those.. They were sent 0.4 ms apart..

              The odd thing for me - is why only some responses being sent twice? The NX responses are only sent once - which you do not get.. Strange for sure..

              edit:
              It would be interesting to see if the 2nd packet is the one you get and the first is always lost sort of thing. This would make sense why your not getting the NX which is only sent once.

              But I am not spotting any difference in the packets that could explain why they might be filtered vs the ones sent twice.. They look the same.. same macs, same transaction ids, same ports.. They are just retrans.. I have to assume your getting the retrans.. And I guess its possible that maybe unbound itself is not sending it, but something your switch is doing, resending the packets? That would make more sense really since not sure why unbound would send out retrans for normal, but not NX.. And why would the retrans be sent so fast? Maybe your switch is doing it??? All the sniff tells us is they were seen on the wire..

              Which is why I guess it has something to do with the lagg..

              Would be interesting to see what happens on your linux boxes where you say your not seeing the problem when you do a query for something that is NX.. And sniff where we see normal responses and nx responses - are the nx only being seen once, while normals are actually seen twice?

              An intelligent man is sometimes forced to be drunk to spend time with his fools
              If you get confused: Listen to the Music Play
              Please don't Chat/PM me for help, unless mod related
              SG-4860 24.11 | Lab VMs 2.8, 24.11

              P 1 Reply Last reply Reply Quote 0
              • P
                Paint @johnpoz
                last edited by Paint

                @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                really since not sure why unbound would send out retrans for normal, but not NX.. And why would the retrans be sent so fast? Maybe your switch is doing it??? All the sniff tells us is they were seen on the wire..

                Ill do some more simultaneous sniffs and share them with you prior to changing anything.

                I dont see any dropped or resent packets on the switch for any of the 48 ports. This is a strange issue for sure. I agree that the next course of action would be removing the LAGG - though it will be fun backing out the LAGG configuration for pfSense and the switch.

                pfSense i5-4590
                940/880 mbit Fiber Internet from FiOS
                BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                Netgear R8000 AP (DD-WRT)

                johnpozJ 1 Reply Last reply Reply Quote 0
                • johnpozJ
                  johnpoz LAYER 8 Global Moderator @Paint
                  last edited by johnpoz

                  there is no wireless involved in these sniffs right.. Clients are all wired?

                  You mention other switches. Could you lay out the physical connections are clients connected directly to the managed switch, or are there some dumb switches involved?

                  Really like to see if linux clients show the same duplicate packets from the server response. Where the linux clients are all connected to the same switch(es) as the windows ones.

                  Possible something doing something odd with dns?? But I would assume that would have to be something only a managed switch might do, or wireless..

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 24.11 | Lab VMs 2.8, 24.11

                  P 1 Reply Last reply Reply Quote 0
                  • P
                    Paint @johnpoz
                    last edited by Paint

                    @johnpoz said in Unbound: DNS request timed out for two requests, then returns Non-authoritative answer:

                    there is no wireless involved in these sniffs right.. Clients are all wired?

                    You mention other switches. Could you lay out the physical connections are clients connected directly to the managed switch, or are there some dumb switches involved?

                    Really like to see if linux clients show the same duplicate packets from the server response. Where the linux clients are all connected to the same switch(es) as the windows ones.

                    Possible something doing something odd with dns?? But I would assume that would have to be something only a managed switch might do, or wireless..

                    No, I have wireless devices on the network. The issue happens on both wired and wireless clients, as long as they are running Windows 10. The previously sent sniffs are from all wired clients, however.

                    I can map out the network architecture as well. Yes, there are two unmanaged/dumb switches being used as well.

                    pfSense i5-4590
                    940/880 mbit Fiber Internet from FiOS
                    BROCADE ICX6450 48Port L3-Managed Switch w/4x 10GB ports
                    Netgear R8000 AP (DD-WRT)

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.