Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Insanely weird issue with DNS resolution to www.cdc.gov

    Scheduled Pinned Locked Moved DHCP and DNS
    52 Posts 15 Posters 7.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mboylan
      last edited by mboylan

      Hi All -- having a ridiculously strange issue with DNS resolution related to www.cdc.gov.

      What's happening is that DNS queries for www.cdc.gov from network clients are resulting in a SERVFAIL response. Whenever querying the CloudFlare DNS servers directly using dig, the results are okay.

      ; <<>> DiG 9.10.6 <<>> @1.1.1.1 www.cdc.gov
      ; (1 server found)
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62488
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
      
      ;; OPT PSEUDOSECTION:
      ; EDNS: version: 0, flags:; udp: 1232
      ;; QUESTION SECTION:
      ;www.cdc.gov.			IN	A
      
      ;; ANSWER SECTION:
      www.cdc.gov.		248	IN	CNAME	www.akam.cdc.gov.
      www.akam.cdc.gov.	1	IN	A	104.86.21.106
      
      ;; Query time: 7 msec
      ;; SERVER: 1.1.1.1#53(1.1.1.1)
      ;; WHEN: Fri Dec 18 14:36:34 PST 2020
      ;; MSG SIZE  rcvd: 79
      

      From the router itself, resolution also appears to be okay in the diagnostics -> DNS lookup pane:
      Screen Shot 2020-12-18 at 2.37.11 PM.png

      But when clients query directly against the router, they get SERVFAIL:

      ; <<>> DiG 9.10.6 <<>> @10.10.0.1 www.cdc.gov
      ; (1 server found)
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 55373
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
      
      ;; OPT PSEUDOSECTION:
      ; EDNS: version: 0, flags:; udp: 1232
      ;; QUESTION SECTION:
      ;www.cdc.gov.			IN	A
      
      ;; Query time: 588 msec
      ;; SERVER: 10.10.0.1#53(10.10.0.1)
      ;; WHEN: Fri Dec 18 14:38:34 PST 2020
      ;; MSG SIZE  rcvd: 40
      

      I'm currently not noticing this with any other website except the CDC, but I do feel like I've seen this behavior a handful of other times with random websites.

      Are there additional debug logs I could gather from the router to identify if this is a bug in Unbound or something else going on? A colleague with the same device (SG-3100), software version, DNS servers (CloudFlare), and different ISP was able to reproduce.

      Thanks!

      • Mike
      johnpozJ 1 Reply Last reply Reply Quote 0
      • johnpozJ
        johnpoz LAYER 8 Global Moderator @mboylan
        last edited by johnpoz

        I can not duplicate it here..

        dig.png

        $ dig @192.168.9.253 www.cdc.gov                                            
                                                                                    
        ; <<>> DiG 9.16.9 <<>> @192.168.9.253 www.cdc.gov                           
        ; (1 server found)                                                          
        ;; global options: +cmd                                                     
        ;; Got answer:                                                              
        ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44762                   
        ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1     
                                                                                    
        ;; OPT PSEUDOSECTION:                                                       
        ; EDNS: version: 0, flags:; udp: 4096                                       
        ;; QUESTION SECTION:                                                        
        ;www.cdc.gov.                   IN      A                                   
                                                                                    
        ;; ANSWER SECTION:                                                          
        www.cdc.gov.            2106    IN      CNAME   www.akam.cdc.gov.           
        www.akam.cdc.gov.       2106    IN      A       23.66.90.90                 
                                                                                    
        ;; Query time: 0 msec                                                       
        ;; SERVER: 192.168.9.253#53(192.168.9.253)                                  
        ;; WHEN: Fri Dec 18 18:01:24 Central Standard Time 2020                     
        ;; MSG SIZE  rcvd: 79                                                       
        

        What are you doing with cloudflare - forwarding, tls? If unbound resolved it fine, then your client asking for it would get cache anyway.. So your test doesn't make a lot of sense.

        An intelligent man is sometimes forced to be drunk to spend time with his fools
        If you get confused: Listen to the Music Play
        Please don't Chat/PM me for help, unless mod related
        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

        1 Reply Last reply Reply Quote 0
        • M
          mboylan
          last edited by

          @johnpoz Forwarding, yes, and I do have TLS enabled, but I can reproduce with it off as well. What would be next steps in trying to figure out what's going on here?

          johnpozJ M 2 Replies Last reply Reply Quote 0
          • johnpozJ
            johnpoz LAYER 8 Global Moderator @mboylan
            last edited by johnpoz

            If you forward you are at the mercy of where you forward wants to answer or not answer..

            Sniff your forward to where your sending it, cloudflare - without using tls.. Do you actually query, what does it send back for answer? Or does it not, etc.

            I have no issues doing a directed query to 1.1.1.1 and getting an answer.. If you find from doing your sniff or turning on logging of queries that unbound is not sending that on to 1.1.1.1 you will need to figure out why its failing on unbound. Servfail could be lots of things - its a pretty generic failure.. Basically something went wrong.. Its not specific like nx or refused, etc.

            An intelligent man is sometimes forced to be drunk to spend time with his fools
            If you get confused: Listen to the Music Play
            Please don't Chat/PM me for help, unless mod related
            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

            1 Reply Last reply Reply Quote 0
            • M
              marshmallow @mboylan
              last edited by

              @mboylan Hi Mike, I have the exact same issue (cdc.gov, CloudFlare, etc). Were you able to figure out what is going on?

              Thanks

              M 1 Reply Last reply Reply Quote 0
              • M
                mboylan @marshmallow
                last edited by

                @marshmallow No. Unfortunately not. I did a packet capture on the WAN interface and can see the response coming back, but then capturing between the client and the router results in a SERVFAIL. Something is amuck in passing the response back to the client, and I can’t figure out what. I’ve tried adjusting some of the cache settings, without any luck. Given this is reproducible by several people (at least 3) at this point, I hope Netgate can help figure out what’s going on. The CDC website is kind of important right now. I tried switching to Google for DNS and am getting the same result.

                johnpozJ 1 Reply Last reply Reply Quote 0
                • johnpozJ
                  johnpoz LAYER 8 Global Moderator @mboylan
                  last edited by

                  Post up this pcap of what they sent back and what you asked for.

                  Its not a pfsense thing - I resolve it just fine..

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                  M 1 Reply Last reply Reply Quote 0
                  • bmeeksB
                    bmeeks
                    last edited by

                    I have an additional data point for this discussion.

                    Let me start by saying I am not a DNS guru like @johnpoz, so my DNS troubleshooting skills are more limited.

                    I have a Windows 2012 R2 DNS server in my network. It's actually part of my Active Directory. I currently have the Windows DNS server resolving via the root servers. I have unbound configured in default mode on my pfSense firewall, so it is the default DNS server for pfSense itself and it also resolves via the roots. unbound has a domain override for my private AD domain, so it asks my AD DNS server for any local stuff. All my network clients point to the AD DNS and use the AD DHCP server (which hands out the AD DNS IP as the DNS server for my LAN).

                    On pfSense, at a shell prompt using dig and the local unbound server, both "cdc.gov" and "www.cdc.gov" resolve just fine. Interestingly, "www.cdc.gov" is a CNAME that points to "akam.cdc.gov". The IP for that host is a totally different IP block than "cdc.gov". I did not go searching to verify this, but my guess is the CDC is using the Akamai CDN for their web site hosting. That would make sense for loading issues.

                    But on Windows DNS, "www.cdc.gov" will not resolve. It produces a SERVFAIL type of error. I tried turning off DNSSEC, clearing the cache, restarting the server (and even uttering some magic spells ... 🙂), and it just would not work when resolving to the root servers. However, when I turned off resolving on the Microsoft side and just told my AD DNS to forward to unbound on pfSense everything worked. So in my case, it appears the Microsoft DNS server does not like something about the info returned for "www.cdc.gov". At least when it resolves it. It seems happy to serve up the reply to requesting clients when it gets it via forwarding to unbound.

                    I've also had other sporadic weirdness in the past with resolving using the Microsoft DNS server and DNSSEC (at least in the 2012 R2 variant I have). So I am turning off resolving on the Microsoft side and just switching over to let it forward to unbound on pfSense. I don't really "need" the AD setup, so I may unwind it at some point. The only real reason I've kept it around is the DFS feature supporting a shared data setup in my LAN.

                    1 Reply Last reply Reply Quote 0
                    • M
                      marshmallow @johnpoz
                      last edited by

                      @johnpoz

                      Not being a DNS expert myself, I wonder if this can shed some light on the issue:

                      https://community.cloudflare.com/t/cdc-gov-not-resolving/228798/3

                      bmeeksB 1 Reply Last reply Reply Quote 0
                      • bmeeksB
                        bmeeks @marshmallow
                        last edited by bmeeks

                        @marshmallow said in Insanely weird issue with DNS resolution to www.cdc.gov:

                        @johnpoz

                        Not being a DNS expert myself, I wonder if this can shed some light on the issue:

                        https://community.cloudflare.com/t/cdc-gov-not-resolving/228798/3

                        Thanks for the link with the possible answer to the riddle. Strange that unbound does not seem particularly bothered by the DNS reply, but other DNS resolvers don't seem to like it. Based on the link you shared, it seems the root issue is with the CNAME record in their DNS and it's not a problem with anything on pfSense.

                        johnpozJ M 2 Replies Last reply Reply Quote 0
                        • johnpozJ
                          johnpoz LAYER 8 Global Moderator @bmeeks
                          last edited by johnpoz

                          @bmeeks said in Insanely weird issue with DNS resolution to www.cdc.gov:

                          it's not a problem with anything on pfSense.

                          Not anything to do with unbound.. Or pfsense

                          There’s a subset of nameservers for akam.cdc.gov that doesn’t return keys https://dnsviz.net/d/www.cdc.gov/dnssec/ so if you’re unlucky it’s going to fail. I added another workaround so it should be better.

                          So lets state this once again - when you forward you are at the mercy of where you forward..

                          This does not, nor ever had anything to do with pfsense or unbound.. But is a cloudflare problem.. or to be honest a cdc problem with their dnssec on some of their servers. But when you forward to somewhere - that becomes their problem.

                          If you can not resolve a cname, that something points to - be it your asking for dnssec or not, then sure you can have problems.. If they have something wrong with their dnssec - you quite often can have more problems. This seems to be a group of NS that are part of that whole process that are having issues. If you try and talk to those - then you have problems, if those have issues talking to who you forward to, you could have problems.

                          This is why its always better to resolve.. Since you can trace such problems yourself, vs just luck of the draw who you forwarded to having issues. Which could just be a connectivity issue to some NS in the chain when they are resolving, etc.

                          If you look to where they linked to
                          https://dnsviz.net/d/www.cdc.gov/dnssec/

                          You can see that some of the NS are having issues.. Not all of them - so its going to be hit or miss.. I have never seen the problem, because prob not talking to those specific NS. They have multiples of them, etc.

                          Look at all the NS for that domain the cname points too

                          ;; QUESTION SECTION:
                          ;akam.cdc.gov.                  IN      NS
                          
                          ;; ANSWER SECTION:
                          akam.cdc.gov.           86393   IN      NS      a8-67.akam.net.
                          akam.cdc.gov.           86393   IN      NS      a5-66.akam.net.
                          akam.cdc.gov.           86393   IN      NS      a9-64.akam.net.
                          akam.cdc.gov.           86393   IN      NS      a1-43.akam.net.
                          akam.cdc.gov.           86393   IN      NS      a2-64.akam.net.
                          akam.cdc.gov.           86393   IN      NS      a28-65.akam.net.
                          

                          That is a huge CDN.. which depending on which part region of the globe your in - could even point to some other NSers.. etc.. If some of those have bad or old info - and those are the ones your trying to talk to you - then you could have issues, etc.

                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                          If you get confused: Listen to the Music Play
                          Please don't Chat/PM me for help, unless mod related
                          SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                          1 Reply Last reply Reply Quote 0
                          • M
                            mboylan @bmeeks
                            last edited by mboylan

                            @bmeeks What doesn't make a lot of sense in my case though is that my clients are using the pfSense box as their DNS server. pfSense is forwarding off the query to CloudFlare, getting a response, and then somehow that response is not making it back to the clients. This seems different from your case where once you told your windows servers to forward to unbound, it started working. I'm already doing that, and I get SERVFAIL. I'm happy to escalate to CloudFlare, but seeing as I can query the host from the pfSense box itself, as well as directly against CloudFlare using dig from my clients (but NOT when forwarding through unbound), I'm hard pressed to believe it's a CloudFlare issue. :-/

                            Edit: I can post the packet captures later today.

                            bmeeksB 1 Reply Last reply Reply Quote 0
                            • bmeeksB
                              bmeeks @mboylan
                              last edited by

                              @mboylan said in Insanely weird issue with DNS resolution to www.cdc.gov:

                              @bmeeks What doesn't make a lot of sense in my case though is that my clients are using the pfSense box as their DNS server. pfSense is forwarding off the query to CloudFlare, getting a response, and then somehow that response is not making it back to the clients. This seems different from your case where once you told your windows servers to forward to unbound, it started working. I'm already doing that, and I get SERVFAIL. I'm happy to escalate to CloudFlare, but seeing as I can query the host from the pfSense box itself, as well as directly against CloudFlare using dig from my clients (but NOT when forwarding through unbound), I'm hard pressed to believe it's a CloudFlare issue. :-/

                              Edit: I can post the packet captures later today.

                              I agree your issue does not make sense. Are you 100% positive those clients are actually using unbound on pfSense? As I posted, in my case letting the AD DNS server forward to unbound on pfSense solved the issue. And I have unbound on pfSense resolving, not forwarding. I think in your case you have it forwarding to Cloudfare if I recall correctly. But then you said on pfSense itself unbound can resolve "www.cdc.gov". I assume that is with the Cloudfare forwarding in place ??

                              1 Reply Last reply Reply Quote 0
                              • T
                                tman222
                                last edited by

                                Saw this thread last night and for kicks tried to go www.cdc.gov - page would not load. Tried again this morning with a dig www.cdc.gov and came back with SERVFAIL. This is using a Pi-hole / Unbound setup (i.e. clients talk to Pi-hole and Pi-hole forwards the DNS query to pfSense/Unbound if not cached, and Unbound then resolves if not already cached). Tried again this afternoon (a few hours ago) and now all is working fine (i.e. DNS resolves properly and page loads fine). I made no changes on my end in the meantime.

                                I think @johnpoz might be on to something - perhaps the related name servers aren't or weren't properly configured and that causes issues. I do have DNSSEC enabled as well on Unbound - could that have been what was failing?

                                1 Reply Last reply Reply Quote 0
                                • johnpozJ
                                  johnpoz LAYER 8 Global Moderator
                                  last edited by johnpoz

                                  Just look at
                                  https://dnsviz.net/d/www.cdc.gov/dnssec/

                                  They have quite a few problems going on.. Its not cloudflare's job to fix it.. Its the domain owners job to make sure their dns works correctly and is valid.

                                  I would contact the cdc webmaster and show him that above dnsviz link.. Tell him to fix his shit..

                                  All kinds of stuff wrong..

                                  net to edgekey.net: The following NS name(s) were found in the authoritative NS RRset, but not in the delegation NS RRset (i.e., in the net zone):
                                  a11-65.akam.net,
                                  ns1-2.akam.net,
                                  a9-65.akam.net,
                                  a3-65.akam.net
                                  net to edgekey.net: The following NS name(s) were found in the delegation NS RRset (i.e., in the net zone), but not in the authoritative NS RRset: 
                                  ns1-66.akam.net, 
                                  ns4-66.akam.net, 
                                  ns5-66.akam.net, 
                                  ns7-65.akam.net
                                  www.akam.cdc.gov/CNAME: The server returned CNAME for www.akam.cdc.gov, but records of other types exist at that name.
                                  

                                  That it resolves sometimes at all is just luck to be honest ;)

                                  They have issues way up the chain..

                                      gov to cdc.gov: The following NS name(s) were found in the authoritative NS RRset, but not in the delegation NS RRset (i.e., in the gov zone): 
                                  icdc-us-ns1.cdc.gov, 
                                  icdc-us-ns3.cdc.gov, 
                                  icdc-us-ns2.cdc.gov
                                      gov to cdc.gov: The following NS name(s) were found in the delegation NS RRset (i.e., in the gov zone), but not in the authoritative NS RRset: 
                                  auth00.ns.uu.net, 
                                  auth100.ns.uu.net
                                  

                                  So again its all going to depend on which NSs your talking too, and what info they have or don't have

                                  NS.png

                                  Sometimes it will work, sometimes it won't.. the cdc.gov is who should get this fixed..

                                  If a domain has issues with their dnssec - and you forward to somewhere that does dnssec like cloudflare. Your setting of dnssec isn't on or off isn't going to do anything. It should be OFF if you forward.. Where you forward either does dnssec or it doesn't.. There is no point for asking for dnssec when you forward. If you want dnssec when you forward, then pick a place to forward to that does dnssec. I have been over this countless times ;)

                                  edit: Even asking clouldflare you get different responses.. Depending I assume which NS you hit of theirs via anycast..

                                  ;www.cdc.gov.                   IN      A
                                  
                                  ;; ANSWER SECTION:
                                  www.cdc.gov.            78      IN      CNAME   www.akam.cdc.gov.
                                  www.akam.cdc.gov.       3378    IN      CNAME   www.cdc.gov.edgekey.net.
                                  www.cdc.gov.edgekey.net. 20544  IN      CNAME   e9313.dscb.akamaiedge.net.
                                  e9313.dscb.akamaiedge.net. 20   IN      A       23.222.138.25
                                  
                                  ;; Query time: 15 msec
                                  ;; SERVER: 1.1.1.1#53(1.1.1.1)
                                  ;; WHEN: Tue Dec 29 06:17:04 Central Standard Time 2020
                                  ;; MSG SIZE  rcvd: 152
                                  
                                  
                                  sec later
                                  
                                  ;www.cdc.gov.                   IN      A
                                  
                                  ;; ANSWER SECTION:
                                  www.cdc.gov.            76      IN      CNAME   www.akam.cdc.gov.
                                  www.akam.cdc.gov.       19      IN      A       23.222.138.25
                                  
                                  ;; Query time: 132 msec
                                  ;; SERVER: 1.1.1.1#53(1.1.1.1)
                                  ;; WHEN: Tue Dec 29 06:17:05 Central Standard Time 2020
                                  ;; MSG SIZE  rcvd: 79
                                  

                                  The cdc really should fix up their shit ;)

                                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                                  If you get confused: Listen to the Music Play
                                  Please don't Chat/PM me for help, unless mod related
                                  SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                  timtraceT GertjanG 2 Replies Last reply Reply Quote 2
                                  • timtraceT
                                    timtrace @johnpoz
                                    last edited by

                                    @johnpoz said in Insanely weird issue with DNS resolution to www.cdc.gov:

                                    The cdc really should fix up their shit ;)

                                    I’m experiencing this problem, also. When I disable DNSSEC the problem goes away and CDC.GOV loads.

                                    Can anything else be done as a workaround, which wouldn’t have as broad an scope as toggling DNSSEC?

                                    Thank you —

                                    johnpozJ 1 Reply Last reply Reply Quote 0
                                    • johnpozJ
                                      johnpoz LAYER 8 Global Moderator @timtrace
                                      last edited by johnpoz

                                      @timtrace said in Insanely weird issue with DNS resolution to www.cdc.gov:

                                      Can anything else be done as a workaround

                                      One way would be to do a domain override to say 9.9.9.10, which is quad9 that doesn't do dnssec.. So that shouldn't fail.. You do a domain override for cdc.gov to any NS that doesn't do dnssec..

                                      Another option should be to set unbound not to do dnssec for that domain.. In the options box

                                      server:
                                      domain-insecure: "cdc.gov"

                                      You would think they would have fixed their shit by now to be honest.. You might actually have to do it for domains the cnames point to if you don't do the domain override forwarding to a non dnssec ns..

                                      But looks like they just have the 1 cname currently www.akam.cdc.gov, so cdc.gov as the unsecure domain should work.

                                      Worse case is you add the other domains as unsecure as well

                                      www.akam.cdc.gov.       3378    IN      CNAME   www.cdc.gov.edgekey.net.
                                      www.cdc.gov.edgekey.net. 20544  IN      CNAME   e9313.dscb.akamaiedge.net.
                                      

                                      Who ever is in charge of their dns should really be fired..

                                      sad.png

                                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                                      If you get confused: Listen to the Music Play
                                      Please don't Chat/PM me for help, unless mod related
                                      SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                      timtraceT M G M 4 Replies Last reply Reply Quote 3
                                      • timtraceT
                                        timtrace @johnpoz
                                        last edited by

                                        @johnpoz said in Insanely weird issue with DNS resolution to www.cdc.gov:

                                        server:
                                        domain-insecure: "cdc.gov"

                                        Thanks, man! That worked perfectly.

                                        1 Reply Last reply Reply Quote 0
                                        • M
                                          mboylan @johnpoz
                                          last edited by

                                          @johnpoz Thanks! This option fixed the issue immediately.

                                          1 Reply Last reply Reply Quote 0
                                          • G
                                            gsmithe @johnpoz
                                            last edited by

                                            @johnpoz said in Insanely weird issue with DNS resolution to www.cdc.gov:

                                            Another option should be to set unbound not to do dnssec for that domain.. In the options box
                                            server:
                                            domain-insecure: "cdc.gov"

                                            Thank you! Worked for me, too.

                                            johnpozJ 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.