Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbound not advertising logincdn.msauth.net correctly to clients

    Scheduled Pinned Locked Moved DHCP and DNS
    11 Posts 6 Posters 3.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      General Platypus
      last edited by General Platypus

      Hi guys, so I've been noticing an issue with ms authentication over the past few days. At first I thought it might have to do with 2.5.0, so I downgraded to 2.4.5-p1 ... factory reset, default config, no custom packages... but I'm still seeing the issue.

      console

      It might reply once, but any subsequent queries all fail.

      Issue persists across multiple OSes: Windows, MacOS, and Android, leading me to suspect that is an Unbound-related issue.

      The odd part is, DNS Lookup on pfSense itself works perfectly fine every single time:

      pfSense response

      Which itself may or may not be correct. This is what I'm pulling via dnslookup.online:

      dnslookup results

      Unfortunately, this is beyond my level of expertise. I've flipped DNS providers, but seeing the same issue with 8.8.8.8 as well. That leads me to suspect the issue resides with how Unbound processes the request and passes it on to a client.

      I should also mention that I'm only consistently seeing this issue with logincdn.msauth.net only. Other domains, like msn.com, respond 100% of the time.

      Is anyone else able to replicate this problem?

      GertjanG 1 Reply Last reply Reply Quote 0
      • GertjanG
        Gertjan @General Platypus
        last edited by

        @general-platypus said in Unbound not advertising logincdn.msauth.net correctly to clients:

        Is anyone else able to replicate this problem?

        Noop.

        C:\Users\Gauche>nslookup logincdn.msauth.net
        Serveur :   pfsense.local.net
        Address:  2001:470:dead:beef:2::1
        
        Réponse ne faisant pas autorité :
        Nom :    cs1227.wpc.alphacdn.net
        Address:  192.229.221.185
        Aliases:  logincdn.msauth.net
                  lgincdn.trafficmanager.net
                  lgincdnvzeuno.azureedge.net
                  lgincdnvzeuno.ec.azureedge.net
        

        The ping test itself is rather useless.
        There is no law that says that a host has to reply to ping.

        Your unbound doesn't do much : it just forwards the requests to 1.1.1.1 and 1.0.0.1.

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        G 1 Reply Last reply Reply Quote 0
        • G
          General Platypus @Gertjan
          last edited by General Platypus

          @gertjan Correct, the server doesn't have to reply to ping, but if you see the error message, it says ping cannot find the host, which is a lookup issue.

          nslookup works the first time, but fails on any subsequent lookups. I understand that Unbound forwards requests to 1.1.1.1, but in this case, while pfSense itself is able to nslookup the domain 100% of the time, clients are not able to get it from Unbound, as indicated by "Server failed" error - an unusual error message from the DNS server which usually means something went horribly wrong.

          Considering this particular dns entry has a boat-load of cnames, I suspect the issue lies there. The Unbound pipeline to the client is choking and failing. Hopefully it's not something silly like a 256 byte buffer overflow?? The combined CNAME character count just happens to be 284. 🤔

          I'm genuinely surprised more people aren't seeing this, unless I am the only pfSense user out there who happens to use Outlook and Firefox. Chrome masks the issue because cache survives browser restarts.

          Here is a "hack" that's currently working for me:

          alt text

          Now that we've stripped out the cnames, nslookup logincdn.msauth.net works 100% of the time for clients.

          That being said, I believe this should be properly investigated.

          1 Reply Last reply Reply Quote 0
          • bmeeksB
            bmeeks
            last edited by bmeeks

            You may be seeing this issue in unbound that someone else here recently posted a link to: https://github.com/NLnetLabs/unbound/issues/132 (or perhaps an artifact of it).

            1 Reply Last reply Reply Quote 0
            • G
              General Platypus
              last edited by General Platypus

              I'm seeing this in the log:

              alt text

              Particularly:

              [712:0] debug: return error response SERVFAIL
              [712:0] debug: request has exceeded the maximum number of query restarts with 9

              When I perform a lookup on this domain.

              @bmeeks said in Unbound not advertising logincdn.msauth.net correctly to clients:

              You may be seeing this issue in unbound that someone else here recently posted a link to: https://github.com/NLnetLabs/unbound/issues/132 (or perhaps an artifact of it).

              I think you're onto something. Looks like it is indeed being caused by the cname chasing. Might be related to this: https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1608638.html

              1 Reply Last reply Reply Quote 0
              • F
                follysuperscript
                last edited by

                I'm having a hard time getting my service setup to produce a log entry as useful as @General-Platypus, but I believe I've got the same issue.

                Started a couple days ago.
                nslookup logincdn.msauth.net results in "can't find logincdn.msauth.net: Server failed" at the client. Using the "Diagnostics> DNS Lookup" tool, It will resolve, though using "Diagnostics> Ping" also fails.

                I've tried doing a domain override to 8.8.8.8 but I can't get that to work either.

                Any ideas how to implemented a unbound based fix? Even if it's a temporary hack. I just want to login to outlook again.

                GertjanG 1 Reply Last reply Reply Quote 0
                • GertjanG
                  Gertjan @follysuperscript
                  last edited by Gertjan

                  @follysuperscript said in Unbound not advertising logincdn.msauth.net correctly to clients:

                  but I believe I've got the same issue.

                  Then stop forwarding to 1.1.1.1 (or 8.8.8.8 - or some other forwarder).
                  Just use unbound as it was meant to be used : as a resolver.
                  And it works well :

                  C:\Users\Gauche>nslookup logincdn.msauth.net
                  Serveur :   pfsense.local.net
                  Address:  2001:470:beef:5c0:2::1
                  
                  Réponse ne faisant pas autorité :
                  Nom :    cs1227.wpc.alphacdn.net
                  Address:  192.229.221.185
                  Aliases:  logincdn.msauth.net
                            lgincdn.trafficmanager.net
                            lgincdnvzeuno.azureedge.net
                            lgincdnvzeuno.ec.azureedge.net
                  

                  Remember : if the forwarder says "dono" then unbound can't make it better.

                  No "help me" PM's please. Use the forum, the community will thank you.
                  Edit : and where are the logs ??

                  1 Reply Last reply Reply Quote 0
                  • L
                    lukasz.s
                    last edited by

                    Hi guys

                    I have encountered the same problem as You.
                    DNS resolver , used as a resolver not as forwarder, has problem with advertising logincdn.msauth.net to clients, which has results in that client cant open login.live.com page correctly.

                    If I ask

                    dig @1.1.1.1 logincdn.msauth.net
                    
                    ; QUESTION SECTION:
                    ;logincdn.msauth.net.		IN	A
                    ;; ANSWER SECTION:
                    logincdn.msauth.net.	285	IN	CNAME	lgincdn.trafficmanager.net.
                    lgincdn.trafficmanager.net. 15	IN	CNAME	lgincdnvzeuno.azureedge.net.
                    lgincdnvzeuno.azureedge.net. 1785 IN	CNAME	lgincdnvzeuno.ec.azureedge.net.
                    lgincdnvzeuno.ec.azureedge.net.	3585 IN	CNAME	cs1227.wpc.alphacdn.net.
                    cs1227.wpc.alphacdn.net. 3585	IN	A	192.229.221.185
                    
                    ;; Query time: 53 msec
                    ;; SERVER: 1.1.1.1#53(1.1.1.1)
                    ;; WHEN: Thu Jun 09 14:44:29 CEST 2022
                    ;; MSG SIZE  rcvd: 204
                    
                    

                    i get correct answer but when i ask

                    dig @my_pfsense_dns_resolver_ip logincdn.msauth.net
                    
                    ;; QUESTION SECTION:
                    ;logincdn.msauth.net.		IN	A
                    
                    ;; Query time: 323 msec
                    ;; SERVER: 192.168.0.10#53(192.168.0.10)
                    ;; WHEN: Thu Jun 09 14:44:56 CEST 2022
                    ;; MSG SIZE  rcvd: 48
                    
                    

                    i get empty response.

                    Its strange because others queries work ok.
                    I have double checked any blockers, ids, firewall and other...

                    Is it some known bug or something ?

                    Pfsense version 22.01-RELEASE (amd64)
                    Netgate 7100

                    Regards

                    johnpozJ 1 Reply Last reply Reply Quote 0
                    • johnpozJ
                      johnpoz LAYER 8 Global Moderator @lukasz.s
                      last edited by johnpoz

                      @lukasz-s said in Unbound not advertising logincdn.msauth.net correctly to clients:

                      logincdn.msauth.net

                      wow 8 freaking cnames - who and the F does their dns???

                      ;logincdn.msauth.net.           IN      A
                      
                      ;; ANSWER SECTION:
                      logincdn.msauth.net.    3600    IN      CNAME   lgincdn.trafficmanager.net.
                      lgincdn.trafficmanager.net. 3600 IN     CNAME   lgincdnmsftuswe2.azureedge.net.
                      lgincdnmsftuswe2.azureedge.net. 3600 IN CNAME   lgincdnmsftuswe2.afd.azureedge.net.
                      lgincdnmsftuswe2.afd.azureedge.net. 3600 IN CNAME firstparty-azurefd-prod.trafficmanager.net.
                      firstparty-azurefd-prod.trafficmanager.net. 3600 IN CNAME dual.part-0023.t-0009.t-msedge.net.
                      dual.part-0023.t-0009.t-msedge.net. 3600 IN CNAME global-entry-afdthirdparty-fallback.trafficmanager.net.
                      global-entry-afdthirdparty-fallback.trafficmanager.net. 3600 IN CNAME dual.part-0023.t-0009.fbs1-t-msedge.net.
                      dual.part-0023.t-0009.fbs1-t-msedge.net. 3600 IN CNAME part-0023.t-0009.fbs1-t-msedge.net.
                      part-0023.t-0009.fbs1-t-msedge.net. 3600 IN A   13.107.219.51
                      part-0023.t-0009.fbs1-t-msedge.net. 3600 IN A   13.107.227.51
                      
                      ;; Query time: 390 msec
                      

                      I haven't chased the whole chain, but first chain has a 5 min TTL as well - wow that is going to cause some unnecessary queries that is for sure.. I have unbound set to min ttl of 3600 (1 hour).. Because so many places using unrealistic ttl values that are so freaking low.

                      logincdn.msauth.net.    300     IN      CNAME   lgincdn.trafficmanager.net.
                      

                      2nd cname - 30 seconds - jfc people! no wonder its problematic!

                      lgincdn.trafficmanager.net. 30  IN      CNAME   lgincdnmsftuswe2.azureedge.net.
                      

                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                      If you get confused: Listen to the Music Play
                      Please don't Chat/PM me for help, unless mod related
                      SG-4860 24.11 | Lab VMs 2.8, 24.11

                      L 1 Reply Last reply Reply Quote 1
                      • L
                        lukasz.s @johnpoz
                        last edited by

                        @johnpoz that do You suggest to set "Minimum TTL for RRsets and Messages" to more than default 0 ?

                        ".... I have unbound set to min ttl of 3600 (1 hour).."

                        btw. today this domain resolves ok

                        johnpozJ 1 Reply Last reply Reply Quote 0
                        • johnpozJ
                          johnpoz LAYER 8 Global Moderator @lukasz.s
                          last edited by

                          @lukasz-s here is the thing - back in the day, you should of really never messed with changing somethings ttl.. But that was back when they used realistic ttls, the only time you would lower them to very short was you were getting ready for a change..

                          You would lower the ttl the closer you got to the change, you would then change the IP of the record. After you were sure everything was working, and new IP was good you would then raise the ttl back up to something normal.

                          These days they love to set them to shit like 30 freaking seconds.. Or 5 minutes - why, they like to drive of number of queries and doing something with tracking if you ask me..

                          I set my min to 1 hour, and I also serve 0.. Have not run into anything there that has caused me any issues in accessing anything..

                          In a sane world no I wouldn't suggest messing with the ttl - but these places are insane - 30 second freaking ttl..

                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                          If you get confused: Listen to the Music Play
                          Please don't Chat/PM me for help, unless mod related
                          SG-4860 24.11 | Lab VMs 2.8, 24.11

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.