Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Major DNS Bug 23.01 with Quad9 on SSL

    General pfSense Questions
    27
    185
    149.9k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      Cylosoft @nononono
      last edited by

      @nononono said in Major DNS Bug 23.01 with Quad9 on SSL:

      @steveits the only reliable fix has been disabling SSL - so far I have tested:

      • DNSSEC enabled SSL/TLS disabled, runs no failure
      • DNSSEC enabled SSL/TLS enabled, completely unusable
      • DNSSEC disabled SSL/TLS enabled, intermittent crashes, nothing useful in logs, an unbound restart will get it working but it periodically will fail and stop serving any DNS responses
      • DNSSEC disabled SSL/TLS disabled, runs no failure

      Running pfsense on a Netgate 7100

      We have been working through this on about a dozen pfSense boxes. I've found the same thing. Quad9 says not to use DNSSEC. So the last 2 options are the way.

      We switched half of them to Cloudflare DNS with TLS. The other half TLS disabled, but still on Quad9. It looks like the switch to Cloudflare is going to be the temp fix. They have Malware filtered options that are pretty similar to Quad9.

      1 Reply Last reply Reply Quote 1
      • S
        SteveITS Galactic Empire @cmcdonald
        last edited by

        @cmcdonald If it helps, IIRC:

        • no issues for the first couple hours or so after install
        • suddenly random issues connecting to I think multiple sites on my phone
        • LinkedIn app wasn't loading data or images
        • went to my PC, linkedin.com wouldn't resolve
        • no issues since disabling DNSSEC on Feb. 18

        Obviously Quad9 thinks it's a potential issue since they recommend disabling it. But it's probably been enabled on this router since Plus 21.x sometime. I had a fairly early 2100. (for clarity, I did have to reinstall 22.05 due to the EFI bug, restore config, and updated to 23.01)

        @nononono Hmmm, I just checked and do have "Use SSL/TLS for outgoing DNS Queries to Forwarding Servers" enabled. If you have unbound crashing/stopping, that's not what I was experiencing then.

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote ๐Ÿ‘ helpful posts!

        1 Reply Last reply Reply Quote 0
        • P
          p1erre @nononono
          last edited by

          @nononono as mentioned in the docs here you have to disable DNSEC for DNS over TLS and forwarding mode. I've had the same issue like you (with different forwarding server), but disabling DNSEC helped me. DNSEC enabled with forwarding server was woring in 22.05 and earlier versions.

          C 1 Reply Last reply Reply Quote 0
          • C
            Cylosoft @p1erre
            last edited by

            @p1erre said in Major DNS Bug 23.01 with Quad9 on SSL:

            @nononono as mentioned in the docs here you have to disable DNSEC for DNS over TLS and forwarding mode. I've had the same issue like you (with different forwarding server), but disabling DNSEC helped me. DNSEC enabled with forwarding server was woring in 22.05 and earlier versions.

            I had hoped that was the fix also, but with Quad9 we saw on every device that at some point with-in 24 hours they would stop returning results. The DNS server was up and would respond, it just wouldn't look things up anymore.

            We are about 48 hours into half a dozen devices on Quad9 with TLS off and no issues. Also another half dozen are having no issues with Cloudlfare DNS and TSL on.

            1 Reply Last reply Reply Quote 0
            • C
              Cylosoft @cmcdonald
              last edited by

              @cmcdonald said in Major DNS Bug 23.01 with Quad9 on SSL:

              I am working on replicating this in order to investigate. My usual setup is to just let Unbound do the recursive resolving for me without forwarding to another service.

              I hope you can sort it out. It's going to be tough to duplicate. It happens in the 18 hour to 24 hour of being up. It responds just doesn't return results unless they are cached. And sometimes it would just start working again on it's own. I haven't found a great way to test other than end users complaining.

              1 Reply Last reply Reply Quote 0
              • B
                bigsy @nononono
                last edited by

                @nononono Same issue here on a NG2100 running 23.01.

                Running fine with DNSSEC disabled & SSL/TLS enabled on 22.05 but I've had to disable SSL/TLS on 23.01 to avoid intermittent DNS failures with Quad9.

                This is on IPv4/IPv6 with the patch applied for redmine #13851.

                1 Reply Last reply Reply Quote 0
                • jimpJ
                  jimp Rebel Alliance Developer Netgate
                  last edited by jimp

                  Check the output of sockstat | grep unbound when it works and when it doesn't.

                  I thought they fixed it but a while back unbound had an issue where it couldn't reuse SSL connections on the same open sockets so in some cases they kept piling up.

                  EDIT: This is what I was thinking of, but it's been fixed/closed for a couple years now: https://github.com/NLnetLabs/unbound/issues/47

                  Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 0
                  • I
                    Isotope1842
                    last edited by Isotope1842

                    This bug is still a problem, for Cloudflare DNS users as well. DNS stopped working on all clients immediately after updating to 23.01 this morning (Mar 15, 2023).

                    After finding this thread, I UNchecked only the following setting in the DNS Resolver settings, and DNS started working again:

                    5e605646-d089-430e-a60c-f1d8da960268-image.png

                    This is clearly a bug because the current pfSense documentation itself advises that this setting be checked:

                    https://docs.netgate.com/pfsense/en/latest/recipes/dns-over-tls.html

                    ab3dcbe2-3d1a-4139-adaa-a1d5834a4a47-image.png

                    1 Reply Last reply Reply Quote 1
                    • jimpJ
                      jimp Rebel Alliance Developer Netgate
                      last edited by

                      That document is about configuring that specific feature, it's not "advising" that setting be checked in a general fashion for everyone.

                      It's working for many people and breaking for a few, but it's still not clear why.

                      Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                      Need help fast? Netgate Global Support!

                      Do not Chat/PM for help!

                      I 1 Reply Last reply Reply Quote 0
                      • I
                        Isotope1842 @jimp
                        last edited by

                        @jimp Bullet number four of the previously included screenshot explicitly directs the user to check that setting. The pfSense documentation is actually more than advising. It is directing.

                        jimpJ 1 Reply Last reply Reply Quote 1
                        • jimpJ
                          jimp Rebel Alliance Developer Netgate @Isotope1842
                          last edited by

                          @isotope1842 said in Major DNS Bug 23.01 with Quad9 on SSL:

                          @jimp Bullet number four of the previously included screenshot explicitly directs the user to check that setting. The pfSense documentation is actually more than advising. It is directing.

                          The page you quoted is a document about configuring DNS over TLS -- it's saying to check that if you want DNS over TLS. That's what that entire document is for.

                          It's not a part of the general setup or DNS resolver docs and so on. It's a recipe for users who are interested in that feature and want to know how to set it up.

                          There is nothing saying users should be following all of those recipes, they're there for reference for things users may want to do.

                          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                          Need help fast? Netgate Global Support!

                          Do not Chat/PM for help!

                          I 1 Reply Last reply Reply Quote 0
                          • I
                            Isotope1842 @jimp
                            last edited by Isotope1842

                            @jimp The context in which the instructions are provided is exactly for users who want to use an upstream DNS over TLS provider. The original poster here was reporting a bug in using Quad9 and I reported the same bug when using Cloudflare.

                            See the top of the same linked documentation:

                            "Pick a DNS over TLS upstream provider, such as a private upstream DNS server or a public service like Cloudflare, Quad9, or Google public DNS."

                            Following the instructions prior to 23.01 worked. Immediately after upgrading to 23.01, DNS fails until unchecking a single setting.

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              Yes, and? They still work for that and for many providers and users who want to enable that feature.

                              But you're trying to imply this is something the docs have told everyone they should be doing which isn't true. They don't advise everyone to do it, just people who are interested in that feature.

                              But none of this is helpful. We still need more information about how and why it's failing. We have yet to be able to reproduce this in a lab environment, and there are plenty of us running DNS over TLS without problems.

                              Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              I 1 Reply Last reply Reply Quote 0
                              • I
                                Isotope1842 @jimp
                                last edited by

                                @jimp Steps to reproduce the problem:

                                1. Install pfSense 22.05 on a netgate device.
                                2. Configure DNS over TLS with Cloudflare.
                                3. Upgrade netgate device to pfSense 23.01.
                                4. Observe broken DNS for downstream clients.
                                jimpJ 1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate @Isotope1842
                                  last edited by

                                  @isotope1842 said in Major DNS Bug 23.01 with Quad9 on SSL:

                                  @jimp Steps to reproduce the problem:

                                  1. Install pfSense 22.05 on a netgate device.
                                  2. Configure DNS over TLS with Cloudflare.
                                  3. Upgrade netgate device to pfSense 23.01.
                                  4. Observe broken DNS for downstream clients.

                                  It is not that simple.

                                  I have multiple lab VMs using DNS over TLS to Cloudflare and Quad9 that successfully resolve and have no problems.

                                  Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  I 1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Mmm, I assume you are seeing the same intermittent behaviour as other users? It's not failing for every query with that configuration?

                                    1 Reply Last reply Reply Quote 0
                                    • I
                                      Isotope1842 @jimp
                                      last edited by

                                      @jimp I just re-checked that single setting. DNS appears to continue to work. Curious to see whether it starts to fail again at some point.

                                      jimpJ S 2 Replies Last reply Reply Quote 0
                                      • jimpJ
                                        jimp Rebel Alliance Developer Netgate @Isotope1842
                                        last edited by

                                        @isotope1842 said in Major DNS Bug 23.01 with Quad9 on SSL:

                                        @jimp I just re-checked that single setting. DNS appears to continue to work. Curious to see whether it starts to fail again at some point.

                                        Since you hit it once it's likely to fail again at some point, but nobody has yet to be able to pinpoint exactly when/why it happens.

                                        I've been periodically checking my lab systems and they all just keep resolving no matter what I do. But they are lab systems so the load is considerably lower than it would be in a live environment.

                                        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                        Need help fast? Netgate Global Support!

                                        Do not Chat/PM for help!

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          SteveITS Galactic Empire @Isotope1842
                                          last edited by

                                          @isotope1842 There are a few threads on this topic, or variations thereof, and in another one someone posted their problem seemed likely to happen when opening a group/folder of bookmarks/favorites at once...implying a higher number of simultaneous requests might trigger it.

                                          I was also unable to replicate my issue by simply (re)checking the DNSSEC option, but I left it off as recommended.

                                          Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                                          When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                                          Upvote ๐Ÿ‘ helpful posts!

                                          J 1 Reply Last reply Reply Quote 0
                                          • N
                                            nononono
                                            last edited by nononono

                                            After playing alot more - it might be an issue with Quad9's TLS DNS limiting responses more than anything.

                                            While there is nothing helpful in the pfsense logs, Quad9 just appears to stop replying and then start responding again - almost as if there is a limit being imposed by Quad9 on requests - but of course pfsense must have some role as it never occurred before 23.01

                                            This network is a very high traffic network, so maybe others that see the same thing manage high traffic networks as well - either way the only long term solution has been doing TLS DNS through Cloudflare

                                            As another point - DHCP lease registrations is definitely not fixed as claimed in 23.01, unbound still likes to reboot too much to consider enabling it - as such at the most I am still only registering the clients that need it via static mapping.

                                            cmcdonaldC johnpozJ 2 Replies Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.