Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Frequent DNS timeouts

    Scheduled Pinned Locked Moved pfBlockerNG
    86 Posts 11 Posters 36.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      JonH @Gertjan
      last edited by

      @gertjan said in Frequent DNS timeouts:

      So : take a first step : list all the MAC addresses, and give them all names that you understand.

      I'm missing something here.

      I have the same issues as OP & @thundergate. I'd call it 'hangs', not timeouts.
      I've read & read thread after thread. I don't have dnssec set. I don't have dhcp leases registered. My resolver simply hangs for around 5-6 minutes and then starts working.

      My dns is Lo, 9.9.9.9, 149.112.112.112
      I've tried switching it to 8.8.8.8 and 1.1.1.1, and openDNS. None of them work any better.

      My current solution is a cron job in pfSense to restart unbound every 30 min. This is NOT what I want to do, it is a convenience for me. However, it still will hang once in a great while in between my 30 min restarts.

      I want to stress that my problem is a hang, not a restart. The upthead suggestion to use grep to search for restart isn't worthwhile in a log that rolls over in an hour or two. I've checked in the past and don't recall a restart in the log.

      I've tried jacking up the log level in Unbound but that has not given me any hints because I'm not savvy in reading all this stuff even with google helping with some of it. If I go up above level 3 it stops logging (I guess it needs to start in a different mode?). What I see is it starts getting a lot of servfail (not sure, but believe this is logged for dnsbl entries also).

      This 'problem' with unbound seems to be experienced by very few users on this forum, but it is experienced and so far I have not seen a solution for a user who is using correctly using the resolver & tls and is not using dnssec. They are also using python mode and pfBlockerNG. I did see one reference that pfBlocker had a patch but the package mgr still has the same version I'm using so I don't know any more about that and if it is for this issue.

      I would like a better explanation of @Gertjan quoted at the head of this reply because I don't know or misunderstand what he is saying and how to accomplish it. Is this referring to, on Apple, "clientID"? My arp table has a few "host names" (mostly IoT devices) and I do not see a way to get apple devices showing this info (probably because Apple goes to great effort to hide itself). Or is it simply a list of MAC address that can be referred to in order to ID entries?

      Before going to v23.01 and switching to the latest pfBlockerNG I did not have these issues.

      S 1 Reply Last reply Reply Quote 0
      • S
        SteveITS Galactic Empire @JonH
        last edited by

        @jonh I think the context of Gertjan's message was to help that poster to assign static IPs, in order to avoid DHCP registrations, which does restart Unbound.

        There are many threads, as you said. In another thread someone suggested/speculated Quad9 was rate limiting or just not answering if there were a lot of DoT requests. At home I have had DoT with Quad9 on 23.01 enabled for a few weeks now, but volume is not high. I haven't tried DoT on, at other places, but we've had no reports of DNS issues with it off. I did have to turn off DNSSEC on 23.01, which Quad9 themselves say will cause problems when forwarding. Others have posted disabling DNSSEC didn't help but disabling DoT did.

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote 👍 helpful posts!

        J 1 Reply Last reply Reply Quote 0
        • J
          JonH @SteveITS
          last edited by

          @steveits said in Frequent DNS timeouts:

          At home I have had DoT with Quad9 on 23.01 enabled for a few weeks now, but volume is not high.

          That's interesting. My DoT subnet has only 3 states set with 12MiB with an uptime of < 2 days.
          I'm not sure that is a good indicator but it does seem maybe it is high. But again, this was not happening prior to my upgrade. As for Quad9 rate limiting, if that is the reason the it implies the other 3 DNS/TLS servers I tried also may be rate limiting. I have not seen that in my logs when I was looking for a common cause for this problem.

          I want to be able to access my DoT via Apple's Homekit while away from home.

          Here is a snippet from resolver.log for a typical 'hang'

          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 p39-imap.mail.me.com.akadns.net. AAAA IN SERVFAIL 0.000000 0 49
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving p39-imap.mail.me.com.akadns.net. A IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 p39-imap.mail.me.com.akadns.net. A IN SERVFAIL 0.000000 0 49
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving jimap.imap.mail.yahoo.com. AAAA IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 jimap.imap.mail.yahoo.com. AAAA IN SERVFAIL 0.000000 0 43
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving jimap.imap.mail.yahoo.com. A IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 jimap.imap.mail.yahoo.com. A IN SERVFAIL 0.000000 0 43
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving www.chevybolt.org. HTTPS IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 www.chevybolt.org. HTTPS IN SERVFAIL 0.000000 0 35
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving www.chevybolt.org. A IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 www.chevybolt.org. A IN SERVFAIL 0.000000 0 35
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving config.htplayground.com. HTTPS IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 config.htplayground.com. HTTPS IN SERVFAIL 0.000000 0 41
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: resolving config.htplayground.com. A IN
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 config.htplayground.com. A IN SERVFAIL 0.000000 0 41
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 p39-imap.mail.me.com.akadns.net. AAAA IN SERVFAIL 0.000000 1 49
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 jimap.imap.mail.yahoo.com. AAAA IN SERVFAIL 0.000000 1 43
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 p39-imap.mail.me.com.akadns.net. A IN SERVFAIL 0.000000 1 49
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.143 jimap.imap.mail.yahoo.com. A IN SERVFAIL 0.000000 1 43
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 www.chevybolt.org. HTTPS IN SERVFAIL 0.000000 1 35
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 www.chevybolt.org. A IN SERVFAIL 0.000000 1 35
          Mar 17 11:06:59 pfSense unbound[77573]: [77573:1] info: 192.168.10.7 config.htplayground.com. HTTPS IN SERVFAIL 0.000000 1 41
          

          This particular hang lasted 2.5 minutes before I manually restarted unbound.

          Here is the final sequence during a restart of unbound:

          Mar 17 11:07:21 pfSense unbound[77573]: [77573:0] info: [pfBlockerNG]: pfb_unbound.py script exiting
          Mar 17 11:08:12 pfSense unbound[35852]: [35852:0] notice: init module 0: python
          Mar 17 11:08:12 pfSense unbound[35852]: [35852:0] info: [pfBlockerNG]: pfb_unbound.py script loaded
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: [pfBlockerNG]: init_standard script loaded
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] notice: init module 1: iterator
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: start of service (unbound 1.17.1).
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving ocsp2.apple.com. HTTPS IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving ocsp2.apple.com. A IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving amp-api-edge.apps.apple.com. HTTPS IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving ocsp2.apple.com. AAAA IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving amp-api-edge.apps.apple.com. AAAA IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving amp-api-edge.apps.apple.com. A IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: response for amp-api-edge.apps.apple.com. A IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: reply from <.> 149.112.112.112#853
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: query response was CNAME
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving amp-api-edge.apps.apple.com. A IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: response for ocsp2.apple.com. AAAA IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: reply from <.> 9.9.9.9#853
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: query response was CNAME
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: resolving ocsp2.apple.com. AAAA IN
          Mar 17 11:08:14 pfSense unbound[35852]: [35852:0] info: response for ocsp2.apple.com. HTTPS IN
          

          And after the restart I'm back up and running.

          S 1 Reply Last reply Reply Quote 0
          • S
            SteveITS Galactic Empire @JonH
            last edited by

            @jonh said in Frequent DNS timeouts:

            As for Quad9 rate limiting, if that is the reason the it implies the other 3 DNS/TLS servers I tried also may be rate limiting

            It occurs to me that if (if) it is rate limiting on the remote end, restarting Unbound probably wouldn't fix it. However if connections are being held open for some reason (e.g. rate limiting? bug?) and Unbound stops connecting out, or gets connection refusals, that could explain both the "self recover" and "restart to recover" behavior...?

            Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
            When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
            Upvote 👍 helpful posts!

            J 1 Reply Last reply Reply Quote 0
            • J
              JonH @SteveITS
              last edited by

              @steveits said in Frequent DNS timeouts:

              It occurs to me that if (if) it is rate limiting on the remote end, restarting Unbound probably wouldn't fix it.

              That's a good point.

              I forgot to mention that I am using an SG5100, it should be robust enough for my little home use.

              I also realize there are many folks here with different skill levels, and I probably am at the lower end of that range. There is always the possibility that I have a mis-configuration problem although it is essentially the same as it has been through many releases of pfSense. The major change seems to be (IMO) the implementation of Python.

              I'll look through some of my older log captures and see if I can find a log that shows the beginning of these 'hangs' I and others have experienced. Maybe the guru's here will see something obvious.

              In the meantime I'll just keep running that cron job and restarting unbound ev. 30 min.

              w0wW 1 Reply Last reply Reply Quote 0
              • w0wW
                w0w @JonH
                last edited by

                @jonh
                I have two different firewalls based on different hardware, not the Netgate. I have been running into this problem several times, I don't what was the reason exactly, because it stopped doing it after some time and maybe some settings change. I have pfblockerNG-devel 3.2.0_3 and 23.01 with current patches added vi system patches package, also DNS block enabled in pfBlocker and I use forwarding to google and cloudflare TLS
                Here are my settings for resolver:
                c8c020b5-cafe-489f-ba97-540deee733fd-image.png
                bbed2b59-a8a1-46ad-b691-e75598e68d14-image.png
                5b811e7b-2f6d-4044-b728-79b3b070d301-image.png
                6f9d7d32-1c6d-4997-a3e2-52739fabed67-image.png

                J 1 Reply Last reply Reply Quote 0
                • J
                  JonH @w0w
                  last edited by

                  @w0w said in Frequent DNS timeouts:

                  I have pfblockerNG-devel 3.2.0_3 and 23.01 with current patches added vi system patches package, also DNS block enabled in pfBlocker and I use forwarding to google and cloudflare TLS

                  Thank you for this info.

                  Previously I used pfB-devel but when 23.01 was released it was stated that since it matched devel I should go to the non-devel package, which I did. I tried to check the current version of -devel but oddly enough, at time of this writing, there are ZERO packages on 'available' packages. Installed package list populates correctly. I'll check on this later.

                  There are a few differences on my dns settings, about 1/2 of the settings you posted for the 'advanced settings' are different from mine, mine are all set to the defaults and I'm not changed them except for the 'log level', which I set up to 3 last night trying to find something, anything, to give a clue on this issue. I will research your settings for more detail. I have a smaller msg cache and tcp buffers and maybe I can benefit from bumping those up.

                  On the DNS 'GeneralSettings' page, you have 'outgoing' interfaces set to WAN, I have it set to 'all', which is the default.

                  Under custom options you have the path to pfblockerNG 'dnsbl config' which was once required in an earlier version but I believe that requirement was removed for python mode. I'll look into that again, it is a possible I'm wrong. I do have "server: log-replies: yes" which is a recommendation for quad9.

                  Again, thanks so much for posting your setup.

                  1 Reply Last reply Reply Quote 0
                  • O
                    oopohj5Oo8shieZe1ree
                    last edited by

                    I disabled DHCP Registration in the DNS Resolver settings and after several days of testing I'm happy to report that all of my DNS issues are gone!

                    I really appreciate all the help from everyone here, especially @SteveITS and @johnpoz.

                    1 Reply Last reply Reply Quote 0
                    • N
                      nedyah700 Rebel Alliance
                      last edited by

                      Okay this has been driving me nuts. For years I have never noticed DNS issues but ever since upgrading to 23.01 I am constantly getting DNS timeouts because unbound is restarting. Looking at my logs unbound is restarting ~ every 5 minutes. It does appear to happen right after DHCP for clients with Hostnames. But why all of a sudden is this a noticeable issue. In previously releases I saw the same restart cadence but have never experienced DNS timeouts. Is the only solution to really disable DNS registration??

                      johnpozJ 1 Reply Last reply Reply Quote 0
                      • johnpozJ
                        johnpoz LAYER 8 Global Moderator @nedyah700
                        last edited by

                        @nedyah700 said in Frequent DNS timeouts:

                        disable DNS registration??

                        Is that really such a bad thing? What dhcp clients are you resolving via name - how many clients?

                        Here is the thing, if unbound is restarting - even if you don't notice and issue with resolving.. it clears its cache ever time it restarts.

                        I simple work around to the problem is just to setup reservations - so you devices always get the same IP.. Unless you have hundreds of clients.. Or you have lots of clients that come and go onto your network without any clue to what they are - then why would you want/need to resolve them?

                        Sure in a perfect world, unbound wouldn't restart and it could register your dhcp clients - maybe someday that will be an option. But its a been a known issue and long time standing thing that dhcp registrations restarts unbound. Many users may never notice - they have a handful of clients, they have a long time lease - unbound only restarts a now and then during a day..

                        But if your dns is restarting every 5 minutes - that is going to be problematic for sure. Be it you wanting to query something during the restart, or just that its loosing all of its cache every 5 minutes is not very efficient..

                        While it might seem daunting to setup reservations - its a one time thing, do a few at a time when you have a chance, etc. All of the like 40+ some devices on my network have reservations.. the only thing I don't have reservations for is like guest devices - which I could care less about resolving their names..

                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                        If you get confused: Listen to the Music Play
                        Please don't Chat/PM me for help, unless mod related
                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                        N 1 Reply Last reply Reply Quote 0
                        • N
                          nedyah700 Rebel Alliance @johnpoz
                          last edited by

                          @johnpoz
                          I just don't understand why with 0 configuration changes this upgrade made the impact so much more sever. Multiple times a day I am getting DNS resolution timeouts lasting one to two minutes. Prior to upgrading the restarts had no notable impact.

                          johnpozJ J 2 Replies Last reply Reply Quote 0
                          • johnpozJ
                            johnpoz LAYER 8 Global Moderator @nedyah700
                            last edited by johnpoz

                            @nedyah700 unbound has restarted with dhcp reservations since for ever.. Can tell you that for sure..

                            Timeout lasting a few minutes shouldn't happen unless your getting a flood of renews like all in a row or something.. Maybe before your registrations were more spread out and didn't come in groups.

                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                            If you get confused: Listen to the Music Play
                            Please don't Chat/PM me for help, unless mod related
                            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                            N 1 Reply Last reply Reply Quote 0
                            • N
                              nedyah700 Rebel Alliance @johnpoz
                              last edited by

                              @johnpoz Agree, and I've seen it in my logs like this since day 1 with pfSense. But all of a sudden now it's actually causing experienced issues with users. Clearly I am not alone judging by all the various posts here on the forums.

                              1 Reply Last reply Reply Quote 0
                              • J
                                JonH @nedyah700
                                last edited by

                                @nedyah700 I've had the same problems except my unbound service was not restarting, it was hanging and if I did nothing it would eventually get going again. I was manually restarting it rather than waiting it out. Now I rarely have that 2 min delay and have not observed it hanging. I set the logging up to level 3 and noticed a lot of "debug: outnettcp got tcp error -1" errors when it was hung.

                                I am using pfBlockerNG and under DNSBL I have DNS set to "unbound python mode". I have my dhcp set to a limited pool range and have some clients with static IP's outside the pool range.

                                The changes I made, and I don't know which one or combo that helped me but here are some things I changed:

                                1). In System->General setup I changed the default "use local, fall back to remote DNS" to "Use local, ignore remote"

                                2). In DNS Resolver I previously had all interfaces selected under "outgoing network interfaces". I changed that to select WAN only.

                                3). Under Resolver -> Advanced I changed the 'outgoing' and the 'incoming' TCP Buffers from the default 10 to 20. When I changed this I was still experiencing the problem but now I have not observed the problem. I have not idea if changing this setting is applicable to the problem, I only know that after changing this and rebooting pfSense, my switch, and my AP everything is better.

                                N 1 Reply Last reply Reply Quote 0
                                • N
                                  nedyah700 Rebel Alliance @JonH
                                  last edited by

                                  @jonh said in Frequent DNS timeouts:

                                  pfBlockerNG

                                  Thanks! I'll give some of these a try. I am using pfBlockerNG but not DNSBL.

                                  johnpozJ 1 Reply Last reply Reply Quote 0
                                  • johnpozJ
                                    johnpoz LAYER 8 Global Moderator @nedyah700
                                    last edited by johnpoz

                                    @nedyah700 are you forwarding or doing a normal resolve, which is default. If your forwarding are you forwarding over tcp? ie dot?

                                    "use local, fall back to remote DNS" to "Use local, ignore remote"

                                    This setting has zero to do with anything - this is what pfsense would do when it needed to resolve something. Ie look to see if there was an update, checking for packages, etc. Or you click to resolve an IP in your firewall log, etc.

                                    That settings has nothing to do with clients asking unbound, or unbound resolving or forwarding.

                                    firewalldns.jpg

                                    I have it set to ignore - because I don't have any remote dns, I only resolve.. I could of just left it at default, but was like why - there is no remote dns set, and even if there was I sure wouldn't want pfsense using them ;)

                                    If I recall correctly that setting came to be when they added dot and such, and you were adding the forwarders into the general settings.. You were not sure before if pfsense would ask unbound, which would use dot to talk to forwarders you had set. Or if pfsense used them it would just ask them via normal dns.. This setting allows you to ignore the forwarders you might have setup for dot use, because while unbound will use dot to talk to them. Pfsense would only just query them over normal 53..

                                    This has nothing to do with unbound restarting, or clients on your network asking unbound for dns.. This is what pfsense will do for its own dns needs.

                                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                                    If you get confused: Listen to the Music Play
                                    Please don't Chat/PM me for help, unless mod related
                                    SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                    1 Reply Last reply Reply Quote 0
                                    • T
                                      thundergate
                                      last edited by

                                      For me those Unbound restarts do still exist.

                                      I do not have any forwarded DNS. Only using direct Unbound with the system.

                                      DHCP registration is turned off.

                                      Only pfblockerNG in python mode.

                                      And my DNS Resolver log is full of entries.... Don't really know what is causing this issues?!

                                      SCR-20230326-ppem.png

                                      johnpozJ J 2 Replies Last reply Reply Quote 0
                                      • johnpozJ
                                        johnpoz LAYER 8 Global Moderator @thundergate
                                        last edited by johnpoz

                                        @thundergate yeah unbound would be pretty much useless if its restarting that often.. Something is wrong - can you up the verbose level so you might be able to see more info.. Or it looks like you filtered that output, what else is the log?

                                        You sure you have dhcp registrations off? That sure looks like what I had posted in this or some other dns related thread where my wifes phone was constantly asking for dhcp, mine doesn't restart unbound because dhcp registrations are off..

                                        Do you have dhcp stuff in its log that might match up - maybe the setting didn't take and for some reason its still restarting on dhcp

                                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                                        If you get confused: Listen to the Music Play
                                        Please don't Chat/PM me for help, unless mod related
                                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                        T 2 Replies Last reply Reply Quote 0
                                        • T
                                          thundergate @johnpoz
                                          last edited by

                                          @johnpoz said in Frequent DNS timeouts:

                                          You sure you have dhcp registrations off? That sure looks like what I had posted in this or some other dns related thread where my wifes phone was constantly asking for dhcp, mine doesn't restart unbound because dhcp registrations are off..

                                          Thx. Yes. See screenshot. Even disabling static DHCP doesn't help.

                                          Also disabled python mode - and still all the unbound restarts.

                                          Activated Level 2 Logging and will have a look into it.

                                          SCR-20230326-qwnz.png

                                          SCR-20230326-qxck.png

                                          1 Reply Last reply Reply Quote 0
                                          • T
                                            thundergate @johnpoz
                                            last edited by

                                            @johnpoz said in Frequent DNS timeouts:

                                            Do you have dhcp stuff in its log that might match up

                                            Within DHCP I do have a lot of those messages (see screenshot):

                                            SCR-20230326-qylr.png

                                            johnpozJ J 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.