Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbound restarting 20-30 times a day without any reason

    Scheduled Pinned Locked Moved DHCP and DNS
    18 Posts 4 Posters 1.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      Piktor10
      last edited by Piktor10

      System data:
      pfSense 2.7.2-RELEASE (amd64), unbound 1.19.1
      Average memory status:
      Mem: 299M Active, 738M Inact, 426M Wired, 215M Buf, 2439M Free
      Swap: 4096M Total, 4096M Free
      __

      My unbound is restarting in intervalls between 2min to 260min, 20 to 30 times a day!

      Checking the log, nothing looks obvious, looks like the restart is done under some control:
      The stop and start message appears everytime in the logfile. Its always a "service stopped" message followed by "start of service", so nothing pointing to a crash.

      Jun  4 00:33:26 XXXX unbound[96489]: [96489:0] info: service stopped (unbound 1.19.1).
      Jun  4 00:33:43 XXXX unbound[64466]: [64466:0] info: start of service (unbound 1.19.1).
      

      The section between service stopped and start of service looks like this:

      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: service stopped (unbound 1.19.1).
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 0: 5324 queries, 4863 answers from cache, 461 recursions, 1688 prefetc
      h, 0 rejected by ip ratelimiting
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 0: requestlist max 7 avg 0.377385 exceeded 0 jostled 0
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: average recursion processing time 0.103678 sec
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: histogram of recursion processing times
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: [25%]=0.0328249 median[50%]=0.0590507 [75%]=0.119681
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: lower(secs) upper(secs) recursions
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.000000    0.000001 30
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.000512    0.001024 1
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.001024    0.002048 2
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.002048    0.004096 5
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.004096    0.008192 3
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.008192    0.016384 3
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.016384    0.032768 71
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.032768    0.065536 144
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.065536    0.131072 105
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.131072    0.262144 61
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.262144    0.524288 27
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.524288    1.000000 6
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    1.000000    2.000000 3
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 1: 3572 queries, 3274 answers from cache, 298 recursions, 1166 prefetc
      h, 0 rejected by ip ratelimiting
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 1: requestlist max 6 avg 0.30123 exceeded 0 jostled 0
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: average recursion processing time 0.127478 sec
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: histogram of recursion processing times
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: [25%]=0.0366778 median[50%]=0.0644189 [75%]=0.138044
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: lower(secs) upper(secs) recursions
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.000000    0.000001 19
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.004096    0.008192 1
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.016384    0.032768 44
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.032768    0.065536 88
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.065536    0.131072 69
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.131072    0.262144 47
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.262144    0.524288 20
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.524288    1.000000 6
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    1.000000    2.000000 3
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    2.000000    4.000000 1
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 2: 1478 queries, 1340 answers from cache, 138 recursions, 490 prefetch
      , 0 rejected by ip ratelimiting
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 2: requestlist max 6 avg 0.16242 exceeded 0 jostled 0
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: average recursion processing time 0.152079 sec
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: histogram of recursion processing times
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: [25%]=0.0340964 median[50%]=0.0646504 [75%]=0.160563
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: lower(secs) upper(secs) recursions
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.000000    0.000001 16
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.016384    0.032768 17
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.032768    0.065536 37
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.065536    0.131072 29
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.131072    0.262144 20
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.262144    0.524288 13
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.524288    1.000000 1
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    1.000000    2.000000 5
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 3: 4884 queries, 4395 answers from cache, 489 recursions, 1517 prefetc
      h, 0 rejected by ip ratelimiting
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: server stats for thread 3: requestlist max 5 avg 0.352443 exceeded 0 jostled 0
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: average recursion processing time 0.117536 sec
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: histogram of recursion processing times
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: [25%]=0.03404 median[50%]=0.0589213 [75%]=0.119998
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: lower(secs) upper(secs) recursions
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.000000    0.000001 21
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.000512    0.001024 2
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.002048    0.004096 1
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.004096    0.008192 3
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.008192    0.016384 6
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.016384    0.032768 83
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.032768    0.065536 161
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.065536    0.131072 108
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.131072    0.262144 62
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.262144    0.524288 28
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    0.524288    1.000000 7
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    1.000000    2.000000 4
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info:    2.000000    4.000000 3
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] notice: Restart of unbound 1.19.1.
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] info: [pfBlockerNG]: pfb_unbound.py script exiting
      Jun  4 20:46:19 XXXX unbound[64466]: [64466:0] notice: init module 0: python
      Jun  4 20:46:20 XXXX unbound[64466]: [64466:0] info: [pfBlockerNG]: pfb_unbound.py script loaded
      Jun  4 20:46:34 XXXX unbound[64466]: [64466:0] info: [pfBlockerNG]: init_standard script loaded
      Jun  4 20:46:34 XXXX unbound[64466]: [64466:0] notice: init module 1: iterator
      

      reporting results from 3-5 threads.

      Every 15 minutes I get the message:

      merge_config: configuration reload
      

      followed by all my firewall named tables

      That is all, I see in the logfiles. No reason, why unbound is restarting so often, nothing about what is causing it.

      Has someone any idea?

      Furthermore, how can I configure, that this configuration reload is done only every 6 or 12 hours?

      GertjanG 1 Reply Last reply Reply Quote 0
      • bmeeksB
        bmeeks
        last edited by bmeeks

        Do you have automatic DNS registration of DHCP leases enabled?

        If so, that results in unbound being given a restart command each time a DHCP client renews its lease. Since most clients renew at the half-way point of their lease duration, that can result in a lot of unbound restarts on a moderately populated network.

        This is a known limitation of the current unbound implementation on pfSense. Work is underway to remedy that with later versions of the Kea DHCP replacement (but it's not ready yet).

        Not sure where that second message about "merge_config: configuration reload" is coming from.

        P 1 Reply Last reply Reply Quote 1
        • P
          Piktor10 @bmeeks
          last edited by

          @bmeeks Thanks for that tip! Yes I have DHCP client DNS registration enabled. Will disable this, so that hint is very valuable!
          Will watch, how things are look like after that.

          1 Reply Last reply Reply Quote 0
          • GertjanG
            Gertjan @Piktor10
            last edited by Gertjan

            Added to what is mentioned above :

            @Piktor10 said in Unbound restarting 20-30 times a day without any reason:

            That is all, I see in the logfiles. No reason, why unbound is restarting so often, nothing about what is causing it.

            Has someone any idea?

            Knowing what you know now :
            read again :

            1f1f07c9-0554-4260-a1f7-8065124fced2-image.png

            and this is exactly what you see happening.
            This is the reason.

            By default, this option in not checked.
            'Then, some one came along, and checked it ..."

            Take note ; the future replacement of the ISC DHCP, KEA, will also have this functionality, and the DHCP host name will get communicated to unbound (DNS° without a DNS restart.
            Until that moment : for all your important 'server type' devices, devices you have to name by host name, us a static DHCP MAC lease.

            And Check this one :

            4d681529-b5b1-4ab7-a176-683ee5c5d4cb-image.png

            as this option will not restart unbound, as they are ... static ^^

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            1 Reply Last reply Reply Quote 0
            • P
              Piktor10
              last edited by Piktor10

              "For me reloading and flushing a resolution cache" was not a synonym for restarting the resolver!
              Even, when I was reading that after unchecking, I have not understand this as a warning, that the whole resolver restarts! The only possible delay I saw in this warning, that the resolver can't answer the next queries from its cache, as that was flushed!

              GertjanG 1 Reply Last reply Reply Quote 0
              • GertjanG
                Gertjan @Piktor10
                last edited by

                @Piktor10 said in Unbound restarting 20-30 times a day without any reason:

                "For me reloading and flushing a resolution cache" was not a synonym for restarting the resolver!
                Even, when I was reading that after unchecking, I have not understand this as a warning, that the whole resolver restarts!

                Actually, unbound is send a reload signal.
                A couple of years ago, it was discoverer (open source !) that unbound, upon receiving a 'reload' actually ... restarts.

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                1 Reply Last reply Reply Quote 0
                • P
                  Piktor10
                  last edited by

                  @Gertjan Thats, what I figured out now as well. Seeing in the dhcpd.log the HUPs beeing sent.
                  So I would recommend, to change the wording in the warning to the current behaviour, and write restart instead of reload.

                  Next question: How that shall be fixed by using kea instead of dhcpd? Unbound still needs to get this data, without restarting.

                  I am also now further, why this happens to my pfSense since some weeks, even when that configuration was in place for years. My network got recently at least 4 mobile devices using new mac addresses for connection, once they were connected anywhere else. They don't remember mac addresses for certain WLANs or router macs. Because of this, I had to throw away static associations for those mobile devices. This is in timely coincidence, to the growing number of restarts.

                  GertjanG bmeeksB 2 Replies Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @Piktor10
                    last edited by

                    @Piktor10 said in Unbound restarting 20-30 times a day without any reason:

                    Next question: How that shall be fixed by using kea instead of dhcpd? Unbound still needs to get this data, without restarting.

                    Believe it or not, after 10+ years of waiting, this is exactly one of the reasons KEA was introduced ! (IMHO)
                    A couple of weeks ago, a Netgate software développer acknowledged here on this forum that he started the implantation of just this very issue.
                    Kea that signals unbound the presence of a (new) lease without restarting unbound.

                    @Piktor10 said in Unbound restarting 20-30 times a day without any reason:

                    My network got recently at least 4 mobile devices using new mac addresses for connection, once they were connected anywhere else. They don't remember mac addresses for certain WLANs or router macs. Because of this, I had to throw away static associations for those mobile devices. This is in timely coincidence, to the growing number of restarts.

                    Not entirely pfSense's fault.
                    When, you, the pfSense admin, decides to add a DHCP Static MAC lease, it would be wise to check if the MAC in question is a randomized MAC, or not. The owner of the device should be willing to shut down this randomization on your network. So, see it as a win-win situation, check that randomisation is off.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    1 Reply Last reply Reply Quote 0
                    • bmeeksB
                      bmeeks @Piktor10
                      last edited by

                      @Piktor10 said in Unbound restarting 20-30 times a day without any reason:

                      Next question: How that shall be fixed by using kea instead of dhcpd? Unbound still needs to get this data, without restarting.

                      unbound has a control channel app that can update some parameters without restarting the daemon. But that control channel app needs to be called from within the DHCP server code. Currently the ISC DHCP daemon is not coded to do that. The new Kea DHCP daemon can be modified to do that, and that's what pfSense intends to do. Since the current ISC DHCP daemon will eventually be obsoleted, there is no point to expending developer resources to modify it. Better to put the effort into the replacement product - Kea.

                      GertjanG M 2 Replies Last reply Reply Quote 0
                      • GertjanG
                        Gertjan @bmeeks
                        last edited by

                        @bmeeks

                        Some one, and yes, even me, could have taken this https://github.com/NoThrowForwardIt/pfsense-tools/blob/master/pfPorts/dhcpleases/files/dhcpleases.c as a starting point, and rewrite it so it takes in account the "unbound-control" parameters, so things could haven been done 'properly' ages ago.

                        The actual solution : check if the ISC DHCPd 'dhcpleases' file was changed, and if so, shoot at unbound.
                        It did pass the KIS test 😊

                        No "help me" PM's please. Use the forum, the community will thank you.
                        Edit : and where are the logs ??

                        1 Reply Last reply Reply Quote 0
                        • M
                          michmoor LAYER 8 Rebel Alliance @bmeeks
                          last edited by

                          @bmeeks
                          Curious. I typically run pfSense in small-ish networks is it worth it keeping DHCP Registration enabled? I haven't noticed a performance hit but also the only benefit is knowing what IP is what when i nslookup.

                          Firewall: NetGate,Palo Alto-VM,Juniper SRX
                          Routing: Juniper, Arista, Cisco
                          Switching: Juniper, Arista, Cisco
                          Wireless: Unifi, Aruba IAP
                          JNCIP,CCNP Enterprise

                          bmeeksB 1 Reply Last reply Reply Quote 0
                          • bmeeksB
                            bmeeks @michmoor
                            last edited by bmeeks

                            @michmoor said in Unbound restarting 20-30 times a day without any reason:

                            @bmeeks
                            Curious. I typically run pfSense in small-ish networks is it worth it keeping DHCP Registration enabled? I haven't noticed a performance hit but also the only benefit is knowing what IP is what when i nslookup.

                            The decision point would be the number of DHCP clients and thus the number of DHCP lease renewals per interval (say per hour or per day). Each lease renewal results in an unbound restart with the current design when "Register DHCP leases in the DNS Resolver" is enabled. Each restart means DNS is unavailable until the unbound daemon gets back up and running. With a plain vanilla install that can be fairly quick, but if you use DNSBL, then the restart time can be much longer. Even still, enough restarts per time interval can still be very disruptive on a network as you have no DNS for an appreciable interval of time.

                            I, too, like to have my local DHCP clients registered in DNS. Because I have a Windows AD domain internally, I just use that for both DHCP and DNS. My unbound setup only resolves for pfSense itself and has a domain override for my AD domain that points back to my AD DNS servers so pfSense can resolve my local LAN hosts.

                            This problem with restarting unbound is not a good design choice in my view, but it is what it is. I have a pfSense install on an SG-6100 at a local church, and I had to disable the DHCP lease registration option there otherwise DNS became essentially unusable due to the frequent restarts initiated by lease allocations and renewals on the network.

                            M 1 Reply Last reply Reply Quote 0
                            • M
                              michmoor LAYER 8 Rebel Alliance @bmeeks
                              last edited by

                              @bmeeks said in Unbound restarting 20-30 times a day without any reason:

                              This problem with restarting unbound is not a good design choice in my view, but it is what it is. I have a pfSense install on an SG-6100 at a local church, and I had to disable the DHCP lease registration option there otherwise DNS became essentially unusable due to the frequent restarts initiated by lease allocations and renewals on the network.

                              how many devices on the network?

                              Firewall: NetGate,Palo Alto-VM,Juniper SRX
                              Routing: Juniper, Arista, Cisco
                              Switching: Juniper, Arista, Cisco
                              Wireless: Unifi, Aruba IAP
                              JNCIP,CCNP Enterprise

                              bmeeksB 1 Reply Last reply Reply Quote 0
                              • bmeeksB
                                bmeeks @michmoor
                                last edited by bmeeks

                                @michmoor said in Unbound restarting 20-30 times a day without any reason:

                                how many devices on the network?

                                There is no universal precise number. It will depend primarily on the DHCP lease time. And then the number of devices comes into play. If you have a DHCP lease time of 1 hour with 1 device on the network, unbound would be seeing a restart once every 30 minutes (assuming the single device renewed its lease at the 50% point of the lease duration). Double the number of devices and the restarts could double in that 30 minute window. Depends on how far apart each lease start time was. Say you have 10 devices getting DHCP addresses and they all start up at staggered times. Then each of the 10 would be renewing its lease at a slightly different minute during the 1 hour lease duration. That could result in lots of unbound restarts over the hour.

                                Play different scenarios in your head and you can quickly see how this can become more and more of a problem. Of course 1 hour is a ridiculously low lease time, but 4 or 6 hours may not be unheard of for say a Guest Wireless network or some other network where user devices frequently come and go and you don't have a huge scope available.

                                Run the DHCP lease time up to days and the problem gets smaller but still does not disappear. If at any instance in time unbound is restarting when a client requests a DNS lookup, that lookup is going to fail.

                                M 1 Reply Last reply Reply Quote 0
                                • M
                                  michmoor LAYER 8 Rebel Alliance @bmeeks
                                  last edited by

                                  @bmeeks
                                  alright alright...i suppose i have to give myself extra work and spin up a separate DNS server at home... 😊

                                  But seriously, i do like having all my pieces in one place , one firewall. All depends on the use case.

                                  Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                  Routing: Juniper, Arista, Cisco
                                  Switching: Juniper, Arista, Cisco
                                  Wireless: Unifi, Aruba IAP
                                  JNCIP,CCNP Enterprise

                                  GertjanG 1 Reply Last reply Reply Quote 0
                                  • GertjanG
                                    Gertjan @michmoor
                                    last edited by

                                    @michmoor said in Unbound restarting 20-30 times a day without any reason:

                                    i have to give myself extra work and spin up a separate DNS server at home...

                                    The situation existed for ... many years.
                                    In a couple of weeks, months, with the next version of pfSense, this will be addressed.

                                    I've been using static MAC leases for my devices that I wanted to know by 'name' like printers, NAS etc.

                                    Register DHCP leases in the DNS Resolver" has been set to default value : NOT set.

                                    Never had any DHCP/DNS issues.

                                    No "help me" PM's please. Use the forum, the community will thank you.
                                    Edit : and where are the logs ??

                                    M 1 Reply Last reply Reply Quote 0
                                    • M
                                      michmoor LAYER 8 Rebel Alliance @Gertjan
                                      last edited by

                                      @Gertjan
                                      Oh i was joking when i wrote that. Im not going to do that. haha.

                                      Just to show how often the HUP process was bouncing on my system until i turned off DNS registration. The picture below is expanded over 5 days but rest assured it was restarting every few minutes. Small network so no impact but i can see it being a problem with several 100 endpoints.

                                      7a2f4bea-fc72-400c-b4db-d385384ec381-image.png

                                      Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                      Routing: Juniper, Arista, Cisco
                                      Switching: Juniper, Arista, Cisco
                                      Wireless: Unifi, Aruba IAP
                                      JNCIP,CCNP Enterprise

                                      GertjanG 1 Reply Last reply Reply Quote 0
                                      • GertjanG
                                        Gertjan @michmoor
                                        last edited by

                                        @michmoor said in Unbound restarting 20-30 times a day without any reason:

                                        but i can see it being a problem with several 100 endpoints.

                                        Things will get even better when you 'some' Wifi connected devices that roam among several APs.
                                        DHCP will fire away every x seconds .... and as much unbound restarts.

                                        Quickly, unbound is more busy 'restarting' as actually doing 'DNS' for you.

                                        No "help me" PM's please. Use the forum, the community will thank you.
                                        Edit : and where are the logs ??

                                        1 Reply Last reply Reply Quote 1
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.