Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    [Solved] Unbound fails on restart after pfBlockerNG updates

    Scheduled Pinned Locked Moved pfBlockerNG
    15 Posts 5 Posters 5.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      bole5
      last edited by

      Hi,

      I have the latest version of pfBlockerNG running on the latest stable pfSense. In my setup the cron job (pfBlockerNG update) runs daily at 01:15. I noticed that every morning pfSense refuses to resolve queries. So I went to pfSense and noticed that unbound was not running. In addition I could find below error in the logs.

      Feb 18 01:15:33 unbound 95052:0 fatal error: Could not read config file: /unbound.conf
      Feb 18 01:15:33 unbound 95052:0 notice: Restart of unbound 1.5.9.

      Variable $pfb['dnsbldir'] in /usr/local/pkg/pfblockerng/pfblockerng.inc is correctly defined as: '/var/unbound'…

      1 Reply Last reply Reply Quote 0
      • BBcan177B
        BBcan177 Moderator
        last edited by

        That error if from unbound itself. Are you using the service watchdog program to restart the resolver? If so, that could cause issues.

        Similar thread here:
        https://forum.pfsense.org/index.php?topic=123554.0

        I don't think it's a package related issue, but something with Unbound itself.

        "Experience is something you don't get until just after you need it."

        Website: http://pfBlockerNG.com
        Twitter: @BBcan177  #pfBlockerNG
        Reddit: https://www.reddit.com/r/pfBlockerNG/new/

        1 Reply Last reply Reply Quote 0
        • B
          bole5
          last edited by

          I am not using to restarting resolver. It is pfBlockerNG that does the reload which seems to fail. Below are the logs of the cron job in pfBlockerNG. Note the: "Reloading Unbound…. Not completed."
          The weird thing is that the reload starts (and fails) and: 01:15:33 but DNSBL Update Competed message comes a minute later at 01:16:48.

          If I restart the process manually by stopping/starting unbound, everything works fine.

          pfBlockerNG Cron Logs

          
          ===[  DNSBL Process  ]================================================
          
          [ EasyList ]		 Downloading update .. 200 OK.
            Whitelist: 10fbb07a4b0.se|32d1d3b9c.se|82d914.se|adaction.se|adrotate.se|ccebba93.se|contexlink.se|d1110e4.se|emediate.se|
            ----------------------------------------------------------------------
            Orig.    Unique     # Dups     # White    # Alexa    Final                
            ----------------------------------------------------------------------
            7312     6970       0          9          0          6961                 
            ----------------------------------------------------------------------
          
          [ EasyPrivacy ]		 Downloading update [ 02/18/17 01:15:27 ] .. 200 OK.
            Whitelist: clicksen.se|easyresearch.se|getstatistics.se|ip2location.com|optimizely.com|publish-int.se|research-int.se|vastpaketet.se|webstat.se|webtraffic.se|
            ----------------------------------------------------------------------
            Orig.    Unique     # Dups     # White    # Alexa    Final                
            ----------------------------------------------------------------------
            2759     2738       21         10         0          2707                 
            ----------------------------------------------------------------------
          
          [ Malware_Corpus_Tracker ] Downloading update [ 02/18/17 01:15:28 ] .. 200 OK
            Remote timestamp missing .
            ----------------------------------------------------------------------
            Orig.    Unique     # Dups     # White    # Alexa    Final                
            ----------------------------------------------------------------------
            140      120        0          0          0          120                  
            ----------------------------------------------------------------------
          
          [ MalwareDomainList ]	 exists. [ 02/18/17 01:15:32 ]
          ------------------------------------------
          Assembling database... completed
          Validating database... completed
          Reloading Unbound.... Not completed.
          DNSBL update [ 10941 | PASSED  ]... completed [ 02/18/17 01:16:48 ]
          ------------------------------------------
          
          

          System logs:

          Feb 18 01:17:45	php		[pfBlockerNG] No changes to Firewall rules, skipping Filter Reload
          Feb 18 01:15:25	php		[pfBlockerNG] Starting cron process.
          
          1 Reply Last reply Reply Quote 0
          • BBcan177B
            BBcan177 Moderator
            last edited by

            As per the link I posted above.
            If you run this command from the pfSense shell, does it complete properly?

            @BBcan177:

            Try the following command from the shell:

            /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
            

            and also try this command:

            /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
            

            Are you using DHCP Registration in Unbound, or IPv6? Maybe post some screenshots of the unbound settings.

            There is some issue in the pfSense unbound code that is causing these corner-case issues…

            "Experience is something you don't get until just after you need it."

            Website: http://pfBlockerNG.com
            Twitter: @BBcan177  #pfBlockerNG
            Reddit: https://www.reddit.com/r/pfBlockerNG/new/

            1 Reply Last reply Reply Quote 0
            • B
              bole5
              last edited by

              You are right. It is something with unbound and my setup.
              If I run the "status" command it runs normally

              version: 1.5.9
              verbosity: 1
              threads: 4
              modules: 2 [ validator iterator ]
              uptime: 48303 seconds
              options: control(ssl)
              unbound (pid 47565) is running...
              

              Then I run "reload" and it returns "ok" immediately. If I run another "reload" shortly thereafter I get:

              [1487451259] unbound-control[22739:0] error: connect: Operation timed out for 127.0.0.1
              

              and in the unbound logs I see: ```
              fatal error: Could not read config file: /unbound.conf

              
              I will modify pfblockerng.inc file as to stop/start service as suggested by the thread you referenced.
              1 Reply Last reply Reply Quote 0
              • BBcan177B
                BBcan177 Moderator
                last edited by

                Could you check a few things out for me please?

                Does this folder exist in your box?    ls -lah /var/tmp
                and does this file exist?                    ls -lah /var/tmp/unbound_cache

                So basically it runs thru these steps to reload the new DNSBL changes into Unbound…
                If Unbound is running:

                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf dump_cache > /var/tmp/unbound_cache
                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf load_cache < /var/tmp/unbound_cache
                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
                

                Its really strange that it gives that error on multiple reload attempts…

                Could you check to see if the /var/unbound/unbound.conf file is changed after each  "Reload"?
                or PM me a copy of the unbound.conf before each reload attempt…

                Another thought is that the dump/load cache might have some issue... But can't tell until you run these commands and see if you can find another clue...

                And is you pause for a second or two, do you still get the error?

                Thanks!

                "Experience is something you don't get until just after you need it."

                Website: http://pfBlockerNG.com
                Twitter: @BBcan177  #pfBlockerNG
                Reddit: https://www.reddit.com/r/pfBlockerNG/new/

                1 Reply Last reply Reply Quote 0
                • B
                  bole5
                  last edited by

                  Could you check a few things out for me please?

                  @BBcan177:

                  Does this folder exist in your box?    ls -lah /var/tmp
                  and does this file exist?                    ls -lah /var/tmp/unbound_cache

                  Yes. The file is there with correct timestamp:

                  ls -al /var/tmp/unbound_cache
                  -rw-r--r--  1 root  wheel  509698 Feb 19 01:15 /var/tmp/unbound_cache
                  

                  @BBcan177:

                  So basically it runs thru these steps to reload the new DNSBL changes into Unbound…
                  If Unbound is running:

                  /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf dump_cache > /var/tmp/unbound_cache
                  /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
                  /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf load_cache < /var/tmp/unbound_cache
                  /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
                  

                  Running commands in sequence gives following:

                  [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf dump_cache > /var/tmp/unbound_cache
                  [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
                  ok
                  [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf load_cache < /var/tmp/unbound_cache
                  error: SSL handshake failed
                  [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
                  [1487493495] unbound-control[69667:0] error: connect: Operation timed out for 127.0.0.1
                  

                  The unbound crashes on "reload" command, so the "load_cache" and "status" fail. During the reload I could see the: "fatal error: Could not read config file: /unbound.conf" in resolver log file.

                  @BBcan177:

                  Its really strange that it gives that error on multiple reload attempts…

                  Based on the latest tests, my explanation is that it crashes on the first reload and the second time when I run the reload it has a long timeout before it returns error.

                  @BBcan177:

                  Could you check to see if the /var/unbound/unbound.conf file is changed after each  "Reload"?
                  or PM me a copy of the unbound.conf before each reload attempt…

                  Strangely enough the timestamp on the unbound.conf is changed, so the file is touched on reload. I copied the conf file before reload and run diff which gave me no changes.

                  @BBcan177:

                  Another thought is that the dump/load cache might have some issue… But can't tell until you run these commands and see if you can find another clue...

                  Based on the above, there is something in my setup that causes unbound to crash on "reload" command. I modified the pfblockerng.inc file and replaced:

                  
                                          exec("{$chroot_cmd} reload 2>&1", $result, $retval);
                                          pfb_logger('.', 1);
                  

                  with:

                  
                                          exec("{$chroot_cmd} stop");
                                          sleep(1);
                                          pfb_logger('.', 1);
                                          mwexec("/usr/local/sbin/unbound -c {$g['unbound_chroot_path']}/unbound.conf 2>&1", $result, $retval);
                                          pfb_logger('.', 1);
                  

                  This procedure worked and didn't cause unbound to crash.

                  Since this is clearly caused by unbound, may I suggest that you add a flag in pfBlockerNG with option to replace reload with stop/start for people experiencing this problem?

                  1 Reply Last reply Reply Quote 0
                  • B
                    bole5
                    last edited by

                    My problem is now solved after following advice from BBcan177.
                    It turns out that all includes in unbound.conf options need to point to the same folder: /var/unbound.

                    I had following in my configuration:

                    My domain overrides

                    server:include: /usr/local/etc/unbound/se-servers.conf

                    All I needed to do is to change the /usr/local/etc/unbound to /var/unbound, and unbound stopped crashing on "reload".

                    My domain overrides

                    server:include: /var/unbound//se-servers.conf

                    Thank you very much BBcan177 for this great package!

                    1 Reply Last reply Reply Quote 0
                    • R
                      renegade
                      last edited by

                      Sorry, but i don't understand the solution.

                      My Logs are full of:

                      Mar 1 17:56:37 	unbound 	30737:0 	info: start of service (unbound 1.6.0).
                      Mar 1 17:56:33 	unbound 	30737:0 	error: cannot chdir to directory: (No such file or directory)
                      Mar 1 17:56:33 	unbound 	30737:0 	info: service stopped (unbound 1.6.0).
                      Mar 1 17:50:18 	unbound 	30737:0 	info: start of service (unbound 1.6.0).
                      Mar 1 17:50:14 	unbound 	30737:0 	error: cannot chdir to directory: (No such file or directory)
                      Mar 1 17:50:14 	unbound 	30737:0 	info: service stopped (unbound 1.6.0).
                      Mar 1 17:46:30 	unbound 	30737:0 	info: start of service (unbound 1.6.0).
                      Mar 1 17:46:25 	unbound 	30737:0 	error: cannot chdir to directory: (No such file or directory)
                      Mar 1 17:46:25 	unbound 	30737:0 	info: service stopped (unbound 1.6.0). 
                      

                      and very often in the morning i cannot resolve any dns, because the unbound did not startup.

                      1 Reply Last reply Reply Quote 0
                      • B
                        bole5
                        last edited by

                        @renegade: In my particular case I had my own domain overrides imported under DNS Resolver/Custom options. The file imported was located in /usr/local/etc/unbound/ folder. The unbound was crashing each time I would issue reload command and solution was to move my files to /var/unbound folder.

                        Please bear in mind that the error message I was getting was different than yours:

                        fatal error: Could not read config file: /unbound.conf
                        

                        You can try running commands listed by BBcan177 and in his post: https://forum.pfsense.org/index.php?topic=125773.msg694881#msg694881 and see if it provides more clues. After running commands check unbound log and dmesg.

                        You can also force restart of unblocker which worked for me (before my problem was solved). For details about that read the end my post where I describe what needs to be modified in pfblockerng.inc file: https://forum.pfsense.org/index.php?topic=125773.msg694915#msg694915

                        Good luck!

                        1 Reply Last reply Reply Quote 0
                        • K
                          kpa
                          last edited by

                          Unbound is chrooted under /var/unbound in pfSense and that means all configuration files and files referenced by the configuration files must be placed under /var/unbound.

                          1 Reply Last reply Reply Quote 0
                          • R
                            renegade
                            last edited by

                            tried the commands but there was no error.
                            this morning i had to start unbound again manually. after the 24h reconnect.

                            1 Reply Last reply Reply Quote 0
                            • RonpfSR
                              RonpfS
                              last edited by

                              Should be fixed at some point https://redmine.pfsense.org/issues/7326

                              2.4.5-RELEASE-p1 (amd64)
                              Intel Core2 Quad CPU Q8400 @ 2.66GHz 8GB
                              Backup 0.5_5, Bandwidthd 0.7.4_4, Cron 0.3.7_5, pfBlockerNG-devel 3.0.0_16, Status_Traffic_Totals 2.3.1_1, System_Patches 1.2_5

                              1 Reply Last reply Reply Quote 0
                              • R
                                renegade
                                last edited by

                                I removed many feeds from DNSBL and it seems to work stable… until 2.3.3p1 came. the last two mornings same effect.  I could not resolve any ip adress until i restarted bind manually by logging in and restarting it.

                                1 Reply Last reply Reply Quote 0
                                • BBcan177B
                                  BBcan177 Moderator
                                  last edited by

                                  I have posted a patch in redmine for this issue:

                                  https://redmine.pfsense.org/issues/7326

                                  "Experience is something you don't get until just after you need it."

                                  Website: http://pfBlockerNG.com
                                  Twitter: @BBcan177  #pfBlockerNG
                                  Reddit: https://www.reddit.com/r/pfBlockerNG/new/

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.