• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

[Solved] Unbound fails on restart after pfBlockerNG updates

Scheduled Pinned Locked Moved pfBlockerNG
15 Posts 5 Posters 5.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    bole5
    last edited by Feb 19, 2017, 3:10 PM Feb 18, 2017, 7:39 AM

    Hi,

    I have the latest version of pfBlockerNG running on the latest stable pfSense. In my setup the cron job (pfBlockerNG update) runs daily at 01:15. I noticed that every morning pfSense refuses to resolve queries. So I went to pfSense and noticed that unbound was not running. In addition I could find below error in the logs.

    Feb 18 01:15:33 unbound 95052:0 fatal error: Could not read config file: /unbound.conf
    Feb 18 01:15:33 unbound 95052:0 notice: Restart of unbound 1.5.9.

    Variable $pfb['dnsbldir'] in /usr/local/pkg/pfblockerng/pfblockerng.inc is correctly defined as: '/var/unbound'…

    1 Reply Last reply Reply Quote 0
    • B
      BBcan177 Moderator
      last edited by Feb 18, 2017, 1:43 PM

      That error if from unbound itself. Are you using the service watchdog program to restart the resolver? If so, that could cause issues.

      Similar thread here:
      https://forum.pfsense.org/index.php?topic=123554.0

      I don't think it's a package related issue, but something with Unbound itself.

      "Experience is something you don't get until just after you need it."

      Website: http://pfBlockerNG.com
      Twitter: @BBcan177  #pfBlockerNG
      Reddit: https://www.reddit.com/r/pfBlockerNG/new/

      1 Reply Last reply Reply Quote 0
      • B
        bole5
        last edited by Feb 18, 2017, 2:30 PM

        I am not using to restarting resolver. It is pfBlockerNG that does the reload which seems to fail. Below are the logs of the cron job in pfBlockerNG. Note the: "Reloading Unbound…. Not completed."
        The weird thing is that the reload starts (and fails) and: 01:15:33 but DNSBL Update Competed message comes a minute later at 01:16:48.

        If I restart the process manually by stopping/starting unbound, everything works fine.

        pfBlockerNG Cron Logs

        
        ===[  DNSBL Process  ]================================================
        
        [ EasyList ]		 Downloading update .. 200 OK.
          Whitelist: 10fbb07a4b0.se|32d1d3b9c.se|82d914.se|adaction.se|adrotate.se|ccebba93.se|contexlink.se|d1110e4.se|emediate.se|
          ----------------------------------------------------------------------
          Orig.    Unique     # Dups     # White    # Alexa    Final                
          ----------------------------------------------------------------------
          7312     6970       0          9          0          6961                 
          ----------------------------------------------------------------------
        
        [ EasyPrivacy ]		 Downloading update [ 02/18/17 01:15:27 ] .. 200 OK.
          Whitelist: clicksen.se|easyresearch.se|getstatistics.se|ip2location.com|optimizely.com|publish-int.se|research-int.se|vastpaketet.se|webstat.se|webtraffic.se|
          ----------------------------------------------------------------------
          Orig.    Unique     # Dups     # White    # Alexa    Final                
          ----------------------------------------------------------------------
          2759     2738       21         10         0          2707                 
          ----------------------------------------------------------------------
        
        [ Malware_Corpus_Tracker ] Downloading update [ 02/18/17 01:15:28 ] .. 200 OK
          Remote timestamp missing .
          ----------------------------------------------------------------------
          Orig.    Unique     # Dups     # White    # Alexa    Final                
          ----------------------------------------------------------------------
          140      120        0          0          0          120                  
          ----------------------------------------------------------------------
        
        [ MalwareDomainList ]	 exists. [ 02/18/17 01:15:32 ]
        ------------------------------------------
        Assembling database... completed
        Validating database... completed
        Reloading Unbound.... Not completed.
        DNSBL update [ 10941 | PASSED  ]... completed [ 02/18/17 01:16:48 ]
        ------------------------------------------
        
        

        System logs:

        Feb 18 01:17:45	php		[pfBlockerNG] No changes to Firewall rules, skipping Filter Reload
        Feb 18 01:15:25	php		[pfBlockerNG] Starting cron process.
        
        1 Reply Last reply Reply Quote 0
        • B
          BBcan177 Moderator
          last edited by Feb 18, 2017, 4:00 PM

          As per the link I posted above.
          If you run this command from the pfSense shell, does it complete properly?

          @BBcan177:

          Try the following command from the shell:

          /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
          

          and also try this command:

          /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
          

          Are you using DHCP Registration in Unbound, or IPv6? Maybe post some screenshots of the unbound settings.

          There is some issue in the pfSense unbound code that is causing these corner-case issues…

          "Experience is something you don't get until just after you need it."

          Website: http://pfBlockerNG.com
          Twitter: @BBcan177  #pfBlockerNG
          Reddit: https://www.reddit.com/r/pfBlockerNG/new/

          1 Reply Last reply Reply Quote 0
          • B
            bole5
            last edited by Feb 18, 2017, 10:17 PM

            You are right. It is something with unbound and my setup.
            If I run the "status" command it runs normally

            version: 1.5.9
            verbosity: 1
            threads: 4
            modules: 2 [ validator iterator ]
            uptime: 48303 seconds
            options: control(ssl)
            unbound (pid 47565) is running...
            

            Then I run "reload" and it returns "ok" immediately. If I run another "reload" shortly thereafter I get:

            [1487451259] unbound-control[22739:0] error: connect: Operation timed out for 127.0.0.1
            

            and in the unbound logs I see: ```
            fatal error: Could not read config file: /unbound.conf

            
            I will modify pfblockerng.inc file as to stop/start service as suggested by the thread you referenced.
            1 Reply Last reply Reply Quote 0
            • B
              BBcan177 Moderator
              last edited by Feb 18, 2017, 11:43 PM

              Could you check a few things out for me please?

              Does this folder exist in your box?    ls -lah /var/tmp
              and does this file exist?                    ls -lah /var/tmp/unbound_cache

              So basically it runs thru these steps to reload the new DNSBL changes into Unbound…
              If Unbound is running:

              /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf dump_cache > /var/tmp/unbound_cache
              /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
              /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf load_cache < /var/tmp/unbound_cache
              /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
              

              Its really strange that it gives that error on multiple reload attempts…

              Could you check to see if the /var/unbound/unbound.conf file is changed after each  "Reload"?
              or PM me a copy of the unbound.conf before each reload attempt…

              Another thought is that the dump/load cache might have some issue... But can't tell until you run these commands and see if you can find another clue...

              And is you pause for a second or two, do you still get the error?

              Thanks!

              "Experience is something you don't get until just after you need it."

              Website: http://pfBlockerNG.com
              Twitter: @BBcan177  #pfBlockerNG
              Reddit: https://www.reddit.com/r/pfBlockerNG/new/

              1 Reply Last reply Reply Quote 0
              • B
                bole5
                last edited by Feb 19, 2017, 8:54 AM

                Could you check a few things out for me please?

                @BBcan177:

                Does this folder exist in your box?    ls -lah /var/tmp
                and does this file exist?                    ls -lah /var/tmp/unbound_cache

                Yes. The file is there with correct timestamp:

                ls -al /var/tmp/unbound_cache
                -rw-r--r--  1 root  wheel  509698 Feb 19 01:15 /var/tmp/unbound_cache
                

                @BBcan177:

                So basically it runs thru these steps to reload the new DNSBL changes into Unbound…
                If Unbound is running:

                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf dump_cache > /var/tmp/unbound_cache
                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf load_cache < /var/tmp/unbound_cache
                /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
                

                Running commands in sequence gives following:

                [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf dump_cache > /var/tmp/unbound_cache
                [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf reload
                ok
                [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf load_cache < /var/tmp/unbound_cache
                error: SSL handshake failed
                [2.3.2-RELEASE][root@fortet.localdomain]/var/log: /usr/sbin/chroot -u unbound -g unbound / /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
                [1487493495] unbound-control[69667:0] error: connect: Operation timed out for 127.0.0.1
                

                The unbound crashes on "reload" command, so the "load_cache" and "status" fail. During the reload I could see the: "fatal error: Could not read config file: /unbound.conf" in resolver log file.

                @BBcan177:

                Its really strange that it gives that error on multiple reload attempts…

                Based on the latest tests, my explanation is that it crashes on the first reload and the second time when I run the reload it has a long timeout before it returns error.

                @BBcan177:

                Could you check to see if the /var/unbound/unbound.conf file is changed after each  "Reload"?
                or PM me a copy of the unbound.conf before each reload attempt…

                Strangely enough the timestamp on the unbound.conf is changed, so the file is touched on reload. I copied the conf file before reload and run diff which gave me no changes.

                @BBcan177:

                Another thought is that the dump/load cache might have some issue… But can't tell until you run these commands and see if you can find another clue...

                Based on the above, there is something in my setup that causes unbound to crash on "reload" command. I modified the pfblockerng.inc file and replaced:

                
                                        exec("{$chroot_cmd} reload 2>&1", $result, $retval);
                                        pfb_logger('.', 1);
                

                with:

                
                                        exec("{$chroot_cmd} stop");
                                        sleep(1);
                                        pfb_logger('.', 1);
                                        mwexec("/usr/local/sbin/unbound -c {$g['unbound_chroot_path']}/unbound.conf 2>&1", $result, $retval);
                                        pfb_logger('.', 1);
                

                This procedure worked and didn't cause unbound to crash.

                Since this is clearly caused by unbound, may I suggest that you add a flag in pfBlockerNG with option to replace reload with stop/start for people experiencing this problem?

                1 Reply Last reply Reply Quote 0
                • B
                  bole5
                  last edited by Feb 19, 2017, 3:08 PM

                  My problem is now solved after following advice from BBcan177.
                  It turns out that all includes in unbound.conf options need to point to the same folder: /var/unbound.

                  I had following in my configuration:

                  My domain overrides

                  server:include: /usr/local/etc/unbound/se-servers.conf

                  All I needed to do is to change the /usr/local/etc/unbound to /var/unbound, and unbound stopped crashing on "reload".

                  My domain overrides

                  server:include: /var/unbound//se-servers.conf

                  Thank you very much BBcan177 for this great package!

                  1 Reply Last reply Reply Quote 0
                  • R
                    renegade
                    last edited by Mar 1, 2017, 5:15 PM

                    Sorry, but i don't understand the solution.

                    My Logs are full of:

                    Mar 1 17:56:37 	unbound 	30737:0 	info: start of service (unbound 1.6.0).
                    Mar 1 17:56:33 	unbound 	30737:0 	error: cannot chdir to directory: (No such file or directory)
                    Mar 1 17:56:33 	unbound 	30737:0 	info: service stopped (unbound 1.6.0).
                    Mar 1 17:50:18 	unbound 	30737:0 	info: start of service (unbound 1.6.0).
                    Mar 1 17:50:14 	unbound 	30737:0 	error: cannot chdir to directory: (No such file or directory)
                    Mar 1 17:50:14 	unbound 	30737:0 	info: service stopped (unbound 1.6.0).
                    Mar 1 17:46:30 	unbound 	30737:0 	info: start of service (unbound 1.6.0).
                    Mar 1 17:46:25 	unbound 	30737:0 	error: cannot chdir to directory: (No such file or directory)
                    Mar 1 17:46:25 	unbound 	30737:0 	info: service stopped (unbound 1.6.0). 
                    

                    and very often in the morning i cannot resolve any dns, because the unbound did not startup.

                    1 Reply Last reply Reply Quote 0
                    • B
                      bole5
                      last edited by Mar 1, 2017, 8:47 PM

                      @renegade: In my particular case I had my own domain overrides imported under DNS Resolver/Custom options. The file imported was located in /usr/local/etc/unbound/ folder. The unbound was crashing each time I would issue reload command and solution was to move my files to /var/unbound folder.

                      Please bear in mind that the error message I was getting was different than yours:

                      fatal error: Could not read config file: /unbound.conf
                      

                      You can try running commands listed by BBcan177 and in his post: https://forum.pfsense.org/index.php?topic=125773.msg694881#msg694881 and see if it provides more clues. After running commands check unbound log and dmesg.

                      You can also force restart of unblocker which worked for me (before my problem was solved). For details about that read the end my post where I describe what needs to be modified in pfblockerng.inc file: https://forum.pfsense.org/index.php?topic=125773.msg694915#msg694915

                      Good luck!

                      1 Reply Last reply Reply Quote 0
                      • K
                        kpa
                        last edited by Mar 1, 2017, 9:48 PM

                        Unbound is chrooted under /var/unbound in pfSense and that means all configuration files and files referenced by the configuration files must be placed under /var/unbound.

                        1 Reply Last reply Reply Quote 0
                        • R
                          renegade
                          last edited by Mar 3, 2017, 6:08 AM

                          tried the commands but there was no error.
                          this morning i had to start unbound again manually. after the 24h reconnect.

                          1 Reply Last reply Reply Quote 0
                          • RonpfSR
                            RonpfS
                            last edited by Mar 3, 2017, 6:21 AM

                            Should be fixed at some point https://redmine.pfsense.org/issues/7326

                            2.4.5-RELEASE-p1 (amd64)
                            Intel Core2 Quad CPU Q8400 @ 2.66GHz 8GB
                            Backup 0.5_5, Bandwidthd 0.7.4_4, Cron 0.3.7_5, pfBlockerNG-devel 3.0.0_16, Status_Traffic_Totals 2.3.1_1, System_Patches 1.2_5

                            1 Reply Last reply Reply Quote 0
                            • R
                              renegade
                              last edited by Mar 12, 2017, 10:28 AM

                              I removed many feeds from DNSBL and it seems to work stable… until 2.3.3p1 came. the last two mornings same effect.  I could not resolve any ip adress until i restarted bind manually by logging in and restarting it.

                              1 Reply Last reply Reply Quote 0
                              • B
                                BBcan177 Moderator
                                last edited by Mar 14, 2017, 6:51 PM

                                I have posted a patch in redmine for this issue:

                                https://redmine.pfsense.org/issues/7326

                                "Experience is something you don't get until just after you need it."

                                Website: http://pfBlockerNG.com
                                Twitter: @BBcan177  #pfBlockerNG
                                Reddit: https://www.reddit.com/r/pfBlockerNG/new/

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                  This community forum collects and processes your personal information.
                                  consent.not_received