Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs

    Scheduled Pinned Locked Moved pfBlockerNG
    21 Posts 3 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • GertjanG
      Gertjan @Klaus2314
      last edited by Gertjan

      @Klaus2314

      That's not "/var/unbound/" as you've listed many files and folders that don't belong there.

      edit : ok, you've figured that out.

      @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

      watchdog active for unbound

      Ah, ok, you've the 'create random issues on my system' tool installed.
      My advise : don't use it.
      It a developer tool.
      You not developing unbound, you are using it. When 'pfSense' decides that it has to rstart unbound because pfBlocker (for example) was realoaded with new DNSBL, then let pfSense restart unbound.

      On my pfSense, unbound can run for days, weeks or longer. It is restarted at lest ones a week because my pfBlocker reloads (updates, if any) DNSBL and a consequence is : unbound is restarted.

      Your situation : you use unbound, pfBlockerbl so unbound can get restarted more often.
      Because you use pfBlockerbl and probably not the python mode ( is that so ? ) the restart can take a long time. One, two or more minutes to start again has been seen before, deping on the number of DNSBL entries.
      Restarting it at that moment, while it was already restarting, will create ... well, it won't do any good, that's for sure.
      That's te scenario you've created : the watchdog package checks every minute, for every selected service, if they are running. If it does,'t it will start it.
      But what if it was already in the startup phase ? ( noop, the watchdog package isn't smart ).

      That's why python mode was invented : unbound restarts way faster = no more, or far smaller, DNS outages.
      For myself : that's why I sync my DNSBL every week, not xx times a day.
      Result - for me - rock solid 'DNS'.

      ( and no DHCP DNS registration, as that option also restarts unbound on every incomonig lease (renewal) )

      edit :

      b563422f-0014-427b-8fe0-64730f4e9fb6-image.png

      😰

      Try this :
      In the GUI :
      stop unbound
      stop pfblocker

      Then : on the console or SSH :
      delete these two files.

      Now, in the GUI, start unbound and start pfblockerng.

      "PHP" can not create a 10 Mbytes file xx time per second. That's just impossible.

      edit 2 :

      c84ac42f-754b-4f5a-92a5-08e8579f4c4b-image.png

      68 Mbytes ... 😢
      If you don't use the python mode, your DNS is worthless as unbound can't read this file 'in a snap' during startup. It will take some serious time to do so.
      Consider making this file waaay smaller == way less DNSBL.

      I've a 4100, and, true, it has an Intel processor, but it's a small one.
      4 Mbytes is already a noticeable load on the unbound startup.

      No "help me" PM's please. Use the forum, the community will thank you.
      Edit : and where are the logs ??

      J 1 Reply Last reply Reply Quote 0
      • J
        jrey @Klaus2314
        last edited by

        @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

        I have watchdog active for unbound could that interfere with the pfblocker in terms of timing?

        can watchdog interfere - yes but only if unbound is failing and/or in the process of doing a "normal" restart.
        Keep in mind the it is generally recommended that watchdog is not running. Depending on your configuration pfblocker will restart unbound when it needs to (for example, a DNSBL list has changed).
        Personally I don't even have watchdog installed and as the DNSBL iists I use only change every 3 or 4 days even though I check daily for updates, unbound only restarts when an updated lists is available. The restart only takes a few seconds here, you'd never notice.

        But there is the problem, the file is rather large, not that size is an issue for a DB unless you are out of disk space.
        "Error: database disk image is malformed"

        in this case you could simply delete the two files when you are in that directory

        cd /var/unbound
        rm pfb_py_resolver.sqlite
        rm pfb_py_resolver.sqlite.invalid 
        

        alternative is you could just move the .sqlite version to a backup and the system will create a new one -- once that happens, you could then delete the backup and invalid versions. to do that you would

        cd /var/unbound
        mv pfb_py_resolver.sqlite pfb_py_resolver.sqlite.bak
        (wait for the new file to be created check with ls -l until it appears, won't take long, then)
        rm pfb_py_resolver.sqlite.bak
        rm pfb_py_resolver.sqlite.invalid 
        

        Did you manually copy or mv the file to .invalid - that would not be there normally.
        looks like based on the time stamps the system would be trying to write to the one dated Sep 3, but that today Sep 18 the invalid copy was created -- that may be when the error causing the corruption started? 14:25 (check logs)

        K 1 Reply Last reply Reply Quote 0
        • K
          Klaus2314 @jrey
          last edited by Klaus2314

          @jrey Thank's a lot for your advice! I'll dig into this. I have watchdog on because unbound (and somertimes kea dhcp) stopped working (I suspect after pfblocker updatding at night).
          I'll try again without.

          GertjanG 1 Reply Last reply Reply Quote 0
          • GertjanG
            Gertjan @Klaus2314
            last edited by

            @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

            stopped working

            I've edited my post above with some observations.
            You will find some very possible reasons why.

            @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

            and somertimes kea dhcp

            Be gentile to yourself : switch back to ISC for the moment.

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            K 1 Reply Last reply Reply Quote 0
            • K
              Klaus2314 @Gertjan
              last edited by

              @Gertjan ok, thanks again. I can't delete the files. permission denied.
              Doesn't sudo work on BSD? It says unknown command.

              Again, sorry for my novice ignorance :-D

              J 1 Reply Last reply Reply Quote 0
              • J
                jrey @Gertjan
                last edited by

                @Gertjan said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                It will take some serious time to do so.

                you'd think that will default settings a zone file that big the system would be throwing memory errors. That's a lot of zones. of course there is a limit as to how many it loads and ignores the rest. Either way though the list is too big.

                @Gertjan said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                switch back to ISC for the moment.

                Also true.

                GertjanG 1 Reply Last reply Reply Quote 0
                • J
                  jrey @Klaus2314
                  last edited by

                  @Klaus2314

                  does your ssh prompt say you are root or admin?

                  if you ssh in with a "normal" user account you won't have permission.

                  ssh bob@... logs in won't work (bob doesn't have permission)
                  ssh admin@... or ssh root@... logs in will work.

                  1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @jrey
                    last edited by Gertjan

                    @jrey said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                    default settings a zone

                    The default pfb_py_zone.txt file size is 'zero' as pfBlockerng has no DNSBL pre set or pre loaded.

                    Then the admin comes along, and selects a whole bunch of these :

                    826bf80b-be28-42ab-abc3-a4c4f0a98fb3-image.png

                    and then .... major issues happens. The system can't cope anymore.

                    Btw : unbound doesn't only 'allocate' memory to load the file, and stores it into that memort check.
                    Noop, it reads the fiie, interpretes every line, and starts to fill structures 'per line', making each line part of a big array with DNSBL to be matched on every host name request ...
                    This is done using arm or intel instrcution, true, but even so : it's very time consumming.
                    How long can be seen here :

                    bc18caea-ad04-42e0-96f6-a39e2626fadf-image.png

                    so with 119414 entries, it took 15:57-15:22 = already 35 seconds.

                    Oh, I get it : @Klaus2314 thought that unbound was 'broken' ....
                    Klaus2314 : check for yourself, ask for apfBlockerng reload here and see what the log tells you :

                    9e2bcb71-77ba-400d-860f-265a419b946a-image.png

                    and hit the blue run button.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    K J 2 Replies Last reply Reply Quote 0
                    • K
                      Klaus2314 @Gertjan
                      last edited by

                      @Gertjan OK, I have now managed to delete those files, de-installed watchdog and followed your other advice. I will see how it goes now and report back.
                      Thanks again for your patience!

                      1 Reply Last reply Reply Quote 0
                      • J
                        jrey @Gertjan
                        last edited by

                        @Gertjan said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                        so with 119414 entries, it took 15:57-15:22 = already 35 seconds.

                        What version are you running?, and this is on a 4100 I think you said?

                        Below is from a crap little test box (my production box a 2100) and has much the same with regards to timing, but has a larger "list" than shown here

                         Original    Matches    Removed    Final     
                         ----------------------------------------
                         166351      61104      67425      98926     
                         -----------------------------------------
                        TLD finalize... completed [ 09/15/24 04:15:10 ]
                        
                        Saving DNSBL statistics... completed
                        Reloading Unbound Resolver (DNSBL python)
                        Stopping Unbound Resolver.
                        Unbound stopped in 2 sec.
                        Additional mounts (DNSBL python):
                          No changes required.
                        Starting Unbound Resolver... completed [ 09/15/24 04:15:12 ]
                        DNSBL update [ 98926 | PASSED  ]... completed
                        

                        Certainly no 35 seconds..
                        Cache you say ?

                        I've never really seen a benefit to "Resolver cache restored" that takes a lot of time.
                        restoring cache, number of entries in zone are generally not related.

                        Cache works just fine... and any 2 hour sample here will run cache from about 35% to 60% cache hit depending who's doing what. The past 2 hours looks kind of like this.

                        Screen Shot 2024-09-18 at 9.50.49 AM.png

                        GertjanG 1 Reply Last reply Reply Quote 0
                        • GertjanG
                          Gertjan @jrey
                          last edited by Gertjan

                          @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                          Certainly no 35 seconds..

                          My pfb_py_zone.txt has 119 414 entries.
                          Yours has 98 926 entries, somewhat less.

                          before, you showed this :

                          11c6041c-764a-459d-804b-b3c090411d98-image.png

                          That's waaaay bigger. Like 68 Mbytes big.

                          Your pfSEnse has probably bit more 'hose power' as mine.

                          @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                          I've never really seen a benefit to "Resolver cache restored" that takes a lot of time.

                          The unbound cache - there are command line commands to dump it - isn't that big.
                          Its just useful to load it back in again. It's a cache after all.

                          My cache : same info, shown differently :

                          cf81d05a-620d-4298-a72c-c1b7da32968c-image.png

                          gray = served out of cache.
                          Other colors : resolving was needed.

                          No "help me" PM's please. Use the forum, the community will thank you.
                          Edit : and where are the logs ??

                          J 1 Reply Last reply Reply Quote 0
                          • J
                            jrey @Gertjan
                            last edited by

                            @Gertjan

                            You're confused, that file size you are quoting back is from the OP not me.

                            I was asking about your 35 second time from your system compared to my unbound and that section of the log where you show and calculate the 35 seconds.
                            a) yes on my little test box the "list" is a little smaller (process time 2 seconds) I said that.
                            b) I also said on my 2100 the list is a little larger than yours but the the timing of the unbound restart in that section of the log is similar to my little test box ~ 2-4 seconds

                            The only difference I can see is that you are restoring the unbound cache, that is where the time difference appears to be from. And I went on to say that I've never seen a benefit to doing that and my cache hit rate is just fine.

                            All I've said is I've never seen a benefit to restoring resolver cache...and therefore taking the time hit doing so. The cache works just fine here.

                            GertjanG 1 Reply Last reply Reply Quote 0
                            • GertjanG
                              Gertjan @jrey
                              last edited by

                              @jrey said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                              You're confused,

                              Noop. The @Klaus2314 didn't made into my post - I've edited.

                              No "help me" PM's please. Use the forum, the community will thank you.
                              Edit : and where are the logs ??

                              1 Reply Last reply Reply Quote 0
                              • K
                                Klaus2314
                                last edited by

                                Hi all,

                                since I deleted the files and de-installed watchdog no more errors occurred in the last 24 hours (which included a few cron jobs by pfblocker) so things seem to be fine again.

                                Oh, and yes, I have been running the python mode before (and still am) on DNSBL.

                                thanks again for helping.

                                1 Reply Last reply Reply Quote 2
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.