Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs

    Scheduled Pinned Locked Moved pfBlockerNG
    21 Posts 3 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      Klaus2314
      last edited by Klaus2314

      Hi,

      I couldn't find a thread about this but recently I've been getting hundreds of these errors logged.
      I removed the package and re-installed but no change. I have re-booted the FW but no change.
      The system has been running for years and has been regularly updated without issues. Also pf-blocker seems to be running fine otherwise nothing in the pfblocker error logs. These only appear in system logs.

      Running
      2.7.2-RELEASE (amd64)
      pfblocker ng 3.2.0_8

      All recommended patches applied.
      Any idea what could be causing this?

      Thanks!

      43ac152a-ed6a-4a1c-80b9-5a83bbd79535-image.png

      J GertjanG 2 Replies Last reply Reply Quote 0
      • J
        jrey @Klaus2314
        last edited by

        @Klaus2314

        interesting.

        you might want to check file permissions?

        cd /var/unbound
        ls -l
        

        this is the database it is using
        pfb_py_resolver.sqlite

        what happens if you try to access the database / table directly do you get any errors?

        /var/unbound: sqlite3 pfb_py_resolver.sqlite
        SQLite version 3.43.1 2023-09-11 12:01:27
        Enter ".help" for usage hints.
        sqlite> .schema
        CREATE TABLE resolver (row integer, totalqueries integer, queries integer);
        CREATE TABLE lastclear ( row INTEGER, lastipclear TEXT, lastdnsblclear TEXT );
        sqlite> select * from lastclear;
        0|May 22 11:29:48|May 22 11:55:07
        sqlite> .exit
        

        so in the above you are typing the following and checking results as above.

        cd /var/unbound
        sqlite3 pfb_py_resolver.sqlite
         .schema
        select * from lastclear;
         .exit
        
        K 1 Reply Last reply Reply Quote 0
        • GertjanG
          Gertjan @Klaus2314
          last edited by Gertjan

          @Klaus2314

          pfBlocker needed to 'do something' with one of its SQL3 database files, and 5 times per second ( ! ) if found the file 'invalid' so recreated it.
          Sure enough, the file recreated was also invalid.
          And so on.

          The database is called 'lastevent', and its in the "/var/unbound/pfb_py_resolver.sqlite" file.

          [24.03-RELEASE][root@pfSense.bhf.tld]/root: sqlite3
          SQLite version 3.44.0 2023-11-01 11:23:50
          Enter ".help" for usage hints.
          Connected to a transient in-memory database.
          Use ".open FILENAME" to reopen on a persistent database.
          sqlite> .open /var/unbound/pfb_py_resolver.sqlite
          sqlite> .dump
          PRAGMA foreign_keys=OFF;
          BEGIN TRANSACTION;
          CREATE TABLE resolver (row integer, totalqueries integer, queries integer);
          INSERT INTO resolver VALUES(0,345511,0);
          CREATE TABLE lastclear ( row INTEGER, lastipclear TEXT, lastdnsblclear TEXT );
          INSERT INTO lastclear VALUES(0,'Sep 15 00:00:00','Sep 15 00:00:00');
          CREATE TABLE lastevent ( row INTEGER, groupname TEXT, entry TEXT, details TEXT );
          INSERT INTO lastevent VALUES(0,'DNSBL_Regex','incoming.telemetry.mozilla.org192.168.1.6','incoming.telemetry.mozilla.org,192.168.1.6,-|PRI|HTTP/2.0|-,Python_HTTPS,DNSBL_Regex,incoming.telemetry.mozilla.org,test_rgx2');
          COMMIT;
          sqlite> .quit
          

          Quiet small actually.

          Anyway, you know now what it is.
          The thing is, pfBlockerng, PHP actually, can't create this 20Kbytes file in the /var/unbound/pfb_py_resolver.sqlite folder.
          That's ... not good.

          I did discover that that file is update very often ... mine was regenerated every couple of seconds for sure.

          So ... disk full ? Or the partition that hosts "/var/unbound/" is full ?
          The drive/partition was mounted cleanly during the last reboot ?
          Or does you pfblockerng has so much activity that the number of "last events" are arriving faster as that PHP can write out the disk ?

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          1 Reply Last reply Reply Quote 0
          • K
            Klaus2314 @jrey
            last edited by

            @jrey said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

            this is the database it is using
            pfb_py_resolver.sqlite

            Thank you!
            forgot to mention: I have watchdog active for unbound could that interfere with the pfblocker in terms of timing?

            Also: the drive is far from full. tons of space left.

            This doesn't seem to exist in that folder:

            ac850e3c-ca41-412f-8098-3e0def40957b-image.png

            K GertjanG J 3 Replies Last reply Reply Quote 0
            • K
              Klaus2314 @Klaus2314
              last edited by

              @Klaus2314 Sorry, I just realised that no matter what directory I try to change to using "cd" in the command line window of the pfsense gui it will always stay in the root folder. Odd...

              J 1 Reply Last reply Reply Quote 0
              • J
                jrey @Klaus2314
                last edited by

                @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                in the command line window of the pfsense gui

                you need to ssh in to run this -- sorry I didn't specifically mention that

                K 1 Reply Last reply Reply Quote 0
                • K
                  Klaus2314 @jrey
                  last edited by

                  @jrey Yeah, no worries. sorry for my ignorance. Got this

                  0bf1e268-7c7d-422a-b3e9-d147500e7e42-image.png

                  and

                  2ca2f67b-1117-4e7b-bb46-1d17f0fee526-image.png

                  1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @Klaus2314
                    last edited by Gertjan

                    @Klaus2314

                    That's not "/var/unbound/" as you've listed many files and folders that don't belong there.

                    edit : ok, you've figured that out.

                    @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                    watchdog active for unbound

                    Ah, ok, you've the 'create random issues on my system' tool installed.
                    My advise : don't use it.
                    It a developer tool.
                    You not developing unbound, you are using it. When 'pfSense' decides that it has to rstart unbound because pfBlocker (for example) was realoaded with new DNSBL, then let pfSense restart unbound.

                    On my pfSense, unbound can run for days, weeks or longer. It is restarted at lest ones a week because my pfBlocker reloads (updates, if any) DNSBL and a consequence is : unbound is restarted.

                    Your situation : you use unbound, pfBlockerbl so unbound can get restarted more often.
                    Because you use pfBlockerbl and probably not the python mode ( is that so ? ) the restart can take a long time. One, two or more minutes to start again has been seen before, deping on the number of DNSBL entries.
                    Restarting it at that moment, while it was already restarting, will create ... well, it won't do any good, that's for sure.
                    That's te scenario you've created : the watchdog package checks every minute, for every selected service, if they are running. If it does,'t it will start it.
                    But what if it was already in the startup phase ? ( noop, the watchdog package isn't smart ).

                    That's why python mode was invented : unbound restarts way faster = no more, or far smaller, DNS outages.
                    For myself : that's why I sync my DNSBL every week, not xx times a day.
                    Result - for me - rock solid 'DNS'.

                    ( and no DHCP DNS registration, as that option also restarts unbound on every incomonig lease (renewal) )

                    edit :

                    b563422f-0014-427b-8fe0-64730f4e9fb6-image.png

                    😰

                    Try this :
                    In the GUI :
                    stop unbound
                    stop pfblocker

                    Then : on the console or SSH :
                    delete these two files.

                    Now, in the GUI, start unbound and start pfblockerng.

                    "PHP" can not create a 10 Mbytes file xx time per second. That's just impossible.

                    edit 2 :

                    c84ac42f-754b-4f5a-92a5-08e8579f4c4b-image.png

                    68 Mbytes ... 😢
                    If you don't use the python mode, your DNS is worthless as unbound can't read this file 'in a snap' during startup. It will take some serious time to do so.
                    Consider making this file waaay smaller == way less DNSBL.

                    I've a 4100, and, true, it has an Intel processor, but it's a small one.
                    4 Mbytes is already a noticeable load on the unbound startup.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    J 1 Reply Last reply Reply Quote 0
                    • J
                      jrey @Klaus2314
                      last edited by

                      @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                      I have watchdog active for unbound could that interfere with the pfblocker in terms of timing?

                      can watchdog interfere - yes but only if unbound is failing and/or in the process of doing a "normal" restart.
                      Keep in mind the it is generally recommended that watchdog is not running. Depending on your configuration pfblocker will restart unbound when it needs to (for example, a DNSBL list has changed).
                      Personally I don't even have watchdog installed and as the DNSBL iists I use only change every 3 or 4 days even though I check daily for updates, unbound only restarts when an updated lists is available. The restart only takes a few seconds here, you'd never notice.

                      But there is the problem, the file is rather large, not that size is an issue for a DB unless you are out of disk space.
                      "Error: database disk image is malformed"

                      in this case you could simply delete the two files when you are in that directory

                      cd /var/unbound
                      rm pfb_py_resolver.sqlite
                      rm pfb_py_resolver.sqlite.invalid 
                      

                      alternative is you could just move the .sqlite version to a backup and the system will create a new one -- once that happens, you could then delete the backup and invalid versions. to do that you would

                      cd /var/unbound
                      mv pfb_py_resolver.sqlite pfb_py_resolver.sqlite.bak
                      (wait for the new file to be created check with ls -l until it appears, won't take long, then)
                      rm pfb_py_resolver.sqlite.bak
                      rm pfb_py_resolver.sqlite.invalid 
                      

                      Did you manually copy or mv the file to .invalid - that would not be there normally.
                      looks like based on the time stamps the system would be trying to write to the one dated Sep 3, but that today Sep 18 the invalid copy was created -- that may be when the error causing the corruption started? 14:25 (check logs)

                      K 1 Reply Last reply Reply Quote 0
                      • K
                        Klaus2314 @jrey
                        last edited by Klaus2314

                        @jrey Thank's a lot for your advice! I'll dig into this. I have watchdog on because unbound (and somertimes kea dhcp) stopped working (I suspect after pfblocker updatding at night).
                        I'll try again without.

                        GertjanG 1 Reply Last reply Reply Quote 0
                        • GertjanG
                          Gertjan @Klaus2314
                          last edited by

                          @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                          stopped working

                          I've edited my post above with some observations.
                          You will find some very possible reasons why.

                          @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                          and somertimes kea dhcp

                          Be gentile to yourself : switch back to ISC for the moment.

                          No "help me" PM's please. Use the forum, the community will thank you.
                          Edit : and where are the logs ??

                          K 1 Reply Last reply Reply Quote 0
                          • K
                            Klaus2314 @Gertjan
                            last edited by

                            @Gertjan ok, thanks again. I can't delete the files. permission denied.
                            Doesn't sudo work on BSD? It says unknown command.

                            Again, sorry for my novice ignorance :-D

                            J 1 Reply Last reply Reply Quote 0
                            • J
                              jrey @Gertjan
                              last edited by

                              @Gertjan said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                              It will take some serious time to do so.

                              you'd think that will default settings a zone file that big the system would be throwing memory errors. That's a lot of zones. of course there is a limit as to how many it loads and ignores the rest. Either way though the list is too big.

                              @Gertjan said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                              switch back to ISC for the moment.

                              Also true.

                              GertjanG 1 Reply Last reply Reply Quote 0
                              • J
                                jrey @Klaus2314
                                last edited by

                                @Klaus2314

                                does your ssh prompt say you are root or admin?

                                if you ssh in with a "normal" user account you won't have permission.

                                ssh bob@... logs in won't work (bob doesn't have permission)
                                ssh admin@... or ssh root@... logs in will work.

                                1 Reply Last reply Reply Quote 0
                                • GertjanG
                                  Gertjan @jrey
                                  last edited by Gertjan

                                  @jrey said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                                  default settings a zone

                                  The default pfb_py_zone.txt file size is 'zero' as pfBlockerng has no DNSBL pre set or pre loaded.

                                  Then the admin comes along, and selects a whole bunch of these :

                                  826bf80b-be28-42ab-abc3-a4c4f0a98fb3-image.png

                                  and then .... major issues happens. The system can't cope anymore.

                                  Btw : unbound doesn't only 'allocate' memory to load the file, and stores it into that memort check.
                                  Noop, it reads the fiie, interpretes every line, and starts to fill structures 'per line', making each line part of a big array with DNSBL to be matched on every host name request ...
                                  This is done using arm or intel instrcution, true, but even so : it's very time consumming.
                                  How long can be seen here :

                                  bc18caea-ad04-42e0-96f6-a39e2626fadf-image.png

                                  so with 119414 entries, it took 15:57-15:22 = already 35 seconds.

                                  Oh, I get it : @Klaus2314 thought that unbound was 'broken' ....
                                  Klaus2314 : check for yourself, ask for apfBlockerng reload here and see what the log tells you :

                                  9e2bcb71-77ba-400d-860f-265a419b946a-image.png

                                  and hit the blue run button.

                                  No "help me" PM's please. Use the forum, the community will thank you.
                                  Edit : and where are the logs ??

                                  K J 2 Replies Last reply Reply Quote 0
                                  • K
                                    Klaus2314 @Gertjan
                                    last edited by

                                    @Gertjan OK, I have now managed to delete those files, de-installed watchdog and followed your other advice. I will see how it goes now and report back.
                                    Thanks again for your patience!

                                    1 Reply Last reply Reply Quote 0
                                    • J
                                      jrey @Gertjan
                                      last edited by

                                      @Gertjan said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                                      so with 119414 entries, it took 15:57-15:22 = already 35 seconds.

                                      What version are you running?, and this is on a 4100 I think you said?

                                      Below is from a crap little test box (my production box a 2100) and has much the same with regards to timing, but has a larger "list" than shown here

                                       Original    Matches    Removed    Final     
                                       ----------------------------------------
                                       166351      61104      67425      98926     
                                       -----------------------------------------
                                      TLD finalize... completed [ 09/15/24 04:15:10 ]
                                      
                                      Saving DNSBL statistics... completed
                                      Reloading Unbound Resolver (DNSBL python)
                                      Stopping Unbound Resolver.
                                      Unbound stopped in 2 sec.
                                      Additional mounts (DNSBL python):
                                        No changes required.
                                      Starting Unbound Resolver... completed [ 09/15/24 04:15:12 ]
                                      DNSBL update [ 98926 | PASSED  ]... completed
                                      

                                      Certainly no 35 seconds..
                                      Cache you say ?

                                      I've never really seen a benefit to "Resolver cache restored" that takes a lot of time.
                                      restoring cache, number of entries in zone are generally not related.

                                      Cache works just fine... and any 2 hour sample here will run cache from about 35% to 60% cache hit depending who's doing what. The past 2 hours looks kind of like this.

                                      Screen Shot 2024-09-18 at 9.50.49 AM.png

                                      GertjanG 1 Reply Last reply Reply Quote 0
                                      • GertjanG
                                        Gertjan @jrey
                                        last edited by Gertjan

                                        @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                                        Certainly no 35 seconds..

                                        My pfb_py_zone.txt has 119 414 entries.
                                        Yours has 98 926 entries, somewhat less.

                                        before, you showed this :

                                        11c6041c-764a-459d-804b-b3c090411d98-image.png

                                        That's waaaay bigger. Like 68 Mbytes big.

                                        Your pfSEnse has probably bit more 'hose power' as mine.

                                        @Klaus2314 said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                                        I've never really seen a benefit to "Resolver cache restored" that takes a lot of time.

                                        The unbound cache - there are command line commands to dump it - isn't that big.
                                        Its just useful to load it back in again. It's a cache after all.

                                        My cache : same info, shown differently :

                                        cf81d05a-620d-4298-a72c-c1b7da32968c-image.png

                                        gray = served out of cache.
                                        Other colors : resolving was needed.

                                        No "help me" PM's please. Use the forum, the community will thank you.
                                        Edit : and where are the logs ??

                                        J 1 Reply Last reply Reply Quote 0
                                        • J
                                          jrey @Gertjan
                                          last edited by

                                          @Gertjan

                                          You're confused, that file size you are quoting back is from the OP not me.

                                          I was asking about your 35 second time from your system compared to my unbound and that section of the log where you show and calculate the 35 seconds.
                                          a) yes on my little test box the "list" is a little smaller (process time 2 seconds) I said that.
                                          b) I also said on my 2100 the list is a little larger than yours but the the timing of the unbound restart in that section of the log is similar to my little test box ~ 2-4 seconds

                                          The only difference I can see is that you are restoring the unbound cache, that is where the time difference appears to be from. And I went on to say that I've never seen a benefit to doing that and my cache hit rate is just fine.

                                          All I've said is I've never seen a benefit to restoring resolver cache...and therefore taking the time hit doing so. The cache works just fine here.

                                          GertjanG 1 Reply Last reply Reply Quote 0
                                          • GertjanG
                                            Gertjan @jrey
                                            last edited by

                                            @jrey said in DNSBL SQLite3 database [ lastevent ] corrupt messages in system logs:

                                            You're confused,

                                            Noop. The @Klaus2314 didn't made into my post - I've edited.

                                            No "help me" PM's please. Use the forum, the community will thank you.
                                            Edit : and where are the logs ??

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.