V 3.2.0 with pfsense 23.01 RC 20230202
-
I am also facing severe problems with pfsense 23.01-RC and pfblocker 3.2.0.
For me, the pfBlocker cron job also hangs on the HTTP 200 message.
In fact, it doesn't really hang fully, it just takes a really long time to process a blocklist and continue with the next one. Previously (on pfsense 22.05 and pfblocker 3.1.0), processing for one block list takes about 20 seconds on my system, but on pfsense 23.01-RC and pfblocker 3.2.0, the same list takes over 30 minutes.Then I thought, I will just let the cronjob run until it has finished even if it takes hours, as it can just run in the background.
The next morning I saw that unbound has was killed because of out-of-memory errors.
After examination of the logs, the whole cronjob took about 16 hours, instead of the normally less than 10 minutes.After the oom errors, I couldn't get unbound to restart on 23.01, therefore I reverted to 22.05.
Edit1: I do have 8GB RAM in my system (on 22.05 it used between 3-5GB), but now I have even ungraded to 16GB, which also has not helped for my oom errors with unbound.
Note: I do NOT have Wildcard Blocking (TLD) enabled.
-
@greenflash Have you applied the patch above by jimp? I also did/do not have wildcard tld blocking enabled, but still had the slow reload problem. After the patch feeds get processed in a timely manner.
-
@pfsjap Not yet, but I will try that.
But I highly doubt, that it will fix my out-of-memory errors.The out-of-memory errors could be the same described here: https://forum.netgate.com/topic/177559/23-01-r-20230202-1645-2500-mb-laundry
-
Ran into the same problem with the memory usage.
Not sure if the other issues I had with 23.01 came from PfblockerNG but IPv6 had some weird issues which I couldn't resolve.
For now I rolled back to 22.05. -
@pfsjap I am now back at 23.01-RC1 and have applied the patch from jimp sucessfully.
This has solved my issues with the long reloading (at least to a certain point).
The patch has indeed reduced the reload time from ~16 hours (23.01 without the patch) to arround 2,5 hours (147 minutes exactly).
While this is a significant improvement, the reloading times are still way slower than on 22.05, where a full reload normally is faster than 20 minutes.But the much worse issue for me still persits:
After the full reload of all the block lists, unbound gets restarted, but then gets killed again because it has used all of the available memory.So for me the memory issues could be tracked down to unbound, see:
[23.01-RC][fabian@***]/home/fabian: top last pid: 44352; load averages: 1.71, 0.87, 0.80 up 0+03:01:07 12:01:55 69 processes: 2 running, 62 sleeping, 5 waiting CPU: 24.3% user, 0.0% nice, 2.8% system, 0.4% interrupt, 72.5% idle Mem: 9845M Active, 1156K Inact, 3833M Laundry, 1662M Wired, 337M Free ARC: 1121M Total, 713M MFU, 384M MRU, 1312K Anon, 15M Header, 8207K Other 1035M Compressed, 6602M Uncompressed, 6.38:1 Ratio Swap: 2048M Total, 1251M Used, 796M Free, 61% Inuse, 153M Out PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 39855 unbound 1 135 0 14G 13G CPU1 1 1:58 99.88% unbound 916 root 1 68 0 170M 27M accept 0 1:08 0.62% php-fpm 2584 root 8 20 0 968M 187M nanslp 3 13:46 0.50% suricata 14741 root 9 20 0 980M 208M nanslp 3 13:48 0.48% suricata 32505 root 1 20 0 31M 4240K kqread 2 0:01 0.06% nginx 82817 fabian 1 20 0 14M 3020K CPU0 0 0:00 0.04% top
(top just before unbound gets killed because of oom)
-
@greenflash Is your DNSBL Mode set to Unbound Mode? Changing to Unbound Python Mode may help.
I have 1GB RAM on Netgate 1100 with 260MB set to RAM disk. Not that many feeds, though.
-
@pfsjap currently it is already set to Unbound Python Mode. I do have 16GB of RAM installed, on 22.05 the whole system never used more than 5 GB at maximum.
I have not a changed any of my block list feeds since the upgrade. Therefore there has to be some kind of an issue with memory management (or even a memory leak) regarding pfblocker and unbound on 23.01.
-
@greenflash Ok, hopefully you'll get help with this.
-
@greenflash I worked on Unbound quite a bit over the past month tracking down memory-related issues.
What does your DNSBL setup look like?
-
@cmcdonald said in V 3.2.0 with pfsense 23.01 RC 20230202:
What does your DNSBL setup look like?
Do you mean this settings page?
spoiler
Or this one:
spoiler -
@tcw said in V 3.2.0 with pfsense 23.01 RC 20230202:
No change. Confirmed the patch applied. Updated to 23.01.r.20230202.1645 from 23.01.r.20230202.0019 yesterday and confirmed successful pfBlockerNG force reload all, before and after the update, and before and after applying the patch, with success as long as Wildcard Blocking (TLD) is unselected.
The "TLD finalize.." step seemed to take just a couple of seconds on 22.05 with my hardware, so I don't believe it's an issue of my not waiting long enough (especially now since the patch seems to have corrected a typo to enforce timeout in 15 seconds).
Let me know how else I may be able to help.
Finally got time to upgrade to 23.01-RC and can confirm that with Wildcard Blocking (TLD) feature enabled the update/reload process hangs on "TLD finalize..."
There's a Redmine ticket for this issue: https://redmine.pfsense.org/issues/13884 -
@jimp's patch just got applied to an updated pfBlockerNG v. 3.2.0_1. (Thanks!) That appears to have been the only change.
-
Using the latest pfSense RC :
23.01-RC (amd64) built on Wed Feb 08 06:11:39 UTC 2023
TLD Whitelist selected.
I'm here :UPDATE PROCESS START [ v3.2.0_1 ] [ 02/8/23 11:13:01 ] ===[ DNSBL Process ]================================================ Loading DNSBL Statistics... completed Missing DNSBL stats and/or Unbound DNSBL files - Rebuilding Loading DNSBL SafeSearch... enabled Loading DNSBL Whitelist... completed Blacklist database(s) ... exists. [ StevenBlack_ADs ] Downloading update .. 200 OK. Whitelist: 15.taboola.com|aax-eu.amazon-adsystem.com|adsafeprotected.com|am-match.taboola.com| ..... snipped Orig. Unique # Dups # White # TOP1M Final ---------------------------------------------------------------------- 177888 177888 0 97 0 177791 ---------------------------------------------------------------------- ------------------------------------------------------------------------ Assembling DNSBL database...... completed [ 02/8/23 11:13:13 ] TLD: TLD analysis.. completed [ 02/8/23 11:13:17 ] TLD finalize..
and I understand why :
The /tmp/dnsbl_tld_remove file - the list with TLDs to remove is 37000+ lines.
The /var/unbound/pfb_py_data.txt.raw file 133608 lines[edit]
From what I make of this : each of the 37000+ lines is checked (grepped) with every line in the 133608 file.
So, 37000 times 133608 'greps' to be executed.
That's huge ....And I have only one dnsbl feed - with "133608" dnsbl entries.
[end edit]I copied both files to /root/ and repeated the command 'on the command line'.
This command is great to max out one core, 100 %, and it will take minutes if not hours to complete.pfblockerng-devel does this with PHP handling the return (output). That will make things even worse.
49 degrees and rising. Of to the kitchen, looking for some eggs.
I guess not using (unchecking) Wildcard Blocking (TLD) is the best option right now.
-
@gertjan said in V 3.2.0 with pfsense 23.01 RC 20230202:
I guess not using TLD Whitelisting is the best option right now.
I'm not using TLD Whitelist
My DNSBL Mode is set to "Unbound python mode" and as pfBlockerNG states: "TLD Whitelist is not utilized for Unbound python mode! Use DNSBL Whitelist instead."
The main problem is when Wildcard Blocking (TLD) is enabled. -
@emikaadeo
You're right :That's the one :
-
-
I believe mine is now having the same issue with my upgrade to 23.01-Final. I manually did a reload and it's at 20 minutes, stuck on "TLD finalize."
I did have an error: On it's first boot I got a banner about this extensive error: https://pastebin.com/aj8q4Mjw than that, It seems to work fine and appears to be passing traffic across 2 VLAN and 1 WAN.
-
This happened to me today as well and likewise disabling Wildcard Blocking (TLD) worked around it. grep was stuck at 100% CPU utilization for several minutes otherwise.
-
Andy Fix for this? Except disabling Wildcard TLD blocking
-
@opit-gmbh said in V 3.2.0 with pfsense 23.01 RC 20230202:
Andy Fix for this? Except disabling Wildcard TLD blocking
Not yet: https://www.patreon.com/posts/pfblockerng-v3-2-78781333
-
@steveits @jmontleon @OpIT-GmbH
It is now fixed with 3.2.0_3 version :)
https://forum.netgate.com/post/1088962