Disk usage slowly increasing
-
I have an SG-3100, and every day I see my disk usage gradually increasing both using the UI widget on the main page and by doing a
df -h
on the command line. I have setup daily e-mails to monitor the situation, and my disk fills up about every couple weeks if I take no action.I've been trying to figure out where the disk space is being occupied using
du -h * | sort -n
in the various directories to no avail. The weird behavior occurs though when simply navigating the directory tree using the shell. Today I had 63% (4G) used, I logged in, and changed directory to /var and immediatly the usage went down to 29% (1.9G) used.I had at one point narrowed it down to the /.sujournal file growing but I haven't been able to catch that a second time to confirm it.
The only thing that changed in my setup recently is the addition of pfblocker-ng-devel. The logs are only a few KB in size, so it's not that. I have no idea where else this space could be getting used up, and suspect it must be some weird file-system thing if it just frees automatically when logging in to the system and navigating around the directory tree.
If I take no action, eventually the system stops routing, and the logs get filled up with disk space errors. Simply rebooting the system also causes the disk space to return to about 29% usage.
-
@anthonyoteri Sounds exactly like the issue I’m seeing after upgrading to 21.05 (using pfBlockerNG).
See my thread here: https://forum.netgate.com/topic/164786/upgraded-to-21-05-and-now-filesystem-is-filling-withCan you confirm that you will also regain the diskspace by stopping pfBlockerNG or rebooting the appliance?
Seems it’s an issue with the python integration to unbound DNS resolver.
-
@keyser Yes, I can confirm that rebooting the appliance fixes this. It cleared this morning, so I can't test stopping pfblocker-ng until I start to see the usage increase again.
And yes, I'm using pfblocker-ng-devel with the python integration to the DNS resolver.
-
Okay Cool. We can just hope that more will face this issue so it becomes rather obvious what’s wrong.
Perhaps then @BBcan177 at some point has time to look into the issue. He is “GOD” when it comes to pfBlockerNG (being the developer), and we owe him a great deal of gratitude for creating such a GREAT package for pfSense. -
I can confirm that changing pfBlockerNG 3.0.16 from “Unbound Python mode” to “Unbound mode” removes the continious write I/O from unbound to disk. Everything still works as expected, but there is much less write I/O done on the system.
The memory usage on my SG-2100 also went down from 15% to 10% by making this change.
I will report back tomorrow to report if the disk filling issue really is stopped by using this workaround.
I will also report on whether the memory use stays permanently lower. -
I can now also confirm the filling filesystem issue is gone once pfBlockerNG is changed to "Unbound Mode" instead of python mode.
So this will serve as workaround until the issue with Python mode filling the filesystem is solved:
NOTE: It seems my pfBlockerNG stopped logging DNSBL hits once I changed to Unbound mode.
The counters in the widget no longer increases, and no hits are registered in the DNSBL report.
But DNSBL is still active and working. -
@keyser said in Disk usage slowly increasing:
NOTE: It seems my pfBlockerNG stopped logging DNSBL hits once I changed to Unbound mode.
The counters in the widget no longer increases, and no hits are registered in the DNSBL report.
But DNSBL is still active and working.That's one of the reasons the 'python mode' exist : it can interact with the DNS traffic, and log.
Btw : the pyhon module logs here : /var/log/pfblockerng/
Their max size (or lines size) can be set with :
-
@gertjan Yes I know, but then something is wrong in the script that does the logging. Because the filesystem fills until pfSense crashes regardless of the limit set there. And the log file size does not show it uses several Gb of storage.
-
These are my files :
-rw------- 1 unbound unbound 14534320 Jul 2 15:10 dns_reply.log -rw------- 1 unbound unbound 2713595 Jul 2 15:10 dnsbl.log -rw-r--r-- 1 root wheel 0 Dec 9 2020 dnsbl_error.log -rw------- 1 root wheel 53752 Jul 2 12:00 dnsbl_parsed_error.log -rw------- 1 root wheel 19383 Jul 2 12:00 extras.log -rw------- 1 root wheel 3325212 Jul 2 15:10 ip_block.log -rw-r--r-- 1 root wheel 119 Jun 10 02:00 maxmind_ver -rw------- 1 root wheel 1728760 Jul 2 12:16 pfblockerng.log -rw-r--r-- 1 unbound unbound 0 May 21 12:20 py_error.log -rw------- 1 unbound unbound 14606143 Jul 2 15:10 unified.log
The number of lines in the file "dns_reply.log" is
wc -l dns_reply.log 157894 dns_reply.log
That's not "20 000" but "157 894" (lines !)
Changing from 20 000 to 40 000 didn't change the file size.
But a force reload of pfBlokcerng-devel does the job.
Now I see this :
wc -l dns_reply.log 40000 dns_reply.log
Did you lose your disk space with the files in /var/log/pfblockerng ?
Or else where ?
Other packages installed ? -
@gertjan Interesting…
I just noticed that pfBlockerNG - when in unbound python mode - creates several extra filesystems that are mounted different places under /var
When in regular - unbound mode - those filesystems are not there, and the log is just placed in /var/log/pfBlockerNG in the root filesystem.
Anyhow, your dns_reply file does report a “big” size, so you can see that’s the “culprit” that’s using up filesystem space. I could not locate such a file that used up all the space.
But I’m a *nix/bsd novice so this may be a dumb question:
Does a “du” command only parse the current filesystem you are in - and not others mounted in subfolders? Because then I might have missed big log files since they were located in one of the new filesystems.No files reported that kind of diskspace usage to “du -h” on my system when doing it in the “/“ filesystem.
-
I have an SG-1100, upgraded from 2.4.1p1 to 21.05 about 2 weeks ago. Only extension is pfBlocker-NG-devel, and changed it from unbound to python mode at the same time as the upgrade.
Same symptom here, since above changes, disk usage slowly increasing ~3% per day. Rebooted 2 days ago, mainly because RAM usage had also climbed to 92%.
Have started collecting more logging to try to narrow down the problem/s, and will do so for a few more days, then switch back to unbound mode to compare.
-
@anthonys Thank you for reporting back. Kind of nice to know it’s a more widespread issue across different hardware.
Makes me wonder if this is a general issue with 21.05 and pfB-NG 3.0.0.16 and people just haven’t upgraded yet since no more reports are coming in?
-
@keyser an interim update.
After reboot, disk usage was 22%, after running 7 days 28%,
then executed the commanddu -s /var/unbound
which returned the disk reported disk usage back to 22%.
Continuing work to locate which sub-directory in /var/unbound
But given that executing du repairs disk usage (df -hi /), then my first reaction is a kernel bug? -
Quick update, have narrowed down to /var/unbound/usr/local/lib/python3.8
Shell Output - df -hi /;du -s /var/unbound/usr/local/lib/python3.8; df -hi / Filesystem Size Used Avail Capacity iused ifree %iused Mounted on /dev/ufsid/5cdd3c209fe36da7 7.0G 2.0G 4.5G 31% 27k 935k 3% / 274800 /var/unbound/usr/local/lib/python3.8 Filesystem Size Used Avail Capacity iused ifree %iused Mounted on /dev/ufsid/5cdd3c209fe36da7 7.0G 1.4G 5.1G 22% 27k 935k 3% /