SG-2220 Full HDD Problems

pfCents

I noticed a few days ago that the firewall wasn't handing out new addresses and when I logged in I saw that several services were not started. I figured a reboot was worth a try and when I did things worked for a few minutes but the services were still acting a bit quirky (not starting). When I finally checked the bottom of the dashboard I saw the following for disk usage:

Disk usage
/ (ufs): 105% of 1.8G
/var/run (ufs in RAM): 3% of 3.4M

So I searched for a similar problem and found the main page for full disks but I don't know where to go from here:

$ df -hi
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/ufsid/55f05ec197022192 1.8G 1.7G -89M 105% 19k 223k 8% /
devfs 1.0K 1.0K 0B 100% 0 0 100% /dev
/dev/md0 3.4M 112K 3.0M 3% 35 987 3% /var/run
devfs 1.0K 1.0K 0B 100% 0 0 100% /var/dhcpd/dev
fdescfs 1.0K 1.0K 0B 100% 11 58k 0% /dev/fd

To me, it looks like the first item is the issue but I don't know what I'm supposed to do about it.

Things I've already tried include reducing the log size for Snort by enabling the automatic log management with all the max sizes set to default, which comes to barely over 2MB, clearing all logs for all services (including Snort) and the firewall, and countless reboots.

EDIT:
After checking the logging area to clear logs again, I noticed that the log size was blank. I set it to 511488, saved, and hit reset logs. All the logs area cleared (again) but the following is still printed on the page:

Disk space currently used by log files: 1.2G. Remaining disk space for log files: -72M.

Is there some other way to clear logs that I'm unaware of?

EDIT 2:
Found out it was a bunch of pflog.bad.[random hex] files that were way larger than any other. Cleared them out with an rm and looks like that finally got my HDD back down to ~35%.

Guest

You are running snort off the eMMC install? Won't that degrade it fast?

jimp

I wouldn't be worried about the longevity of the eMMC but it definitely doesn't have enough space to effectively handle large packages like snort with much grace. An M.2 disk would be much better if you want to run packages like that.

pfCents

I believe there is a much more serious thing going on here, possibly a bug with the software.

So I thought the problem was resolved by deleting those mysterious files but then yesterday none of the nodes in the network were able to reach the Internet. They were being served IP's and I could still log into the firewall's configuration website. That's when things really got weird, my dashboard was completely different, Snort and service monitor were missing, but traffic/bandwidth monitors were there instead. I also verified that I still had a WAN IP (I'll come back to this in a minute) but when I checked services Snort and a couple others (I should have wrote this down in hindsight) were completely missing. Also, on the dashboard my memory was 108% used now.

I checked my installed packages to see if maybe Snort was acting up and blocking everyone for some reason. Come to find out, Snort (and every other package) were completely gone, as if I had uninstalled them. So I reinstalled Snort, thinking maybe it just needed to reload my settings, but it did not, it was as if I was doing it for the first time.

Next I went over to to the logging management page and saw that I was again in negative space. Then I went back to the trouble area and checked my log file sizes:

$ ls -al /var/log
total 2551544
drwxr-xr-x 4 root wheel 1024 Sep 29 02:57 .
drwxr-xr-x 28 root wheel 512 Sep 10 20:42 ..
-rw–----- 1 root wheel 1024000 Sep 29 05:04 dhcpd.log
-rw-r--r-- 1 root wheel 8294 Sep 29 04:08 dmesg.boot
-rw------- 1 root wheel 1024000 Sep 29 05:06 filter.log
-rw------- 1 root wheel 1024000 Sep 29 04:09 gateways.log
-rw------- 1 root wheel 27158 Sep 9 11:31 installer.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 ipsec.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 l2tps.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 lighttpd.log
drwxr-xr-x 2 root wheel 512 Jul 14 20:02 ntp
-rw------- 1 root wheel 1024000 Sep 29 04:09 ntpd.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 openvpn.log
-rw------- 1 root wheel 1115288 Sep 29 05:06 pflog
-rw------- 1 root wheel 135200460 Sep 29 04:07 pflog.bad.004650c8
-rw------- 1 root wheel 109641426 Sep 29 04:07 pflog.bad.041fa52f
-rw------- 1 root wheel 50953858 Sep 29 04:07 pflog.bad.15941f7d
-rw------- 1 root wheel 109805392 Sep 29 04:07 pflog.bad.272d3f0f
-rw------- 1 root wheel 156199060 Sep 29 04:07 pflog.bad.29009618
-rw------- 1 root wheel 131857533 Sep 29 04:07 pflog.bad.42d2be83
-rw------- 1 root wheel 10192268 Sep 29 04:07 pflog.bad.62912fe5
-rw------- 1 root wheel 20506904 Sep 29 04:07 pflog.bad.63c03df9
-rw------- 1 root wheel 285900368 Sep 29 04:07 pflog.bad.69bf7eb2
-rw------- 1 root wheel 57179566 Sep 29 04:07 pflog.bad.a51a6664
-rw------- 1 root wheel 122453124 Sep 29 04:07 pflog.bad.aac65a97
-rw------- 1 root wheel 95092018 Sep 29 04:07 pflog.bad.cae7490b
-rw------- 1 root wheel 1024000 Sep 27 20:00 poes.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 portalauth.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 ppp.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 pptps.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 relayd.log
-rw------- 1 root wheel 1024000 Sep 29 03:11 resolver.log
-rw------- 1 root wheel 1024000 Sep 27 21:16 routing.log
drwxr-xr-x 4 root wheel 512 Sep 18 18:22 snort
-rw------- 1 root wheel 10240 Sep 10 21:04 spamd.log
-rw------- 1 root wheel 1024000 Sep 29 04:48 system.log
-rw------- 1 root wheel 16079 Sep 29 04:09 userlog
-rw-r--r-- 1 root wheel 197 Sep 29 04:09 utx.lastlogin
-rw------- 1 root wheel 2233 Sep 29 04:09 utx.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 vpn.log
-rw------- 1 root wheel 1024000 Sep 27 20:00 wireless.log

You'll notice that those pflog.bad.[hex] files of absurd size were back. This time I cleared out all but one, and tried to read the file to see if I could further troubleshoot what is going on here:

$ clog /var/log/pflog.bad.cae7490b
V6F ll4igb03vʚ;E8

V
Voll4igb03vʚ;E8S@
V+
VYll4igb03vʚ;E8
V
V?]ll4igb03vʚ;E8
VT
V[
Ve
VCg
VXk
V(x
VBz
V
V
V4R
V?\

Now I don't know what any of that means, and it was a lot more than that, this was just a snippet but it looked basically the same all the way through.

At this point I thought it would be best to perform a factory reset and make this post asking for help/insight as to what to do next. Interestingly, after a factory reset, the pflog.bad file I read with clog above was still present.

At this point my best guess as to what is going on is that my WAN situation is less than ideal.

I live in a rural area and my only choices for Internet access are LTE/3G and satellite, I've opted for LTE/3G but run into issues every now and again. There is only one tower out here (12 miles away) and when upgrades/changes/maintenance/weather happen I'll drop out multiple times a day. Previously this wasn't much of an issue because my Cradlepoint would handle it. Recently though I was tired of being double NAT'd (a limitation of the cheapest Cradlepoint is no IP-Passthrough) so I bought a PocketPORT2 which can handle IPT for my USB modem. Unfortunately it's a lot more finicky with dropped connections and requires me to power off/on the device to reconnect.

I've noticed there is some correlation in timing between when I have a few days of no drops and my pfSense memory usage staying normal and when I have a lot of drops that I'm getting these pflog.bad's overfilling memory.

What can I do to prevent this issue from further happening, I'd rather not have to go and rm /var/log/pflog.bad* every single time my WAN drops?

jimp

Looks like that's all from spamd. The firewall itself doesn't touch pflog, but spamd does. Kill it with fire.

pfCents

That would make sense, towards the end that was the only package I had with Snort. Thanks for the help but could you please elaborate onto how you determined that? If only to help me in understanding a bit more.

jimp

I knew pflog wasn't used by the base system so I grepped through the whole package repo to see what if anything touched it, and the only thing that does is spamd, and you had a spamd.log so it must have been installed there.

pfCents

That would make sense as to why I couldn't find any information about it, thanks.

doktornotor

~~Please file a bug about the spamd thing before it gets forgotten… This is insane. Found ~3 GiB worth of crap here after a couple of test installs.~~

https://redmine.pfsense.org/issues/5231
https://github.com/pfsense/pfsense-packages/pull/1086