Kernel: (dhcpd) /var: filesystem full

AaronO

I am running pfSense 2.0 on a Hakamu 1Ghz embedded platform and have over 170 vlans. After adding this last round of 40 vlans I have run into a problem with the /var file system filling up, and consequently, causing strange connectivity issues etc. There is a DHCP range running on each vlan interface and I think this is causing /var/dhcpd/dev to fill up. The specific message I'm getting in the system log is:

kernel: pid 6437 (dhcpd), uid 1002 inumber 5973 on /var: filesystem full

Here is what df -H returns….

Filesystem Size Used Avail Capacity Mounted on
/dev/ufs/pfsense1 464M 150M 277M 35% /
devfs 1.0k 1.0k 0B 100% /dev
/dev/md0 40M 432k 37M 1% /tmp
/dev/md1 61M 61M -4.8M 109% /var
/dev/ufs/cf 52M 4.1M 43M 9% /cf
devfs 1.0k 1.0k 0B 100% /var/dhcpd/dev

Any advice would be appreciated. I am running no other additional packages on the system. The only thing I can guess at this point is pfSense can't support 170+ dhcp ranges. Any thoughts?

wallabybob

My system runs two VLANs and I see

df

Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 690006 425574 209232 67% /
devfs 1 1 0 100% /dev
/dev/md0 3694 50 3350 1% /var/run
devfs 1 1 0 100% /var/dhcpd/dev

so I doubt the number of VLANs is causing your /var/dhcp/dev to fill up.

Certainly your /var is full. Perhaps the output of du /var from your system might give some clues. Here is the output from my system:

du /var

2 /var/account
2 /var/at/jobs
2 /var/at/spool
6 /var/at
2 /var/audit
2 /var/backups
4 /var/crash
2 /var/cron/tabs
4 /var/cron
2 /var/db/entropy
2 /var/db/freebsd-update
2 /var/db/ipf
30 /var/db/pkg/jpeg-6b_4
30 /var/db/pkg/gd-2.0.35,1
28 /var/db/pkg/png-1.4.5_1
30 /var/db/pkg/libiconv-1.11_1
32 /var/db/pkg/libosip-3.1.0
30 /var/db/pkg/siproxd-0.7.0_1
28 /var/db/pkg/vnstat-1.6_3
28 /var/db/pkg/expat-2.0.1_1
28 /var/db/pkg/ca_root_nss-3.12.4
428 /var/db/pkg/python26-2.6.5
470 /var/db/pkg/perl-5.10.1_1
28 /var/db/pkg/p5-Net-SMTP-SSL-1.01
28 /var/db/pkg/p5-Error-0.17016
58 /var/db/pkg/curl-7.20.1
28 /var/db/pkg/cvsps-2.1
110 /var/db/pkg/git-1.7.1.1
28 /var/db/pkg/pkg-config-0.25_1
32 /var/db/pkg/libosip-3.3.0
32 /var/db/pkg/siproxd-0.8.0
30 /var/db/pkg/jpeg-8_3
32 /var/db/pkg/freetype2-2.4.4
30 /var/db/pkg/libiconv-1.13.1_1
30 /var/db/pkg/gd-2.0.35_7,1
26 /var/db/pkg/bandwidthd-2.0.1_4
42 /var/db/pkg/libpcap-1.1.1
60 /var/db/pkg/gettext-0.18.1.1
120 /var/db/pkg/postgresql-client-8.4.7
26 /var/db/pkg/pfflowd-0.8
1904 /var/db/pkg
2 /var/db/ports
2 /var/db/portsnap
9968 /var/db/rrd
2 /var/db/pingstatus
2 /var/db/pingmsstatus
22 /var/db/vnstat
2 /var/db/cpelements
12530 /var/db
2 /var/empty
2 /var/games
2 /var/heimdal
1926 /var/log
2 /var/mail
2 /var/msgs
2 /var/named
2 /var/preserve
2 /var/run/.snap
2 /var/run/hostapd
50 /var/run
2 /var/rwho
2 /var/spool/lock
2 /var/spool/lpd
2 /var/spool/mqueue
2 /var/spool/opielocks
2 /var/spool/output/lpd
4 /var/spool/output
14 /var/spool
60 /var/tmp/vi.recover
64 /var/tmp
2 /var/yp
2 /var/etc/openvpn_csc
2 /var/etc/mpd-vpn
2 /var/etc/openvpn
2 /var/etc/openvpn-csc
2 /var/etc/l2tp-vpn
2 /var/etc/pppoe-vpn
112 /var/etc
1 /var/dhcpd/dev/fd
1 /var/dhcpd/dev/usb
1 /var/dhcpd/dev/pts
2 /var/dhcpd/dev
8 /var/dhcpd/etc
1810 /var/dhcpd/usr/local/sbin
1812 /var/dhcpd/usr/local
1814 /var/dhcpd/usr
10 /var/dhcpd/var/db
6 /var/dhcpd/var/run
18 /var/dhcpd/var
1154 /var/dhcpd/lib
2 /var/dhcpd/run
3000 /var/dhcpd
40 /var/installer_logs
6 /var/siproxd
17782 /var

jimp

dev always shows 100% that wouldn't be a problem, but your /var is quite full as shown in your output.

If you have that many VLANs and DHCP is active on all of them, it's possible that the DHCP leases database is using all that space.

If you want to increase the size of /var, which should be fairly safe if you have enough memory in that box, edit /etc/rc.embedded and put a larger value on this line:

varsize="60m"

Then reboot the box.

You might try moving that up to at least 128m

AaronO

Thanks for the responses, guys. I hope someone else will also find this thread useful. Out of desperation last night I started poking around in /var and tracked down the offending directory to the RRD directory. My only guess is that RRD was graphing all 170+ interfaces I have set up for the various Vlans. I cleared the RRD log and disabled the service, reboot, and /var usage is back down to 10%.

jimp

I could see RRD data files getting that large for that kind of deployment. If you want to graph that kind of data, you can enable the SNMP service and then use an external poller such as Cacti or Zabbix to graph.