Kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7)
-
same here on an old p4 that got updated from 2.0.x to 2.1 alpha->beta->release
every 2 weeks i need to restart the service. -
I'm joining in. Since upgrading to 2.1 Release, I have this problem too.
resolver log: dnsmasq[61060]: failed to load names from /etc/hosts: Too many open files in system
system log: kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7). -
in my case it appears to have been caused by a harddrive that was starting to fail … it died completely on monday ;)
-
Im on 2.1-Release and had to reboot my system today because I lost dns resolving and that log was thousands of times.
I tried the fstab and had similar to this:[…]
root filterdns 28615 5663 / 14743744 -rw-r–r-- 613 r
root filterdns 28615 5664 / 14743744 -rw-r--r-- 613 r
root filterdns 28615 5665 / 14743744 -rw-r--r-- 613 r
root filterdns 28615 5666 / 14743744 -rw-r--r-- 613 r
root filterdns 28615 5667 / 14743744 -rw-r--r-- 613 r
root filterdns 28615 5668 / 14743744 -rw-r--r-- 613 r
root filterdns 28615 5669 / 14743744 -rw-r--r-- 613 r -
+1 for me…
EDIT:
I solved the prob by writing the 2.1.2 *.img to another CF & restoring the config... -
I've seen this "max files" problem every couple of months probably since January-ish of this year.
Past related posts (0) (1) indicate there is a problem, but don't offer a solution and are quite dated.I have a PC Engines ALIX 2D13 that I've had for few years now.
The pfSense Store sells these (2) so I'd figure these would be well supported.Today dnsmasq was no longer processing DHCP requests. I could have statically assigned an IP address and tried to access the unit, but instead consoled into the device. I ended up consoling into the unit and found system.log indicating the board was low on memory. This unit had only been running for approximately 47 days since my upgrade to version 2.1.2. I ended up rebooting it and also opted to upgrade to version 2.1.3 afterwards.
In the past when dnsmasq wouldn't process DNS queries, I found filterdns had quite a few files open (output from fstat). I don't have a total count of open files from the time periods when there were problems.
I whipped up the following one-liner for the future. Maybe it's useful to somebody else as well.
# filterdns open files fstat | awk '/filterdns/{i++} END{printf("%d files open by filterdns\n", i)}' # all open files per process plus a total fstat | awk '\!/CMD/{print $2} END{printf("* Total files open: %d", NR)}' | sort | uniq -c | sort -n
I have a few aliases for my pf rules which have host names and one persistent IPSec tunnel.
The unit has been up for <60 minutes now and has 20 open files by the filterdns user, but other users have many more open files.
If pfSense developers (or anyone else) have any proactive tips as to what else I might look for or try, please reply! Thanks.
(0) https://forum.pfsense.org/index.php?topic=29885.0
(1) https://forums.freebsd.org/viewtopic.php?&t=1553
(2) http://store.pfsense.org/vk-2d13-black/ -
Hello
I had the same problem a couple of months ago. In my case pfBlocker was the cause (had a rule with whole world blocked except Europe) and eventually it crashed with a lot of filterdns files opened.
-
Thanks for the reply Cristacul.
pfBlocker is not the cause for my problem since I'm not using it.
It is my understanding that firewall rules with (or other features that utilize) domain names are the only ones that utilize filterdns.
Might there be something else at work here causing my problem? -
Same problem and not having pfBlocker.
I have pfSense 2.1.4 64bits :( -
I whipped up the following one-liner for the future. Maybe it's useful to somebody else as well.
# filterdns open files fstat | awk '/filterdns/{i++} END{printf("%d files open by filterdns\n", i)}' # all open files per process plus a total fstat | awk '\!/CMD/{print $2} END{printf("* Total files open: %d", NR)}' | sort | uniq -c | sort -n
Here's an extension to my previous one-liner commands.
This one is more helpful to get the big picture (beyond what filterdns is doing).# spit out open files with a count per command and order them fstat | awk '\!/CMD/{print $2}' | sort | uniq -c | sort -n
-
Same problem here. Clean embedded install, here is what I got in a config file.
dnsmasq[37691]: failed to read /etc/resolv.conf: Too many open files in system kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7).
The system was running for only around a week. I rebooted quickly as I needed to restore functionality. Approx. an hour after reboot, the stats are following:
... 47 sh 186 php 214 filterdns 226 ipfw-classifyd
Will monitor progress and post an update later today.
-
After 5 hours of uptime, the stats are following:
712 filterdns 752 ipfw-classifyd
Which leads me to believe that this is going to grow until it just cannot, which leads me to more important question - what can I do to mitigate this?
I am really not running that complex setup - 2xWAN with rules to direct traffic to each + VPN client connection going out. Not that many hostnames either - below 10.
-
The numbers seem to be growing, now at:
... 1072 filterdns 1106 ipfw-classifyd
I think this could be related to this: https://forum.pfsense.org/index.php?topic=42991.0
The number of open files seem to be increasing after every filter reload, which is now every 15m. Although I do not have any schedules set, it still gets reloaded every 15m.
In any case - it seems to me that ipfw-classifyd/filterdns do not respond correctly to the HUP signal being sent to them and re-create any temp files they had before for the previous config for the new one… which is not a sustainable approach.
-
@petr:
While this doesn't necessary help you, I got filterdns to stop consuming file handles when I removed the domain name from my IPSec VPN tunnel configuration.I expect the increasing number of file handles open by filterdns was a result of Racoon (IPSec daemon) rekeying and what not.
Of course that thread <0> on the pfsense forum tells of the other problems I'm noticing.
<0> https://forum.pfsense.org/index.php?topic=81121
-
@petr:
While this doesn't necessary help you, I got filterdns to stop consuming file handles when I removed the domain name from my IPSec VPN tunnel configuration.I expect the increasing number of file handles open by filterdns was a result of Racoon (IPSec daemon) rekeying and what not.
Of course that thread <0> on the pfsense forum tells of the other problems I'm noticing.
<0> https://forum.pfsense.org/index.php?topic=81121
Thank you for the suggestion! Sadly, I am not running IPSec thus have nothing to switch-off.
The filterdns is now at 1118 open files after 1 day, 20 hours.
-
Found a workaround - at least I think so, the number of open files has not grown for a few days.
To cut long story short, I've found out that the number of open files grows in correlation with gateway down alarms in my logs. This lead me to conclude that an unstable connection on one of the VPNs caused frequent alarms and subsequent reloads. As I was not using the alarms to do anything useful, I've simply disabled them for that connection - and voila, open file count stopped growing.
However, I still believe that there is a problem - in my opinion, having an alarm avery 10m should not be something that would destabilise the router, or lock it up as it does for me!
What do you think guys?
-
Here are my numbers on a production 2.1.5 system that has been up for 7 days:
2 kernel 2 md0 2 md1 3 init 6 awk 7 login 7 rrdtool 8 apinger 8 cron 9 fstat 9 sudo 14 check_reload_status 14 devd 14 logger 15 dnsmasq 15 sh 16 inetd 16 lighttpd 16 sshlockout_pf 16 tcpdump 24 tcsh 25 dhcpd 33 sshd 36 minicron 41 syslogd 54 ntpd 249 openvpn 1535 php 1861 filterdns
But a 2.2 system that I just updated/rebooted looks like:
[2.2-BETA][root@apu22.localdomain]/root(1): fstat | awk '\!/CMD/{print $2}' | sort | uniq -c | sort -n 2 kernel 2 md0 2 md1 3 init 4 getty 7 awk 7 rrdtool 7 uniq 8 apinger 8 fstat 8 login 8 sshlockout_pf 9 tcsh 12 sleep 14 cron 14 dnsmasq 14 sort 15 filterlog 16 check_reload_status 17 devd 17 inetd 17 openvpn 21 dhcpd 22 lighttpd 22 ntpd 22 sshd 36 dhclient 36 minicron 39 php-fpm 42 syslogd 47 sh
So the "php" and "filterdns" on the 2.1.5 production system have something wrong - there is no way they should be sitting with so many open file handles.
A problem like this will cause intermittent system problems after some random days/weeks/months. -
Exactly my concern!
I think the problem exhibits itself when filterdns (and also layer7 daemon for me) get restarted - could be gateway alarm, refresh of rules, etc. They do not seem to release the old files and just allocate new file handles.
-
I raised a bug report: https://redmine.pfsense.org/issues/3951
That way it does not get forgotten.