Kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7)



  • Hi,

    my pfSense stop with the dns resolution today.

    Syslog show:

    Jun 12 07:16:19 	dnsmasq[19277]: failed to read /etc/resolv.conf: Too many open files in system
    Jun 12 07:16:19 	dnsmasq[19277]: failed to read /etc/resolv.conf: Too many open files in system
    Jun 12 07:16:19 	kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7).
    Jun 12 07:16:21 	dnsmasq[19277]: failed to read /etc/resolv.conf: Too many open files in system
    Jun 12 07:16:21 	dnsmasq[19277]: failed to read /etc/resolv.conf: Too many open files in system
    Jun 12 07:16:21 	kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7).
    

    Here is a old topic, but not really a solution why the system want open so much files:
    http://forum.pfsense.org/index.php/topic,29885.msg154831/topicseen.html#msg154831

    I have only installed the "OpenVPN Client Export Utility", pfSense 2.0.3 amd64


  • Rebel Alliance Developer Netgate

    If/when that happens again, look at the output of "fstat" in the shell and see what has all of the files open.

    That isn't normal to see, especially if you don't have any other packages installed.



  • @jimp:

    If/when that happens again, look at the output of "fstat" in the shell and see what has all of the files open.

    I will look the next time, this morning there was no time to search and i reboot the system.  :-[



  • @jimp:

    If/when that happens again, look at the output of "fstat" in the shell and see what has all of the files open.

    Today again:

    
    [...]
    root     filterdns  28615 5663 /        14743744 -rw-r--r--     613  r
    root     filterdns  28615 5664 /        14743744 -rw-r--r--     613  r
    root     filterdns  28615 5665 /        14743744 -rw-r--r--     613  r
    root     filterdns  28615 5666 /        14743744 -rw-r--r--     613  r
    root     filterdns  28615 5667 /        14743744 -rw-r--r--     613  r
    root     filterdns  28615 5668 /        14743744 -rw-r--r--     613  r
    root     filterdns  28615 5669 /        14743744 -rw-r--r--     613  r
    
    

    What is "filterdns" and how can I fix it?


  • Rebel Alliance Developer Netgate

    filterdns translates hostnames in aliases into IP addresses so they can be used in pf tables.

    I don't recall ever seeing it go that nuts, though.

    Do you have a lot of hostnames in your aliases?



  • @jimp:

    Do you have a lot of hostnames in your aliases?

    No, i had about 20 hostnames in my aliases.
    Delete them all and replace it by the ip address.
    Interesting is the last alias i delete with hostnames, is still avalible in /var/etc/filterdns.conf

    Can it be a problem if some hostnames can not resolved?


  • Rebel Alliance Developer Netgate

    No that shouldn't be a problem.

    The last hostname sticking in filterdns.conf is a known issue that has been fixed (along with many other filterdns issues) on 2.1



  • Same thing just happened to us. We're on 2.0.3, so maybe we need to update? The output of fstat was saturated with 2 different apps:

    root    filterdns  40614  666 /        15544735 -rw-r–r--      0  r
    ...

    and also

    root    ipfw-classifyd 27444 1264 /        18514540 -rw-r--r--      34  r
    ...

    A reboot got us back.



  • Since upgrading to 2.1-RELEASE, from a snapshot in June (ish), I am seeing this problem. Only have two hostnames in aliases.



  • @firegrass:

    Since upgrading to 2.1-RELEASE, from a snapshot in June (ish), I am seeing this problem. Only have two hostnames in aliases.

    Absolutely same situation with open files and filterdns :( Any ideas?



  • same here on an old p4 that got updated from 2.0.x to 2.1 alpha->beta->release
    every 2 weeks i need to restart the service.



  • I'm joining in. Since upgrading to 2.1 Release, I have this problem too.

    resolver log: dnsmasq[61060]: failed to load names from /etc/hosts: Too many open files in system
    system log: kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7).



  • in my case it appears to have been caused by a harddrive that was starting to fail … it died completely on monday ;)



  • Im on 2.1-Release and had to reboot my system today because I lost dns resolving and that log was thousands of times.
    I tried the fstab and had similar to this:

    […]
    root    filterdns  28615 5663 /        14743744 -rw-r–r--    613  r
    root    filterdns  28615 5664 /        14743744 -rw-r--r--    613  r
    root    filterdns  28615 5665 /        14743744 -rw-r--r--    613  r
    root    filterdns  28615 5666 /        14743744 -rw-r--r--    613  r
    root    filterdns  28615 5667 /        14743744 -rw-r--r--    613  r
    root    filterdns  28615 5668 /        14743744 -rw-r--r--    613  r
    root    filterdns  28615 5669 /        14743744 -rw-r--r--    613  r



  • +1 for me…

    EDIT:
    I solved the prob by writing the 2.1.2 *.img to another CF & restoring the config...



  • I've seen this "max files" problem every couple of months probably since January-ish of this year.
    Past related posts (0) (1) indicate there is a problem, but don't offer a solution and are quite dated.

    I have a PC Engines ALIX 2D13 that I've had for few years now.
    The pfSense Store sells these (2) so I'd figure these would be well supported.

    Today dnsmasq was no longer processing DHCP requests.  I could have statically assigned an IP address and tried to access the unit, but instead consoled into the device.  I ended up consoling into the unit and found system.log indicating the board was low on memory.  This unit had only been running for approximately 47 days since my upgrade to version 2.1.2.  I ended up rebooting it and also opted to upgrade to version 2.1.3 afterwards.

    In the past when dnsmasq wouldn't process DNS queries, I found filterdns had quite a few files open (output from fstat).  I don't have a total count of open files from the time periods when there were problems.

    I whipped up the following one-liner for the future.  Maybe it's useful to somebody else as well.

    
    # filterdns open files
    fstat | awk '/filterdns/{i++} END{printf("%d files open by filterdns\n", i)}'
    
    # all open files per process plus a total
    fstat | awk '\!/CMD/{print $2} END{printf("* Total files open: %d", NR)}' | sort | uniq -c | sort -n
    
    

    I have a few aliases for my pf rules which have host names and one persistent IPSec tunnel.

    The unit has been up for <60 minutes now and has 20 open files by the filterdns user, but other users have many more open files.

    If pfSense developers (or anyone else) have any proactive tips as to what else I might look for or try, please reply!  Thanks.

    (0) https://forum.pfsense.org/index.php?topic=29885.0
    (1) https://forums.freebsd.org/viewtopic.php?&t=1553
    (2) http://store.pfsense.org/vk-2d13-black/



  • Hello

    I had the same problem a couple of months ago. In my case pfBlocker was the cause (had a rule with whole world blocked except Europe) and eventually it crashed with a lot of filterdns files opened.



  • Thanks for the reply Cristacul.

    pfBlocker is not the cause for my problem since I'm not using it.

    It is my understanding that firewall rules with (or other features that utilize) domain names are the only ones that utilize filterdns.
    Might there be something else at work here causing my problem?



  • Same problem and not having pfBlocker.
    I have pfSense 2.1.4 64bits  :(



  • @silvertip257:

    I whipped up the following one-liner for the future.  Maybe it's useful to somebody else as well.

    
    # filterdns open files
    fstat | awk '/filterdns/{i++} END{printf("%d files open by filterdns\n", i)}'
    
    # all open files per process plus a total
    fstat | awk '\!/CMD/{print $2} END{printf("* Total files open: %d", NR)}' | sort | uniq -c | sort -n
    
    

    Here's an extension to my previous one-liner commands.
    This one is more helpful to get the big picture (beyond what filterdns is doing).

    
    # spit out open files with a count per command and order them
    
    fstat | awk '\!/CMD/{print $2}' | sort | uniq -c | sort -n
    
    


  • Same problem here. Clean embedded install, here is what I got in a config file.

    dnsmasq[37691]: failed to read /etc/resolv.conf: Too many open files in system
    kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7).
    

    The system was running for only around a week. I rebooted quickly as I needed to restore functionality. Approx. an hour after reboot, the stats are following:

    ...
       47 sh
     186 php
     214 filterdns
     226 ipfw-classifyd
    

    Will monitor progress and post an update later today.



  • After 5 hours of uptime, the stats are following:

     712 filterdns
     752 ipfw-classifyd
    

    Which leads me to believe that this is going to grow until it just cannot, which leads me to more important question - what can I do to mitigate this?

    I am really not running that complex setup - 2xWAN with rules to direct traffic to each + VPN client connection going out. Not that many hostnames either - below 10.



  • The numbers seem to be growing, now at:

    
    ...
    1072 filterdns
    1106 ipfw-classifyd
    

    I think this could be related to this: https://forum.pfsense.org/index.php?topic=42991.0

    The number of open files seem to be increasing after every filter reload, which is now every 15m. Although I do not have any schedules set, it still gets reloaded every 15m.

    In any case - it seems to me that  ipfw-classifyd/filterdns do not respond correctly to the HUP signal being sent to them and re-create any temp files they had before for the previous config for the new one… which is not a sustainable approach.



  • @petr:
    While this doesn't necessary help you, I got filterdns to stop consuming file handles when I removed the domain name from my IPSec VPN tunnel configuration.

    I expect the increasing number of file handles open by filterdns was a result of Racoon (IPSec daemon) rekeying and what not.

    Of course that thread <0> on the pfsense forum tells of the other problems I'm noticing.

    <0> https://forum.pfsense.org/index.php?topic=81121



  • @silvertip257:

    @petr:
    While this doesn't necessary help you, I got filterdns to stop consuming file handles when I removed the domain name from my IPSec VPN tunnel configuration.

    I expect the increasing number of file handles open by filterdns was a result of Racoon (IPSec daemon) rekeying and what not.

    Of course that thread <0> on the pfsense forum tells of the other problems I'm noticing.

    <0> https://forum.pfsense.org/index.php?topic=81121

    Thank you for the suggestion! Sadly, I am not running IPSec thus have nothing to switch-off.

    The filterdns is now at 1118 open files after 1 day, 20 hours.



  • Found a workaround - at least I think so, the number of open files has not grown for a few days.

    To cut long story short, I've found out that the number of open files grows in correlation with gateway down alarms in my logs. This lead me to conclude that an unstable connection on one of the VPNs caused frequent alarms and subsequent reloads. As I was not using the alarms to do anything useful, I've simply disabled them for that connection - and voila, open file count stopped growing.

    However, I still believe that there is a problem - in my opinion, having an alarm avery 10m should not be something that would destabilise the router, or lock it up as it does for me!

    What do you think guys?



  • Here are my numbers on a production 2.1.5 system that has been up for 7 days:

       2 kernel
       2 md0
       2 md1
       3 init
       6 awk
       7 login
       7 rrdtool
       8 apinger
       8 cron
       9 fstat
       9 sudo
      14 check_reload_status
      14 devd
      14 logger
      15 dnsmasq
      15 sh
      16 inetd
      16 lighttpd
      16 sshlockout_pf
      16 tcpdump
      24 tcsh
      25 dhcpd
      33 sshd
      36 minicron
      41 syslogd
      54 ntpd
     249 openvpn
    1535 php
    1861 filterdns
    
    

    But a 2.2 system that I just updated/rebooted looks like:

    [2.2-BETA][root@apu22.localdomain]/root(1): fstat | awk '\!/CMD/{print $2}' | sort | uniq -c | sort -n
       2 kernel
       2 md0
       2 md1
       3 init
       4 getty
       7 awk
       7 rrdtool
       7 uniq
       8 apinger
       8 fstat
       8 login
       8 sshlockout_pf
       9 tcsh
      12 sleep
      14 cron
      14 dnsmasq
      14 sort
      15 filterlog
      16 check_reload_status
      17 devd
      17 inetd
      17 openvpn
      21 dhcpd
      22 lighttpd
      22 ntpd
      22 sshd
      36 dhclient
      36 minicron
      39 php-fpm
      42 syslogd
      47 sh
    
    

    So the "php" and "filterdns" on the 2.1.5 production system have something wrong - there is no way they should be sitting with so many open file handles.
    A problem like this will cause intermittent system problems after some random days/weeks/months.



  • Exactly my concern!

    I think the problem exhibits itself when filterdns (and also layer7 daemon for me) get restarted - could be gateway alarm, refresh of rules, etc. They do not seem to release the old files and just allocate new file handles.



  • I raised a bug report: https://redmine.pfsense.org/issues/3951
    That way it does not get forgotten.