Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.0.1 Nano Alix Memory Usage, Unstable System, Config saved but not applying

    Scheduled Pinned Locked Moved General pfSense Questions
    3 Posts 1 Posters 2.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      stompro
      last edited by

      Hello, I've been seeing problems in several sites that are running pfSense 2.0.1 Nanobsd on Alix 2d13's (256MB ram).

      The usual symptom is that the captive portal stops working for new users, to restore functionality the firewall needs to be rebooted.

      In one case today all config changes have stopped applying.  I can make a change, click apply changes, but the changes never actually get applied to the system.  The changes are saved, but not applied.  If I look at /tmp/rules.debug it doesn't get updated and if I run pfctl -sa |less I can see that the changes have not been applied.

      I did notice that there were a large number of php processes just sitting around when I first looked at the box.  I tried killing them to see if one of them has the system stuck, and now I have 5 zombie processes.  Now there are 16 php processes just sitting around.

      When I look at the filter reload status, it is just stuck at one line "Creating NAT rule VNC…".

      When I look at the process list I see.

        PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
      32169 root        1  76   20 39260K 17940K lockf    0:01  0.00% php
      19144 root        1  76   20 39260K 17900K lockf    0:01  0.00% php
      50355 root        1  76   20 39260K 17896K lockf    0:01  0.00% php
       2772 root        1  76    0 34140K 16080K accept   0:06  0.00% php
       2540 root        1  76    0 34140K 16012K accept   0:03  0.00% php
       2222 root        1  76    0 34140K 15220K accept   0:03  0.00% php
      
      

      Does this mean that there is some sort of lock contention going on, 3 processes are all trying to grab locks, so I'm in a deadlock?

      My full process list looks like

      USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
      root      10 89.0  0.0     0     8  ??  RL   28Feb12 277714:04.60 [idle]
      root    2823  4.0  5.2 33116 12788  ??  S     2:40PM   0:22.60 /usr/local/bin/php
      root      11  2.0  0.0     0   104  ??  WL   28Feb12 3050:44.41 [intr]
      root       0  0.0  0.0     0    56  ??  DLs  28Feb12 871:49.73 [kernel]
      root       1  0.0  0.0  1888    84  ??  ILs  28Feb12   0:06.85 /sbin/init --
      root       2  0.0  0.0     0     8  ??  DL   28Feb12  20:15.29 [g_event]
      root       3  0.0  0.0     0     8  ??  DL   28Feb12   8:00.53 [g_up]
      root       4  0.0  0.0     0     8  ??  DL   28Feb12   7:56.95 [g_down]
      root       5  0.0  0.0     0     8  ??  DL   28Feb12   0:00.00 [crypto]
      root       6  0.0  0.0     0     8  ??  DL   28Feb12   0:00.00 [crypto returns]
      root       7  0.0  0.0     0     8  ??  DL   28Feb12  19:27.58 [pfpurge]
      root       8  0.0  0.0     0     8  ??  DL   28Feb12   0:00.00 [xpt_thrd]
      root       9  0.0  0.0     0     8  ??  DL   28Feb12  33:28.61 [pagedaemon]
      root      12  0.0  0.0     0     8  ??  DL   28Feb12   0:00.00 [ng_queue]
      root      13  0.0  0.0     0     8  ??  DL   28Feb12 417:45.44 [yarrow]
      root      14  0.0  0.0     0    64  ??  DL   28Feb12   5:27.94 [usb]
      root      15  0.0  0.0     0     8  ??  DL   28Feb12   0:00.94 [vmdaemon]
      root      16  0.0  0.0     0     8  ??  DL   28Feb12   0:00.52 [pagezero]
      root      17  0.0  0.0     0     8  ??  DL   28Feb12   0:46.74 [idlepoll]
      root      18  0.0  0.0     0     8  ??  DL   28Feb12   2:34.03 [bufdaemon]
      root      19  0.0  0.0     0     8  ??  DL   28Feb12   2:23.98 [vnlru]
      root      20  0.0  0.0     0     8  ??  DL   28Feb12  32:50.15 [syncer]
      root      21  0.0  0.0     0     8  ??  DL   28Feb12   2:37.69 [softdepflush]
      root      29  0.0  0.0     0     8  ??  DL   28Feb12  35:54.14 [md0]
      root      35  0.0  0.0     0     8  ??  DL   28Feb12 125:34.09 [md1]
      root     267  0.0  0.2  3408   476  ??  INs  28Feb12   0:00.06 /usr/local/sbin/check_reload_status
      root     272  0.0  0.0  3408     0  ??  IWN  -         0:00.00 check_reload_status: Monitoring daemon of check_reload_s
      root     288  0.0  0.1  1888   268  ??  Is   28Feb12   0:00.01 /sbin/devd
      root    1863  0.0  3.0 32092  7324  ??  Is    2:40PM   0:00.27 /usr/local/bin/php
      root    1933  0.0  5.3 33116 13044  ??  I     2:40PM   0:00.83 /usr/local/bin/php
      root    2222  0.0  6.2 34140 15220  ??  I     2:40PM   0:03.10 /usr/local/bin/php
      root    2539  0.0  6.1 34140 15008  ??  I     2:40PM   0:02.89 /usr/local/bin/php
      root    2540  0.0  6.5 34140 16012  ??  I     2:40PM   0:03.41 /usr/local/bin/php
      root    2772  0.0  6.6 34140 16080  ??  I     2:40PM   0:06.24 /usr/local/bin/php
      root    8528  0.0  0.5  3656  1144  ??  SN    2:41PM   0:02.35 /bin/sh /var/db/rrd/updaterrd.sh
      root    8546  0.0  0.7  7336  1672  ??  Ss   28Feb12  13:13.89 /usr/sbin/bsnmpd -c /var/etc/snmpd.conf -p /var/run/snmp
      root   12657  0.0  0.2  1564   468  ??  SN    3:14PM   0:00.00 sleep 60
      root   16710  0.0  0.4  7412  1020  ??  Is    7Sep12   0:23.69 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.le
      nobody 17307  0.0  0.9  5556  2176  ??  S     7Sep12  15:52.22 /usr/local/sbin/dnsmasq --local-ttl 1 --all-servers --dn
      root   17537  0.0  0.2  3404   436  ??  Is   28Feb12   1:23.92 /usr/sbin/cron -s
      root   18854  0.0  1.1  7992  2812  ??  Ss    2:10PM   0:02.34 sshd: admin@pts/0 (sshd)
      root   19144  0.0  7.3 39260 17900  ??  IN    2:44PM   0:01.05 /usr/local/bin/php -f /etc/rc.filter_configure_sync
      root   20237  0.0  0.1  3316   140  ??  Is   28Feb12   0:30.70 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr
      root   20525  0.0  0.1  3316   208  ??  Is   28Feb12   0:02.14 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pi
      root   20879  0.0  0.1  3316   196  ??  Is   28Feb12   0:00.09 /usr/local/bin/minicron 86400 /var/run/update_alias_url_
      root   20943  0.0  0.4  4948  1100  ??  Is   28Feb12   8:59.26 /usr/sbin/syslogd -c -c -l /var/dhcpd/var/run/log -f /va
      root   22238  0.0  0.6  5116  1372  ??  Ss   28Feb12  99:21.51 /usr/local/sbin/openvpn --config /var/etc/openvpn/client
      root   22652  0.0  2.4 10008  5892  ??  SN   31Aug12   1:22.85 /usr/sbin/tcpdump -s 256 -v -l -n -e -ttt -i pflog0
      root   22711  0.0  0.2  3316   448  ??  IN   31Aug12   0:35.32 logger -t pf -p local0.info
      root   25226  0.0  1.0 12108  2376  ??  Is    2:44PM   0:00.14 /usr/local/sbin/filterdns -p /tmp/filterdns-cpah.pid -i
      dhcpd  26565  0.0  0.9  8436  2296  ??  Ss    7Sep12   3:17.26 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /
      root   26684  0.0  0.6  5272  1404  ??  Is   28Feb12   0:00.01 /usr/sbin/sshd
      root   27154  0.0  0.9  5576  2108  ??  S     2:44PM   0:00.08 /usr/local/sbin/lighttpd -f /var/etc/lighty-CaptivePorta
      root   27550  0.0  0.3  3532   620  ??  Is   28Feb12   0:00.42 /usr/local/sbin/sshlockout_pf 15
      root   27602  0.0  3.0 32092  7368  ??  Is    2:44PM   0:00.27 /usr/local/bin/php
      root   28089  0.0  0.3  3316   768  ??  Is    2:44PM   0:00.01 /usr/local/bin/minicron 60 /var/run/cp_prunedb.pid /etc/
      root   28358  0.0  3.0 32092  7368  ??  Is    2:44PM   0:00.27 /usr/local/bin/php
      root   30798  0.0  0.1  3436   152  ??  Is   28Feb12   0:00.03 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.con
      root   31219  0.0  3.0 32092  7380  ??  I     2:44PM   0:00.00 /usr/local/bin/php
      root   31476  0.0  3.0 32092  7380  ??  I     2:44PM   0:00.00 /usr/local/bin/php
      root   31821  0.0  3.0 32092  7380  ??  I     2:44PM   0:00.00 /usr/local/bin/php
      root   31922  0.0  3.0 32092  7380  ??  I     2:44PM   0:00.00 /usr/local/bin/php
      root   32169  0.0  7.3 39260 17940  ??  IN    2:44PM   0:01.05 /usr/local/bin/php -f /etc/rc.filter_configure_sync
      root   34062  0.0  0.4  4496  1068  ??  S     2:43PM   0:00.03 /usr/local/bin/rrdtool -
      root   46065  0.0  0.3  3316   652  ??  Ss   28Feb12 110:20.73 /usr/local/sbin/apinger -c /var/etc/apinger.conf
      root   50355  0.0  7.3 39260 17896  ??  IN    2:41PM   0:01.05 /usr/local/bin/php -f /etc/rc.filter_configure_sync
      root   54262  0.0  1.4  7612  3480  ??  S    28Feb12  30:40.66 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigura
      root   54517  0.0  0.0     0     0  ??  Z    28Feb12  66:53.76 <defunct>root   55563  0.0  0.0     0     0  ??  Z    28Feb12 239:57.08 <defunct>root   55928  0.0  0.0     0     0  ??  Z    28Feb12  20:47.48 <defunct>root   56079  0.0  0.2  3316   524  ??  Ss   28Feb12   0:32.77 ntpd: [priv] (ntpd)
      root   56281  0.0  0.0     0     0  ??  Z    28Feb12  20:59.64 <defunct>root   56618  0.0  0.0     0     0  ??  Z    28Feb12  26:27.41 <defunct>root   37159  0.0  0.0  3684     0  u0  IWs  -         0:00.00 login [pam] (login)
      root   37712  0.0  0.0  3656     0  u0  IW   -         0:00.00 -sh (sh)
      root   39851  0.0  0.1  3656   220  u0  I+   28Feb12   0:00.01 /bin/sh /etc/rc.initial
      _ntp   55519  0.0  0.2  3316   556  u0- I    28Feb12   2:40.48 ntpd: ntp engine (ntpd)
      root   13484  0.0  0.4  3456   892   0  R+    3:14PM   0:00.01 ps aux
      root   19426  0.0  0.4  3656   976   0  Is    2:10PM   0:00.02 /bin/sh /etc/rc.initial
      root   25765  0.0  0.9  4696  2144   0  S     2:10PM   0:00.13 /bin/tcsh</defunct></defunct></defunct></defunct></defunct> 
      

      My system log looks like this.

      Sep 27 14:17:37 cr-firewall kernel: pid 45333 (php), uid 0, was killed: out of swap space
      Sep 27 17:08:25 cr-firewall kernel: pid 62341 (php), uid 0, was killed: out of swap space
      Sep 27 17:09:26 cr-firewall php: : The command '/sbin/ipfw table 1 delete 192.168.1.208' returned exit code '71', the output was 'ipfw: setsockopt(IP_FW_TABLE_DEL): No such process'
      Sep 27 17:09:26 cr-firewall php: : The command '/sbin/ipfw table 2 delete 192.168.1.208' returned exit code '71', the output was 'ipfw: setsockopt(IP_FW_TABLE_DEL): No such process'
      Sep 27 17:09:26 cr-firewall php: : The command '/sbin/ipfw pipe 20794 delete' returned exit code '1', the output was 'ipfw: rule 1: setsockopt(IP_DUMMYNET_DEL): Invalid argument'
      Sep 27 17:09:27 cr-firewall php: : The command '/sbin/ipfw pipe 20795 delete' returned exit code '1', the output was 'ipfw: rule 1: setsockopt(IP_DUMMYNET_DEL): Invalid argument'
      Sep 27 21:13:27 cr-firewall kernel: pid 54687 (php), uid 0, was killed: out of swap space
      Sep 28 02:23:39 cr-firewall kernel: pid 30905 (php), uid 0, was killed: out of swap space
      Sep 28 03:34:27 cr-firewall filterdns: host_dns: failed looking up "ils.larl.org": hostname nor servname provided, or not known
      Sep 28 03:34:27 cr-firewall filterdns: host_dns: failed looking up "ils.larl.org": hostname nor servname provided, or not known
      Sep 28 03:34:27 cr-firewall filterdns: host_dns: failed looking up "ils.larl.org": hostname nor servname provided, or not known
      Sep 28 03:34:27 cr-firewall filterdns: host_dns: failed looking up "ils.larl.org": hostname nor servname provided, or not known
      Sep 28 04:10:02 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 04:10:02 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 04:18:01 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 04:18:01 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 04:25:01 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 04:25:01 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 10:07:59 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 10:07:59 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 11:57:07 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 11:57:07 cr-firewall php: /index.php: Successful webConfigurator login for user 'admin' from 216.239.28.98
      Sep 28 16:58:35 cr-firewall check_reload_status: Syncing firewall
      Sep 28 16:59:10 cr-firewall check_reload_status: Syncing firewall
      Sep 28 16:59:43 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:00:11 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:00:32 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:00:50 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:00:50 cr-firewall check_reload_status: Reloading filter
      Sep 28 12:00:50 cr-firewall php: /firewall_shaper.php: Creating rrd update script
      Sep 28 12:00:57 cr-firewall kernel: pid 14302 (php), uid 0, was killed: out of swap space
      Sep 28 17:01:16 cr-firewall check_reload_status: Syncing firewall
      Sep 28 12:01:24 cr-firewall apinger: Error while feeding rrdtool: Broken pipe
      Sep 28 17:01:37 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:01:45 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:01:45 cr-firewall check_reload_status: Reloading filter
      Sep 28 12:01:53 cr-firewall kernel: pid 19436 (php), uid 0, was killed: out of swap space
      Sep 28 12:02:02 cr-firewall kernel: pid 19636 (php), uid 0, was killed: out of swap space
      Sep 28 12:04:39 cr-firewall kernel: pid 21966 (php), uid 0, was killed: out of swap space
      Sep 28 17:04:52 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:04:57 cr-firewall check_reload_status: Reloading filter
      Sep 28 12:16:12 cr-firewall kernel: pid 33523 (php), uid 0, was killed: out of swap space
      Sep 28 17:17:52 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:17:57 cr-firewall check_reload_status: Syncing firewall
      Sep 28 17:17:57 cr-firewall check_reload_status: Reloading filter
      Sep 28 12:17:59 cr-firewall php: /firewall_shaper.php: Creating rrd update script
      Sep 28 12:18:05 cr-firewall kernel: pid 8111 (php), uid 0, was killed: out of swap space
      Sep 28 12:18:24 cr-firewall apinger: Error while feeding rrdtool: Broken pipe
      Sep 28 12:20:17 cr-firewall kernel: pid 8159 (php), uid 0, was killed: out of swap space
      Sep 28 18:57:39 cr-firewall check_reload_status: Syncing firewall
      Sep 28 18:57:45 cr-firewall check_reload_status: Syncing firewall
      Sep 28 18:57:45 cr-firewall check_reload_status: Reloading filter
      Sep 28 13:57:47 cr-firewall php: /firewall_shaper.php: Creating rrd update script
      Sep 28 13:57:53 cr-firewall kernel: pid 57576 (php), uid 0, was killed: out of swap space
      Sep 28 13:58:25 cr-firewall apinger: Error while feeding rrdtool: Broken pipe
      Sep 28 19:08:33 cr-firewall check_reload_status: Syncing firewall
      Sep 28 19:08:57 cr-firewall check_reload_status: Syncing firewall
      Sep 28 19:08:57 cr-firewall check_reload_status: Reloading filter
      Sep 28 14:08:58 cr-firewall php: /firewall_shaper.php: Creating rrd update script
      Sep 28 14:09:05 cr-firewall kernel: pid 32043 (php), uid 0, was killed: out of swap space
      Sep 28 14:09:25 cr-firewall apinger: Error while feeding rrdtool: Broken pipe
      Sep 28 14:10:24 cr-firewall sshd[18854]: Accepted keyboard-interactive/pam for admin from 216.239.28.98 port 45647 ssh2
      Sep 28 19:33:04 cr-firewall check_reload_status: Syncing firewall
      Sep 28 19:33:16 cr-firewall check_reload_status: Syncing firewall
      Sep 28 19:33:16 cr-firewall check_reload_status: Reloading filter
      Sep 28 14:33:17 cr-firewall php: /firewall_shaper.php: Creating rrd update script
      Sep 28 14:33:26 cr-firewall apinger: Error while feeding rrdtool: Broken pipe
      Sep 28 14:35:19 cr-firewall kernel: pid 21821 (php), uid 0, was killed: out of swap space
      Sep 28 14:36:20 cr-firewall php: : The command '/sbin/ipfw table 1 delete 192.168.1.199' returned exit code '71', the output was 'ipfw: setsockopt(IP_FW_TABLE_DEL): No such process'
      Sep 28 14:36:20 cr-firewall php: : The command '/sbin/ipfw table 2 delete 192.168.1.199' returned exit code '71', the output was 'ipfw: setsockopt(IP_FW_TABLE_DEL): No such process'
      Sep 28 14:36:20 cr-firewall php: : The command '/sbin/ipfw pipe 20014 delete' returned exit code '1', the output was 'ipfw: rule 1: setsockopt(IP_DUMMYNET_DEL): Invalid argument'
      Sep 28 14:36:20 cr-firewall php: : The command '/sbin/ipfw pipe 20015 delete' returned exit code '1', the output was 'ipfw: rule 1: setsockopt(IP_DUMMYNET_DEL): Invalid argument'
      Sep 28 19:41:28 cr-firewall check_reload_status: Syncing firewall
      Sep 28 19:41:37 cr-firewall check_reload_status: Syncing firewall
      Sep 28 19:41:37 cr-firewall check_reload_status: Reloading filter
      Sep 28 14:41:37 cr-firewall php: /firewall_shaper.php: Creating rrd update script
      Sep 28 14:42:26 cr-firewall apinger: Error while feeding rrdtool: Broken pipe
      Sep 28 19:43:54 cr-firewall check_reload_status: Syncing firewall
      Sep 28 14:43:55 cr-firewall kernel: IP firewall unloaded
      Sep 28 19:43:55 cr-firewall check_reload_status: Reloading filter
      Sep 28 19:44:10 cr-firewall check_reload_status: Syncing firewall
      Sep 28 14:44:11 cr-firewall kernel: ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based forwarding enabled, default to accept, logging disabled
      Sep 28 19:44:16 cr-firewall check_reload_status: Reloading filter
      
      

      My theory right now is that the system gets low on memory, then starts killing processes randomly, or however it decided which process to kill, and a process that holds a lock gets killed, which deadlocks the system until it is rebooted.

      Does anyone have any tips for figuring out what is causing the memory usage to run out of control?
      Thanks
      Josh

      Hardware used: Alix 2D13 X 10, APU2D4 X 10, SG-2200 X 10, SG-2440 X 4

      1 Reply Last reply Reply Quote 0
      • S Offline
        stompro
        last edited by

        This redmine ticket may be related to my problem.

        http://redmine.pfsense.org/issues/2475

        According to the submitter, the captive portal page limits don't actually work because a lighttpd module isn't included with pfsense.  So it could be that the captive portal is getting dos'd once in a while, which causes memory to be used up, which triggers system instability.

        Josh

        Hardware used: Alix 2D13 X 10, APU2D4 X 10, SG-2200 X 10, SG-2440 X 4

        1 Reply Last reply Reply Quote 0
        • S Offline
          stompro
          last edited by

          The Mod_evasive per-ip connection to the captive portal issue should be fixed in 2.0.3, which should keep the Captive portal from causing OOM events.  I've upgraded several sites and the memory usage does seem to be lower now.
          Josh

          Hardware used: Alix 2D13 X 10, APU2D4 X 10, SG-2200 X 10, SG-2440 X 4

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.