Update to 2.1 Nano, 'file system full'.


  • Netgate Administrator

    Hmm, I've just updated my home box from 2.0.3 to 2.1. It's NanoBSD 32bit 1GB. The box has 512MB, so double the Alix, but it didn't go nearly as smoothly as I'd hoped, or found on other boxes with RC. I too used a manual update since the servers are currently offline. Result was very similar. No access to the gui and no internet access. Connecting a serial console showed many 'file system full' errors on /tmp. Rebooting got back to a working router but at least one package was partially installed. Reinstalling it resulted in more 'file system full' errors, on /var this time, but it did reinstall correctly. All seems well now.
    Seems odd. The machine has the lowest RAM of any I've run 2.1(beta, RC, etc) on. I'm wondering if some adjusting of the /tmp and /var sizes might be in order.

    Steve

    After 1st boot:
    
    /tmp: write failed, filesystem is full
    
    Fatal error: PHP Startup: apc_fcntl_create: open(/tmp/.apc.Se4VZI, O_RDWR|O_CREA                                                                                                                                  T, 0666) failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_lock failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_unlock failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_create: open(/tmp/.apc.eS9TAT, O_RDWR|O_CREA                                                                                                                                  T, 0666) failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_create: open(/tmp/.apc.KfhW3r, O_RDWR|O_CREA                                                                                                                                  T, 0666) failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_lock failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_unlock failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_create: open(/tmp/.apc.Pd8tLM, O_RDWR|O_CREA                                                                                                                                  T, 0666) failed: in Unknown on line 0
    
    Fatal error: PHP Startup: apc_fcntl_create: open(/tmp/.apc.27D9gR, O_RDWR|O_CREA                                                                                                                                  T, 0666) failed: in Unknown on line 0
    
    Fatal error: Unknown: apc_fcntl_rdlock failed: in Unknown on line 0
    
    2nd boot:
    Generating RRD graphs...ERROR: Dangling Comment
    ERROR: No <v> tag found
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.
    
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.
    
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.</v>
    


  • I have seen this as well. I used the auto upgrade functionality built into PFSense. I had to power cycle the box in order to get my interfaces back up. Now i get a bunch of division by zero errors in my GUI and I am not able to SSH in, after entering the username and password putty closes. It also appears that none of the log files are reading into the GUI (as in none of the tabs show any information)

    Version 2.1-RELEASE (i386)
    built on Wed Sep 11 18:16:22 EDT 2013
    FreeBSD 8.3-RELEASE-p11
    Platform nanobsd (2g)

    These are some of the errors I get on the index page where the cpu usage and mempory usage bars are:

    Warning: array_combine(): Both parameters should have an equal number of elements in /usr/local/www/includes/functions.inc.php on line 123 Warning: array_combine(): Both parameters should have an equal number of elements in /usr/local/www/includes/functions.inc.php on line 125 Warning: array_sum() expects parameter 1 to be array, boolean given in /usr/local/www/includes/functions.inc.php on line 127 Warning: array_sum() expects parameter 1 to be array, boolean given in /usr/local/www/includes/functions.inc.php on line 128 Warning: Division by zero in /usr/local/www/includes/functions.inc.php on line 220 Warning: Division by zero in /usr/local/www/includes/functions.inc.php on line 175 Warning: Division by zero in /usr/local/www/includes/functions.inc.php on line 155 0%


  • Netgate Administrator

    You have low memory errors? Out of swap? File system full?

    My own box is now back up and running with no long term problems (or none I've found yet). Your error looks more serious.

    Steve



  • Have the same error with PC Alix and 2GB CF Card after autoupdate :(



  • You can see out of diskspace due to the RRD upgrade process not cleaning up /tmp.

    Adding IPv6 to the RRD graphics basicly doubles the information, and it was keeping two copies of the XML dump on disk. So >4 times the data on /tmp which by default is 40mb.

    This Pull Request, fixes the lack of cleanup.



  • Warning: array_combine(): Both parameters should have an equal number of elements in /usr/local/www/includes/functions.inc.php on line 123 
    Warning: array_combine(): Both parameters should have an equal number of elements in /usr/local/www/includes/functions.inc.php on line 125 
    Warning: array_sum() expects parameter 1 to be array, boolean given in /usr/local/www/includes/functions.inc.php on line 127 
    Warning: array_sum() expects parameter 1 to be array, boolean given in /usr/local/www/includes/functions.inc.php on line 128 
    Warning: Division by zero in /usr/local/www/includes/functions.inc.php on line 220 Warning: Division by zero in /usr/local/www/includes/functions.inc.php on line 175
     Warning: Division by zero in /usr/local/www/includes/functions.inc.php on line 155 0%
    

    Those can all result from system calls out of php code that have failed, presumably due to something like a lack of memory to fork the system processes needed to run the commands to gather the data.
    @firegood: how much memory on your system? is this still happening? does the system log have any "killed" or other messages that look like lack of memory problems?



  • Yes its certainly still going on. Through the webGUI none of the "system log" tabs show any information, they are all empty. SSH closes the connection after entering a valid username and password, WinSCP will not connect either. I am not able to reboot it through the GUI either. It comes up that it is stopping the package services but never actually reboots. I am going to have to console in and see if i can retrieve the logs through the shell. Any idea what logs i should look at in particular? I only know enough about the command line to squeak by.

    I believe it has 512MB of ram (certainly no less) and it is not a wrap or alix board, but rather a mini itx mother board with a P4 processor running on the CF card via an ATA adapter.

    I have tried:
    Hard power off/on
    Attempting to manually flash the latest upgrade package through the GUI - Get message that the file is corrupt.
    Config restore - didnt help any.



  • I haven't had these problems, but what I wonder is how close to full were these drives before the upgrades?


  • Banned

    Sounds like about time to reimage and restore the config… Still don't see any info on what packages you installed there.



  • I was going to pickup a new CF tomorrow and re-flash it.

    Packages:
    Avahi
    Cron
    Iperf
    Open VPN Client Export Utility
    pfflowd


  • Banned

    Hmmm… the avahi thing is pretty huge when starting. Also tons of dependencies there. Try to nuke and see if it helps.



  • It sounds like /var is getting filled by something. A few simple commands:

    [2.1-RELEASE][root@myrouter]/home/phil.davis(2): df
    Filesystem        512-blocks   Used   Avail Capacity  Mounted on
    /dev/ufs/pfsense0    1859358 510953 1199657    30%    /
    devfs                      2      2       0   100%    /dev
    /dev/ufs/cf           101055   7293   85678     8%    /cf
    /dev/md0               78812   1424   71084     2%    /tmp
    /dev/md1              118492  40848   68168    37%    /var
    devfs                      2      2       0   100%    /var/dhcpd/dev
    

    See which has Avail down to 0.

    [2.1-RELEASE][root@myrouter]/home/phil.davis(3): cd /var/log
    [2.1-RELEASE][root@myrouter]/var/log(4): ls
    dhcpd.log      gateways.log   lastlog        ntpd.log       portalauth.log relayd.log     system.log     wireless.log
    dmesg.boot     ipsec.log      lighttpd.log   openvpn.log    ppp.log        resolver.log   userlog
    filter.log     l2tps.log      ntp            poes.log       pptps.log      routing.log    vpn.log
    

    There should be a bunch of log files in /var/log

    [2.1-RELEASE][root@myrouter]/var/log(15): clog system.log | grep ROUTING
    Sep 15 12:20:57 myrouter php: rc.bootup: ROUTING: setting default route to 10.49.94.250
    

    Most of the logs are circular logs. Use "clog" to display the content, and then "grep" for interesting strings…


  • Banned

    Pretty much convinced it's the Avahi thing messing this up. Cf. http://forum.pfsense.org/index.php?topic=61289.0


  • Netgate Administrator

    It wasn't until my box didn't come back up some time after hitting the upgrade button that I realised I'd not tried it on any box with <1GB of RAM. My box didn't/doesn't have any large packages running. The package that failed to install was lcdproc-dev which uses almost no resources under normal conditions.
    Perhaps 60MB just isn't enough for /var? Have the ram disk sizes changed between 2.0.3 and 2.1?
    Anyway it seems like it was the actual package re-installation process that was causing problems, once I was past that I've had no further problems.
    For reference, after initial reboot when the box had stopped doing anything I couldn't reboot even from the console menu. I have to drop to the command line and issue 'reboot now'. Even then it errored a bit but did reboot.

    Steve


  • Banned

    @stephenw10:

    Perhaps 60MB just isn't enough for /var? Have the ram disk sizes changed between 2.0.3 and 2.1?

    This is what I have managed to load on a poor Alix testbox, with default 60MB /var, snapshots updated very frequently without any such issues.

    (Note: the gwled/blinkled stuff is only used when needed, since it's a CPU hog on this poor box). So… definitely not generic issue there.

    As for avahi, this one should most definitely be unavailable on nanobsd. And looking at the footprint and resulting attack surface, I frankly don't think anyone wants to run such thing on their firewall, at all.

    P.S. Trying to repeat the above setup is NOT suggested for production boxes, LOL...  :P


  • Netgate Administrator

    You haven't even loaded Squid or Snort, not even trying!  :P

    Yes, I agree though, it's not a matter of how many packages you have installed but of actually installing them. Downloading and extracting the package and it's dependencies (now all in the pbi?) uses a lot of space. Somehow during the upgrade there is less space available and the file system ran out in my case. Since the size of /var and /tmp are fixed (unless you've changed them) I can't see how the RAM size has much effect.  :-\

    Steve



  • Thats just not smart!



  • I've tried again with no packages installed and got the same error. Now im flashing 2.1 by hand :(


  • Netgate Administrator

    Which error specifically are you referring to?
    Even if none of us have an answer at the moment it helps to at least document the problems you're having and what you've tried to correct it.

    Steve



  • I am having a similar issue upgrading from 2.03 to 2.1 on an Alix 2D3 w/4 GB card and nano image:

    Both procedures below end up with the following errors:

    
    Generating RRD graphs...done.
    Updating configuration...............Generating RRD graphs...done.
    ...Update RRD database wan-traffic.rrd.
    Update RRD database wan-packets.rrd.
    Update RRD database lan-traffic.rrd.
    Update RRD database lan-packets.rrd.
    Update RRD database ipsec-traffic.rrd.
    Update RRD database ipsec-packets.rrd.
    Generating RRD graphs...done.
    pid 786 (rrdtool), uid 0 inumber 2050 on /var: filesystem full
    
    /var: write failed, filesystem is full
    ....done.
    

    At reboot I get:

    
    Enter an option: 
    pfSense is now shutting down ...
    
    /var: write failed, filesystem is full
    
    /var: write failed, filesystem is full
    Sep 16 18:26:51 lighttpd[33063]: (server.c.1546) server stopped by UID = 0 PID = 1 
    
    

    Procedures (How I got the errors:)

    1. Auto upgrade using webgui (2.03 -> 2.1)

    2a. Backup 2.03 config
    2b. Wipe SD card; flash 2.1
    2c. Restore 2.03 config

    So it seems that I have no good upgrade path from 2.03 to 2.1 unless I'm missing something. Also, following a 2.1 upgrade, I get following at subsequent boots:

    
    Generating RRD graphs...ERROR: No  tag found
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.
    
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.
    
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.
    
    ERROR: Incompatible file version, detected version . This is not supported by the version 0003 restore tool.
    
    done.
    
    

    Many thanks in any help with this.


  • Netgate Administrator

    I assume you mean CF card not SD?

    I had those RRD incompatible version errors. They didn't seem to cause much of a problem. I think lost some traffic data from one WAN. Only had it one one boot though.

    Steve


  • Netgate Administrator

    Hmm, may have spoken too soon:

    
    Sep 18 00:00:05	kernel: pid 60237 (rrdtool), uid 0 inumber 4091 on /var: filesystem full
    Sep 18 00:00:04	kernel: pid 59098 (rrdtool), uid 0 inumber 4084 on /var: filesystem full
    
    

    Time to increase /var perhaps.

    Steve



  • Ohhhhhhhh how the mighty have fallen…  ;D

    Give it 10 minutes of tinkering and you will be all aces again  ;)



  • I got too system full but my hard disk had 29G free  ???


  • Netgate Administrator

    Hmm, odd. You hadn't chosen to use a ramdisk for /var?

    Steve



  • No flash drives, no ram disks here:

    [root@xxx]/root: mount
    /dev/ad0s1a on / (ufs, local)
    devfs on /dev (devfs, local)
    /dev/md0 on /var/run (ufs, local)
    devfs on /var/dhcpd/dev (devfs, local)

    [root@xxx]/root: df -h
    Filesystem    Size    Used  Avail Capacity  Mounted on
    /dev/ad0s1a    35G    2.8G    29G    9%    /
    devfs          1.0k    1.0k      0B  100%    /dev
    /dev/md0      3.6M    48k    3.3M    1%    /var/run
    devfs          1.0k    1.0k      0B  100%    /var/dhcpd/dev



  • Has anyone found an answer to this, i'm in the same boat as NetVicious, except I have a 236GB Hdd i'm running the firewall off with 4Gb ram, i'm to scared to reboot as my whole network uses this firewall, its currently My internet connection, and bridges 3 networks together, Cant run the WEB gui, i get Fatal error: Unknown: apc_fcntl_rdlock failed: in Unknown on line 0 the Firewall monitor is also saying something about the Hdd's Inode's hard drive formatted incorrectly? not sure.
    Any one fixed it?


  • Netgate Administrator

    I fixed the errors I was seeing by expanding /var from 60 to 80MB. Your problem looks bigger than that though.

    Steve

    Edit: typo



  • On my Alix I did not find a solution. I finally reflashed my compact flash card with a fresh pfSense 2.1. I've described this besides other experiences in this thread: http://forum.pfsense.org/index.php/topic,68531.0.html.

    As my WLAN card remains dead after upgrading I am currently forced to stay with pfSense 2.0.3.



  • @stephenw10:

    I fixed the errors I was seeing by expanding /var from 60 to 80MB. Your problem looks bigger than that though.

    Steve

    Edit: typo

    Can you tell me how? I have the same problem. It raises when I configure IGMP Proxy on a clean install of 2.1.
    When I delete the two gateways, everything runs normal….

    Matthias


  • Netgate Administrator

    There's an option to do it in System: Advanced: Miscellaneous:
    You have to reboot to see the change.

    Steve


Log in to reply