Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Backup before upgrade fills hard drive, can't find kernel

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    10 Posts 3 Posters 2.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L
      lexrc
      last edited by

      After attempting an upgrade to 2.2.1 from 2.2, the system was left unable to boot.  I had a 12GB drive with several GB allocated to squid cache.  The auto-backup failed leaving the hard drive full.  As you can see, it appears the upgrade continued after the backup failed.

      [2.2-RELEASE][root@pfSense.localdomain]/tmp/hdrescue/cf/conf: df -h /dev/ada*
      df: /dev/ada0: Invalid argument
      df: /dev/ada0s1: Invalid argument
      df: /dev/ada0s1b: Invalid argument
      Filesystem      Size    Used  Avail Capacity  Mounted on
      /dev/ada0s1a    12G    11G  -241M  102%    /tmp/hdrescue
      [2.2-RELEASE][root@pfSense.localdomain]/tmp/hdrescue/cf/conf:

      –-------------------------------------------------------------------------------------------------------------------------------------------------------------------

      [2.2-RELEASE][root@pfSense.localdomain]/tmp/hdrescue/cf/conf: less firmware_update_misc_log.txt

      tar: Failed to set default locale

      bzcat: Compressed file ends unexpectedly;
              perhaps it is corrupted?  Possible reason follows.
      bzcat: No such file or directory
              Input file = /tmp/chflags.dist.usr.bz2, output file = (stdout)

      It is possible that the compressed file(s) have become corrupted.
      You can use the -tvv option to test integrity of such files.

      You can use the `bzip2recover' program to attempt to recover
      data from undamaged sections of corrupted files.

      shutdown: [pid 61121]
      firmware_update_misc_log.txt (END)

      –-------------------------------------------------------------------------------------------------------------------------------------------------------------------
      [2.2-RELEASE][root@pfSense.localdomain]/tmp/hdrescue/cf/conf: less upgrade_log.txt

      pfSenseupgrade upgrade starting

      Sat Mar 21 02:54:56 EDT 2015

      -rw-r–r--  1 root  wheel    82M Mar 21 02:17 /root/latest.tgz

      MD5 (/root/latest.tgz) = 0dffafc9f3f815bc0cdba775b62ccdaf

      /dev/ada0s1a on / (ufs, local)
      devfs on /dev (devfs, local)
      /dev/md0 on /var/run (ufs, local)
      devfs on /var/dhcpd/dev (devfs, local)
      fdescfs on /dev/fd (fdescfs)
      /dev/md10 on /var/tmp/havpRAM (ufs, local, soft-updates)

      last pid: 17437;  load averages:  0.56,  0.58,  0.52  up 8+15:37:36    02:54:57
      72 processes:  1 running, 67 sleeping, 4 zombie

      Mem: 167M Active, 1472M Inact, 289M Wired, 11M Cache, 211M Buf, 21M Free
      Swap: 4096M Total, 2470M Used, 1626M Free, 60% Inuse

      PID USERNAME  THR PRI NICE  SIZE    RES STATE    TIME    WCPU COMMAND
      42616 proxy      17  20    0  3173M  676M uwait  20:50  0.00% squid
      3668 nobody      1  20    0 23164K  4624K select  5:51  0.00% darkstat
      19251 root        1  20    0 12464K  1840K select  2:53  0.00% apinger
      23730 root        1  20    0 54892K  6500K kqread  2:12  0.00% lighttpd
      14439 root        1  20    0 16812K  1864K bpf      1:17  0.00% filterlog
      68981 root        1  20    0 14676K  2020K select  0:37  0.00% syslogd
        245 root        1  20    0  219M  8992K kqread  0:29  0.00% php-fpm
      73747 _pflogd    1  20    0 14752K  1972K bpf      0:29  0.00% pflogd
      76524 _pflogd    1  20    0 14752K  1968K bpf      0:28  0.00% pflogd
      90146 _spamd      1  20    0 23004K  3632K bpf      0:19  0.00% spamlogd
      91262 _spamd      1  20    0 23004K  3632K bpf      0:18  0.00% spamlogd
      72923 root        1  24    0 17144K  704K wait    0:11  0.00% sh
      19494 root        1  20    0 28332K  2016K piperd  0:07  0.00% rrdtool
      42954 root        1  20    0 16672K  1192K nanslp  0:02  0.00% cron
      3978 nobody      1  20    0 19068K  1220K sbwait  0:01  0.00% darkstat
        262 root        1  40  20 19032K  1028K kqread  0:01  0.00% check_reload_status
        276 root        1  20    0 13164K  364K select  0:00  0.00% devd
      67782 root      17  20    0  207M  7084K uwait    0:00  0.00% charon

      bzip2: I/O or other error, bailing out.  Possible reason follows.
      bzip2: No space left on device
              Input file = (stdin), output file = (stdout)
      tar: Failed to set default locale
      x ./tmp/pre_upgrade_command
      Firmware upgrade in progress…
      Content-type: text/html

      Installing /root/latest.tgz.
      tar: Failed to set default locale
      ./usr/local/lib/ipsec/libstrongswan.so.0: Write failed
      ./usr/local/lib/ipsec/libhydra.so.0: Write to restore size failed
      ./usr/local/lib/ipsec/libcharon.so.0: Write to restore size failed
      ./usr/local/lib/ipsec/libcharon.so: Write to restore size failed
      ./usr/local/lib/ipsec/libhydra.so: Write to restore size failed
      ./usr/local/lib/ipsec/libradius.so: Write to restore size failed
      ./usr/local/lib/ipsec/libradius.so.0: Write to restore size failed
      ./usr/local/lib/ipsec/libsimaka.so: Write to restore size failed
      ./usr/local/lib/ipsec/libsimaka.so.0: Write to restore size failed
      ./usr/local/lib/ipsec/libstrongswan.so: Write to restore size failed
      ./usr/local/lib/ipsec/libtls.so: Write to restore size failed
      ./usr/local/lib/ipsec/libtls.so.0: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-addrblock.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-aes.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-attr.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-blowfish.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-cmac.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-constraints.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-curl.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-des.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-dnskey.so: Write to restore size failed
      ./usr/local/lib/ipsec/plugins/libstrongswan-eap-aka-3gpp2.so: Write to restore size failed

      1 Reply Last reply Reply Quote 0
      • D
        doktornotor Banned
        last edited by

        And what solution are you expecting? Get a bigger drive. Or don't backup junk like squid cache.

        1 Reply Last reply Reply Quote 0
        • L
          lexrc
          last edited by

          I'm expecting the upgrade to not kill the box. This was my first upgrade of pfsense. It'd only been running for a week or so so I just rebuilt the vm and gave it a bigger drive.  This gave me the opportunity to perform a cleaner install the second time around.

          Obviously I should have checkpointed the vm before the upgrade, lesson learned, but when doing the auto-upgrade there was a checkbox saying something along the lines of perform a full backup prior to upgrade. This seemed like the safest option BUT instead it killed the system. If I had opted out of the backup I would have been fine.

          The backup failed because the drive was full yet the update proceeded to kill the installation without doing any sanity checks.

          This post was more of a warning to others and should probably be a bug report but I didn't want to file one without posting here first in case it was a known issue.

          1 Reply Last reply Reply Quote 0
          • L
            lexrc
            last edited by

            Also just pointing out the only reason I gave the vm any additional space was to store the squid cache. As a newbie, that seems like a reasonable thing to do.

            1 Reply Last reply Reply Quote 0
            • L
              lexrc
              last edited by

              Submitted a bug report. https://redmine.pfsense.org/issues/4549

              1 Reply Last reply Reply Quote 0
              • K
                kejianshi
                last edited by

                Yeah - A full HD is quite a bug.

                Next time back up the configuration and save it on a desktop somewhere (not the backup while upgrading button)

                Then make a new vm, install a fresh pfsense and restore the config.

                1 Reply Last reply Reply Quote 0
                • L
                  lexrc
                  last edited by

                  Filling up the hard drive during a backup isn't the bug. The bug is trying to extract the update while the drive is full and wiping out the kernel leaving the machine unable to boot. The drive had 40% free space when I started the upgrade.

                  I've never seen another OS start an upgrade without verifying capacity. Windows, Ubuntu, android, ios, macos all do sanity checks. It's not asking that much. I'm even willing to work on the bug myself and submit a patch.

                  Dismissing this out of hand doesn't help.  Remember,I used the UI to kick off the upgrade and chose what appeared to be the conservative path by creating a backup first.  This left the machine in an unusable state.  That is completely unacceptable.

                  1 Reply Last reply Reply Quote 0
                  • K
                    kejianshi
                    last edited by

                    If you have alot of access to the VM, do always make a backup of the config on your desktop somewhere so that if you hit a snag you are not buggered.

                    1 Reply Last reply Reply Quote 0
                    • L
                      lexrc
                      last edited by

                      Yeah, I'll definitely be checkpointing in the future.  I've looked through the code and found a couple of things:

                      • Doing an upgrade on a nano install does check to see that the image is not larger than the partition (but does not check free space) around line 208. https://github.com/pfsense/pfsense/blob/RELENG_2_2/etc/rc.firmware

                      • The pre-upgrade command does not check for free space and the last thing it does is remove all kernels on line 50. https://github.com/pfsense/pfsense/blob/RELENG_2_2/tmp/pre_upgrade_command

                      So what happens is the backup fills the drive, all kernels are deleted then the updated archive is extracted.  This eventually fills the space that was freed by removing the original kernels before the new kernel can be extracted.

                      Fixing this is complicated by the fact that nano installs may not have free space to extract the entire archive, they rely on overwriting files, so checking for free space to fit the entire archive may mean upgrades are blocked when they could possibly complete.

                      I think the easiest way to work around this may be to modify pre_upgrade_command to extract the new kernel into /tmp, verify it is good, then remove old kernels and mv the new /tmp/kernel to /boot/kernel/.  Then if you run out of space extracting the upgrade, you may have mismatched files but you will at least have a kernel file that can boot so you can manually free space and re-extract the archive.

                      Thoughts?

                      1 Reply Last reply Reply Quote 0
                      • D
                        doktornotor Banned
                        last edited by

                        Verifying free space on nanobsd would be utterly pointless. The entire slice is overwritten.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.