Preparing for Recovery from a failed install/upgrade



  • TLDR; I want a way to easily create a backup image (excluding one very big directory) and easily restore that image (without the big directory) in case of a drive/upgrade failure.

    –-

    pfSense is such a wonderful piece of software that it has become absolutely critical for the operation of my home network.

    Besides being my boarder firewall, it handles my DHCP, and routing for multiple VLANS from a managed switch that connects everything on my network.  In short if an upgrade goes south (or I have a drive fail), i'm in deep s**t as I won't have easy access to the internet to get things back up.

    I also have a few custom scripts/cron jobs for monitoring, and several packages that could get screwed that will be a pain to restore as well.

    I would ideally like to have a way to image the critical elements of my pfSense boot disk to a USB drive.

    As things sit at the moment I need to exclude /var/db/ntopng from the backup because it's about 68G.

    du -d 1 -h /
    4.0K    /.snap
    1.1M    /bin
    58M    /boot
    4.0K    /dev
    7.4M    /etc
    9.3M    /lib
    168K    /libexec
    4.0K    /media
    8.0K    /mnt
    4.0K    /proc
    7.6M    /rescue
    501M    /root
    4.7M    /sbin
    536K    /tmp
    849M    /usr
    69G    /var
    9.7M    /cf
    12K    /conf.default
    23M    /home
    71G    /

    du -d 1 -h /var/db/
    4.0K    /var/db/entropy
    4.0K    /var/db/freebsd-update
    4.0K    /var/db/hyperv
    4.0K    /var/db/ipf
    4.0K    /var/db/ntp
    5.9M    /var/db/pkg
    4.0K    /var/db/ports
    4.0K    /var/db/portsnap
    22M    /var/db/rrd
    4.0K    /var/db/pingstatus
    4.0K    /var/db/pingmsstatus
    8.0K    /var/db/suricata
    576K    /var/db/aliastables
    8.0K    /var/db/sudo
    140K    /var/db/fontconfig
    4.0K    /var/db/nut
    68G    /var/db/ntopng
    200K    /var/db/vnstat
    341M    /var/db/pfblockerng
    44K    /var/db/snort
    4.0K    /var/db/redis
    69G    /var/db/

    I would think that imaging 3.5G to a USB drive should go fairly quickly, as would a restore.  (If I lose the ntopng data on restore, no big deal, but I'd prefer not to have to flush it just to do a backup.)

    In an ideal world, I would be able to create the image using a script that would run on the pfSense box while the firewall was running , but if that's not possible, I'm OK with that.  A bootable USB to do the imaging would be fine.

    I'd like to store the backed up image on the bootable USB drive with software necessary to restore the image.

    So if things go south, the I just plug a kb/monitor into the firewall, boot to the USB drive, restore the image from the USB drive, reboot, and things are right back to where they were.

    ZFS snapshots might be partial solution to this problem if the overhead of ZFS isn't too high, but I'd really like to image before risking an upgrade to 2.4

    I'm not sure if there is an easy way to create a bootable FreeBSD USB with either a second partition (or even just an empty directory) to use for the backup.  A lot of bootable drives don't make that easy.

    Would I run into trouble doing a tar/gzip of / (excluding the /proc and /var/db/ntopng) on a running pfSense?

    What about rsync?  Would it be easy to add rsync to /rescue?

    Thanks in advance for any comments / suggestions as to the best way to accomplish this task would be much appreciated.


  • Moderator

    And why is Re-installing and restoring your backed-up config.xml no valid way for you to do this? Keep an image of pfSense Installation offline available (or a USB key) and make a daily backup of your config.xml. Then re-install and use your config file to restore, packages are reinstalled automatically and config is saved, too. All you'd lose is RRD graphs  and statistics etc.

    So why do an image backup?



  • @JeGr:

    And why is Re-installing and restoring your backed-up config.xml no valid way for you to do this? Keep an image of pfSense Installation offline available (or a USB key) and make a daily backup of your config.xml. Then re-install and use your config file to restore, packages are reinstalled automatically and config is saved, too. All you'd lose is RRD graphs  and statistics etc.

    So why do an image backup?

    Thanks for the reply… the main reason is certainty.

    I've seen posts of people having trouble restoring configuration files, So I would prefer to have an extra level of backup that I can guarantee I can restore.

    I can test an image restore procedure once, and know that as long as the hardware doesn't change, and the media/hardware doesn't fail I am guaranteed that I can restore and I won't have very much downtime.  I have no way of easily testing that my config will restore without actually breaking the system.
    Downloading all the packages takes longer, and if something goes wrong I've got a nasty mess to deal with.  If there is a cheap (USB drives are cheap) and relatively easy way to accomplish this goal it just seems to make sense.

    Am I missing something?

    Maybe ZFS snapshots will partially eliminate this issue IF the overhead of ZFS doesn't create a problem.  Is a new install going to expect to duplicate everything on the disk, or just the changes?  If it's not just the changes that /var/db/ntopng is going to be a problem.

    Comments/suggestions?



  • I think trying to image the drive might be more of an uncertaintity than using the config.xml.

    I actually keep a dedicated USB key stuck in my pfSense box, and anytime I upgrade pfSense via the GUI, I recreate the installation key.

    Every night, I have a script that runs from a 'nix VM that:
        1. Grabs a copy of the current config.xml and names it with the current date.
        2. Stores above in two locations, one on my FreeNAS and one on my primary workstation
        4. Removes any XML files older than x days (can specify in my script)
        5.  My XML also contains my RRD, so I don't even lose that.  File size is ~3.3MB presently.
        6.  Sends me an email notifying me of success for failure of the backup, and the file size.

    With the USB key in the pfSense, I can be up and running from a failed machine in about 6 minutes.

    Unfortunately, I've tested it twice in the last month.  Once was with a failed USB drive, and one was today, with a failed 2.4 upgrade.    The upside is recovery worked in both instances much smoother that I had any right to hope for.



  • You made me paranoid.

    Just realized I hadn't tested my script after today's update to 2.4

    So, I did that.  It works.  Shew.

    Email Output:

    vm-NixServices's backups finished at Fri Oct 13 14:08:17 CDT 2017.

    Backup Process Took: 0 hours 0 minutes 47 seconds.

    7 jobs finished successfully and 0 failed.
        Scripts Backup was successful.
        pfSense Backup was successful. (Filesize:3070 KB.)
        Plex Backup was successful.
        FreeNAS Backup was successful.
        Plex GoDisk was successful.
        Flat Backup was successful.
        Media Backup was successful.

    / 7.0G used, 13G available (36% in use).
    /home 48M used, 38G available (1% in use).
    /mnt/freenas/media 7.9T used, 15T available (35% in use).
    /mnt/x299prime 8.3T used, 6.4T available (57% in use).
    /mnt/godisk 6.1T used, 1.3T available (84% in use).

    System has been up 2 days, 23 hours, 59 minutes.

    0 packages can be updated.
    0 updates are security updates.

    Log attachement set to Off



  • @Craash:

    With the USB key in the pfSense, I can be up and running from a failed machine in about 6 minutes.

    Unfortunately, I've tested it twice in the last month.  Once was with a failed USB drive, and one was today, with a failed 2.4 upgrade.    The upside is recovery worked in both instances much smoother that I had any right to hope for.

    …what could I offer in trade for you to post that script?  I have some really excellent coffee beans, some experimental homebrew that's still in the bottling stages, and a stray cat that I occasionally leave some food out for (could probably box it up cf xkcd.com/325).


  • Netgate

    I've seen posts of people having trouble restoring configuration files

    That is the supported restoration procedure. If everyone tried to take a drive image you would see posts from people having trouble with that, too.


  • Rebel Alliance Developer Netgate

    Honestly, doing a full disk or filesystem image is both overkill and more error prone than relying on the built-in methods.

    Keep regular backups of config.xml, using ACB also helps you here if you have a Gold subscription, or you can script your own in some cases.

    If you only have a failed upgrade and the disk contents are accessible but inconsistent/not working, then you can easily reinstall and recover from that using "Recover config.xml" in the 2.4 installer, and there are also ways to load the config from USB during or just after install.

    All of those would have a firewall back up and running, practically guaranteed, in just a few minutes.



  • @SIGUSRpi:

    @Craash:

    With the USB key in the pfSense, I can be up and running from a failed machine in about 6 minutes.

    Unfortunately, I've tested it twice in the last month.  Once was with a failed USB drive, and one was today, with a failed 2.4 upgrade.    The upside is recovery worked in both instances much smoother that I had any right to hope for.

    …what could I offer in trade for you to post that script?  I have some really excellent coffee beans, some experimental homebrew that's still in the bottling stages, and a stray cat that I occasionally leave some food out for (could probably box it up cf xkcd.com/325).

    Well, I had to strip it down alot, becuase it actually runs about 7 different backup's on my end, but here it is.  Let me know if I missed anything.

    The user variables end with FILESIZE_CHECK with the EXCEPTION of line 82, which is

    Delete config files older than 30 days

    find /mnt/freenas/backup/pfsense/* -name *.xml -mtime +30 -exec rm {} \

    This is the path to your pfsense backups and the number of days you wish to keep your backups.

    #!/bin/bash
    
    # rSync Backup v2.1 by me @ kaacee.com
    #####################################################
    # Full Featured rSync Script with:
    #   Email Reporting
    #	Filesystem Report
    #	Uptime Reporting
    #	Updates available
    #
    # Requirements:
    # 	sendEmail On Debian, apt-get install sendemail
    #
    # History:
    # v1.0 - 20170215 - 1st release.
    # v1.1 - 20170130 - Improved compatibility.
    #
    # 
    # PS: Feel free to distribute but kindly retain 
    # the credits
    #####################################################
    
    # ------------------------------------------------- #
    # -------------- User Configuration --------------- #
    
    # Send email report to:
    stTo="me@email.com"
    
    # Email from address:
    stFrom="me@email.com"
    
    # Include Log in Email:
    stIncludeLog=No	#Yes or No
    
    # Email server to use:
    # This script uses sendemail (NOT sendmail). 
    # To install: apt install sendemail
    stServer="FILL ME IN"
    
    # This is the ip or DNS name of your pfSense Device
    BACKUP_HOST=pfsense.local.corp
    # This should be a pfSense user with ONLY "WebCfg - Diagnostics: Backup & Restore" permission
    BACKUP_USER=backup
    # Password for this user
    BACKUP_PASSWORD=password
    # Backup Location, can be a local or mounted path
    BACKUP_LOCATION=/mnt/freenas/backup/pfsense/pfSense.local.corp-`date +%Y.%m.%d`.xml
    FILESIZE_CHECK=2048
    
    # Get CSRF token
    wget -qO- --keep-session-cookies --save-cookies cookies.txt \
      --no-check-certificate https://${BACKUP_HOST}/diag_backup.php \
      | grep "name='__csrf_magic'" | sed 's/.*value="\(.*\)".*/\1/' > csrf.txt
    
    # Log into pfSense
    wget -qO- --keep-session-cookies --load-cookies cookies.txt \
      --save-cookies cookies.txt --no-check-certificate \
      --post-data "login=Login&usernamefld=${BACKUP_USER}&passwordfld=${BACKUP_PASSWORD}&__csrf_magic=$(cat csrf.txt)" \
      https://${BACKUP_HOST}/diag_backup.php  | grep "name='__csrf_magic'" \
      | sed 's/.*value="\(.*\)".*/\1/' > csrf2.txt
    
     # Save configuration file
     wget --keep-session-cookies --load-cookies cookies.txt --no-check-certificate \
      --post-data "download=download&__csrf_magic=$(head -n 1 csrf2.txt)" \
      https://${BACKUP_HOST}/diag_backup.php -O ${BACKUP_LOCATION}
    
     #Verify
    FILESIZE=$(wc -c <${BACKUP_LOCATION})
    FILESIZE=$(($FILESIZE / 1024))
    if [ $FILESIZE -ge $FILESIZE_CHECK ]; then
    		strResults="$strResults     $strBackupName was successful. (Filesize:$FILESIZE KB.)"$'\n'
    else
    		strResults="$strResults     $strBackupName failed. (Filesize:$FILESIZE KB.)"$'\n'
    fi
    
      # Clean up
    rm cookies.txt
    rm csrf.txt
    rm csrf2.txt
    unset BACKUP_HOST BACKUP_USER BACKUP_PASSWORD BACKUP_LOCATION FILESIZE_CHECK
    
    # Delete config files older than 30 days
    find /mnt/freenas/backup/pfsense/* -name *.xml -mtime +30 -exec rm {} \
    
    # Calculate Email Subject and format  NOTHING TO EDIT HERE
    strEmailSubject="$(BACKUP_HOST)'s Backup Complete"
    
    # Start Sending Email  NOTHING TO EDIT HERE
    /usr/bin/sendEmail -s $stServer -f $stFrom -t $stTo -u $strEmailSubject -m<< EOF
    
    $(hostname)'s backups finished at $(date).  
    
    $strResults
    
    $strFileSystemReport
    
    System has been $(uptime -p).
    
    $(/etc/update-motd.d/90-updates-available)
    
    EOF
    
    # End Sending Email
    


  • Since I only have one system to backup, I went a simpler route.

    1. Format USB stick as FAT and insert into firewall
    2. Create mount point: mkdir /media/usb
    3. Copy following shell script to firewall

    #!/bin/sh
    DATE=`date +%Y%m%d`
    
    # mkdir /media/usb
    
    mount_msdosfs /dev/da0s1 /media/usb
    mount | grep /dev/da0s1 > /dev/null
    if [ "$?" -eq "0" ]; then
    	cp /cf/conf/config.xml /media/usb/config_$DATE.xml
    	find /media/usb/* -name config_*.xml -mtime +180 -exec rm {} \;
    else
    	# do something
    fi
    umount /media/usb
    
    # install cron package and add cron job
    # 0 4 * * Sun /bin/sh /root/backup.sh > /dev/null
    

    4. Install package cron
    5. Add to cron

    Presto!  Weekly backups, retained for 6 months



  • @jimp:

    Honestly, doing a full disk or filesystem image is both overkill and more error prone than relying on the built-in methods.

    Keep regular backups of config.xml, using ACB also helps you here if you have a Gold subscription, or you can script your own in some cases.

    If you only have a failed upgrade and the disk contents are accessible but inconsistent/not working, then you can easily reinstall and recover from that using "Recover config.xml" in the 2.4 installer, and there are also ways to load the config from USB during or just after install.

    All of those would have a firewall back up and running, practically guaranteed, in just a few minutes.

    yes that works fine i also did that. But i have a few custom files for my DNS resolver. So when reinstall and then do the config i still need to copy certain files to the correct directory otherwise Unbound etc will not start.


  • Rebel Alliance Developer Netgate

    @Music:

    yes that works fine i also did that. But i have a few custom files for my DNS resolver. So when reinstall and then do the config i still need to copy certain files to the correct directory otherwise Unbound etc will not start.

    Then you can utilize the "Backup" package to easily get an archive of those files to restore later.

    Also if you put use a glob to specify your files in the advanced options of the resolver (e.g. "custom*.conf") then unbound won't fail if it can't find the files.

    Or you could get even trickier and use the System Patches package to store them in config.xml create them for you, but that's not quite so easy to maintain.



  • @Music:

    @jimp:

    Honestly, doing a full disk or filesystem image is both overkill and more error prone than relying on the built-in methods.

    Keep regular backups of config.xml, using ACB also helps you here if you have a Gold subscription, or you can script your own in some cases.

    If you only have a failed upgrade and the disk contents are accessible but inconsistent/not working, then you can easily reinstall and recover from that using "Recover config.xml" in the 2.4 installer, and there are also ways to load the config from USB during or just after install.

    All of those would have a firewall back up and running, practically guaranteed, in just a few minutes.

    yes that works fine i also did that. But i have a few custom files for my DNS resolver. So when reinstall and then do the config i still need to copy certain files to the correct directory otherwise Unbound etc will not start.

    Add SCP to your script and then you can copy whatever file from whatever location you want.



  • @jimp:

    @Music:

    yes that works fine i also did that. But i have a few custom files for my DNS resolver. So when reinstall and then do the config i still need to copy certain files to the correct directory otherwise Unbound etc will not start.

    Then you can utilize the "Backup" package to easily get an archive of those files to restore later.

    Also if you put use a glob to specify your files in the advanced options of the resolver (e.g. "custom*.conf") then unbound won't fail if it can't find the files.

    Or you could get even trickier and use the System Patches package to store them in config.xml create them for you, but that's not quite so easy to maintain.

    ok thank you will look into that