Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Fun with zfs , snapshots and rollback

    Scheduled Pinned Locked Moved General pfSense Questions
    2 Posts 1 Posters 3.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • bingo600B
      bingo600
      last edited by bingo600

      NOTE: If you have pfSense plus I recommend you use the new excellent GUI "snapshot feature" under System --> Boot Environment.

      have been reading the thread here
      https://forum.netgate.com/topic/95148/pc-engines-apu2-experiences/577?_=1627030893766

      And that sparked my interest for trying zfs snapshots.

      Initially I went with the "Checkpoint method in the above thread" , but follwing the instructions in the thread ,didn't work for me.

      I was always getting:

      # zpool import zroot
      cannot import 'zroot': pool may be in use from other system
      use '-f' to import anyway
      # zpool import -f zroot
      cannot mount '/zroot': failed to create mountpoint
      

      I decided to go with the snapshots instead , and think i have found a "Clean way" to rollback a snapshot, when booting from (here a 2.4.5-p1) USB install stick , and select the rescue shell.

      Prereqs:
      pfSense has to have been installed on a zfs filesystem
      User must be familiar with ssh and "console/shell commands"

      Begin:
      ssh as admin into your pfSense Box (Select 8 in the menu to get a shell)

      Pre snapshot: Find the zfs pool name (here it's zroot)

      zfs list
      NAME                 USED  AVAIL  REFER  MOUNTPOINT
      zroot                541M  54.7G    88K  /zroot
      zroot/ROOT           525M  54.7G    88K  none
      zroot/ROOT/default   525M  54.7G   525M  /
      zroot/tmp            184K  54.7G   184K  /tmp
      zroot/var           10.2M  54.7G  10.2M  /var
      

      Set snapshot listing on (on the above found name)

      zpool set listsnapshots=on zroot
      

      1: Make recursive snapshot (The part after the @ is the snapshot name ... Of your choice)

      zfs snapshot -r zroot@2.4.5-p1
      

      1.a: Make sure the snapshots for all "partitions are taken" , note the snapshot name after the @

      zfs list              
      NAME                          USED  AVAIL  REFER  MOUNTPOINT
      zroot                         541M  54.7G    88K  /zroot
      zroot@2.4.5-p1                   0      -    88K  -
      zroot/ROOT                    525M  54.7G    88K  none
      zroot/ROOT@2.4.5-p1              0      -    88K  -
      zroot/ROOT/default            525M  54.7G   525M  /
      zroot/ROOT/default@2.4.5-p1      0      -   525M  -
      zroot/tmp                     184K  54.7G   184K  /tmp
      zroot/tmp@2.4.5-p1               0      -   184K  -
      zroot/var                    10.2M  54.7G  10.2M  /var
      zroot/var@2.4.5-p1               0      -  10.2M  -
      

      Now you have made a snapshot named : 2.4.5-p1 , and can revert to that if needed.

      2: Rollback to snapshot (Boot from Install USB , and select : Recovery)

      2.a: Make a "ZFS Root" mountpoint in the tmpfs (Seems like USB disk is "ro")

      mkdir /tmp/mnt
      

      2.b: Import the "ZFS Root"

      zpool import -f -o altroot=/tmp/mnt zroot
      

      2.c: Rollback the interesting partitions (We can discard the tmp partition) & reboot after rollback.

      zfs rollback zroot/var@2.4.5-p1
      zfs rollback zroot/ROOT/default@2.4.5-p1
      zfs rollback zroot/ROOT@2.4.5-p1
      zfs rollback zroot@2.4.5-p1
      shutdown -r now
      

      3: pfSense should now boot up and be excactly in the state when the snapshot was taken.
      Done

      Cleanup snapshot if desired.
      Destroy/Delete (recursive) the snapshot

      zfs destroy -r zroot@2.4.5-p1
      

      !!! Rollback on a running remote system !!!

      This is DANGEROUS and NOT RECOMMENDED, and might fail , and you will need to power off/on (Reset) the system afterwards, as the shutdown doesn't seem to be executed.
      This method is "Pulling the carpet" under all diskpartitions except tmp , and causes numerous coredumps during/after the rollback.
      Only use it you are "Desperate

      But during my tests (6..8) "SSH remote rollbacks" , the system came up functioning , after a "hard remote system poweroff/poweron".
      You need to be able to do a (Reset) poweroff/poweron either physically or via some "Smart power device - APU ??") - Remember your pfSense network & VPN's will be down.

      1: The rollback commands have to be executed from a "shell script" .. Do NOT try to paste them.
      NB : The pfSense Box will drop the SSH connection , and prob. all network access will be lost.

      Rollback the interesting partitions (We can discard the tmp partition) & reboot after rollback.

      1.a: Login via ssh , as admin
      Put the below commands in a shell script file Ie. x.sh

      zfs rollback zroot/var@2.4.5-p1
      zfs rollback zroot/ROOT/default@2.4.5-p1
      zfs rollback zroot/ROOT@2.4.5-p1
      zfs rollback zroot@2.4.5-p1
      shutdown -r now
      

      1.b: Make x.sh executable

      chmod +x x.sh
      

      1.c: Execute the commands in the newly created shell script x.sh

      ./x.sh
      

      1.d: Now wait 60 sec , and then power off/on the system
      Cross your fingers and pray it comes up on the old "snapshot"

      Here is a "Console snip" during the rollback.
      pfSense is NOT happy - But it has booted the roolback system every time after a "Reset" in my tests (8..10 now i think).

      login: pid 35748 (openvpn), jid 0, uid 0: exited on signal 11 (core dumped)
      ovpnc1: link state changed to DOWN
      pid 83413 (ladvd), jid 0, uid 142: exited on signal 11
      pid 49695 (cat), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 762 (devd), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 49710 (sshg-parser), jid 0, uid 0: exited on signal 11
      pid 50002 (sshg-blocker), jid 0, uid 0: exited on signal 11
      pid 83262 (ladvd), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 50370 (sh), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 88734 (zabbix_agentd), jid 0, uid 122: exited on signal 11
      pid 24383 (ntpd), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 87613 (zabbix_agentd), jid 0, uid 122: exited on signal 11
      pid 50091 (sh), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 7911 (pcscd), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 37589 (filterlog), jid 0, uid 0: exited on signal 11 (core dumped)
      pflog0: promiscuous mode disabled
      pid 88170 (zabbix_agentd), jid 0, uid 122: exited on signal 11
      pid 90915 (dpinger), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 55087 (bsnmpd), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 49586 (sh), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 90281 (dpinger), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 337 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 3318 (minicron), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 3062 (minicron), jid 0, uid 0: exited on signal 11 (core dumped)
      pid 14668 (getty), jid 0, uid 0: exited on signal 11 (core dumped)
      

      Right now this works for me.
      But i might try out the "Checkpoint" using the alternate mount , at a later point.

      Hope this helps , and explains why zfs is such a nice thing to use.

      I'm a linux guy , so don't ask me any BSD related questions 😊
      And if you don't know ssh & vi don't even try this.

      Edit: All test made on 2.4.5-p1 (snapshot + rollback base) and 2.5.2 as the "upgrade"

      /Bingo

      If you find my answer useful - Please give the post a šŸ‘ - "thumbs up"

      pfSense+ 23.05.1 (ZFS)

      QOTOM-Q355G4 Quad Lan.
      CPUĀ  : Core i5 5250U, Ram : 8GB Kingston DDR3LV 1600
      LANĀ  : 4 x Intel 211, DiskĀ  : 240G SAMSUNG MZ7L3240HCHQ SSD

      bingo600B 1 Reply Last reply Reply Quote 5
      • bingo600B
        bingo600 @bingo600
        last edited by bingo600

        @bingo600 said in Fun with zfs , snapshots and rollback:

        zpool set listsnapshots=on zroot

        Just made a snapshot on version 2.5.2

        Since Netgate changed the zfs root-name from zroot to pfSense
        On the new 2.5.2 CE version. And made some other zfs changes.

        I decided to make a full reinstall of my "boxes", booting from a 2.5.2 USB stick , and reinstalling from scratch.

        This is the new layout on my boxes

        root: zfs list
        NAME                   USED  AVAIL  REFER  MOUNTPOINT
        pfSense               1.02G   222G    96K  /pfSense
        pfSense/ROOT           800M   222G    96K  none
        pfSense/ROOT/default   800M   222G   800M  /
        pfSense/cf            5.58M   222G    96K  /cf
        pfSense/cf/conf       5.48M   222G  5.48M  /cf/conf
        pfSense/home           212K   222G   212K  /home
        pfSense/tmp            476K   222G   476K  /tmp
        pfSense/var            228M   222G  3.37M  /var
        pfSense/var/cache      120K   222G   120K  /var/cache
        pfSense/var/db         223M   222G   223M  /var/db
        pfSense/var/empty       96K   222G    96K  /var/empty
        pfSense/var/log        880K   222G   880K  /var/log
        pfSense/var/tmp        136K   222G   136K  /var/tmp
        

        I just ran the above commands with the new zfs root , names pfSense

        zfs list
        
        zpool set listsnapshots=on pfSense
        
        zfs snapshot -r pfSense@2.5.2
        

        Here's the layout after the snapshot.

        /root: zfs list
        NAME                         USED  AVAIL  REFER  MOUNTPOINT
        pfSense                     1.02G   222G    96K  /pfSense
        pfSense@2.5.2                   0      -    96K  -
        pfSense/ROOT                 800M   222G    96K  none
        pfSense/ROOT@2.5.2              0      -    96K  -
        pfSense/ROOT/default         800M   222G   800M  /
        pfSense/ROOT/default@2.5.2      0      -   800M  -
        pfSense/cf                  5.58M   222G    96K  /cf
        pfSense/cf@2.5.2                0      -    96K  -
        pfSense/cf/conf             5.48M   222G  5.48M  /cf/conf
        pfSense/cf/conf@2.5.2           0      -  5.48M  -
        pfSense/home                 212K   222G   212K  /home
        pfSense/home@2.5.2              0      -   212K  -
        pfSense/tmp                  476K   222G   476K  /tmp
        pfSense/tmp@2.5.2               0      -   476K  -
        pfSense/var                  230M   222G  3.37M  /var
        pfSense/var@2.5.2               0      -  3.37M  -
        pfSense/var/cache            120K   222G   120K  /var/cache
        pfSense/var/cache@2.5.2         0      -   120K  -
        pfSense/var/db               225M   222G   223M  /var/db
        pfSense/var/db@2.5.2        1.78M      -   223M  -
        pfSense/var/empty             96K   222G    96K  /var/empty
        pfSense/var/empty@2.5.2         0      -    96K  -
        pfSense/var/log              952K   222G   880K  /var/log
        pfSense/var/log@2.5.2         72K      -   880K  -
        pfSense/var/tmp              136K   222G   136K  /var/tmp
        pfSense/var/tmp@2.5.2           0      -   136K  -
        

        I haven't played with restore etc. yet, but expect it to behave as above.
        We might (will) have to take the new partitions made in 2.5.2 into consideration.

        /Bingo

        If you find my answer useful - Please give the post a šŸ‘ - "thumbs up"

        pfSense+ 23.05.1 (ZFS)

        QOTOM-Q355G4 Quad Lan.
        CPUĀ  : Core i5 5250U, Ram : 8GB Kingston DDR3LV 1600
        LANĀ  : 4 x Intel 211, DiskĀ  : 240G SAMSUNG MZ7L3240HCHQ SSD

        1 Reply Last reply Reply Quote 1
        • bingo600B bingo600 referenced this topic on
        • bingo600B bingo600 referenced this topic on
        • bingo600B bingo600 referenced this topic on
        • First post
          Last post
        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.