Replace failed drive with ZFS doesn't work like a true RAID-1 system



  • ... maybe I'm doing something wrong ...

    For years we've used pfSense and the GEOM mirror when it was available. On the 2.4 branch we started using ZFS mirroring which looked great.

    Yesterday I attempted to replace a failed drive in the ZFS mirror. My steps were:

    Starting with a powered down pfSense machine:

    1. Physically remove the failed drive
    2. Install the replacement drive
    3. Power up the machine and wait for it to boot from the still functional drive
    4. Reach a console and run the "zpool replace" command to resilver the ZFS array;
    zpool replace zroot /dev/ada1p3 /dev/ada1
    

    That all worked well enough ...

    zpool status
      pool: zroot
     state: ONLINE
      scan: resilvered 849M in 0h1m with 0 errors on Wed Jan  8 08:41:30 2020
    config:
    
           NAME        STATE     READ WRITE CKSUM
            zroot       ONLINE       0     0     0
              mirror-0  ONLINE       0     0     0
                ada0p3  ONLINE       0     0     0
                ada1    ONLINE       0     0     0
    
    errors: No known data errors
    

    except when I check the partitions on the Good (ada0) drive compared with the New (ada1) drive.

     gpart show
    =>       40  488397088  ada0  GPT  (233G)
             40       1024     1  freebsd-boot  (512K)
           1064        984        - free -  (492K)
           2048    4194304     2  freebsd-swap  (2.0G)
        4196352  484200448     3  freebsd-zfs  (231G)
      488396800        328        - free -  (164K)
    
    =>       34  488397101  ada1  GPT  (233G)
             34    1023966        - free -  (500M)
        1024000     204800     2  efi  (100M)
        1228800  487168335        - free -  (232G)
    
    

    My expectation is when I resilvered the ZFS mirror that I would have two identical disks - like with GEOM mirroring.

    That's not the case.

    Now, I've read the sticky post about manually setting up partitions and then performing a resilver --- and perhaps that's the correct way to replace a failed drive in a ZFS mirrored system.

    I'm looking for thoughts on what I may be doing wrong or perhaps I should reset my expectations and that having a mirrored ZFS system doesn't mean I can swap drives like used to be done with GEOM - and there's a lot more work to replace a failed drive starting in pfSense 2.4

    And, I'm curious to know how the ZFS mirror is set up during the initial install of pfSense. Perhaps knowing those steps would reveal a clear answer to how we should be replacing failed drives in v2.4 with ZFS mirroring.

    Thanks in advance, Jason


  • Rebel Alliance Developer Netgate

    @cfapress said in Replace failed drive with ZFS doesn't work like a true RAID-1 system:

    /dev/ada1p3 /dev/ada1

    You replaced one slice with an entire disk, so it did what you asked (which wasn't right). I think you could have just run zpool replace and it would have figured out that it should redo the missing disk.



  • I did try 'zpool replace' but it wouldn't accept that command without the name of the pool, old device and new device.

    zpool replace
    missing pool name argument
    usage:
            replace [-f] <pool> <device> [new-device]
    

    I also noted the ZFS pool is only mirroring a partition on the Good (ada0) drive. It's not mirroring the full disk as I was expecting (hoping since that's what GEOM did).

    Is your suggestion to follow the instructions in the forum's sticky thread regarding ZFS when replacing a failed drive - manually partition and then resilver pointing at the explicit partition to be used?

    https://forum.netgate.com/topic/112490/how-to-2-4-0-zfs-install-ram-disk-hot-spare-snapshot-resilver-root-drive

    My experience with ZFS is limited to FreeNAS. They have a GUI that takes care of resilvering boot drives and perhaps they perform some partitioning prior to resilvering too.

    I'm having trouble locating information regarding ZFS and replacing failed drives with pfSense. The documentation is light in terms of ZFS information:
    https://docs.netgate.com/pfsense/en/latest/book/install/perform-install.html


  • Rebel Alliance Developer Netgate

    I haven't had to replace one yet myself (or simulated a failure in a VM) but I thought you should be able to just say zpool replace ada1 in your case and it would do the right thing. The "new device" should be optional since the name may not change if you replace the disk exactly the same.

    The docs are sparse (if any exist) for ZFS yet because it's still new and considered experimental.



  • After additional trials I can confirm that replacing a ZFS mirrored drive is non-trivial.

    My pfSense boxes are old decomissioned Windows workstations. Too old to be useful for Windows but perfect for pfSense routers.

    Here are the steps I took to replace the ZFS mirrored drive.

    1. The new drive must be completely clean, if it was previously partitioned then you need to clear that away prior to installing into the pfSense machine.

    In my case I'm using old HDDs from decomissioned workstations. They have Windows partitions. In particular an EFI partition that's impossible to replace with "gpart". I had to use the Windows "diskpart" tool to unprotect that partition and then destroy it before continuing to Step 2 below.

    1. Plan to manually add all the partitions to the new drive. You should use "gpart" to view the partitions on the Good drive and then partition the New drive identically. An example here, taken from a prior forum post.
    # gpart create -s gpt da2
    # gpart add -a 4k -s 512k -t freebsd-boot -l gptboot2 da2 ###This creates p1, you are using 4k alignment, size is 512k, type is freebsd-boot, label is gptboot2, you are partitioning drive da2
    # gpart add -b 2048 -s 8384512 -t freebsd-zfs -l zfs2 da2 ###This creates p2, you are beginning at block 2048 and stopping at block 8384512
    # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da2 ###This writes the bootcode to p1 of your hot spare
    
    1. Once partitioned you must "zpool remove" the old dead drive partition. The breaks the ZFS mirror.

    Example:

    zpool remove zroot /dev/ada1p3
    
    1. Then "zpool attach" the new good drive. The creates a new ZFS mirror.

    Example:

    zpool attach zroot /dev/ada0p3 /dev/ada1p3
    

    It's important to note that ZFS mirror is online mirroring a portion of the hard drive. The boot and swap partitions are not mirrored. It's not a true RAID-1 situation as I expected, like with GEOM.

    And, I suggest you read about how to use zpool before altering any production systems. Everything I've shown above came from testing with spare hard drives in an old desktop computer.
    REFERENCE: https://www.freebsd.org/cgi/man.cgi?zpool(8)



  • Try to thinking this way: just buy used on eBay IBM-branded RAID card (or LSI, Dell, Adaptec, MSI, SiliconImage), better PCI-X 2 or 2.1, install two HDD (I suggest bullet-proof Ultrastar 7K3000 3TB 3.5-INCH ENTERPRISE HARD DRIVE SATA model HUA723030ALA640), configure to mirror and sleep well another 5 years. :)

    You cannot write exactly where failure happened (in appliance or desktop), so I decide that You have desktop system.

    Better to spend $50 on card and sleep well 5 years rather spending hours with failed drive and dancing around ZFS.


Log in to reply