Replace failed drive with ZFS doesn't work like a true RAID-1 system
-
... maybe I'm doing something wrong ...
For years we've used pfSense and the GEOM mirror when it was available. On the 2.4 branch we started using ZFS mirroring which looked great.
Yesterday I attempted to replace a failed drive in the ZFS mirror. My steps were:
Starting with a powered down pfSense machine:
- Physically remove the failed drive
- Install the replacement drive
- Power up the machine and wait for it to boot from the still functional drive
- Reach a console and run the "zpool replace" command to resilver the ZFS array;
zpool replace zroot /dev/ada1p3 /dev/ada1
That all worked well enough ...
zpool status pool: zroot state: ONLINE scan: resilvered 849M in 0h1m with 0 errors on Wed Jan 8 08:41:30 2020 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1 ONLINE 0 0 0 errors: No known data errors
except when I check the partitions on the Good (ada0) drive compared with the New (ada1) drive.
gpart show => 40 488397088 ada0 GPT (233G) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 4194304 2 freebsd-swap (2.0G) 4196352 484200448 3 freebsd-zfs (231G) 488396800 328 - free - (164K) => 34 488397101 ada1 GPT (233G) 34 1023966 - free - (500M) 1024000 204800 2 efi (100M) 1228800 487168335 - free - (232G)
My expectation is when I resilvered the ZFS mirror that I would have two identical disks - like with GEOM mirroring.
That's not the case.
Now, I've read the sticky post about manually setting up partitions and then performing a resilver --- and perhaps that's the correct way to replace a failed drive in a ZFS mirrored system.
I'm looking for thoughts on what I may be doing wrong or perhaps I should reset my expectations and that having a mirrored ZFS system doesn't mean I can swap drives like used to be done with GEOM - and there's a lot more work to replace a failed drive starting in pfSense 2.4
And, I'm curious to know how the ZFS mirror is set up during the initial install of pfSense. Perhaps knowing those steps would reveal a clear answer to how we should be replacing failed drives in v2.4 with ZFS mirroring.
Thanks in advance, Jason
-
@cfapress said in Replace failed drive with ZFS doesn't work like a true RAID-1 system:
/dev/ada1p3 /dev/ada1
You replaced one slice with an entire disk, so it did what you asked (which wasn't right). I think you could have just run
zpool replace
and it would have figured out that it should redo the missing disk. -
I did try 'zpool replace' but it wouldn't accept that command without the name of the pool, old device and new device.
zpool replace missing pool name argument usage: replace [-f] <pool> <device> [new-device]
I also noted the ZFS pool is only mirroring a partition on the Good (ada0) drive. It's not mirroring the full disk as I was expecting (hoping since that's what GEOM did).
Is your suggestion to follow the instructions in the forum's sticky thread regarding ZFS when replacing a failed drive - manually partition and then resilver pointing at the explicit partition to be used?
https://forum.netgate.com/topic/112490/how-to-2-4-0-zfs-install-ram-disk-hot-spare-snapshot-resilver-root-drive
My experience with ZFS is limited to FreeNAS. They have a GUI that takes care of resilvering boot drives and perhaps they perform some partitioning prior to resilvering too.
I'm having trouble locating information regarding ZFS and replacing failed drives with pfSense. The documentation is light in terms of ZFS information:
https://docs.netgate.com/pfsense/en/latest/book/install/perform-install.html -
I haven't had to replace one yet myself (or simulated a failure in a VM) but I thought you should be able to just say
zpool replace ada1
in your case and it would do the right thing. The "new device" should be optional since the name may not change if you replace the disk exactly the same.The docs are sparse (if any exist) for ZFS yet because it's still new and considered experimental.
-
After additional trials I can confirm that replacing a ZFS mirrored drive is non-trivial.
My pfSense boxes are old decomissioned Windows workstations. Too old to be useful for Windows but perfect for pfSense routers.
Here are the steps I took to replace the ZFS mirrored drive.
- The new drive must be completely clean, if it was previously partitioned then you need to clear that away prior to installing into the pfSense machine.
In my case I'm using old HDDs from decomissioned workstations. They have Windows partitions. In particular an EFI partition that's impossible to replace with "gpart". I had to use the Windows "diskpart" tool to unprotect that partition and then destroy it before continuing to Step 2 below.
- Plan to manually add all the partitions to the new drive. You should use "gpart" to view the partitions on the Good drive and then partition the New drive identically. An example here, taken from a prior forum post.
# gpart create -s gpt da2 # gpart add -a 4k -s 512k -t freebsd-boot -l gptboot2 da2 ###This creates p1, you are using 4k alignment, size is 512k, type is freebsd-boot, label is gptboot2, you are partitioning drive da2 # gpart add -b 2048 -s 8384512 -t freebsd-zfs -l zfs2 da2 ###This creates p2, you are beginning at block 2048 and stopping at block 8384512 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da2 ###This writes the bootcode to p1 of your hot spare
- Once partitioned you must "zpool remove" the old dead drive partition. The breaks the ZFS mirror.
Example:
zpool remove zroot /dev/ada1p3
- Then "zpool attach" the new good drive. The creates a new ZFS mirror.
Example:
zpool attach zroot /dev/ada0p3 /dev/ada1p3
It's important to note that ZFS mirror is online mirroring a portion of the hard drive. The boot and swap partitions are not mirrored. It's not a true RAID-1 situation as I expected, like with GEOM.
And, I suggest you read about how to use zpool before altering any production systems. Everything I've shown above came from testing with spare hard drives in an old desktop computer.
REFERENCE: https://www.freebsd.org/cgi/man.cgi?zpool(8) -
Try to thinking this way: just buy used on eBay IBM-branded RAID card (or LSI, Dell, Adaptec, MSI, SiliconImage), better PCI-X 2 or 2.1, install two HDD (I suggest bullet-proof Ultrastar 7K3000 3TB 3.5-INCH ENTERPRISE HARD DRIVE SATA model HUA723030ALA640), configure to mirror and sleep well another 5 years. :)
You cannot write exactly where failure happened (in appliance or desktop), so I decide that You have desktop system.
Better to spend $50 on card and sleep well 5 years rather spending hours with failed drive and dancing around ZFS.