Pfsense 2.4 ZFS File System
-
@kpa:
I wouldn't use spares on a boot pool, it's not worth the effort and you might run into complications just like this one because at the boot time the ZFS boot code wants to probe every device in the pool. If you still want to use spares on a boot pool the spare must be partitioned properly beforehand for the use and it has to have the ZFS boot blocks just like the other disks in case it is selected as the boot device.
Hot spares is basically a feature for very large data pools with serious availability concerns when a disk breaks and has to be replaced. A firewall/router is hardly such a use case.
I'm definitely not trying to use spares as a normal boot pool solution. I want the hot spare(s) to be properly configured to boot ahead of time so that if a boot drive fails and the hot spare is placed into the pool, the system will still be able to boot if it has to.
As I understood it these commands are partitioning the spare and installing the boot blocks to it?# gpart create -s gpt adaX # gpart add -a 4k -s 512k -t freebsd-boot -l gptbootX adaX # gpart add -b 2048 -s 8384512 -t freebsd-zfs -l zfsX adaX # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 adaX
Obviously, as you pointed out it doesn't seem to be working. I'll try partitioning with a stop at the end of p2 as you suggested and see if that works.
I'm hoping (and assuming) that it is just something that I am messing up, not a feature that just doesn't exist/work.The use case in my mind is if you set up a system somewhere that you won't have frequent access to if you need to replace a bad disk. This way zfs just resilvers the bad disk and in the event that the system needs to reboot for whatever reason, it does and everything works just fine with a fresh disk in the pool until you can get around to replacing it.
About the only ways I could see this making sense is if you have the above restraints on accessing the system AND
1. Need an exceptionally reliable firewall
2. Are on a tight budget and using cheap install media like thumb drives
3. Will literally never physically touch the hardware again and want a system that just works in a closet for a VERY long timeDefinitely fringe cases, and if it is something that just doesn't work (at all or well) with ZFS then it should be avoided, but if it's something that you can do with a few simple commands and it works then it would be useful. Primarily for tight budgets that want to install on thumb drives.
EDIT: Adjusting the partitions to exactly match the rest of the pool still doesn't boot.
Two more things I'm thinking:
1. possibly reinstall bootcode to all devices so that they match up with the new ada?
i.e., remove ada0, the hot spare which was ada4, is now ada3, ada1 = ada0, ada2=ada1, ada3=ada2.
So redo bootcode & labels (although I would think once bootcode is in p1 it doesn't matter what ada it is? and idk how much labels matter for booting?) so that EITHER:
The NEW ada3=ada3, ada0-ada0, ada1=ada1, ada2=ada2
OR
The new ada3=ada0, or whatever place it took in the pool2. gpart list shows "mode" or what appears to be permission, r0w0e0 for all p1's, p2's are different on the spare. Possibly changing this to match the rest, but I don't know why it would matter since it's booting from p1?
If anyone has thoughts on what I'm messing up to get this to work I'd appreciate them!
-
Yes if the spare was ada3 or anything that is on the SATA bus but your spare shows up as 'da4' in your earlier post, you have to adjust your commands for da4. Also the spare can not be just 'da4', it has to be the freebsd-zfs partition on it which is 'da4p2' after partioning. If you read the Sun ZFS documentation you probably thought that the spare would be there as a whole disk and the system would automatically sync the partitions on it as well, this is not the case on FreeBSD.
-
@kpa:
Yes if the spare was ada3 or anything that is on the SATA bus but your spare shows up as 'da4' in your earlier post, you have to adjust your commands for da4. Also the spare can not be just 'da4', it has to be the freebsd-zfs partition on it which is 'da4p2' after partioning. If you read the Sun ZFS documentation you probably thought that the spare would be there as a whole disk and the system would automatically sync the partitions on it as well, this is not the case on FreeBSD.
OK thanks I'll give that a shot, I really appreciate your help! I did read Suns documentation and eventually figured that out about the partitions (and some other differences), that's not the way it's setup anymore.
And ignore the earlier post, that's my actual pfsense install. I'm troubleshooting all of this on a VM before I do anything with my actual system. The real box doesn't use a hot spare like this.
On the VM all of the drives are adaX. -
@kpa:
When using a single drive how do you tell ZFS to keep 2 copies?
You have to do this right after the pool creation before any datasets are created or files are written on it:
zfs set copies=2 zpool
This property is a dataset property but gets inherited by any child datasets so it applies to them as well.
Thanks for the answer.
Setting this at the command line - is it permanent or does in need to be put into a file?
-
@kpa:
When using a single drive how do you tell ZFS to keep 2 copies?
You have to do this right after the pool creation before any datasets are created or files are written on it:
zfs set copies=2 zpool
This property is a dataset property but gets inherited by any child datasets so it applies to them as well.
Thanks for the answer.
Setting this at the command line - is it permanent or does in need to be put into a file?
Zfs properties are stored in the pool metadata, they are permanent.
-
Thanks again.
Do you believe there is any benefit in setting copies to 2? I've been reading a bit about it and from what I've read it has an impact on speed.
Also when the page I was reading tested its ability to stop corruption and while it reduced the amount of corruption it didn't eliminate it. So I't left me under the impression that it would have some merrit where important data of photos are stored, but not so much on a firewall.Would I be correct?
-
Any kind of reduncancy has at least a small effect on write speeds, it's unavoidable because the data has to be duplicated somehow be it a straight second copy or some kind of parity system you have on raid-z. Two copies is not a bad idea on a single disk if you can't use two disk mirror for some reason, it can save your bacon because disks don't usually blow up completely just like that but start to slowly develop a bad sector here and bad sector there and it's very unlikely that with two copies of the same data you lose both copies at the same time.
-
Well I tried modifying the labels on all of the partitions to match their ada#, and reinstalled the bootcode to each p1. Nothing changed though, still same boot error.
gpart modify -l gptboot0 -i 1 ada0 gpart modify -l zfs0 -i 2 ada0 gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0 #and so on for the rest of the drives
Any more ideas? It seems like this should be doable?
-
Finally figured it out. Detach the bad drive from the pool and the hot spare becomes a permanent part of the pool.
I detached the bad disk after resilvering was complete, the hot spare became part of the pool and reboots as if nothing changed. I kept removing two more disks that were part of the original pool and it still boots great with the hot spare and one disk from the original pool.
zpool detach poolname baddiskname
https://blogs.oracle.com/eschrock/entry/zfs_hot_spares
If you want a hot spare replacement to become permanent, you can zpool detach the original device, at which point the spare will be removed from the hot spare list of any active pools.
So at the end of the day it looks like it could potentially be useful to replace a boot disk remotely so long as you SSH in and offline the bad disk. Or if you wrote some sort of script that would offline the bad disk after resilvering was complete. But that's beyond me.
-
I typed up a basic ZFS How To here if anyone's interested.