how to boot from the zfs mirror when 1 disk failed?
-
Yesterday my mini pc running pfsense CE v2.7 failed to boot after a planned reboot (as it was behaving strangely lately), connecting HDMI let me quickly found out one of the ssd failed, but somehow the other ssd couldn't finish the boot process, half way thru, it had this error:
... Configuring crash dumps... Using /dev/ada0p3 for dump device. Can‘t open '/dev/gpt/efiboot0' /dev/gpt/efiboot0: UNEXPECTED INCONSISTENCY: RUN fsck_msdosfs MANUALLY. THE FOLLOMWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY: msdosfs: /dev/gpt/efiboot0 (/boot/efi) Preen mode recommended running a check that will be performed now now. Warning: Trying to recover filesystem from inconsistency... ERROR: Impossible to mount filesystem., use interactive shell to attempt to recover it #
Taking out the bad disk had the same output.
At this stage, if I ran
zfs pool
I could still see all the zfs pools, and I did managed to recover the config from/conf/config.xml
, after manually mounting/pfSense/ROOT/default/cf
, which was not auto-mounted.Many filesystems were read-only and I couldn't mount all zfs datasets as the
export.lock
file was read-only, too (probably reason why the/pfSense/ROOT/default/cf
dataset wasn't mounted)./dev/gpt/efiboot0
was from the bad drive, so as a linux person with no freebsd experience I wanted to edit/etc/fstab
and change theesp
partition to/dev/gpt/efiboot1
hoping it would boot, for editors, i've only usednano
my life (which was not present in pfsense), so I triedvi
, when trying to save after editing, I was told the file was read-only, then I couldn't manage to leave with:q
,:wq
,:wq!
orESC
orCtrl + C
, and the screen became very messy. BGM Hotel California intensified.The family was waiting for internet to come back, so I gave up recovering the zfs mirror and re-installed a single-disk version with the recovered
config.xml
, so we're good for now.May I ask what are the steps to boot from the good drive(s) when we use zfs mirror, when pfsense normally booted from the failed drive?
For example, in Linux the EFI partition cannot be zfs, and have to be synchronized manually across the mirrors. Is it the same with pfsense / freebsd?
Does pfsense synchronize the "EFI" and "boot" mountpoints? If not, anyway to make sure when any one of the disks failed, one can still boot from the other good disk(s) temporarily until we can resilver the mirror?
Thanks a lot!
-
Probably this unfortunately: https://redmine.pfsense.org/issues/15083
Steve
-
@stephenw10 Thank you very much, that looks like it. I'll give it a try soon, after i receive my new disk and re-create a mirror. Thanks again!
-
Hmm, actually it may not be that issue. Some testing here shows it's incorrectly trying to mount the efi partition using a missing label. But it doesn't need to mount that at all.
Simply commenting out that line from the fstab should allow it to boot.
You can use the included Easy Editor
ee
to avoid vi induced insanity.Steve
-
@stephenw10 Thank you sir, it worked beautifully!
Following the guide, i found
/dev/gpt/efiboot1
was indeed empty, so if it were the remainingesp
(EFI) partition, there would be nothing to boot from.So i copied everything, as shown in the guide, from
(/boot/efi/)
(i.e./dev/gpt/efiboot0
) to/dev/gpt/efiboot1
.I then used
ee
to edit/etc/fstab
, changing:/dev/gpt/efiboot0 /boot/efi msdosfs rw 2 2 /dev/ada0p3 none swap sw 0 0
to:
#/dev/gpt/efiboot0 /boot/efi msdosfs rw 2 2 #/dev/ada0p3 none swap sw 0 0 # when the above disk failed, comment the 2 lines above # and uncomment the 2 lines below, to boot from the remaining disk /dev/gpt/efiboot1 /boot/efi msdosfs rw 2 2 /dev/ada1p3 none swap sw 0 0
Here,
/etc/fstab
tells the system to boot from/dev/gpt/efiboot1
(mounted as/boot/efi
), also to use/dev/ada1p3
as the new swap.I then (after weeks of having edited the lines above) rebooted and voila it booted normally.
Also,
ee
looks very similar tonano
, and i managed to quit very easilyThank you very much for the help!
It would be awesome if pfSense could have some kind of script to format and sync theesp
partition during or post installation so the system can be more resilient.edit: sorry i missed the part where you said uncommenting that line is sufficient. I'll test it on another day. thanks!
-
Great. Yes there are a bunch of improvements there coming in 24.03.