6100 MAX NVMe failed
-
A few weeks ago my 6100 Max with the 128GB factory NVMe had a catastrophic failure, it was running perfectly for almost 4 years. No warning, no indication of why it failed. It was such a stressful weekend.
The device would not even boot from a USB Drive.
I reached out to support and was essentially told that the device was bricked, no real guidance to try anything besides booting from the USB. I was told I needed to replace the entire device. It's a shame that Netgate support doesn't even bother to suggest trying to replace the NVMe just because they don't sell replacements.
On a whim I decided to remove the NVMe and see if it would boot off the eMMC and to my surprise it did. Which indicates that something went terribly wrong with the factory NVMe.
That weekend I was able to locate a local ebay reseller that happened to have a couple of used 256GB NVMe M Keyed NVMe drives and I setup the NVMe's in a mirrored setup, just to see if I could, and it worked.
I had to do a bare metal restore and then use the ACB (Auto Config Backup) service to restore my last configuration, fortunately I had my Device ID and encryption key so I could locate and restore the backup.
Since these NVMe's were used, I wasn't comfortable keeping the system running on them so I was able to find compatible NVMe on Amazon: KingSpec 256GB M.2 NVMe SSD, 2242 PCIe for about $40 each.
Made a backup of the config.xml, copied it onto the USB I used to reinstall. Replaced the NVMe's and was able to restore the system and get everything running stable.
@stephenw10 on this forum was a huge help in getting me back up and running. Thank You.
I've also setup a cron job to copy the config.xml file to my local NAS so I have an offline copy available if I ever need it in the future.
ssh-keygen -b 4096 -C "your_email@example.com" **No Passphrase **Copy pub key to admin user profile on the NAS
This allows me to run the cron job without a password
/usr/bin/scp /cf/conf/config.xml admin@192.168.2.20:/share/BACKUP/pfsense/
Hopefully this will run for at least another 4 years if not longer.
And I hope this helps someone that might have a similar issue come up.
-
-
Very nice work. Great write-up.
Only part I don't follow is:
On a whim I decided to remove the NVMe and see if it would boot off the eMMC and to my surprise it did. Which indicates that something went terribly wrong with the factory eMMC.
Wouldn't that merely confirm catastrophic failure of the storage drive you removed? What do you mean when you say "factory eMMC"?
The pictured KingSpec M.2 NVME SSD is a SATA drive, not a PCIe device.
EDIT: Ohh, I think I get it. Your 6100 MAX shipped with an eMMC storage device from the factory—which failed. And the resolution was a storage device replacement/overhaul with these mirrored SATA SSDs. Got it.
Still important to note the difference between the SATA and PCIe SSD interfaces, since the 6100 MAX presumably does not have NVMe PCIe slots onboard.
-
@tinfoilmatt Thanks for catching that typo, the eMMC didn't fail it was the NVMe, fixed the typo. I was able to boot off the factory (soldered) eMMC once I removed the SSD.
The Kingspec is a PCIe SSD as the Netgate 6100 doesn't have a SATA interface, which made it incredibly hard to find a replacement. If it was SATA there were so many more options available.
-
@tariqali The pictured drive has an M.2 B+M keyed interface, which is typically only found with SATA drives. PCIe NVMe drives are usually M keyed only. (Relying on image search results and prior knowledge, but I could be wrong.)
So if that drive is in fact a PCIe, not a SATA drive (as KingSpec advertises), then I've never seen that before.
What storage driver is being used for the new mirrored array?
da
,nvme
—other? -
@tariqali hey, as already said (but a thank you can't be said too much): THANK YOU. So far everything is alrite here (not quite 2 years of usage), but it helps me sleep a little better, having bookmarked your post! Very much appreciated!!
-
@tinfoilmatt Correct it is a M.2 B+M Keyed interface, but this one is a PCIe, here's the Amazon listing: https://a.co/d/fDxaY4m
Not sure about the mirror driver, I just selected the mirror setup when I did the reinstall using the USB from Netgate, the installer recognized two drives and gave me the option. I am using ZFS if that makes a difference?
-
@the-other The thing that will save you a lot of headache is keeping an unencrypted backup of your config.xml somewhere, so restoring your system becomes a lot easier.
You can use ACB, but your need you device id and encryption key, it's a little bit of a hassle but an option.
That's the reason I now have a cron job keeping a daily copy of the config.xml on my NAS.
-
@tariqali
Yeah, well thanks to keepass xc i have both saved. Besides I actually push a config backup to my pc regularly and after bigger changes. That is also saved to my nas and is unencrypted (pure hobby usage).