Another Netgate with storage failure, 6 in total so far
-
It is NVMe in the 4100/6100/8200. It's not hard to fit if you have any experience assembling PCs.
-
@dstaylor I feel your pain. The storage on your 6100 should not wear out so quickly!
Netgate staff say that it is not possible/supported/recommended for an end-user to install an SSD in the 4100 and 6100, so there is no official documentation.
"It's a time of purchase upgrade option."
Why does Netgate misleadingly advertise the slots when they have no use and the user is not supposed to touch them? Well, that is a real puzzler!
So if your Netgate device is 31 to 365 days old, you are SOL. Fortunately, this limitation does not apply to you since your device is out of warranty.
The actual SSD installation process is easy:
- Remove the rubber feet and then remove all 8 torx screws (2 front, 2 back, 4 under the feet).
- Remove the plastic filler panel that was held in by the screws.
- Gently separate the top and bottom half of your 6100.
- Install the M.2 NVMe drive into the slot.
- Carefully put the top and bottom halves together - pay attention to the LED lights and the 3 plastic "shrouds" that bend around the circuit board.
- Reinstall the screws and attach the feet.
The instructions from the 4200 are pretty similar and cover the software re-installation process.
This video of opening a 6100 should be helpful.
-
@jimp @stephenw10 @kphillips @marcosm @cmcdonald Would any of you care to comment on this thread? With 4.3k views here and over 60k views on Reddit, I hope that @jwt's comments do not represent Netgate's official and only response to the issues that have been raised.
-
@andrew_cb Some good points have been raised along with actionable suggestions to mitigate the issue. Thanks for the constructive feedback - the issue has our attention.
-
@andrew_cb I installed a KingSpec NVMe 256GB drive from Amazon this morning. I followed the guidance you provided and the procedure to wipe the MMC drive in the documents.
It's back up and running now. The only thing that seems to be different is I no longer have the slowly flashing diamond that shows "boot complete/ready". I do get the flashing square during boot, and the orange circle when I put it in standby. I checked that none of the "light pipes" were bent, so that is a head-scratcher. Now when it's running all the LEDs are dark. I took away the memory disk I had implemented to try and prolong the mmc drive because it filled up during my reinstall and package reload.
Best I can tell it's all good except for the LED issue.
-
Hmm, odd. Try running:
pfSense-led.sh ready
-
@stephenw10 I'm running that command from where? And what output am I expecting?
-
@dstaylor said in Another Netgate with storage failure, 6 in total so far:
@stephenw10 I'm running that command from where? And what output am I expecting?
I found it down in /usr/local/sbin. I ran the command but there is no change.
-
Hmm, so just no LEDs? You can try setting any of the other states:
pfSense-led.sh usage: pfSense-LED booting pfSense-LED ready pfSense-LED update [1|0] pfSense-LED updating
It should update the LEDs accordingly. 'ready' is the normal state after booting.
-
@stephenw10 I pulled the box apart again. It was a keyboard/chair interface issue; the "flap" around the lighthood had folded over a little bit and was blocking the LED.
Now the LEDs are working as expected.
-
Please do follow the advice of zero'ing out your emmc when installing an ssd. I did not (I'm not sure that advice was even there at the time, I don't recall where I got the pointers to install my own ssd in the first place, but other than the special requirements on the type of ssd supported, I don't remember it being a particularly difficult install physically), and it worked fine until it didn't. I appreciate the help of staff here to guide me out of that predicament.
All of this discussion does make me wonder where I put my original pfsense media. It would seem to be a really bad idea not to have a recovery image in case it's needed (gulp).
-
@dnavas said in Another Netgate with storage failure, 6 in total so far:
bad idea not to have a recovery image
You can get the Netgate Installer from the store (free). It will download the latest version when you run it. Actually I think they added, or were talked about adding, a version selection.
https://docs.netgate.com/pfsense/en/latest/install/index.html
-
@marcosm I am glad Netgate is committed to addressing the issues raised in this thread. When do you think Netgate will be able to share a high-level overview of the planned changes so the community can give feedback? It could be a new thread to help keep the discussion focused.
Feel free to contact me publicly or privately if there is anything I can assist with.
-
Add me to the list. I have a number of 4100 base models installed in a different time zone that, had I known, would have been upgraded at purchase a couple of years ago. The first of them just just died on me. Fortunately I had a replacement on my shelf that I overnighted with a backup config. However, the remainder which were similarly configured are now likely at high mortality risk which led me to this thread for options.
For reference booting the dead 4100 and attempting to install a freshly downloaded copy from the store finds no valid storage devices.
If Netgate sold an appropriately overpriced nvme with B key on the store for the 4100 I would buy them just to sleep comfortably again.
In the meantime I can only turn off the packages I naively installed and cross my fingers.
-
You can also enable ram disks to reduce drive writes significantly.
-
@stephenw10 Thanks for the reminder, back in the day ram disks were the norm (and I may be deluding myself recalling also the default config) on my Alix boxes. I got lazy assuming these boxes would be preconfigured for longevity.
-
@stephenw10
4100 have only 4GB of RAM? In some use cases with such a small amount of memory, it's already barely enough for the device to function. And if you add a RAM disk... You'll have to significantly cut down on resource usage and configure it so that there are very few writes in general.My hardware is not Netgate, and my SSD is a Samsung Pro SATA 256GB and RAM size is 16GB.
Over almost five years, I have accumulated about 32TB of writes (~20GB per day). Recently, I started using RAM disks, reducing writes to around 1GB or less per day. Enabling RAM disks was not straightforward—it required a lot of trial and error.
In the end, I optimized log writing and analyzed the temp folder to understand what was writing to the disk and why. However, even now, /tmp is set to 1024, and /var to 8192 due to stability concerns.
I think the main problem is that eMMC wear can easily go unnoticed. And perhaps there should have been a preset in the Plus version that, where possible, checks the wear status and notifies the user well in advance of a critical state in every possible noticeable way, through all kinds of alerts. I’m not sure if it’s already too late for this or not...
-
@w0w said in Another Netgate with storage failure, 6 in total so far:
4100 have only 4GB of RAM? In some use cases with such a small amount of memory, it's already barely enough for the device to function. And if you add a RAM disk... You'll have to significantly cut down on resource usage and configure it so that there are very few writes in general.
Well, that depends on quite a lot of things. My edge device here is a 3100. That has 2GB RAM and I run RAM disks on that at 80/160MB without issue. That's running pfBlocker and Snort.
But obviously I didn't just enable all the lists and signatures.
-
@w0w Recent versions of pfSense don’t allocate the RAM disk space until it’s used, so it’s more flexible.
-
@stephenw10 curious I have an 1100.
Been running into OOM situations with pfblocker and some services like snmp crashing.
I am somewhat concerned about the writes but if I move to RAM disk do my issues go away?