Netgate 4100 SMART: "Unable to detect device type"
-
I have a Netgate 4100 from 2023 that seems to be having issues. I realized today that it randomly switched to a non-default "boot environment" that had old an old configuration in it... this concerns me for obvious reasons.
As part of investigating I wanted to make sure the onboard storage is OK, so I went to Diagnostics -> S.M.A.R.T. but all entries just give me this:
smartctl 7.4 2023-08-01 r5530 [FreeBSD 15.0-CURRENT amd64] (local build) Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org /dev/: Unable to detect device type Please specify device type with the -d option. Use smartctl -h to get a usage summary
Is this expected?
-
@courtalj Uhh, that sounds a bit like the eMMC (storage) has been worn “out”. More often that not they do not wear themselves to death and just fails, but rather switches to read-only but still accepts write commands without actually writing them.
Since pfSense runs from memory that actually means it could have “died” months ago, and a reboot causes it to boot with the configuration your box had at that time.You can veriy that is what happened by making a lot of changes or install a package, and then see, that after a reboot you are back to where you are now.
If it is “dead” then I’m sorry to say your only option is to install a M.2 SSD into the box, and then hope you can get it to reliably boot from that without hanging for extended periods of time trying to figure out the boot storage environment. Sometimes you can get lucky and be allowed to write a little to the eMMC again - optimally you would install the SSD, and attempt to see if you could get it to accept a “erase eMMC” from the installer so you no longer have to worry about it attempting to boot from eMMC again.
-
@courtalj if it’s an eMMC I don’t think they show SMART.
You can check eMMC life via https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-lifetime.html#emmc.
Re your issue though, were you restarting at the time? And is the good/default BE still present? I had that on another model, I think a 2100. There’s detection at boot to see if the boot was successful and if not it reverts. I was kind of wondering if there’s a bug there somewhere. At the time I didn’t realize what happened until a bit after I restored a backup, but IIRC I could just use the “default” BE again.
-
@SteveITS @keyser Thanks! eMMC seems OK, though more worn than I would expect after only two years:
eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x02 eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x04 eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01
I'm not sure exactly when it reverted - the configurations that were wiped out were VPN settings that I only use when I travel so it could have gone unnoticed for quite a while. When I figured out the whole "boot environments" thing, I was able to select the "default" boot environment which had up-to-date configurations and it did work.
I think I will look into installing an M.2 regardless...
-
@courtalj You can of course. There are a few ways to reduce writing, and one ZFS change coming in 25.03.
https://forum.netgate.com/topic/195879/netgate-2100-life-expectancy/8