eMMC Write endurance
-
@dugeem said in eMMC Write endurance:
Yes it is disappointing that Netgate chooses not to provide detailed endurance specifications for the eMMC components used in their hardware. Although to be fair this is not uncommon.
Anyway I have access to 3 different models of Netgate hardware containing eMMC. Based on FreeBSD dmesg output (dmesg | grep mmc) and a bit of research:
-
SG-1000 - Kingston M627 4GB MLC eMMC
Endurance - not published -
SG-1100 - Sandisk iNAND 7250 (DG4008) 8GB MLC/SLC eMMC
Listed endurance 20TBW. P/E cycles - MLC 3k ; SLC 30k -
XG-7100 - Kingston M525 32GB MLC eMMC
P/E cycles - MLC 3k ; pSLC 30k
(Later variants of this component may be TLC based with similar P/E cycles)
NB these are only samples - it is entirely possible that eMMC components may change across different production runs.
Checking eMMC Wear
- Install mmc-utils package - pkg install mmc-utils
(package only available in pfSense Plus 21.05 / CE 2.5 & later) - Check eMMC life time estimation
mmc extcsd read /dev/mmcsd0rpmb | egrep -i ^emmc - Life Time Estimation A is SLC NAND (or pseudo SLC) - multiply by 10 to get upper bound of % life used - eg 0x08 is 80%
- Life Time Estimation B is MLC NAND - multiply by 10 to get upper bound of % life used
- Pre EOL status - 0x01 = Normal, 0x02 = Warning (80%+), 0x03 = Urgent (90%+)
An example is my lab SG-1000 which has SLC wear @ 100% and MLC wear @ 80%. Bit of a surprise as it is only used as a VPN client test router with no extra packages installed. Recently I enabled RAM disk for /var & /tmp to try to squeeze a few more months of life out of it.
My SG-1100 fleet is still running 2.4.5p1 so I'll start finding out in the next few months (as they are upgraded) how they're going.
Other issues
Generally partition alignment is considered important for SSD/eMMC devices. Based on the 3 samples above:
- SG-1000 - aligned to 4MB - good
- SG-1100 - unaligned. Maybe a quirk of the EspressoBin board? Or perhaps later pfSense Plus recovery images correct this?
- XG-7100 - aligned to 32kB - good
Also to be noted is that eMMC drives generally support TRIM, but in all cases it was disabled. Again there may be reasons for this (eg TRIM on some older drives is problematic). Having said that TRIM is generally considered useful and so perhaps Netgate could revisit this?
Finally the ZFS elephant is now in the room :-) Only for 64 bit systems but since the SG-1100 only has 1GB RAM that may be optimistic. XG-7100 has plenty of RAM so we will evaluate it. In theory the possibility of ZFS integrity & snapshots is very interesting but not if it wears the eMMC faster than UFS. Hopefully Netgate may publish some more technical information on how their ZFS implementation is tuned for pfSense.
Thank you for the absolutely excellent post :-)
Since we are VERY likely at the limit of the write endurance on these eMMCs, and I'd rather be safe than sorry, I have ordered a 512Gb GB SSD for both my SG-2100 and 6100.
That way I can allow my self to enable all the logging and monitoring I want (IE: what the box can handle when it comes to SG-2100).So this post is now a reminder to people looking - and perhaps Netgate to answer officially - in the future :-)
-
-
@dugeem said in eMMC Write endurance:
Check eMMC life time estimation
mmc extcsd read /dev/mmcsd0rpmb | egrep -i ^emmcThis tool came up in in previous thread talking about writes to emmc and ssds, etc. and zfs writing all the time..
I have a sg4860, and I don't see any such mmcs* in my /dev dir - I see da0 and 0p1, 0p2 and 0p3 but I don't see how to use this tool to check the eMMC wear, which I do believe my sg4860 is suppose to have a 4GB eMMC, I sure didn't put any ssd into it, etc.
I see this in my dmesg
da0 at umass-sim0 bus 0 scbus6 target 0 lun 0 da0: <Generic Ultra HS-COMBO 1.98> Removable Direct Access SCSI device da0: Serial Number 000000225001 da0: 40.000MB/s transfers da0: 29184MB (59768832 512 byte sectors) da0: quirks=0x2<NO_6_BYTE>
I should prob open this up.. Because clearly I have more than 4GB eMMC because show 23GB size of disk.. Which actually then I could replace that I would think - which seems good. Now if just check how much wear it has on it ;)
-
@rcoleman-netgate said in updated to 22.01 - SG1100 high CPU usage '/sbin/pfctl -vvsr':
Most users that are [using Snort] on an eMMC have stated they were unaware of the recommendation to use an SSD/HDD for that task (and since the 1100 lacks that ability) they had a configuration that destroyed their eMMC in weeks, or months.
So that's disappointing. As pointed out in that thread, https://www.netgate.com/supported-pfsense-plus-packages lists several as "requires SSD" including NtopNG, and others as SSD recommended. I was unaware of that list.
I checked one of our older (Oct 2017) 3100s that isn't using IDS and I get:
eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x05
eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x00
eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01So that's 50% life used and 0% life used?
A more recent (Oct 2020) 2100 using Snort but with very few alerts shows:
eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x01
eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x01A client's 3100 from Nov 2017 that is using Suricata and OpenVPN shows:
eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x04
eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x00For reference, per https://forum.netgate.com/topic/170081/gui-services-in-the-system-log-are-filled-with-nginx-messages/8 the web server logging can be turned off between pfSense updates by editing /etc/inc/system.inc.
After installing the mmc-utils package, if the path hasn't been rescanned, use /usr/local/sbin/mmc.
Edit: we also normally turn off a lot of logging like the four "Log firewall default blocks" options.
-
@johnpoz said in eMMC Write endurance:
I have a sg4860, and I don't see any such mmcs* in my /dev dir - I see da0 and 0p1, 0p2 and 0p3 but I don't see how to use this tool to check the eMMC wear, which I do believe my sg4860 is suppose to have a 4GB eMMC, I sure didn't put any ssd into it, etc.
Weird - that looks like an USB attached drive. According to Netgate SG-4860 specs they started fitting larger 32GB eMMC drives from the end of 2015.
Certainly older eMMC components (pre eMMC v5.0 IIRC) may not support the lifetime estimates.
-
FYI I've opened a Redmine feature request to add mmc-utils to base images:
https://redmine.pfsense.org/issues/12860
Hopefully also a simple GUI wrapper for lifetime estimates & EOL info.
-
@dugeem said in eMMC Write endurance:
Weird - that looks like an USB attached drive
Exactly that. In the RCC-VE platform devices the eMMC is USB attached and mmcutils cannot read it directly.
Some of the 1100 drives cannot be read either due to the eMMC version.
Steve
-
-
-
-
@keyser said in New Netgate Appliance for IPS/IDS:
@steveits said in New Netgate Appliance for IPS/IDS:
The other 3100 (40%) is 3 days 7 hours uptime and:
device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b flash/sp 0 0 0.0 0.0 7 0 0 7 0 0 mmcsd0 0 0 0.5 29.1 2 7 0 7 0 0 mmcsd0bo 0 0 0.0 0.0 0 0 0 0 0 0 mmcsd0bo 0 0 0.0 0.0 0 0 0 0 0 0 md0 0 0 0.0 0.0 0 0 0 0 0 0
Probably would be better to wait a few weeks and do the math. :)
Yes, a long uptime would be much better. Those numbers posted with this box is more in line with the 11 - 12Tb Write endurance I guesstimated for the 8GB eMMC.
Also, I forgot we recently enabled the RAM disk feature on that router so the “iostat -x” numbers I quoted here are with the RAM disk active. I have been doing that when upgrading routers to 22.01.
I'll try to remember to check our 3100 in a few weeks.
-
-
-
-
@steveits said in eMMC Write endurance:
For reference, https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-lifetime.html
Minor note: the above instructions include a step with the csh builtin command rehash:
pkg install -y mmc-utils; rehash
but using the GUI Diagnostics->Command prompt is a sh, not csh, hence:
Shell Output - rehash sh: rehash: not found
Nevertheless, for others perhaps with similar setup, the first device I checked, an 18 month old sg-1100 (pfblockerng-dev the only package) reports:
eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x01 eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x04 eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01
Would appear it has maybe 3 years of life remaining.
-
Also per the above posts and https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-writes.html "...if there is enough RAM to spare, using RAM disks will drastically reduce disk writes over time."
One note on the RAM Disk feature, that doc page says it will preallocate the RAM. However with the RAM disk now using tmpfs, it only allocates RAM as files are written to it.
-
@steveits said in eMMC Write endurance:
Also per the above posts and https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-writes.html "...if there is enough RAM to spare, using RAM disks will drastically reduce disk writes over time."
One note on the RAM Disk feature, that doc page says it will preallocate the RAM. However with the RAM disk now using tmpfs, it only allocates RAM as files are written to it.
Hi Steve
I must admit I have not tested the ramdisk feature thoroughly - probably mostly because I lack detailed understanding of the “collateral dataloss” it will cause.
I get that you can have RDD data, DHCP leases and system logs flush to disk periodically, but since I use pfBlockerNG and NTopNG for historical logs and trend analysis, I have never bothered really testing RamDisk.
I assume it is still true that they will loose all logs and trend data at every reboot if you use RamDisk? -
@keyser said in eMMC Write endurance:
loose all logs and trend data at every reboot if you use RamDisk
I missed your question, sorry. Yes and no... on the System/Advanced/Miscellaneous page the "Periodic RAM Disk Data Backups" section covers how often that info is written to disk. Per the Netgate doc on RAM disks, "Data for both is saved during a proper shutdown or reboot, and also periodically if configured." By "both" I think it means RRD and DHCP (mentioned in the previous sentence)? Possibly /tmp and /var but I suspect it would be up to a package to copy their own files...? Not really sure, there. I just logged into a backup router to generate a system log entry, rebooted, and the log entry for my login was still there, along with a few others for Suricata and pfBlocker processes stopping.
So an unexpected power off is the main risk. Also, RAM disks should be easier on UFS drives in terms of file system corruption during power loss.
@steveits said in eMMC Write endurance:
remember to check our 3100 in a few weeks
After 14 Days 10 Hours uptime, the 3100 with the RAM disk active and without IDS:
iostat -x extended device statistics device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b flash/sp 0 0 0.0 0.0 7 0 0 7 0 0 mmcsd0 0 0 0.1 4.8 1 4 0 4 0 0 mmcsd0bo 0 0 0.0 0.0 0 0 0 0 0 0 mmcsd0bo 0 0 0.0 0.0 0 0 0 0 0 0
I had also found the "Ignore denied clients" option in DHCP server which reduced log writing somewhat.
-
@steveits Thanks Steve.
Well your 3100 will last a lifetime with that - almost non-existent - write intensity to the eMMC. No doubt the RAM disk has a profound impact on this issue.
I’ll see if I can find the time to investigate and test RAMdisk further.
-
So just to add another data point to this conversation... My Netgate 4100 is 10 days less than 1 year old an I just ran the check on my eMMC drive...
I'm not impressed.
eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x08
eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x09
eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01Showing between 70-90% of it's expected life is gone with just under a year of usage... This thing will be a brick before I know it.
-
@nkull You should put an SSD in it now before you brick it at an inconvenient time.
-
@nkull Write usage depends a lot on logging and, well, usage. Are you using any of the "SSD/HDD recommended" packages on https://www.netgate.com/supported-pfsense-plus-packages? We haven't had such issues but we make a point of disabling logging of default block rules, and do not have a lot of Suricata logging, so writing is limited.
We also frequently use RAM disks now that they aren't preallocated from RAM. That may help you in the short term.
-
@nkull What packages are you running?
-
@SteveITS I do use Suricata - Just now realized that it was a SSD recommended package... I'll have to look at the RAM disk thing, but I'm also going to see if I can figure out getting a SSD installed so that maybe this thing isn't just a really expensive paperweight in a couple more months. Seems lame it ships with such crap storage, yeah I know there is an option for more, but maybe more robust storage should be standard if the unit can't handle a couple packages running on it, I mean what's the point if you just use it like a home Linksys router. Sure don't remember seeing anything obvious ahead of time saying that running without a SSD would kill the unit in a bit over a year.
-
@nkull Yeah, turn off logging for Suricata on an eMMC -- if you want the storage to last.
-
@rcoleman-netgate Yeah, little late for me to see that... Hopefully it's not too hard to swap in some good storage.