NVME Wear Rate on 6100
-
Haven't had my 6100 for 2 years, yet SMART is telling me my drive is 69% worn already. It's also written 186TB. I'm not running Suricata. I have vnstat and rrd-summary installed.
SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 54 Celsius Available Spare: 100% Available Spare Threshold: 1% Percentage Used: 69% Data Units Read: 503,711 [257 GB] Data Units Written: 364,300,734 [186 TB] Host Read Commands: 12,095,957 Host Write Commands: 3,300,796,255 Controller Busy Time: 59,272 Power Cycles: 45 Power On Hours: 6,553 Unsafe Shutdowns: 13 Media and Data Integrity Errors: 0 Error Information Log Entries: 196 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 78 Celsius Temperature Sensor 2: 54 Celsius Temperature Sensor 3: 54 Celsius Temperature Sensor 4: 54 Celsius Thermal Temp. 1 Transition Count: 2 Thermal Temp. 1 Total Time: 29945
This is surely ridiculous for a drive not even 2 years old?
-
@ahxcjay Search the forum or read this looong thread:
https://forum.netgate.com/topic/195990/another-netgate-with-storage-failure-6-in-total-so-far/38?_=1744872979707This is unfortunately the new “standard” of SSD/eMMC wear because of the switch to the ZFS filesystem combined with your firewall fx. Logs a lot of traffic on rules or have packages installed.
Be glad you caught it in time. You can now adjust your logging/disc writes, or enable RAM disk if needed if you want to avoid changing the SSD in another year or two. -
@keyser Thank you.
Yes, looks like syslog is consuming a tonne of writes my side. I've turned off default logging for deny rules and moved things to a ram disk for the time being.
This is insane though that we're buying equipment with very finite storage lifetime.
-
@ahxcjay curious but whats writing so much?
I have had my 6100 Max for 3x years. I intentionally log every rule (compliance) along with pfblocker logging. There was a brief stint where i had to have ntop-ng running for a week.
No suricata/snort.== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 59 Celsius Available Spare: 100% Available Spare Threshold: 1% Percentage Used: 31% Data Units Read: 35,772 [18.3 GB] Data Units Written: 28,839,860 [14.7 TB] Host Read Commands: 597,968 Host Write Commands: 929,740,937 Controller Busy Time: 11,409 Power Cycles: 37 Power On Hours: 6,553 Unsafe Shutdowns: 30 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 83 Celsius Temperature Sensor 2: 59 Celsius Temperature Sensor 3: 59 Celsius Temperature Sensor 4: 60 Celsius Thermal Temp. 1 Transition Count: 1
-
@michmoor Your R&W stats are way below mine, yet you're at 31%. It's a really sucky situation.
I wish I knew what was writing so much. I run NextDNS, maybe it's that?
-
@ahxcjay yep it is
-
@michmoor Currently looking for a decent B+M key SSD to replace it with. I have a remote pfSense instance running on Protectli hardware. About to replace that with a Samsung Evo 870 when I'm on-site. The current WD one it has isn't in the smartdb. The wear indicator is '101' which is clearly wrong.
-
@ahxcjay If it makes you feel any better, my production box's M.2 NVMe drive is currently sitting at 133%. You got time. ;)
-
@tinfoilmatt hah! Appreciate it! I've mounted RAM disks for now. I wouldn't mind replacing the NVMe, it's just that finding a decent B&M keyed one now is so difficult.
-
@ahxcjay said in NVME Wear Rate on 6100:
It's also written 186TB.
I had the same issue:
https://forum.netgate.com/topic/189820/how-do-i-find-out-what-write-continuously-on-my-pfsense-ssd -
@keyser said in NVME Wear Rate on 6100:
https://forum.netgate.com/topic/195990/another-netgate-with-storage-failure-6-in-total-so-far/38?_=1744872979707
I agree that thread contains an excellent analysis of the problem described in this thread together with solutions, so probably a better place to continue this conversations.
Starting near the end is easiest https://forum.netgate.com/topic/195990/another-netgate-with-storage-failure-6-in-total-so-far/