Laptop Hard Drive + Load Cycle Count
-
I encountered a scenario on my home firewall the other day, and I want to see if it's more widespread. I just replaced the hdd in my firewall after it failed recently, and it is a laptop hdd.
I kept hearing the make a noise that sounded like a head reset, so I installed smartmontools:
# pkg_add -r smartmontools # rehash
When I ran smartctl, I found that the Load Cycle Count was unusually high, over 1700 after only a few hours of uptime.
# smartctl -A /dev/ad2 smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.1-RC1 i386] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE [...] 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 1723 [...]
Finding that, I installed ataidle and turned APM off for the drive:
# pkg_add -r ataidle # rehash # ataidle -P 0 /dev/ad2 APM set to 0 APM Disabled
And then the Load Cycle Count stopped increasing. Part of me wonders if that's what killed the old drive. I've heard of Ubuntu and others doing this to laptop drives but hadn't seen it myself on FreeBSD before.
So if anyone else running with a laptop drive could install and run smartmontools, I'd appreciate knowing if your Load Cycle Count is unusually high, as mine was. If this is a widespread issue, I'll speed up my plans to make packages for smartmontools and ataidle.
EDIT: I checked the old HDD, I got it to connect via a USB to IDE cable, and its Load_Cycle_Count was rather high: 263719.
-
Novice FreeBSD guy here-I'm using a laptop HD in a production machine. Is it safe to install and run smartmontools on a prodcution machine?
Thanks
-
Novice FreeBSD guy here-I'm using a laptop HD in a production machine. Is it safe to install and run smartmontools on a prodcution machine?
Yes, it should be safe. It can help, actually, in that you can montior the SMART data from a drive and get an idea of its overall health. You can even run online/offline tests while the system is active to test for bad sectors.
SMART isn't the only indication of failure, of course, but it's usually a good predictor if it starts flagging issues.
-
Novice FreeBSD guy here-I'm using a laptop HD in a production machine. Is it safe to install and run smartmontools on a prodcution machine?
Thank you. I'll see if I can get this done tomorrow.
My laptop drive is a new install. Online for about 3 weeks-in a via c7 mobo.Yes, it should be safe. It can help, actually, in that you can montior the SMART data from a drive and get an idea of its overall health. You can even run online/offline tests while the system is active to test for bad sectors.
SMART isn't the only indication of failure, of course, but it's usually a good predictor if it starts flagging issues.
-
From my home pfsense box, drive has been in there for quite a while.. 6 Months, maybe longer.
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 302
According to ataidle:
# ataidle /dev/ad0 Model: HTS726060M9AT00 Serial: MRH425M4J6PR1B Firmware Rev: MH4OA68A ATA revision: ATA-6 LBA 48: yes Geometry: 16383 cyls, 16 heads, 63 spt Capacity: 55GB SMART Supported: yes SMART Enabled: yes APM Supported: yes APM Enabled: yes AAM Supported: yes AAM Enabled: no APM Value: 16512
-
I ran smartmontools on my home pfsense box. It looks laughable. I Have it running on an old crappy laptop-I figued I'd run it here before I run it on my production machine.
smartctl -A /dev/ad0 tl version 5.38 [i386-portbld-freebsd7.2] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 129 129 033 Pre-fail Always - 1
4 Start_Stop_Count 0x0012 098 098 000 Old_age Always - 4221
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 062 062 000 Old_age Always - 16992
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1218
191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 72
193 Load_Cycle_Count 0x0012 063 063 000 Old_age Always - 378566
194 Temperature_Celsius 0x0002 107 107 000 Old_age Always - 51 (Lifetime Min/Max 14/58)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 230
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 -
I get this error when trying to download adaidle
pkg_add -r adaidle
Error: FTP Unable to get ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-7.2-release/Latest/adaidle.tbz: File unavailable (e.g., file not found, no access)
pkg_add: unable to fetch 'ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-7.2-release/Latest/adaidle.tbz' by URL -
Try this instead:
pkg_add -r ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-7-stable/Latest/ataidle.tbz
-
I get this.
Error: FTP Unable to get ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-7-stable/Latest/adaidle.tbz: File unavailable (e.g., file not found, no access)
pkg_add: unable to fetch 'ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-7-stable/Latest/adaidle.tbz' by URLBy the way, In about 2 hours I will be at the production box. What should the numbers look like when I run smartctl?
-
The tool is called ataidle instead of adaidle :)
Here is mine on a laptop hd that is probably more than 6 years old:
# smartctl -A /dev/ad0 | grep Load_Cycle 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 483217 #
After turning off powermanagement with -P ataidle reports:
# ataidle /dev/ad0 Model: ST960821A Serial: 3LF0G30K Firmware Rev: 3.00 ATA revision: ATA-6 LBA 48: no Geometry: 16383 cyls, 16 heads, 63 spt Capacity: 55GB SMART Supported: yes SMART Enabled: yes APM Supported: yes APM Enabled: no AAM Supported: no AAM Enabled: no #
Load_Cycle_Count stopped increasing after turning off APM.
-
Hmm, mine was only 1847.
-
@kpa:
The tool is called ataidle instead of adaidle :)
Nice catch on the typo. :-)
(I fixed them)@kpa:
Here is mine on a laptop hd that is probably more than 6 years old:
# smartctl -A /dev/ad0 | grep Load_Cycle 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 483217 #
Load_Cycle_Count stopped increasing after turning off APM.
That is rather high. Good to know that it stopped just like mine did. Did you happen to notice how fast it was increasing?
-
By the way, In about 2 hours I will be at the production box. What should the numbers look like when I run smartctl?
It depends on the drive and system, really. Some Seagates spit out nonsense numbers for certain values and can't be trusted (like error rates, iirc) but are accurate for others.
If it's a new drive, most of the values should be 0 or near 0. Errors should be 0, reallocations should be 0. The Power_On_Hours will of course increase over time. There are a couple counts that will go up every time you reboot or power cycle.
-
@kpa:
The tool is called ataidle instead of adaidle :)
Nice catch on the typo. :-)
(I fixed them)@kpa:
Here is mine on a laptop hd that is probably more than 6 years old:
# smartctl -A /dev/ad0 | grep Load_Cycle 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 483217 #
Load_Cycle_Count stopped increasing after turning off APM.
That is rather high. Good to know that it stopped just like mine did. Did you happen to notice how fast it was increasing?
The drive is very old and was in constant use on a laptop before I "rescued" it for my pfSense box so I have no idea what the number was when I started using it on my system. I forgot to check what the APM setting was before setting APM to zero but after setting it to 2 Load_Cycle_Count increased to 483228 in just a few minutes.
-
Sounds about like what I had expected.
I'm working on getting smartmontools and ataidle into 2.0. Someone had already written a gui for smartmontools, but I'll need to whip something up for ataidle.
-
Here's my production box readout.
It's been up for a few weeks or so. The load cycle count seems low about look at the raw_read_error_rate. That seems high.
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_ FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 046 Pre-fail Always - 68771
2 Throughput_Performance 0x0005 100 100 030 Pre-fail Offline - 12255232
3 Spin_Up_Time 0x0003 100 100 025 Pre-fail Always - 1
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 37
5 Reallocated_Sector_Ct 0x0033 100 100 024 Pre-fail Always - 8589934592000
7 Seek_Error_Rate 0x000f 100 100 047 Pre-fail Always - 520
8 Seek_Time_Performance 0x0005 100 100 019 Pre-fail Offline - 0
9 Power_On_Seconds 0x0032 099 099 000 Old_age Always - 679h+26m+26s
10 Spin_Retry_Count 0x0013 100 100 020 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 37
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 26
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 96
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 42 (Lifetime Min/Max 25/45)
195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 32
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 459931648
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x000f 100 100 060 Pre-fail Always - 384
203 Run_Out_Cancel 0x0002 100 100 000 Old_age Always - 1529023102969
240 Head_Flying_Hours 0x003e 200 200 000 Old_age Always -
Looking at that, I'd guess it's a Seagate drive, and the numbers are bogus for those values.
-
Looking at that, I'd guess it's a Seagate drive, and the numbers are bogus for those values.
It's a Fujitsu
-
I haven't seen many of those. They may also throw out bogus numbers, but if they are real, they're worrisome :-)
-
crap!