HDD Crashing or Something Else?
-
Bad thing is this is with a different motherboard(new port) and a new sata cable.
-
I'd be inclined to believe that both drives are good…
However, if testing the Intel drive is a bother, you can send it to me FEDEX and I will test it for you for 5 or 10 years and let you know the results.
On a more serious note, I keep my drive health info in a flat-file so that I can check the current values against older values to see if there are very noticeable changes. Interesting though - I've never had a bad SATA cable.
-
The only time I have ever heard of a bad hdd cable (pata or sata) has been when pc magazine or someone like that intentionally damaged a cable to see if 5 repair places would properly diagnose it.
At this point it is way too early to tell if changing the cable helped, and looking at the smart report there are a lot of errors that shouldn't be there. I find it highly unlikely the cable is bad, but its something easy to swap out and change. It may just be a coincidence and the problem hasn't shown itself.
A side note, when the errors start occurring I am unable to login to pfsense via browser. It pulls up the page but spits out garbage along the top of the screen and it doesn't allow a login.
-
Wouldn't a nice kernel panic be so much more fun than this intermittent failure stuff?
-
That value at 181 looked extremely bad to me but this seems to indicate it's not that bad:
http://forum.crucial.com/t5/Solid-State-Drives-SSD/SSD-SMART-quot-Non-4k-Aligned-Access-quot/td-p/43012Some attributes seem to be manufacturer specific though. I'd see if you can get a definitive list from Micron (?).
Steve
-
Yeah - I also saw that some manufacturers throw some weirdness for the 181 value. I'm not sure why they do that.
-
Checked smart this morning. The 181 value is 214751576065 from 206161510401 yesterday. Larger but not exponentially.
System shows no odd symptoms this morning.
-
Yeah - I might be inclined to ignore that field for your SSD.
No increase in UDMA_CRC_Error_Count?
-
UDMA_CRC_Error_Count is the same at 33.
I will continue to monitor the smart reports daily and report any unusual behavior from the machine.
-
Here is an explanation of the attributes for your SSD:
http://code.google.com/p/hddguardian/wiki/Micron_SMART_attributes202 - Average lifetime used, seems relevant and your is still at 0% so no worries there. ;)
Steve
-
Reporting in on this issue. No crashes since the last one reported in this thread. Total uptime now is 44 days. SMART status looks nominal, with UDMA_CRC_Error_Count still at 33.
The only odd thing at this point is when looking at the local console of the machine, I am getting "fpudna in kernel mode!" Googling this for pfsense only brings up 2 hits, and in general it seems to bring up issues from 2006-2007 in freebsd and linux. A brief look at some posts seem to indicate it isn't a problem. I still thought it was worth mentioning. If anyone has any idea what that is or means, by all means educate me. ;D Otherwise everything seems normal.
It would appear to me at this point that it had something to do with the first SATA cable I used. Studying the original cable doesn't reveal any obvious problems, but nothing else has changed so I am assuming that was the cause of all my issues. Hopefully this can remind people when troubleshooting not to overlook cables.
-
Looks like that error is related to the via padlock driver in 64bit systems. Can you disable padlock? Or switch to 32bit?
Steve
-
I could switch to a 32bit system, but the error doesn't seem to be causing any issues I have been able to detect. I do have a couple of ipsec tunnels to remote offices which I assume the padlock helps with the encryption. However if the error is indicating that the accelerator features is not working, then there isn't much point to it. CPU usages typically never exceeds 1-2% so I may not even need the hardware crypto. Network performance has been impressive with the VIA dual core, with us typically dealing with ~100mbps UP/DOWN. Granted I don't have a huge amount of users, around 30; but being able to do what I need and with a fanless design with tons of processor to spare is pretty neat.
I am still running 2.03 and haven't upgraded to 2.1 yet. Perhaps this will address the fpudna issue. If not, I don't think I am going to worry about it unless there is a pressing reason to.
I will report any other HDD issues/errors if they happen. I thank everyone for the help and assistance regarding the matter.