ada0 randomly disappears?
-
zfs setup with two identical 60gb SATA drives and randomly ada0 just vanishes, nothing anywhere.
Reboot same issue.
Shut down and restart and ada0 is there again until it decides to poof again.
No SMART errors short or long test.Thoughts... bad SATA port?
bad drive?
I'm stupid?
2.5.2 release (current af of this post)
openvpn_client_export and service_watchdog only packages installed.SMART test for ada0 as follows:
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-STABLE amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org=== START OF INFORMATION SECTION ===
Model Family: SandForce Driven SSDs
Device Model: KINGSTON SV300S37A60G
Serial Number: 50026B725A080AD0
LU WWN Device Id: 5 0026b7 25a080ad0
Firmware Version: 603ABBF0
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
TRIM Command: Available
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Aug 15 21:24:56 2021 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSEDGeneral SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7d) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 48) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0025) SCT Status supported.
SCT Data Table supported.SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate -O--CK 120 120 050 - 0/0
5 Retired_Block_Count PO--CK 100 100 003 - 0
9 Power_On_Hours_and_Msec -O--CK 080 080 000 - 18331h+48m+14.200s
12 Power_Cycle_Count -O--CK 100 100 000 - 184
171 Program_Fail_Count -O-R-- 100 100 000 - 0
172 Erase_Fail_Count -O--CK 100 100 000 - 0
174 Unexpect_Power_Loss_Ct ----CK 000 000 000 - 178
177 Wear_Range_Delta ------ 000 000 000 - 1
181 Program_Fail_Count -O-R-- 100 100 000 - 0
182 Erase_Fail_Count -O--CK 100 100 000 - 0
187 Reported_Uncorrect -O--C- 100 100 000 - 0
189 Airflow_Temperature_Cel ------ 030 052 000 - 30 (Min/Max 19/52)
194 Temperature_Celsius -O---K 030 052 000 - 30 (Min/Max 19/52)
195 ECC_Uncorr_Error_Count --SRC- 120 120 000 - 0/0
196 Reallocated_Event_Count PO--CK 100 100 003 - 0
201 Unc_Soft_Read_Err_Rate --SRC- 120 120 000 - 0/0
204 Soft_ECC_Correct_Rate --SRC- 120 120 000 - 0/0
230 Life_Curve_Status PO--C- 100 100 000 - 100
231 SSD_Life_Left ------ 094 094 011 - 1
233 SandForce_Internal -O--CK 000 000 000 - 15704
234 SandForce_Internal -O--CK 000 000 000 - 17637
241 Lifetime_Writes_GiB -O--CK 000 000 000 - 17637
242 Lifetime_Reads_GiB -O--CK 000 000 000 - 244
244 Unknown_Attribute ------ 091 091 010 - 18612508
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warningGeneral Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x04 GPL,SL R/O 16 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xb7 GPL,SL VS 16 Device vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data TransferSMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log not supported
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error1 Extended offline Completed without error 00% 18331 -
2 Short offline Completed without error 00% 12862 -
3 Short offline Completed without error 00% 12784 -
4 Short offline Completed without error 00% 12651 -
5 Short offline Completed without error 00% 12424 -
6 Short offline Completed without error 00% 12421 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.SCT Status Version: 3
SCT Version (vendor specific): 0 (0x0000)
Device State: Active (0)
Current Temperature: 30 Celsius
Power Cycle Min/Max Temperature: 19/52 Celsius
Lifetime Min/Max Temperature: 19/67 Celsius
Under/Over Temperature Limit Count: 0/0SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 10 minutes
Min/Max recommended Temperature: 0/70 Celsius
Min/Max Temperature Limit: 0/85 Celsius
Temperature History Size (Index): 478 (64)Index Estimated Time Temperature Celsius
65 2021-08-12 13:50 29 **********
... ..( 8 skipped). .. **********
74 2021-08-12 15:20 29 **********
75 2021-08-12 15:30 30 ***********
... ..( 4 skipped). .. ***********
80 2021-08-12 16:20 30 ***********
81 2021-08-12 16:30 31 ************
... ..( 47 skipped). .. ************
129 2021-08-13 00:30 31 ************
130 2021-08-13 00:40 30 ***********
... ..( 21 skipped). .. ***********
152 2021-08-13 04:20 30 ***********
153 2021-08-13 04:30 29 **********
154 2021-08-13 04:40 30 ***********
155 2021-08-13 04:50 29 **********
... ..( 41 skipped). .. **********
197 2021-08-13 11:50 29 **********
198 2021-08-13 12:00 30 ***********
199 2021-08-13 12:10 30 ***********
200 2021-08-13 12:20 29 **********
201 2021-08-13 12:30 29 **********
202 2021-08-13 12:40 29 **********
203 2021-08-13 12:50 30 ***********
... ..( 14 skipped). .. ***********
218 2021-08-13 15:20 30 ***********
219 2021-08-13 15:30 31 ************
... ..( 13 skipped). .. ************
233 2021-08-13 17:50 31 ************
234 2021-08-13 18:00 ? -
235 2021-08-13 18:10 ? -
236 2021-08-13 18:20 34 ***************
237 2021-08-13 18:30 33 **************
238 2021-08-13 18:40 32 *************
239 2021-08-13 18:50 32 *************
240 2021-08-13 19:00 31 ************
241 2021-08-13 19:10 32 *************
242 2021-08-13 19:20 31 ************
... ..( 8 skipped). .. ************
251 2021-08-13 20:50 31 ************
252 2021-08-13 21:00 32 *************
253 2021-08-13 21:10 32 *************
254 2021-08-13 21:20 31 ************
255 2021-08-13 21:30 32 *************
... ..( 16 skipped). .. *************
272 2021-08-14 00:20 32 *************
273 2021-08-14 00:30 31 ************
... ..( 4 skipped). .. ************
278 2021-08-14 01:20 31 ************
279 2021-08-14 01:30 32 *************
... ..( 5 skipped). .. *************
285 2021-08-14 02:30 32 *************
286 2021-08-14 02:40 31 ************
... ..( 11 skipped). .. ************
298 2021-08-14 04:40 31 ************
299 2021-08-14 04:50 30 ***********
... ..( 35 skipped). .. ***********
335 2021-08-14 10:50 30 ***********
336 2021-08-14 11:00 31 ************
337 2021-08-14 11:10 30 ***********
338 2021-08-14 11:20 31 ************
... ..( 8 skipped). .. ************
347 2021-08-14 12:50 31 ************
348 2021-08-14 13:00 30 ***********
349 2021-08-14 13:10 31 ************
... ..( 16 skipped). .. ************
366 2021-08-14 16:00 31 ************
367 2021-08-14 16:10 32 *************
... ..( 10 skipped). .. *************
378 2021-08-14 18:00 32 *************
379 2021-08-14 18:10 33 **************
380 2021-08-14 18:20 33 **************
381 2021-08-14 18:30 32 *************
382 2021-08-14 18:40 32 *************
383 2021-08-14 18:50 32 *************
384 2021-08-14 19:00 33 **************
385 2021-08-14 19:10 33 **************
386 2021-08-14 19:20 32 *************
... ..( 2 skipped). .. *************
389 2021-08-14 19:50 32 *************
390 2021-08-14 20:00 33 **************
391 2021-08-14 20:10 32 *************
... ..( 11 skipped). .. *************
403 2021-08-14 22:10 32 *************
404 2021-08-14 22:20 31 ************
405 2021-08-14 22:30 31 ************
406 2021-08-14 22:40 31 ************
407 2021-08-14 22:50 32 *************
408 2021-08-14 23:00 31 ************
... ..( 11 skipped). .. ************
420 2021-08-15 01:00 31 ************
421 2021-08-15 01:10 30 ***********
... ..( 42 skipped). .. ***********
464 2021-08-15 08:20 30 ***********
465 2021-08-15 08:30 31 ************
466 2021-08-15 08:40 30 ***********
... ..( 3 skipped). .. ***********
470 2021-08-15 09:20 30 ***********
471 2021-08-15 09:30 31 ************
472 2021-08-15 09:40 31 ************
473 2021-08-15 09:50 30 ***********
474 2021-08-15 10:00 31 ************
475 2021-08-15 10:10 30 ***********
... ..( 47 skipped). .. ***********
45 2021-08-15 18:10 30 ***********
46 2021-08-15 18:20 31 ************
47 2021-08-15 18:30 30 ***********
... ..( 11 skipped). .. ***********
59 2021-08-15 20:30 30 ***********
60 2021-08-15 20:40 31 ************
61 2021-08-15 20:50 32 *************
62 2021-08-15 21:00 31 ************
63 2021-08-15 21:10 30 ***********
64 2021-08-15 21:20 30 ***********SCT Error Recovery Control command not supported
Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
0x01 ===== = = === == General Statistics (rev 2) ==
0x01 0x008 4 184 --- Lifetime Power-On Resets
0x01 0x010 4 18331 --- Power-on Hours
0x01 0x018 6 36988064433 --- Logical Sectors Written
0x01 0x028 6 512520108 --- Logical Sectors Read
0x04 ===== = = === == General Errors Statistics (rev 1) ==
0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors
0x04 0x010 4 0 --- Resets Between Cmd Acceptance and Completion
0x05 ===== = = === == Temperature Statistics (rev 1) ==
0x05 0x008 1 30 --- Current Temperature
0x05 0x010 1 30 --- Average Short Term Temperature
0x05 0x018 1 37 --- Average Long Term Temperature
0x05 0x020 1 63 --- Highest Temperature
0x05 0x028 1 19 --- Lowest Temperature
0x05 0x030 1 48 --- Highest Average Short Term Temperature
0x05 0x038 1 22 --- Lowest Average Short Term Temperature
0x05 0x040 1 40 --- Highest Average Long Term Temperature
0x05 0x048 1 27 --- Lowest Average Long Term Temperature
0x05 0x050 4 0 --- Time in Over-Temperature
0x05 0x058 1 70 --- Specified Maximum Operating Temperature
0x05 0x060 4 0 --- Time in Under-Temperature
0x05 0x068 1 0 --- Specified Minimum Operating Temperature
0x06 ===== = = === == Transport Statistics (rev 1) ==
0x06 0x008 4 155 --- Number of Hardware Resets
0x06 0x010 4 1011 --- Number of ASR Events
0x06 0x018 4 0 --- Number of Interface CRC Errors
0x07 ===== = = === == Solid State Device Statistics (rev 1) ==
0x07 0x008 1 9 --- Percentage Used Endurance Indicator
|||_ C monitored condition met
||__ D supports DSN
|___ N normalized valuePending Defects log (GP Log 0x0c) not supported
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 3 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 4 Device-to-host register FISes sent due to a COMRESET
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC
0x0002 2 0 R_ERR response for data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FISThanks in advance!
-
possible power saving setting I'm missing?
-
SSDs that disappear and then miraculously re-appear if you power cycle them are usually on the way out. I would look art replacing it.
Steve
-
@stephenw10
Yeah figured as much, just wanted to be sure before I ordered a new break the bank $20 60Gb SDD heh.Thanks Steve!
Well that and the fact I've NEVER had a drive in my entire technical life just play David Blaine on me like that with NO SMART errors.
-
Yeah I've never seen it with a spinning drive but with SSDs I've seen it a few times now.
Usually, with only one drive, you fins services start to fail as it can't access it until there are just errors. Rebooting leaves it unable to boot but full power cycle will boot back up like nothing is wrong. It has always failed again though when I have seen that.
It is probably worth re-seating the SATA cables because I have also seen weirdness with lose cables in the past. It doesn't usually recover by power cycling in that case though.
Steve
-
Yeah, cables have been changed actually, I know nothing lasts forever and sometimes is open box bad. However this is a first for me.
AND exactly as you described it.