Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    HDD dying although SMART says ok?

    Scheduled Pinned Locked Moved Hardware
    4 Posts 4 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      D0X
      last edited by

      As from this morning, my systemlog is cluttered with the following:

      Sep 7 07:00:48	kernel		vnode_pager_putpages: residual I/O 4096 at 24
      Sep 7 07:00:48	kernel		vnode_pager_putpages: I/O error 5
      Sep 7 07:00:48	kernel		g_vfs_done():ufsid/5799c06539f8a71c[READ(offset=359244398592, length=32768)]error = 5
      Sep 7 07:00:48	kernel		(ada0:ata0:0:0:0): Error 5, Retries exhausted
      Sep 7 07:00:48	kernel		(ada0:ata0:0:0:0): RES: 51 40 9a 51 d2 29 29 00 00 00 00
      Sep 7 07:00:48	kernel		(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
      Sep 7 07:00:48	kernel		(ada0:ata0:0:0:0): CAM status: ATA Status Error
      Sep 7 07:00:48	kernel		(ada0:ata0:0:0:0): READ_DMA48\. ACB: 25 00 8f 51 d2 40 29 00 00 00 40 00
      Sep 7 07:00:43	kernel		(ada0:ata0:0:0:0): Retrying command
      Sep 7 07:00:43	kernel		(ada0:ata0:0:0:0): RES: 51 40 99 51 d2 29 29 00 00 00 00
      Sep 7 07:00:43	kernel		(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
      Sep 7 07:00:43	kernel		(ada0:ata0:0:0:0): CAM status: ATA Status Error
      Sep 7 07:00:43	kernel		(ada0:ata0:0:0:0): READ_DMA48\. ACB: 25 00 8f 51 d2 40 29 00 00 00 40 00
      Sep 7 07:00:39	kernel		(ada0:ata0:0:0:0): Retrying command
      Sep 7 07:00:39	kernel		(ada0:ata0:0:0:0): RES: 51 40 98 51 d2 29 29 00 00 00 00
      Sep 7 07:00:39	kernel		(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
      Sep 7 07:00:39	kernel		(ada0:ata0:0:0:0): CAM status: ATA Status Error
      Sep 7 07:00:39	kernel		(ada0:ata0:0:0:0): READ_DMA48\. ACB: 25 00 8f 51 d2 40 29 00 00 00 40 00
      Sep 7 07:00:35	kernel		(ada0:ata0:0:0:0): Retrying command
      Sep 7 07:00:35	kernel		(ada0:ata0:0:0:0): RES: 51 40 98 51 d2 29 29 00 00 00 00
      Sep 7 07:00:35	kernel		(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
      Sep 7 07:00:35	kernel		(ada0:ata0:0:0:0): CAM status: ATA Status Error
      Sep 7 07:00:35	kernel		(ada0:ata0:0:0:0): READ_DMA48\. ACB: 25 00 8f 51 d2 40 29 00 00 00 40 00
      Sep 7 07:00:32	kernel		(ada0:ata0:0:0:0): Retrying command
      Sep 7 07:00:32	kernel		(ada0:ata0:0:0:0): RES: 51 40 98 51 d2 29 29 00 00 00 00
      Sep 7 07:00:32	kernel		(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
      Sep 7 07:00:32	kernel		(ada0:ata0:0:0:0): CAM status: ATA Status Error
      Sep 7 07:00:32	kernel		(ada0:ata0:0:0:0): READ_DMA48\. ACB: 25 00 8f 51 d2 40 29 00 00 00 40 00
      Sep 7 07:00:27	kernel		(ada0:ata0:0:0:0): Retrying command
      Sep 7 07:00:27	kernel		(ada0:ata0:0:0:0): RES: 51 40 96 51 d2 29 29 00 00 00 00
      Sep 7 07:00:27	kernel		(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
      Sep 7 07:00:27	kernel		(ada0:ata0:0:0:0): CAM status: ATA Status Error
      Sep 7 07:00:27	kernel		(ada0:ata0:0:0:0): READ_DMA48\. ACB: 25 00 8f 51 d2 40 29 00 00 00 08 00
      

      This repeats itself every first minute of every hour. Until it states 'Retries exhausted'. This system runs for over 7 years now without any troubles, the HDD is in there for about 3 years now (Seagate Momentus 5400.6).

      SMART still says the drive is healthy, but I can't see any other reason for these entries. RAM is ok, cables swapped. Northing else would cause this right?

      1 Reply Last reply Reply Quote 0
      • W
        whosmatt
        last edited by

        In my experience SMART is hit or miss.  I'd trust the syslog messages first. Make a config backup ASAP.  Then swap that drive for a known good one (an SSD if you can swing it) do a fresh install, and restore your config.

        1 Reply Last reply Reply Quote 0
        • JailerJ
          Jailer
          last edited by

          What is the output of```
          smartctl -a /dev/ada0

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            Given the errors you're seeing, odds are high that there is actually a problem with the drive.

            In all the years I've been dealing with SMART, two things have been evident:

            1. SMART is prone to false negatives – Just because SMART says a drive is OK, doesn't mean it is. Especially when it comes to physical defects of various kinds or serious controller problems.

            2. If SMART says a drive has a problem, it has a problem.

            So you can trust that if SMART finds a problem, it's definitely a problem but if SMART says it's OK, you have more work to do.

            Same with software RAM tests like memtest86.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.