Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Notifications for ZFS status

    Scheduled Pinned Locked Moved General pfSense Questions
    7 Posts 2 Posters 87 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • O Offline
      ohmantics
      last edited by

      I'm aware of https://redmine.pfsense.org/issues/9226, but I wanted to raise this because I just recovered from both SSDs in a mirror dying in the same day and there wasn't a peep from the standard pfSense software, nor that script, which I had configured (and alerted me to a capacity issue before due to excessive retention from ntopng).

      Not sure what's gone on with the hardware, but one drive's controller doesn't respond and the other drive can't be imported. I've made some recovery attempts with UFS Explorer, but it couldn't see the newest boot environment at all, so I had to revert to a backup and lost a few recent config changes from after that backup that I had to recreate.

      The things that script checks for are a great start. Drives going offline or having SMART status changes should be alerted.

      Booting from the pfSense Installer image made me wish for a few things like smartmontools to be in that image to aid with recovery.

      O 1 Reply Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        If you have SMART enabled and it fails I'd expect to see some sort of alert. Were you able to see anything logged before it failed?

        O 1 Reply Last reply Reply Quote 0
        • O Offline
          ohmantics @stephenw10
          last edited by

          @stephenw10 No emails were emitted; it just stopped routing and didn't come back when rebooted.

          If a drive went offline and the pool changed to DEGRADED, the only time I'd have learned that from that script is after the cron job successfully ran, weekly.

          Something needs to detect the pool transitioning away from ONLINE or a drive going offline.

          1 Reply Last reply Reply Quote 0
          • stephenw10S Offline
            stephenw10 Netgate Administrator
            last edited by

            Hmm, do the drives now fail a SMART check? I guess one doesn't respond at all....

            This seems like something that has been raised before. Unsurprisingly. I thought there was an open ticket for it.....

            1 Reply Last reply Reply Quote 0
            • stephenw10S Offline
              stephenw10 Netgate Administrator
              last edited by

              Oh I'm thinking of the bug you linked. 🤦

              Here this didn't help because weekly tests were not frequent enough. Some active service that triggered on status change would be better I agree. However that script looks pretty light apart from the scrub every time it runs. You could run that far more frequently if you removed that.

              O 1 Reply Last reply Reply Quote 0
              • O Offline
                ohmantics @stephenw10
                last edited by

                @stephenw10 said in Notifications for ZFS status:

                Here this didn't help because weekly tests were not frequent enough. Some active service that triggered on status change would be better I agree. However that script looks pretty light apart from the scrub every time it runs. You could run that far more frequently if you removed that.

                It would be ideal for the base OS to be watching for status change. I just wanted to put that thought out there -- polling consumes extra power, isn't ever frequent enough, etc.

                I'm not sure that this would have actually helped in my situation or not as I don't know if there was enough time between the two drive failures to have done anything to rebuild the pool.

                1 Reply Last reply Reply Quote 0
                • O Offline
                  ohmantics @ohmantics
                  last edited by

                  While I'm thinking about ZFS-related things, I have just enough modifications to pfSense that require backup that a built-in ZFS send/recv solution would be a good addition.

                  If you're wondering what those non-packaged mods are:

                  • https://github.com/clara-j/pfsense_zfs_check
                  • wpa_supplicant setup for AT&T fiber
                  • https://github.com/LeonStraathof/pfsense-speedtest-widget

                  I think all of these could be handled if there was a designated directory that was included in backups and/or packed into config.xml.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.