Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Pfsense hangs after replacing hdd from zfs pool

    Scheduled Pinned Locked Moved General pfSense Questions
    14 Posts 4 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      ashima LAYER 8
      last edited by

      Hello,

      Pfsense 4.5.1. ZFS with mirror two hard drives (ada0 and ada1)

      2nd Hard drive (ada1) fails. Added a new hard drive.
      Detached 2nd hard drive and added the new hard drive to pool.
      After resilvering, copied boot file to new hard drive. Rebooted and the system boots normally. So far so good.

      I manually removed the 1st (original) drive and tried booting with the new replaced drive. The booting sequence starts. but after "Firewall Configuration done" , the system waits for ever. Pressing any key simply reboots the system. Putting back the 1st drive boots the system normally.

      Tried replacing the new drive with another drive... Adding two more new drive... Upgrading the system to 2.5.2 ... but NOPEs the system boots completely only if the 1st drive is attached.

      Any Pointers ?

      Ashima

      DaddyGoD 1 Reply Last reply Reply Quote 0
      • DaddyGoD
        DaddyGo @ashima
        last edited by DaddyGo

        @ashima said in Pfsense hangs after replacing hdd from zfs pool:

        Any Pointers ?

        Hi,

        I saved (Firefox favourites :)) this when we changed drives in our TrueNAS system, also FreeBSD + ZFS RAID.

        It can help you too:

        https://forums.freebsd.org/threads/replacing-a-failed-drive-in-an-encrypted-zfs-raidz-array-with-both-boot-and-root-pools.65199/

        +++edit:

        minus encrypt ๐Ÿ˜‰

        Cats bury it so they can't see it!
        (You know what I mean if you have a cat)

        1 Reply Last reply Reply Quote 0
        • A
          ashima LAYER 8
          last edited by

          Thanks @DaddyGo I tried with following commands :

          gpart create -s gpt ada1
          gpart add -s 409600 -t efi -l zfsefi ada1
          gpart add -a 4k -s 512k -t freebsd-boot -l gptboot2 ada1
          gpart add -b 411648 -s 4194304 -t freebsd-swap -l zfswap ada1
          gpart add -b 4605952 -s 307974144 -t freebsd-zfs -l zfs2 ada1

          (gpart show -- shows partition of ada1 same as ada0)

          zpool attach -f zroot ada0p4 ada1p4

          After 100% resilvering

          gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 2 ada1

          zpool status --- shows both the drives online.

          Similar steps are shown in the link provided by @DaddyGo
          minus the encryption.

          But the system starts booting fine and stops after the sequence "Configuring Firewall rules" (after all my WAN interfaces are detected). I fail to understand the issue.

          Any Clue ?

          Regards,
          Ashima

          DaddyGoD 1 Reply Last reply Reply Quote 0
          • DaddyGoD
            DaddyGo @ashima
            last edited by

            @ashima

            Hmmmm,

            Then I have to say that something went wrong when rebuilding ZFS RAID on the replaced drive...
            (ZFS RAID does this without problems, but there is no way to know)

            since you write that if you boot from the working drive (orig.) - everything works fine

            What I'd do is "save a big one" and build a ZFS RAID from scratch with two new drives and throw the saved pfSense config back on it...

            anyway, you're wasting your time, because I don't think there's anything more to say.... ๐Ÿ˜‰ - and at least you'll have a new ZFS RAID system with a fresh HW

            BTW:

            I'm guessing if you bought in pair of RAID elements (HDD / SSD) it's likely that the other one, will have similar problems soon...

            Cats bury it so they can't see it!
            (You know what I mean if you have a cat)

            1 Reply Last reply Reply Quote 0
            • A
              ashima LAYER 8
              last edited by

              @DaddyGo

              Yes that seems to be only work around. I shall try that tomorrow morning (its 50 past midnight, here in India)

              So I have this system (now upgraded to 2.5.2) with captive portal and freeradius installed. I have around 250 users created with freeradius. I was trying to avoid creating these IDs again.

              So what would be best way to take backup of these users and restore in new system.

              I have few other packages like openvpn client, shellcmd, sudo, cron and mailreports installed. But these are not much of work. I can configure them again if required.

              Thanks for all the input.

              dotdashD 1 Reply Last reply Reply Quote 0
              • dotdashD
                dotdash @ashima
                last edited by

                @ashima
                If the system was running zfs under 2.4.x and was updated to 2.5.2, it may be safer to backup and re-install. From my recollection, there were many changes related to zfs in recent versions. This may cause problems with older zfs installs.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  I'm not aware of anything that would cause a problem specifically but I would still re-install clean coming from 2.4.X if you can.

                  Steve

                  1 Reply Last reply Reply Quote 0
                  • A
                    ashima LAYER 8
                    last edited by

                    @dotdash Well I was facing the issue with 2.4.5 so thats when I decided to upgrade to 2.5.2. But the problem persist,

                    Another thing I noticed after running "zpool scrub" my original hard drive gave 13 Checksum error, so I just did "zpool clean" and the error disappeared.

                    What's surprising is that the booting sequence stops midway
                    "Configuring Firewall...done" and pressing any key reboots the system. I guess "Generating RRD graph ..." doesn't happen.

                    As suggested by all of you I am planning to do a fresh installation on the two new drives. Is there any way I can transfer the users (freeradius) to new system.

                    BTW I have plenty of Pfsense sites which I upgraded from 2.4.5 to 2.5.2 remotely and all of them were on zfs. No issues so for. Is there any thing I need to take care for my remaining sites or should I stick to 2.4.5 for the rest of the sites.

                    Do share your feedback for freeradius users restoring in new system,

                    Regards,
                    Ashima

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      The Freeradius user data is all stored in the main config file. When you restore that all package data should also be restored.

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • A
                        ashima LAYER 8
                        last edited by

                        Thanks @stephenw10 .

                        So I booted my device with the original drive and took a Cloud Backup (Thanks to pfsense team for providing this facility).

                        Hooked two new drives. Install a fresh copy, downloaded all the packages and restored configuration from the backup. OMG the device was as original.... all my settings and my users IDs were there. But then I couldn't login through captive portal (using freeradius2 for authentication) page. I guess its not authenticating or some SSL issue. Couldn't get enough time to solve the issue. I guess I'll start a new thread in case I fail to do so.
                        Meanwhile if someone has any pointers for freeradius2 issue please do reply back. That'll save me lot of work for Monday Morning.

                        Regards,
                        Ashima

                        DaddyGoD 1 Reply Last reply Reply Quote 0
                        • DaddyGoD
                          DaddyGo @ashima
                          last edited by

                          @ashima said in Pfsense hangs after replacing hdd from zfs pool:

                          some SSL issue.

                          My bet would be ๐Ÿ˜‰

                          and in this case you are not in as much trouble as you think

                          I wish you a good job

                          Cats bury it so they can't see it!
                          (You know what I mean if you have a cat)

                          1 Reply Last reply Reply Quote 0
                          • A
                            ashima LAYER 8
                            last edited by

                            @DaddyGo Thanks for response.

                            So if SSL issue...
                            I need to create new freeradius certificate and need freeradius2 configuration (I guess EAP tab) to point to new certificates.

                            Is that right ?

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              It depends how/why it's failing. If that config works on the old device I would expect it to work here too. The MAC addresses will be different so clients may see it as a new network. Perhaps they are simply using the wrong stored logins?

                              First make sure you can authenticate against Freeradius from Diag Auth.

                              Steve

                              1 Reply Last reply Reply Quote 0
                              • A
                                ashima LAYER 8
                                last edited by

                                @stephenw10 ... it finally worked.

                                Created new CA/Certificates for Freeradius.
                                Created new CA/certificates for Captive Portal.

                                Finally what actually worked :
                                User Manage : Authentication Server : Selected Radius Server and saved it again.

                                And every thing started working. Kept it under testing (finger crossed)

                                1 Reply Last reply Reply Quote 1
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.