Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Fresh load, minimal tweaks, idle then catastrophe

    Scheduled Pinned Locked Moved General pfSense Questions
    6 Posts 3 Posters 718 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • U Offline
      ultraSilence
      last edited by

      So far, this has happened twice. The second time was with different drives. I'm hoping if I tell enough, someone will key up and tell me where a possible weak link is.

      I took a spare 1u chassis supermicro X9DRL-7F, dual e5-2603v2, 32Gb ram and loaded pfsense. I have an additional 1gb ethernet adapter with 2 ports, because I like to party.

      It seems like a bit overkill, but the system was handy.

      I put the drives in a raid 1 array, using the built in sas controller. I know the bsd world loves zfs and zfs hates hardware raid, but this was a simple load and I pretended zfs didn't exist for either attempt.

      I set up the machine, things look good, I add a VPN client to get familiar with routing subnets over different vpn connections. I set it aside and do other things while it burns in. After a few weeks, the machine reboots and goes into endless boot loops.

      I have seen mentions of filesystem corruption and instructions to boot into single user mode to repair the filesystem. This unfortunately was not a point I could get to, it would just crash out before then. I couldn't really catch much from the IPMI interface before it would reboot.

      After the first time this happened and I gave up, I simply reloaded the system with new drives and it worked fine... until last night.

      There very well could be something up with the hardware, but to be able to reload the system and it be absent of any symptoms until a certain duration is reached seems a bit weird. Are there known issues with certain configurations that are time bombs like this is?

      Unless someone has some good ideas, the next thing I will disable the internal sas controller and wire into the sata ports. I'll probably build a spare and keep it on standby until this one croaks again.

      H 1 Reply Last reply Reply Quote 0
      • H Offline
        heper @ultraSilence
        last edited by

        @ultrasilence said in Fresh load, minimal tweaks, idle then catastrophe:

        After a few weeks, the machine reboots and goes into endless boot loops.

        why did it reboot ?

        U 1 Reply Last reply Reply Quote 0
        • stephenw10S Offline
          stephenw10 Netgate Administrator
          last edited by

          Yeah, impossible to say without seeing a crash report of some kind.

          But, yes, hardware raid controllers can suck under pfSense and are best avoided if possible.

          Steve

          U 1 Reply Last reply Reply Quote 0
          • U Offline
            ultraSilence @heper
            last edited by

            @heper said in Fresh load, minimal tweaks, idle then catastrophe:

            @ultrasilence said in Fresh load, minimal tweaks, idle then catastrophe:

            After a few weeks, the machine reboots and goes into endless boot loops.

            why did it reboot ?

            That unfortunately is an answer I do not have. It only had once PC routing through it, and it was offline.

            I'm resisting the urge to put an OS that I am more familiar with.

            1 Reply Last reply Reply Quote 0
            • U Offline
              ultraSilence @stephenw10
              last edited by

              @stephenw10 said in Fresh load, minimal tweaks, idle then catastrophe:

              Yeah, impossible to say without seeing a crash report of some kind.

              But, yes, hardware raid controllers can suck under pfSense and are best avoided if possible.

              Steve

              The heat sink on the LSI 2208 is pretty hot to the touch, so I have disabled it and will move things over to the sata ports.

              I don't often encounter hardware failures that will destroy operating systems.

              Time for round 3, if it happens again, are there any "paint by numbers" guides for retrieving logs externally?

              1 Reply Last reply Reply Quote 0
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                The best thing you can do it hook up a serial console and log it's output to something locally.
                If it is a drive or drive controller failure it may not be able to record that event but it will spew a load of errors to the console.

                The next best thing is set up log exporting via syslog:
                https://docs.netgate.com/pfsense/en/latest/monitoring/logs/remote.html

                Steve

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.