Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Random crashes

    Scheduled Pinned Locked Moved General pfSense Questions
    11 Posts 3 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      MaxPF
      last edited by

      I have installed 2.3.4 and the upgraded to 2.3.4-p1 on a Shuttle DS68U:

      Celeron 3855U
      8GB DDR3
      Intel i211 (igb) and i219m (em) Nics
      128GB nvme drive

      Everything is working, but I started experiencing random crashes and reboots on average once a day.

      There is nothing in /var/crash and other logs don't show any error. Other than that I am running the same packages I was running on my previous hardware (an old Dell laptop) which never crashed:

      Acme
      Avahi
      ntopng
      openvpn-client-export
      pfBlockerNG
      Service Watchdog
      Status traffic Totals
      System patches

      The 128GB nvme m2 drive initially caused the automatic installer to fail , but I managed to get around and complete the installation. Could the nvme compatibility cause the instability? I am tempted to remove the nvme drive and try a fresh install on a regular HD using the same configuration.

      1 Reply Last reply Reply Quote 0
      • M
        MaxPF
        last edited by

        Removed the nvme drive, installed a 2.5 standard HD, reinstalled 2.3.4, updated it to 2.3.4-p1 and installed the same packages. After restoring the config file everything worked well as expected, but this morning I checked the uptime and it looks like it rebooted itself again around 4.40am.

        It always happens within  36hrs and again I can't see anything in the logs :(

        I'm going to remove packages one at the time and see what happens. Removed ntopng and manually rebooted. Next step will be disable the traffic shaper and after that overnight memtest on the box.

        1 Reply Last reply Reply Quote 0
        • M
          MaxPF
          last edited by

          After more troubleshooting and keeping a serial console always connected I was able to determine that the crashes are caused by the HD getting detached for some unknown reason:

          ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
          ada0: <st9160310as de06=""> s/n 5GV6HR23D detached
          (ada0:ahcich0:0:0:0): Periph destroyed
          /: got error 6 while accessing filesystem</st9160310as>
          

          I could try a different HD, but I think the current one is good as it was working just fine on my previous pfSense box (laptop). Also, I don't know for sure if the same thing was happening with the nvme drive since I did not have a console connected at the time.

          What I find interesting is that it always happens between 24 and 28 hours of uptime. Once 24 have passed I know that a crash is imminent.

          1 Reply Last reply Reply Quote 0
          • M
            MaxPF
            last edited by

            Just happened again, after about 26 hours, with the same storage detached issue.
            Is there anything at the OS level that would cause the storage to get detached maybe after inactivity?
            This is also a Skylake based CPU/chip set, are there any known compatibility issues?

            I'll try a different HD, but I'm not confident it will change anything. After that I might consider giving 2.4 a try… I'm really puzzled, because when it works it works very well... So frustrating

            1 Reply Last reply Reply Quote 0
            • M
              MaxPF
              last edited by

              Unfortunately I have to report that things have not improved for me. I tried two standard HD, switched to 2.4 and tried the NVME drive in efi mode which, at least now, is supported by the installer. I replaced the ram with specific brand and model on the DS68U compatibility list, even if the existing one passed memtest86 with flying colors.

              Nothing! every day, between 24hrs and 31hrs (new record) of uptime the box just spontaneously reboots. It can be in the middle of the night or any time during the day, but never before reaching at least 24 hrs of uptime. When it runs, it works great, performance is good, load low, temps stable around 34C, SMART reporting all good. I am at a loss…

              I will have to schedule a daily reboot at night with cron at this point I don't know what else to do.

              1 Reply Last reply Reply Quote 0
              • B
                buttabean
                last edited by

                Did you ever figure this out? i converted my old skylake based server into a router and I'm having the same issue. I switched the ssd drive out thinking it was it. Changed the bios sata controller to ide mode and it seemed stable for a long time(a full month without demounting the drive) It seems totally random, sometimes I can't go a few hours without having an issue. Going to try a fresh install with a usb drive as the mount instead tomorrow.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  You see a crash report?

                  Anything on the console?

                  If it reboots at random with nothing logged it's almost certainly hardware.

                  Steve

                  B 1 Reply Last reply Reply Quote 0
                  • B
                    buttabean @stephenw10
                    last edited by

                    @stephenw10

                    it's the pref destroyed error with the ssd dismounting. I'm running a USB drive install since this morning without any issues so far.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Be sure you don't have SWAP (or at least are not swapping) and have moved /var /tmp to RAM if running from flash.

                      Also check the root is mounted noatime.

                      [2.4.4-RELEASE][admin@fw1.stevew.lan]/root: mount -p
                      /dev/diskid/DISK-9E18E959s2a /			ufs	rw,noatime 	1 1
                      devfs			/dev			devfs	rw		0 0
                      /dev/diskid/DISK-9E18E959s1 /boot/u-boot		msdosfs	rw,noatime 	0 0
                      /dev/md0		/tmp			ufs	rw		2 2
                      /dev/md1		/var			ufs	rw		2 2
                      devfs			/var/dhcpd/dev		devfs	rw		0 0
                      

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • B
                        buttabean
                        last edited by

                        @stephenw10 said in Random crashes:

                        noatime

                        /dev/gptid/ba785815-9ce5-11e9-8bc0-90e2ba09f08c /			ufs	rw		1 1
                        devfs			/dev			devfs	rw		0 0
                        /dev/md0		/tmp			ufs	rw		2 2
                        /dev/md1		/var			ufs	rw		2 2
                        devfs			/var/dhcpd/dev		devfs	rw		0 0
                        

                        it seems swap is enabled. I was following this tutorial on disabling "Swap" https://forum.netgate.com/topic/107375/howto-remove-swap-post-install-and-resize/2

                        /dev/gptid/ba785815-9ce5-11e9-8bc0-90e2ba09f08c	/	ufs	rw	1	1
                        #/dev/gptid/ba7d42de-9ce5-11e9-8bc0-90e2ba09f08c	none	swap	sw	0	0
                        
                        

                        Should I change the "ba785815-9ce5-11e9-8bc0-90e2ba09f08c" to something else before rebooting? I'm pretty sure it's good but i wanted to double check before rebooting.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Hmm, I've never tried that. I would backup the config re-install, remove the swap during the install.

                          You should edit the fstab to set it to mount root noatime though if it's not already.

                          Steve

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.