Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense on a SuperMicro Atom Server Randomly Rebooting

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    25 Posts 7 Posters 9.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tim.mcmanus
      last edited by

      Can you ssh into the box?

      If you can, go into the shell and type in clog /var/log/system.log and post the logs from just prior to the reboot and following it.

      I wrote a post with more info regarding grabbing logs here.  Posting log info usually gets the problem identified very quickly.

      1 Reply Last reply Reply Quote 0
      • T
        tjgertge
        last edited by

        Thanks for the info guys.  I will work on getting 2.1 installed in the next day or two and see what happens from there.  I'll know relatively quickly if it is going to work or not, and will post what I find from there.

        Just out of curiosity….  Any idea when 2.1 is going to move to stable?

        1 Reply Last reply Reply Quote 0
        • T
          tjgertge
          last edited by

          Well it looks like I am in the same boat with 2.1.  Here's the syslog right before and after the reboot.  Sure doesn't look like anything is getting logged.

          Feb 21 19:56:16 atlas check_reload_status: Syncing firewall
          Feb 21 19:56:49 atlas check_reload_status: Syncing firewall
          Feb 21 19:56:53 atlas php: /snort/snort_alerts.php: Checking for and disabling any rules dependent upon disabled preprocessors for WAN…
          Feb 21 19:57:33 atlas check_reload_status: Syncing firewall
          Feb 21 19:57:37 atlas php: /snort/snort_alerts.php: Checking for and disabling any rules dependent upon disabled preprocessors for WAN...
          Feb 21 20:02:02 atlas syslogd: kernel boot file is /boot/kernel/kernel
          Feb 21 20:02:02 atlas kernel: Copyright (c) 1992-2012 The FreeBSD Project.
          Feb 21 20:02:02 atlas kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
          Feb 21 20:02:02 atlas kernel: The Regents of the University of California. All rights reserved.
          Feb 21 20:02:02 atlas kernel: FreeBSD is a registered trademark of The FreeBSD Foundation.
          Feb 21 20:02:02 atlas kernel: FreeBSD 8.3-RELEASE-p6 #1: Thu Feb 21 11:33:28 EST 2013
          Feb 21 20:02:02 atlas kernel: root@snapshots-8_3-amd64.builders.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8 amd64
          Feb 21 20:02:02 atlas kernel: Timecounter "i8254" frequency 1193182 Hz quality 0
          Feb 21 20:02:02 atlas kernel: CPU: Intel(R) Atom(TM) CPU D525   @ 1.80GHz (1807.21-MHz K8-class CPU)
          Feb 21 20:02:02 atlas kernel: Origin = "GenuineIntel"  Id = 0x106ca  Family = 6  Model = 1c  Stepping = 10
          Feb 21 20:02:02 atlas kernel: Features=0xbfebfbff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Feb 21 20:02:02 atlas kernel: Features2=0x40e31d <sse3,dtes64,mon,ds_cpl,tm2,ssse3,cx16,xtpr,pdcm,movbe>Feb 21 20:02:02 atlas kernel: AMD Features=0x20100800 <syscall,nx,lm>Feb 21 20:02:02 atlas kernel: AMD Features2=0x1 <lahf>Feb 21 20:02:02 atlas kernel: TSC: P-state invariant
          Feb 21 20:02:02 atlas kernel: real memory  = 8589934592 (8192 MB)
          Feb 21 20:02:02 atlas kernel: avail memory = 8244371456 (7862 MB)
          Feb 21 20:02:02 atlas kernel: ACPI APIC Table: <022112 APIC1550>
          Feb 21 20:02:02 atlas kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
          Feb 21 20:02:02 atlas kernel: FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 HTT threads
          Feb 21 20:02:02 atlas kernel: cpu0 (BSP): APIC ID:  0
          Feb 21 20:02:02 atlas kernel: cpu1 (AP/HT): APIC ID:  1
          Feb 21 20:02:02 atlas kernel: cpu2 (AP): APIC ID:  2
          Feb 21 20:02:02 atlas kernel: cpu3 (AP/HT): APIC ID:  3

          Crash happened between 19:57 and 20:02</lahf></syscall,nx,lm></sse3,dtes64,mon,ds_cpl,tm2,ssse3,cx16,xtpr,pdcm,movbe></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>

          1 Reply Last reply Reply Quote 0
          • C
            cmb
            last edited by

            I'm on a customer network at a hotel running that exact same hardware right now with 80-some active users. That platform is widely used with factory defaults. Still no crash report from the sounds of it? Definitely, without question, a hardware problem of some sort if you're still not getting a crash report.

            1 Reply Last reply Reply Quote 0
            • T
              tjgertge
              last edited by

              Still no crash report, and still nothing logged in the BIOS log.

              I've ran untangle, and most recently ClearOS on this box for like 5 months and no crashes.  I'd just rather run pfSense.

              So you think it's something faulty?  I suppose I could see if SuperMicro would be willing to replace the system board.

              I've tried each drive individually, so I don't think it's the drives.  And you said earlier you didn't think it was RAM, because there was no crash report.

              In your similar setups are you using dual hard drives in a RAID array? 
              The only thing I have changed from defaults in BIOS is the IDE/SATA Config.  I have it set as follows:
              Configure Sata#1 as: RAID
              ICH Raid CodeBase: Adaptec

              If memory serves me right, I tried the CodeBase as Intel, and it wouldn't even see the raid volume.

              1 Reply Last reply Reply Quote 0
              • T
                tjgertge
                last edited by

                Ok, thought I'd provide an update…

                So I stumbled upon SuperMicro's supported OSes page.  Supposedly FreeBSD is supported, but not the onboard RAID.
                http://www.supermicro.com/support/resources/OS/Atom.cfm

                So I installed pfSense setting up a gmirror.  That didn't seem to solve it.

                So I started wondering if it had something to do with ACPI.  Looking in BIOS it was set to ACPI version 2.0.  I switched it to 3.0.  It's been up for about 24 hours now, so I'm cautiously optimistic now. It would never make it a full 24 hours before.

                1 Reply Last reply Reply Quote 0
                • T
                  tjgertge
                  last edited by

                  Spoke too soon, rebooted overnight.  Back to the drawing board.

                  1 Reply Last reply Reply Quote 0
                  • T
                    tim.mcmanus
                    last edited by

                    Can you bypass the RAID on the motherboard and directly connect to an IDE/SATA port?

                    1 Reply Last reply Reply Quote 0
                    • T
                      tjgertge
                      last edited by

                      Turning RAID on and off is just a BIOS setting, no jumpers or anything on the board for it.  SATA ports are the same, there aren't special ones for the RAID.  According to SuperMicro, AHCI mode is supported, which is what I have it on now.  I'm accomplishing the RAID with gmirror now.

                      I just swapped the RAM out with a different brand that I happened to have, so I'm going to give this a go now and see what happens.  So I went from 8GB of crucial ram to 8GB of Hynix ram that I had left over from a ram upgrade on my laptop.

                      I just can't think of what would be physically wrong with the board to only give me grief in FreeBSD, but work fine in other linux variants.  But if the RAM doesn't do it, I think my only other option is to see if SuperMicro will send me another board.  I just don't know if they will.

                      1 Reply Last reply Reply Quote 0
                      • T
                        tjgertge
                        last edited by

                        Well it's not the RAM.  It rebooted in less than three hours this time.

                        I've submitted an RMA to SuperMicro, hopefully they'll send a replacement.

                        1 Reply Last reply Reply Quote 0
                        • T
                          tjgertge
                          last edited by

                          SuperMicro is shipping a new system board.  Hope to have it in a day or two.

                          1 Reply Last reply Reply Quote 0
                          • T
                            tjgertge
                            last edited by

                            New system board is in, so we'll soon see if this is the answer.

                            Interesting side note.  It's been up for about 3 hours now.  The CPU is running about 10 degrees cooler than on the other board.  It was never close to overheating.  I just thought it was note worthy that the new one is running cooler.

                            1 Reply Last reply Reply Quote 0
                            • C
                              cheddarlump
                              last edited by

                              Running 2 of those same boxes here with 2.0.2 amd64 on them.

                              Never had any issues..  I did read however that a lot of people had temperature issues with them, and to fix them they taped the vents on the FRONT (?!) closed so it forced air in from the back, across the passive CPU cooler, and out through the power supply.  Seemed weird, but they say that the temps drop more than 10 degrees C when they do that..

                              Mine run in the mid 50's, and the chips are specced up to 100 degrees C, so I'm gonna leave mine alone for now.

                              They're surprisingly fast squid boxes when paired with an intel or samsung SSD.. :)  Nice low budget really fast router.

                              1 Reply Last reply Reply Quote 0
                              • T
                                tjgertge
                                last edited by

                                The CPU temp on the new system board is sitting at 55-56 degrees celsius.  On the only one it was always 65-70 degrees celsius.  So it never got to the point that it was over heating, but the difference tells me that there was definitely something going on.

                                It's been up 1.5 days now without rebooting, so I'm optimistic.

                                1 Reply Last reply Reply Quote 0
                                • T
                                  tjgertge
                                  last edited by

                                  Four days and counting.  I think it's solved.

                                  1 Reply Last reply Reply Quote 0
                                  • B
                                    bryanj0207
                                    last edited by

                                    Curious…how did you end up accomplishing RAID on the replacement board?  I understood where you switched to gmirror; did you revert to the onboard RAID again with the new board?

                                    1 Reply Last reply Reply Quote 0
                                    • T
                                      tjgertge
                                      last edited by

                                      @bryanj0207:

                                      Curious…how did you end up accomplishing RAID on the replacement board?  I understood where you switched to gmirror; did you revert to the onboard RAID again with the new board?

                                      I did revert back to the onboard raid on the new board and it is working perfectly. I have it set to raid mode in bios and the code base is set to adaptec.

                                      1 Reply Last reply Reply Quote 0
                                      • S
                                        serj999
                                        last edited by

                                        We have 2 boxes with this Supermicro board and PFsense 2.0.1 running. One Year it was no problems at all, no restarts, very stable. Since today morning we have the same problem with these two boxes. Suddenly reboots, weird hdd problems (says it's full but it isn't full and fsck  and smart and vendor tools say it's all right with hdds). ram is OK. CPU temp on one box 38 and another box 21 Celsius. It's looks like something with cable or sata controller. It's runs with one hdd, no raid.

                                        For 3 month I had the same issue with another boxes (absolut the same hardware configs) at my customers Datacenter. All was running about 6 month good and suddenly both boxes weird problems, reboots and so on. Those problem was solved changing sata cables and hdds from WD black scorpio 160 GB 2,5" to Seagate. Now they running 4 Month with no problems.

                                        And today these problems in our Infrastrucure too :( There was IPMI update at supermicro website -> now installed and waiting if it helps. box #2 since 5 hours -> OK, box #1 since 1 hour updated -> no crashes.

                                        I'll keep You updated if anything goes wrong, but for me now: no more these Boards with pfsense.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.