Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense Not Responding - What to do before power cycle?

    Scheduled Pinned Locked Moved General pfSense Questions
    8 Posts 4 Posters 8.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mevans336
      last edited by

      My pfSense box randomly loses all network connectivity. The web interface falls off the face of the earth, ssh doesn't respond, as does all network connectivity. The box is in our data center remotely, but before I have my datacenter power cycle it, should I have them console the box and see if there is any output on the screen? What can I do once the box is power cycled to determine the cause of the lockups?

      1 Reply Last reply Reply Quote 0
      • dotdashD
        dotdash
        last edited by

        If possible, you want to see if there is anything on the screen. If you can reboot from the console, that is better than power-cycling the box. Afterwards, I would check the error logs, historical traffic, state table, cpu usage, etc.

        1 Reply Last reply Reply Quote 0
        • M
          mevans336
          last edited by

          My provider consoled the box and here is what they said:

          The console showed an error message in Hex, and the words "BTX Halted". It appeared to be a few lines after boot, as the pfsense bootloader was still visible towards the top of the screen.

          That sounds like the box rebooted itself and then had an error at boot. I've looked at the logs in /var/log but they all seem to have reset themselves after reboot. Other than syslog, is there anything I can do to determine the cause of the reboot or why it may have halted?

          1 Reply Last reply Reply Quote 0
          • C
            cmb
            last edited by

            some info here:
            http://doc.pfsense.org/index.php/Unexpected_Reboot_Troubleshooting

            1 Reply Last reply Reply Quote 0
            • M
              mevans336
              last edited by

              @cmb:

              some info here:
              http://doc.pfsense.org/index.php/Unexpected_Reboot_Troubleshooting

              Thanks for that link. Installing the developers kernel will be problematic at best, but I know my data center has an IP KVM they can attach. I'll see what I can do.

              In the mean-time, I remember that when I built this box a week or so ago, everything was fine when I had 2GB of RAM installed. I dropped 4GB in (the 2GB DIMM heatsinks were too tall for the 1U chassis cover to close) and upon reboot, I received some sort of ACPI crash. I disabled ACPI by using hint.acpi.0.disabled="1" in /etc/loader.conf and everything seemed fine. I pounded on the box for 48 hours or so with iperf in a loop with no issues. Does pfSense 1.2.3-RC3 have any issues with 4GB of RAM and the SMP kernel that could cause crashes?

              I'm also not ruling out that the RAM could be faulty, although it has worked for several months in another machine here at the house. I would like to rule everything else out before I pay the data center guys to swap the RAM.

              1 Reply Last reply Reply Quote 0
              • S
                Supermule Banned
                last edited by

                I am running 4gb ram on a Xseries box from IBM with RC3 release. No problems at all.

                1 Reply Last reply Reply Quote 0
                • M
                  mevans336
                  last edited by

                  @Supermule:

                  I am running 4gb ram on a Xseries box from IBM with RC3 release. No problems at all.

                  Thanks Supermule.

                  The data center hooked a KVM up for me and unfortunately, I couldn't recreate the behavior and the hardware event log of the motherboard didn't show any events like overtemp or RAM errors; just the no keyboard connected event after the spontaneous reboot at 3AM on 10/5. I decided to make several BIOS changes however that I think may have been at fault. The Supermicro board comes with several new power saving features enabled that I think may be incompatible with either FreeBSD, my processor (E5200), or some combination of all three. So I turned the majority of them off, re-enabled ACPI, and so far things have stayed up with no crashing. I'll document them here just in case anyone runs into a similar issue:

                  • Intel Thermal Management 2
                  • C1 Enhanced Mode
                  • Enhanced Intel Speedstep
                  • Memory Remapping
                  • High Precision Event Timer

                  The first 3 options allow not only CPU clock modulation, but CPU voltage modulation based on load and temperature as well. Memory remapping on my Supermicro X7SBL-LN2 (Intel 3200 chipset) board moves the 4GB address space to the 5GB range. Finally, the High Precision Event Timer is a replacement for the 8254 Programmable Interval Timer, but seemed to be of benefit for desktop operating systems that play multimedia only.

                  I have a remote server monitoring the box via ping, so if it crashes or network connectivity drops off, I'll know immediately. My finger are crossed, but so far so good.

                  1 Reply Last reply Reply Quote 0
                  • M
                    mevans336
                    last edited by

                    Well, apparently the items noted above didn't help as the server dropped offline an hour ago and it looks like my data center has to power cycle it to get it back online. If I install the developer's kernel and get the panic information, is that likely to tell me what the cause is since the system BIOS isn't detecting in the DMI event logs?

                    All this work has a real cost ($135/hour for remote hands at the datacenter) associated with it, not to mention the cost of renting their IP KVM.

                    Edit:

                    The server had rebooted and was hung right after the memory count at POST, so this has got to be a hardware issue. Eff'ing A.

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.