Sudden hangs with pfSense 1.2
-
Hello people,
I have yet stumbled into another problem with my pfSense installation.
First, I'll describe the hardware:IBM x330 chassis, dual Xeon CPU, 2GB RAM, hardware RAID-1 with 2 36GB SCSI disks.
It's an old server, but it works. At least it had driven a Windows Server 2003-based website for several months with no locks or reboots.Anyway, the server is hosted in a local ISP and since I thought it was first a cooling problem, I'm sure it's not since the server hasn't changed its physical location in the collocation farm.
The problem is that every once and then, with no reason apparent to me (logs, screen), the machine freezes and a hardware reboot must be applied. The hangs can be every one hour, two, every two days, after a whole week of uptime… It's very random. I have not found anything in the logs that can lead me to the main reason.
After the reboot, everything comes back alive again, but with few problems that needs fsck's help to correct.
I have no idea what to do, or how to debug this problem!!!
Please assist. Thank you in advance.
-
Try to upgrade the bios. I have that unit as well and it runs just fine. I run it with the only one CPU though. Current uptime 70 days (since last upgrade). Which kernel are you using?
-
Problems like that are almost always down to hardware or BIOS problems. Hardware faults could be almost anything from a borderline power supply, failing fan/motherboard/cpu/memory problems onwards. That it used to work means either that Windows never hit the fault (luck) or that the hardware only started failing at around the point of the upgrade.
I'd agree with hoba though, try upgrading the BIOS first.
-
The problem is that every once and then, with no reason apparent to me (logs, screen), the machine freezes and a hardware reboot must be applied. The hangs can be every one hour, two, every two days, after a whole week of uptime… It's very random. I have not found anything in the logs that can lead me to the main reason.
Sounds a bit like faulty RAM.
Or you could check if the fans are dusty.
A friend once had a computer which we replaced about everything without finding the cause for a problem that sounds almost exactly as you describe it.
In the end we just had to clean the CPU-fan…...Also a PSU with capacitors nearing their end could lead to the problems you describe.
-
does anyone know of a good boot cd that will run tests on the hardware? I also had a lock up the other day that I cant find any info in logs of what happened.
-
http://www.ultimatebootcd.com/
It's a CD that contains about every program you'll every need to diagnose your computer :)
-
The problem is that every once and then, with no reason apparent to me (logs, screen), the machine freezes and a hardware reboot must be applied. The hangs can be every one hour, two, every two days, after a whole week of uptime… It's very random. I have not found anything in the logs that can lead me to the main reason.
Sounds a bit like faulty RAM.
Or you could check if the fans are dusty.
A friend once had a computer which we replaced about everything without finding the cause for a problem that sounds almost exactly as you describe it.
In the end we just had to clean the CPU-fan…...Also a PSU with capacitors nearing their end could lead to the problems you describe.
OK, I've got pretty new and shocking news, even to me!
Last night I went back to the servers' room, with a new server. This one is a DL/360 with pretty much different hardware: RAID card, disks, NICs (although they're [fxp*] like the former server but nonetheless new,) memory, CPU and even the casing. I installed pfSense on it, restored the XML backup using the Web UI. Then I switched servers, and put this one into production.
What do you think happened? You're damn right; 40 minutes later this new server died, with the same symptoms: it just froze with no reasons!
I've been very loyal to FreeBSD back to 1997, but I decided to try another OS firewall. The dead locks continued to materialize every 30-60 minutes thereafter, while I formatted the original, old server and installed Endian Firewall on it. Later on, I manually transferred the Virtual IPs (called Aliases in Endian,) forwards, firewall rules and other settings into the new firewall Web UI. Then, I switched between the two machines, putting the old one, that with Endian, back into production.
It has been 12 hours since. No deadlocks, no reboots, no freezes.
The problem is definetly in pfSense. Something, probably in the configuration, is causing it to deadlock. And it's happening on two different hardwares, with different RAIDs, NICs, memory and CPU.
Now, what do you say about this!?
-
I would guess the dev's would like to see your config XML backup if you still have it.
-
I would guess the dev's would like to see your config XML backup if you still have it.
Of course I have it. Developers, anyone?
-
Hello again,
I'm coming back after few months, during which I started using Endian firewall on the same hardware, with not even the slightest problem. Firewall was up and running with Endian (same hardware crashed continuously with pfSense 1.2) till I physically turned the server off.
The good news is: I've installed pfSense 1.2.1-RC2 on the same hardware, and for few days now I've seen no problem! So, I'm so happy that I'm back to pfSense :D
-
I am happy to hear that. Welcome back :-)