All 3 PF-Sense mashines shutdown over weekend
-
Hello Everyone,
I hope i can find some help here, since this is happening for the 3rd week now
and i'm quite fed with having to hear that all our firewalls crashed again over the weekend.Following Problem:
Around Friday night when no one (including me) is in the office anymore the Firewalls
misteriously crash. Two of them are Synced using Carp. They manage our general
Internet Traffic and Routing.Also we have a seperated Firewall connected to a whole other ISP Connection
and is completely seperated from the main network.
All 3 of them Crash since 3 Weeks now. Every time we get back on Monday the 3 Firewalls
are Shut Down.What could that be? I'm quite clueless about what causes this problem
ever since it's "only" the 3 Firewalls crashing. We have another 2 working at another location
and they work fine.. :/PF Sense version I'm using is 1.2.3-RELEASE on all 3 Mashines.
Kind Regards,
thunder -
What exactly do you mean with "shut down"?
Are they switched off or are they crashed?This sound to me like a situation where the cleaning personal on friday evening need a power socket for the vacuum cleaner…..
(we once lost a week worth of simulations to such an incident....) -
No worries, all 3 of them are in a Closed room where the cleaning personal doesn't get to.
The Firewalls are then switched off Hardware-sided, so that i have to press the power button
to make them Boot and get back to do their duties again. -
You are probably loosing power to the room. Do you have a device you can leave on that will not power on by itself such as another computer or tv or…?
Plug something else in like a clock or television that will prove the power went out.
-
The other possibility is that the machines are overheating and shutting down as a result. If these devices are in a closed room, are you providing adequate cooling and air circulation?
-
They're in a server room, provided an air condition that constantly keeps the temperature at 20°C
The Mashines are 4 HE Server Mashines with adequate Tower coolersAlso, there are other mashines in the room that run perfectly during the weekend, so there is
no Power Loss either. -
Are the other unaffected machines the same specs as the pfsense boxes? It could be a minor brown-out (voltage line sag) that these HEs are susceptible to and the other machines aren't.
I've encountered this issue before where I had to go in and salvage a situation where a client's internet connection kept going down every half-hour.
As it turned out, he had the modem/ router on the same powerline as a 1.5HP water pump that triggered every half-hour causing a brown-out. I made him buy a true-online UPS and solved the issue (since moving the equipment involved tearing down $20,000 worth of renovations (thanks to an idiot of an interior designer who decided that they can run network cables concealed behind a 3m high mirror wall and between 5 levels without consulting any IT guys first). -
Assuming they have consoles, I also assume the displays are blank, due to the boxes being off? Question: can you set up a syslog server on the LAN and have the units send logging info there? Might give you clues as to what happens just before "The End" (as well as an idea of whether they are going bye-bye at the same time or not.)
-
Heyas, sorry for the late reply, got quite the work going on here and there.. and basically everywhere..
However, I have been seeing that we've had a lot of spam going on at SMTP port,
maybe an overflow of those packets successfully NAT'ed had caused this Shutdown?I mean it wouldn't explain that the other firewall which basically blocks everything on NAT
shuts down aswell, but it was a general idea of mine..However, i've set up a remote logging Server quick, so maybe i can catch a hint from where
the problems come from.@dreamslacker: We haven't had any changes ever since these Firewalls used to work fine.
It's just been the last few weeks. We have our own BBU and there are ONLY servers and
said air-condition in this room. A few Servers have similiar specs, some have even
higher Power needs and lesser efficiency, but all of them run just fine -
Heavy load will not cause the firewall to shut down. At the worst, the boxes will become unresponsive under heavy load, but that requires a pretty dedicated effort. Whatever you've got causing your boxes to shut down its something physical.
-
@dreamslacker: We haven't had any changes ever since these Firewalls used to work fine.
It's just been the last few weeks. We have our own BBU and there are ONLY servers and
said air-condition in this room. A few Servers have similiar specs, some have even
higher Power needs and lesser efficiency, but all of them run just fineUnfortunately, this isn't about power consumption on the machine end. It is about the power supply units used. If the other servers are not of the same make and similar age (to account for capacitors degrading with age), they won't be an accurate representation of whether the power down is related to sensitivity to brown-outs.
By BBU, do you mean UPS? If so, are they true online units? If not, they might not kick in fast enough for some of the PSUs that are more sensitive.
A quick-dirty test at the moment will be to set the BIOS in the servers under power options so that the servers will power on upon restoration of power after a power failure.
If the servers are deliberately powered off (by button or console or webgui) they will not automatically power on themselves after the cut. If they are dropping because of a brown-out, then you should find that the machines will be running when you come in after the weekend.
Note that if you have NUTS forcing the machines into shutdown from a UPS, they won't automatically power on again.Hopefully, this will isolate the problem. Finding them in powered off state means that someone or a script is deliberately powering down the machines. If they're re-activated and powered on, you either have a line sag issue or thermal issues of sorts (have you verified that the servers are generally dust free and that the fans are working fine?)
-
Heyas again,
Yes, Not a BBU of course, i'm talking about a UPS. Usually when power is cut, they don't power on themselfes without pressing a Button or anything,
because it's not true server hardware.. The Mashines themselfes are around 6 months old now. The Air condition in the room generally filters all the dust
in the room and gets maintained on a regular basis. I will probably just try to put all mashines to a different power outlet.Hope this helps :) Thanks again!
-
Heyas again,
Yes, Not a BBU of course, i'm talking about a UPS. Usually when power is cut, they don't power on themselfes without pressing a Button or anything,
because it's not true server hardware.. The Mashines themselfes are around 6 months old now. The Air condition in the room generally filters all the dust
in the room and gets maintained on a regular basis. I will probably just try to put all mashines to a different power outlet.Hope this helps :) Thanks again!
Hi, I don't mean that as a server feature. It is a basic setting in the BIOS (CMOS). Under Power Options, you should find a setting that says:
Power on after power failure - Options: On, Soft Off, Last State.This is definitely available on consumer boards. Set this to "On". If the machines are powered down due to power line issues, when the power comes back on, the machines will automatically boot up.
However, if the machines are manually powered off - by pressing and holding the power button or via shutdown command (script or manually entered command), the machines won't come back online.
This should hopefully, help you isolate the problem as to whether this is a powerline issue or script problem (or sabotage for this matter).