weekly crash newsyslog
-
I am getting to a point where I am at a loss.
Pfsense now crashes at least once a week usually without a dump after a required hard reset. I've had a screen and keyboard connected that didn't show info or responded when nothing was working.
my pfsense version is 2.6.0
my current state is a remote log server that records the last thing pfsense sends which has always been(root) CMD (/usr/sbin/newsyslog)
reading up on other reports the method to increase log rotation size to 10240000 has not really helped. neither has changing the compression to zstd
what should be the next step in fixing this? would switching to the plus version have any benefits?
hope some one can help. -
First this is try hitting
ctl+t
at the console. That can often respond when nothing else does and should show you what process it's waiting for.Is it completely locked up? Does the number lock or caps lock LED work on the keyboard for example.
If not then it could be a hardware issue. Could it be overheating?Steve
-
@stephenw10 thanks for your suggestion.
On the hardware side i've replaced both the ram and msata drive to test without succes.
the unit is a 6 port Chinese box equivalent to the protectli vault units with i3 7100 and it is originally passively cooled. It was stacked ontop of a switch but now it is mounted to the side wall of my meter room and it has a usb fan on medium to cool it actively. since then temperatures are better in the box but have always been below 45degC now sitting at a max 34degC. Should be okay.I will hook up the monitor and keyboard again to test.
-
Ok, well 45C at the CPU core would not worry me at all. Some other component might have been overheating but a fan should prevent that.
-
So have been out for a short holiday and back since last Friday. The firewall crashed on the Wednesday before I got back and didn't have a screen connected yet. Only same syslog notification on the log server.
Connected a screen and keyboard and restarted the firewall for it to crash again today (Monday) i.e four days ish up time. Screen stil showed the pfsense menu screen, no additions. Ctrl+t no reaction and also numlock etc nothing. Log server same output plus about 25seconds with 3 block logs.
Any further thoughts?
Can I script (cronjob?) a daily clear logs as I don't need the history locally anyway since it goes to my log server just to avoiding the log rotate and see if that will help? -
@tomracing what kind of nics are in there? bsd doesn't normally lock up unless there's a hardware issue.
-
Yup, feels like a hardware issue if it locked solid like that with no output at all.
You might try disabling everything you can in the BIOS. De-rate the TDP, underclock it etc if you can.
Steve
-
Intermediate update,
Turned off log archiving as it is going to a log server anyway before trying to adjust the bios settings.
Currently 13 days 15hrs uptime. -
What exactly did you set to 'turn off log archiving'? Disable rotation? Compression?
That can use a lot of CPU, I wonder if that was triggering overheating.
Steve
-
Ah yes indeed, now I turned off compression to see if that step was causing the locking up.
-
Hmm, that would be interesting if that was the cause but it would also imply that any significant CPU usage might trigger it.
-
@stephenw10 interesting concept but wouldn't booting classify as high cpu load as well preventing the box from even starting?
I'm first going to see if it makes it a month without crashing then I will see if I can simulate some processor load from the terminal.
-
Yes, you would think. If it got stuck in a loop compressing the logs it could have high CPU usage for a long while though.