PfSense randomly reboots but always exactly at the whole hour, help needed.
-
I'm puzzled with the weird behaviour my pfSense box is showing.
I've got a brand new APU 1C4 with a 16 GB mSATA flash card. I've installed pfSense 2.1.5, installed some packages (arpwatch, if top, LADVD, OpenVPN Client Export Utility, pfBlocker, Postfix Forwarder, Service Watchdog, snort) that are still on the firewall.
I also had other packages (Avahi, mailreport, ntop) installed, but they are removed.
My initial configuration used RAM for /tmp with the option to copy RRD files enabled, but I've disabled this altogether.Network wise I've bridged interface re0 and re1 together. Snort is monitoring the traffic passing the bridge to create a protected DMZ with multiple servers behind it.
After initially running fine for more than a week the system started to reboot randomly. If it reboots it's always exactly at the whole hour, without any exceptions. This makes me believe that the problem is software related and not hardware. The interval between reboots various from 2 hours up to almost 5 days with exactly the same configuration. There is no other pattern that I can discover.
The 'good' news is that the network is back to operational within 3 minutes.
I've checked the time of the reboots against the entries in /etc/crontab and noticed a couple of things:
- There is still an entry to copy the RRD files enabled. Removing this is impossible because any change in the GUI will put this entry back.
- There is still an entry /usr/local/bin/mail_reports_generate.php which seems a leftover as well that I can't remove permanently. The cron time format of this entry is wrong btw. It lacks one column in the time selector.
The exact content of the crontab jobs that schedule at, or around the whole hour is as follows:
pfSense specific crontab entries
Created: October 16, 2014, 11:03 pm
*/60 * * * * root /usr/bin/nice -n20 /usr/local/sbin/expiretable -v -t 3600 sshlockout
*/60 * * * * root /usr/bin/nice -n20 /usr/local/sbin/expiretable -v -t 3600 virusprot
0 */1 * * * root /etc/rc.backup_dhcpleases.sh
0 * * * root /usr/local/bin/mail_reports_generate.php 0 &
0 * * * * root /usr/local/bin/php -q /usr/local/www/pfblocker.php cron
2 */1 * * * root /usr/bin/nice -n20 /sbin/pfctl -q -t snort2c -T expire 43200Is it possible for any of these commands to cause the system to panic with a general protection fault? If so which one?
How can I permanently remove the entries that are leftovers from previously installed packages?Are there any tools to analyse the crashdumps in more detail? Uploading them doesn't seem to work since I always get an error when attempting to do so.
Any help is appreciated,
Thanks,
Marco