pfSense seems to reboot only on Weekends
-
Hello Everyone,
I have an odd ball that I can use help with. I purchased two Netgate SG-5100 at the same time for two different departments in the same building. Both firewalls are connected to the same Fiber CPE, separate /30 blocks for each. On one firewall, I see the gateway drop and the unit reboots itself. The second firewall says online with no report of Gateway issue. Confirmed no power issues at the location when the reboot occur. Odd part is that it is every weekend around the same time. I have used the logs and troubleshooting via the WebGUI with no success on the root cause. Any tips via CLI that I can try to isolate the issue?
We are running 2.4.4-RELEASE-p3 on Americian Megatrends V1.10_5 BIOS.
System logs:
Sep 8 12:56:35 kernel done.
Sep 8 12:56:35 check_reload_status updating dyndns WANGW
Sep 8 12:56:35 check_reload_status Restarting ipsec tunnels
Sep 8 12:56:35 check_reload_status Restarting OpenVPN tunnels/interfaces
Sep 8 12:56:35 check_reload_status Reloading filter
Sep 8 12:56:35 check_reload_status Linkup starting igb1
Sep 8 12:56:35 kernel igb1: link state changed to UP
Sep 8 12:56:35 kernel igb1.20: link state changed to UP
Sep 8 12:56:35 kernel igb1.10: link state changed to UP
Sep 8 12:56:35 kernel igb1.90: link state changed to UP
Sep 8 12:56:35 check_reload_status Linkup starting igb1.20
Sep 8 12:56:35 check_reload_status Linkup starting igb1.10
Sep 8 12:56:35 check_reload_status Linkup starting igb1.90
Sep 8 12:56:36 kernel igb0: link state changed to UP
Sep 8 12:56:36 check_reload_status Linkup starting igb0
Sep 8 12:56:36 php-cgi rc.bootup: sync unbound done.
Sep 8 12:56:36 kernel done.
Sep 8 12:56:37 kernel done.
Sep 8 12:56:37 php-cgi rc.bootup: NTPD is starting up.
Sep 8 12:56:38 check_reload_status Updating all dyndns
Sep 8 12:56:38 kernel done.
Sep 8 12:56:38 kernel .done.
Sep 8 12:56:43 php-cgi rc.bootup: Creating rrd update script
Sep 8 12:56:43 php-cgi rc.bootup: The command '/usr/sbin/powerd -b 'hadp' -a 'hadp' -n 'hadp'' returned exit code '69', the output was 'powerd: no cpufreq(4) support -- aborting: No such file or directory'
Sep 8 12:56:43 kernel done.
Sep 8 12:56:43 root /etc/rc.d/hostid: WARNING: hostid: unable to figure out a UUID from DMI data, generating a new one
Sep 8 12:56:45 syslogd exiting on signal 15
Sep 8 12:56:45 syslogd kernel boot file is /boot/kernel/kernel
Sep 8 12:56:45 kernel done.
Sep 8 12:56:46 php-fpm 351 /rc.start_packages: Restarting/Starting all packages.
Sep 8 12:56:45 kernel igb1: promiscuous mode enabled
Sep 8 12:56:45 kernel igb1.10: promiscuous mode enabled
Sep 8 12:56:45 kernel igb1.20: promiscuous mode enabled
Sep 8 12:56:46 kernel igb1.90: promiscuous mode enabled
Sep 8 12:56:46 kernel igb0: promiscuous mode enabled
Sep 8 12:56:53 rc.gateway_alarm 72088 >>> Gateway alarm: WANGW (Addr:216.12.57.101 Alarm:0 RTT:6.425ms RTTsd:2.836ms Loss:12%)
Sep 8 12:56:53 check_reload_status updating dyndns WANGW
Sep 8 12:56:53 check_reload_status Restarting ipsec tunnels
Sep 8 12:56:53 check_reload_status Restarting OpenVPN tunnels/interfaces
Sep 8 12:56:53 check_reload_status Reloading filter
Sep 8 12:56:55 php-fpm 352 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Sep 8 12:56:55 php-fpm 352 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WANGW.
Sep 8 12:57:06 ntopng [HTTPserver.cpp:924] ERROR: [HTTP] set_ports_option: cannot bind to 3000s: Address already in use
Sep 8 12:57:06 ntopng [mongoose.c:4584] ERROR: set_ports_option: cannot bind to 3000s: No error: 0
Sep 8 12:57:06 ntopng [HTTPserver.cpp:1104] ERROR: Unable to start HTTP server (IPv4) on ports 3000s
Sep 8 12:57:06 ntopng [HTTPserver.cpp:1110] ERROR: Either port in use or another ntopng instance is running (using the same port)
Sep 8 12:57:06 login login on ttyv0 as root
Sep 8 12:57:06 login login on ttyu0 as rootGateway Logs (Sanitized)
GW Address = Gateway IP Address
FW Address = WAN Firewall Address
All Times EDTAug 3 14:37:03 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 3 14:37:05 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 3 14:37:24 dpinger WANGW GW Address: Clear latency 8008us stddev 10257us loss 12%
Aug 4 12:56:13 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 4 12:56:15 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 4 12:56:35 dpinger WANGW GW Address: Clear latency 15988us stddev 46596us loss 12%
Aug 10 14:38:10 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 10 14:38:12 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 10 14:38:31 dpinger WANGW GW Address: Clear latency 13917us stddev 41555us loss 12%
Aug 11 12:56:07 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 11 12:56:09 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 11 12:56:28 dpinger WANGW GW Address: Clear latency 6409us stddev 2671us loss 12%
Aug 17 14:37:58 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 17 14:38:00 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 17 14:38:24 dpinger WANGW GW Address: Clear latency 8095us stddev 14801us loss 14%
Aug 18 12:56:08 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 18 12:56:10 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 18 12:56:29 dpinger WANGW GW Address: Clear latency 8119us stddev 14342us loss 12%
Aug 24 14:38:10 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 24 14:38:12 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 24 14:38:31 dpinger WANGW GW Address: Clear latency 15760us stddev 45661us loss 12%
Aug 25 12:56:23 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 25 12:56:25 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 25 12:56:44 dpinger WANGW GW Address: Clear latency 16704us stddev 46661us loss 12%
Aug 31 14:38:46 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Aug 31 14:38:48 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Aug 31 14:39:07 dpinger WANGW GW Address: Clear latency 14875us stddev 37912us loss 12%
Sep 1 12:56:41 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Sep 1 12:56:43 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Sep 1 12:57:05 dpinger WANGW GW Address: Clear latency 6227us stddev 2577us loss 13%
Sep 7 14:38:36 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Sep 7 14:38:38 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Sep 7 14:38:57 dpinger WANGW GW Address: Clear latency 9382us stddev 13236us loss 12%
Sep 8 12:56:32 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr GW Address bind_addr FW Address identifier "WANGW "
Sep 8 12:56:34 dpinger WANGW GW Address: Alarm latency 0us stddev 0us loss 100%
Sep 8 12:56:53 dpinger WANGW GW Address: Clear latency 6425us stddev 2836us loss 12%Thanks in Advance for helping!
Brandon. -
That is odd. Are there no other log messages around the time of the reboot? Anything from before the reboot?
Since you can somewhat predict the occurrence, could you leave a system connected to the console port logging the output from there to see what happens? It may have some better info.
From the general symptoms it sounds like a power/environmental issue somehow, perhaps it's enough to trip the 5100 but not a UPS, though at least for me, my 5100s don't seem to be phased by small power blips even without a UPS.
If you leave the console connected and there is nothing logged before the reboot, then it would almost have to be power or hardware. A hardware issue would almost certainly be more random/less predictable, though.