Multiple issues, firewall freezes and whole network goes down.
-
@stephenw10 UPS is snmp so my guess is fq_codel errors froze the network and the firewall lost comms with UPS as a result then the "ups stale" errors.
The 500ms across a VPN is normal as the internet connection on the other end of the tunnel is very bad.
-
Ah then you should definitely tune the gateway alert settings so it doesn't cause an alarm at that latency. It's causing a lot of unnecessary scripts to run.
-
@stephenw10 got it. Btw thanks for bearing with me until now (:. But is this somehow related to freeze and crash?
-
It could be if it's somehow ending up with scripts looping until it uses all available resources of some kind. The monitoring graphs should show that though as I mentioned
-
@stephenw10 I have checked the graphs, but nothing seems wrong with the values. I had also another freeze. There is nothing useful on the logs other than some gateway alarms.
I have noticed a common issue across all these crashes/freezes though. As soon as something goes wrong with my WAN, there is a big chance that the firewall also freezes/crashes. I cannot make sense of it though.
So far,
I have removed all the watchdog items
tweaked the gatewayvalues a little bit
and after this crash I have removed the HE Tunnel since it was useless.Sep 8 17:36:24 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:36:25 FIREWALL php-fpm[34190]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use VPNAC_WG. Sep 8 17:36:27 FIREWALL rc.gateway_alarm[40780]: >>> Gateway alarm: WANV6_TUNNELV6 (Addr:2001:470:1f1a:46d::1 Alarm:1 RTT:510.886ms RTTsd:361.034ms Loss:11%) Sep 8 17:36:27 FIREWALL check_reload_status[635]: updating dyndns WANV6_TUNNELV6 Sep 8 17:36:27 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:36:27 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:36:27 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:36:28 FIREWALL php-fpm[34190]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WANV6_TUNNELV6. Sep 8 17:36:30 FIREWALL check_reload_status[635]: updating dyndns OVPN_S2S_VPNV4 Sep 8 17:36:30 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:36:30 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:36:30 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:36:30 FIREWALL rc.gateway_alarm[58334]: >>> Gateway alarm: OVPN_S2S_VPNV4 (Addr:10.25.25.2 Alarm:1 RTT:707.150ms RTTsd:342.738ms Loss:11%) Sep 8 17:36:31 FIREWALL php-fpm[40188]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use OVPN_S2S_VPNV4. Sep 8 17:36:47 FIREWALL rc.gateway_alarm[20939]: >>> Gateway alarm: VPNAC_WG (Addr:10.11.0.1 Alarm:1 RTT:574.422ms RTTsd:376.129ms Loss:21%) Sep 8 17:36:47 FIREWALL check_reload_status[635]: updating dyndns VPNAC_WG Sep 8 17:36:47 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:36:47 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:36:47 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:36:48 FIREWALL rc.gateway_alarm[24045]: >>> Gateway alarm: WAN_PPPOE (Addr:10.98.238.224 Alarm:1 RTT:500.131ms RTTsd:368.558ms Loss:16%) Sep 8 17:36:48 FIREWALL check_reload_status[635]: updating dyndns WAN_PPPOE Sep 8 17:36:48 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:36:48 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:36:48 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:36:48 FIREWALL php-fpm[40188]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use VPNAC_WG. Sep 8 17:36:49 FIREWALL php-fpm[36133]: /rc.dyndns.update: phpDynDNS (@.mydomain.org): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 8 17:36:49 FIREWALL php-fpm[40188]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_PPPOE. Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS: updatedns() starting Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _checkIP() starting. Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): "redacted ip" extracted from local system. Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS (931952): running get_failover_interface for wan. found pppoe0 Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _detectChange() starting. Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _checkIP() starting. Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): "redacted ip" extracted from local system. Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic Dns (931952): Current WAN IP: "redacted ip" Cached IP: "redacted ip" Sep 8 17:36:50 FIREWALL php-fpm[36133]: /rc.dyndns.update: phpDynDNS (931952): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 8 17:36:51 FIREWALL rc.gateway_alarm[79606]: >>> Gateway alarm: WAN_PPPOE (Addr:10.98.238.224 Alarm:1 RTT:518.989ms RTTsd:369.829ms Loss:21%) Sep 8 17:36:51 FIREWALL check_reload_status[635]: updating dyndns WAN_PPPOE Sep 8 17:36:51 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:36:51 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:36:51 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:36:51 FIREWALL rc.gateway_alarm[80565]: >>> Gateway alarm: WANV6_TUNNELV6 (Addr:2001:470:1f1a:46d::1 Alarm:1 RTT:578.630ms RTTsd:391.247ms Loss:22%) Sep 8 17:36:51 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:36:51 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:36:52 FIREWALL php-fpm[639]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WANV6_TUNNELV6. Sep 8 17:36:52 FIREWALL php-fpm[36133]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_PPPOE. Sep 8 17:37:02 FIREWALL rc.gateway_alarm[4317]: >>> Gateway alarm: OVPN_S2S_VPNV4 (Addr:10.25.25.2 Alarm:1 RTT:735.071ms RTTsd:303.770ms Loss:37%) Sep 8 17:37:02 FIREWALL check_reload_status[635]: updating dyndns OVPN_S2S_VPNV4 Sep 8 17:37:02 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:02 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:02 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:37:03 FIREWALL php-fpm[639]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use OVPN_S2S_VPNV4. Sep 8 17:37:07 FIREWALL rc.gateway_alarm[60969]: >>> Gateway alarm: MNG_DHCP (Addr:192.168.2.1 Alarm:1 RTT:24.549ms RTTsd:124.820ms Loss:21%) Sep 8 17:37:07 FIREWALL check_reload_status[635]: updating dyndns MNG_DHCP Sep 8 17:37:07 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:07 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:08 FIREWALL php-fpm[40188]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use MNG_DHCP. Sep 8 17:37:12 FIREWALL rc.gateway_alarm[27585]: >>> Gateway alarm: MODEM_DHCP (Addr:192.168.0.1 Alarm:1 RTT:23.050ms RTTsd:124.997ms Loss:22%) Sep 8 17:37:12 FIREWALL check_reload_status[635]: updating dyndns MODEM_DHCP Sep 8 17:37:12 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:12 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:12 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:37:13 FIREWALL php-fpm[592]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use MODEM_DHCP. Sep 8 17:37:16 FIREWALL upsd[6640]: Data for UPS [UPS] is stale - check driver Sep 8 17:37:19 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:19 FIREWALL upsmon[19346]: Communications with UPS UPS lost Sep 8 17:37:22 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for device.description Sep 8 17:37:24 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:29 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:31 FIREWALL rc.gateway_alarm[47891]: >>> Gateway alarm: WAN_PPPOE (Addr:10.98.238.224 Alarm:1 RTT:383.257ms RTTsd:388.087ms Loss:81%) Sep 8 17:37:31 FIREWALL check_reload_status[635]: updating dyndns WAN_PPPOE Sep 8 17:37:31 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:31 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:31 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:37:32 FIREWALL php-fpm[36133]: /rc.dyndns.update: phpDynDNS (@.mydomain.org): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 8 17:37:32 FIREWALL php-fpm[639]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_PPPOE. Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS: updatedns() starting Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _checkIP() starting. Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): "redacted ip" extracted from local system. Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS (931952): running get_failover_interface for wan. found pppoe0 Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _detectChange() starting. Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _checkIP() starting. Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): "redacted ip" extracted from local system. Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: Dynamic Dns (931952): Current WAN IP: "redacted ip" Cached IP: "redacted ip" Sep 8 17:37:33 FIREWALL php-fpm[36133]: /rc.dyndns.update: phpDynDNS (931952): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 8 17:37:34 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:38 FIREWALL rc.gateway_alarm[62940]: >>> Gateway alarm: WANV6_TUNNELV6 (Addr:2001:470:1f1a:46d::1 Alarm:1 RTT:266.937ms RTTsd:155.977ms Loss:93%) Sep 8 17:37:38 FIREWALL check_reload_status[635]: updating dyndns WANV6_TUNNELV6 Sep 8 17:37:38 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:38 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:38 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:37:39 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:39 FIREWALL php-fpm[639]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WANV6_TUNNELV6. Sep 8 17:37:40 FIREWALL rc.gateway_alarm[79115]: >>> Gateway alarm: VPNAC_WG (Addr:10.11.0.1 Alarm:1 RTT:225.111ms RTTsd:55.236ms Loss:96%) Sep 8 17:37:40 FIREWALL check_reload_status[635]: updating dyndns VPNAC_WG Sep 8 17:37:40 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:40 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:40 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:37:40 FIREWALL rc.gateway_alarm[81903]: >>> Gateway alarm: OVPN_S2S_VPNV4 (Addr:10.25.25.2 Alarm:1 RTT:436.319ms RTTsd:43.501ms Loss:97%) Sep 8 17:37:40 FIREWALL check_reload_status[635]: updating dyndns OVPN_S2S_VPNV4 Sep 8 17:37:40 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:40 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:41 FIREWALL php-fpm[90111]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use VPNAC_WG. Sep 8 17:37:41 FIREWALL rc.gateway_alarm[96096]: >>> Gateway alarm: WAN_PPPOE (Addr:10.98.238.224 Alarm:1 RTT:806.992ms RTTsd:750.675ms Loss:98%) Sep 8 17:37:41 FIREWALL php-fpm[639]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use OVPN_S2S_VPNV4. Sep 8 17:37:44 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:44 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for device.contact Sep 8 17:37:49 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:51 FIREWALL rc.gateway_alarm[12424]: >>> Gateway alarm: WAN_PPPOE (Addr:10.98.238.224 Alarm:1 RTT:0ms RTTsd:0ms Loss:100%) Sep 8 17:37:51 FIREWALL check_reload_status[635]: updating dyndns WAN_PPPOE Sep 8 17:37:51 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 8 17:37:51 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 8 17:37:51 FIREWALL check_reload_status[635]: Reloading filter Sep 8 17:37:52 FIREWALL php-fpm[90111]: /rc.dyndns.update: phpDynDNS (@.mydomain.org): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 8 17:37:52 FIREWALL php-fpm[36133]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_PPPOE. Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS: updatedns() starting Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _checkIP() starting. Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): "redacted ip" extracted from local system. Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS (931952): running get_failover_interface for wan. found pppoe0 Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _detectChange() starting. Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): _checkIP() starting. Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic DNS he-net-tunnelbroker (931952): "redacted ip" extracted from local system. Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: Dynamic Dns (931952): Current WAN IP: "redacted ip" Cached IP: "redacted ip" Sep 8 17:37:53 FIREWALL php-fpm[90111]: /rc.dyndns.update: phpDynDNS (931952): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 8 17:37:54 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:37:59 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:04 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:06 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for device.location Sep 8 17:38:09 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:14 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:19 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:24 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:28 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.voltage Sep 8 17:38:29 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:34 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:39 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:44 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:47 FIREWALL php-cgi[43129]: notify_monitor.php: Could not send the message to mymail@gmail.com -- Error: Failed to connect to ssl://smtp.gmail.com:465 [SMTP: Failed to connect socket: php_network_getaddresses: getaddrinfo for smtp.gmail.com failed: Address family for hostname not supported (code: -1, response: )] Sep 8 17:38:49 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:50 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.voltage.maximum Sep 8 17:38:54 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:38:59 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:04 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:09 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:13 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.voltage.minimum Sep 8 17:39:14 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:19 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:24 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:29 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:34 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:35 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.frequency Sep 8 17:39:39 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:44 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:49 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:54 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:39:57 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.transfer.low Sep 8 17:39:59 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:04 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:09 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:14 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:19 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.transfer.high Sep 8 17:40:19 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:24 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:29 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:34 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:39 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:41 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for input.transfer.reason Sep 8 17:40:44 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:49 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:54 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:40:59 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:03 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.power Sep 8 17:41:04 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:09 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:15 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:20 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:25 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:25 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.realpower Sep 8 17:41:30 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:35 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:40 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:45 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:48 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.status Sep 8 17:41:50 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:41:55 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:00 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:05 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:10 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.status Sep 8 17:42:10 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:15 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:15 FIREWALL upsmon[19346]: UPS UPS is unavailable Sep 8 17:42:20 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:25 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:30 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:32 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.status Sep 8 17:42:35 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:40 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:45 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:50 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:42:54 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.status Sep 8 17:42:55 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:00 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:05 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:10 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:15 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:16 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.temperature Sep 8 17:43:20 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:25 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:30 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:35 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:38 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for ups.load Sep 8 17:43:40 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:44 FIREWALL php-cgi[63749]: notify_monitor.php: Could not send the message to mymail@gmail.com -- Error: Failed to connect to ssl://smtp.gmail.com:465 [SMTP: Failed to connect socket: php_network_getaddresses: getaddrinfo for smtp.gmail.com failed: Address family for hostname not supported (code: -1, response: )] Sep 8 17:43:45 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:50 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:43:55 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:00 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:00 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for battery.charge Sep 8 17:44:05 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:10 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:15 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:20 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:23 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for battery.runtime Sep 8 17:44:25 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:30 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:35 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:40 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:45 FIREWALL snmp-ups[18117]: [UPS] snmp_ups_walk: data stale for battery.runtime.low Sep 8 17:44:45 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale Sep 8 17:44:50 FIREWALL upsmon[19346]: Poll UPS [UPS] failed - Data stale
-
A short time later, this time I got an another crash with crash report.
Dump header from device: /dev/nda0p2 Architecture: amd64 Architecture Version: 4 Dump Length: 617472 Blocksize: 512 Compression: none Dumptime: 2024-09-08 20:08:22 +0300 Hostname: FIREWALL.mydomain.org Magic: FreeBSD Text Dump Version String: FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/j Panic String: page fault Dump Parity: 44932402 Bounds: 0 Dump Status: good
Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 08 fault virtual address = 0x1c fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80f246e2 stack pointer = 0x28:0xfffffe00e1f3bae0 frame pointer = 0x28:0xfffffe00e1f3bb70 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (clock (6)) rdi: 0000000000000000 rsi: 0000000000000000 rdx: fffffe00e1f3bcf8 rcx: 0000000000000000 r8: 0000000000000528 r9: 0000000000000000 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00e1f3bb70 r10: 000000000000300f r11: 0000000000015069 r12: 0000000000000000 r13: 0000000000000528 r14: fffff8027dfb5000 r15: 0000000000000034 trap number = 12 panic: page fault cpuid = 6 time = 1725815302 KDB: enter: panic panic.txt 0600 0 0 12 14667355006 7145 ustar root wheel page fault version.txt 0600 0 0 457 14667355006 7635 ustar root wheel FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBSD-src-plus-RELENG_24_03/amd64.amd64/sys/pfSense
full crash dump here
I see a lot of "Disabled multicast promiscuous mode" outputs here.
textdump.tar.0right now, my ISP is working on the cables in the neighborhood and I am having frequent WAN downtime but for some reason, this is crashing the firewall.
-
Ok that crash is this: https://redmine.pfsense.org/issues/15684
Try setting the workaround suggested there: https://redmine.pfsense.org/issues/15684#note-12
The logs show all gateways going down including what looks like an internal gateway?
Sep 8 17:37:07 FIREWALL rc.gateway_alarm[60969]: >>> Gateway alarm: MNG_DHCP (Addr:192.168.2.1 Alarm:1 RTT:24.549ms RTTsd:124.820ms Loss:21%)
Are all those gateways using the same NIC(s)?
-
@stephenw10 I have set the workaround though I had to set it manualy from system tunables sine it was not there by default.
There are 5 gateways with corresponding Interfaces
and the interfaces below
-
Ok so 4 of those gateways are all using igb1 but the MNG gtaeway uses igb0. So you would not expect to see all 5 throwing packet loss unless they go through the same switch maybe?
-
@stephenw10 yep, igb1 goes to modem port and igb0 goes to different switch. MNG is a management network with a separate switch with dhcp server not connected to the internet. It has all the IPMI and critical management connections. The purpose is to provide an environment where even if the pfsense crashes, management interface should stay up to reach pfsense UI (if possible) and IPMI
-
Hmm, what hardware is this?
Not much can cause two NICs to stop passing traffic like that. Especially igb NICs.
-
@stephenw10 it is Supermicro SuperServer 5019D-4C-FN8TP with 32GB ECC RAM and with addon card AOC-S25G-I2S-O PCIe SFPP28 25gbps
-
@stephenw10 to make it clear, the firewall just freezes itself, even directly connecting to the console, no inputs are registered by the firewall through console. Until reboot, it is just at stuck at something.
-
Hmm, so all 4 of those ports are on-board.
Does it not respond even to
ctl+t
? -
@stephenw10 no, it does not respond to anything. I did not try ctrl + t but ctrl + c, ctrl + alt + del, enter, space, backspace, nothing works
-
@Laxarus
we have the same hardware but not the 25 gbps card.Please check over the IPMI interface for some PCIe, ... errors, we had a faulty broadcom card some months ago.
-
Sometimes ctl+t is the only thing that will produce a response.
-
@stephenw10 will try ctrl + c, if the same thing happens again (hoping not), I will try to troubleshoot with WAN when I go back (right now I only have remote access).
There is only one constant in all the situations, when WAN goes down, there is a big chance of firewall crashing or freezing.
And the two bugs that you have stated is contributing to this somehow when WAN goes down. Hopefully, the next release of pfsense will take care of these bugs.
Thanks for bearing with me until now and I really appreciate it.@slu thanks for the suggestion. I have checked the maintenance and health logs on the IPMI but there is nothing noteworthy there. It all seems normal.
-
So, I had the same issue again this morning and I still have no idea why this is happening. @stephenw10 I have tried ctrl + t and no response to that neither.
Any advise to debugging this is very much appreciated.
Full log here, the freeze happened around Sep 16 07:00
system.log.0 -
You need to tune the OVPN_S2S_VPNV4 gateway. It's throwing alarms repeatedly. It's clearly a pretty bad route because the alarms are legitimate for a default settings . However reloading the firewall each tie it fires is not helping anything. You might just disable the monitoring or monitoring action on that gateway.
But that shouldn't cause it to stop responding. The actual failure appears to happen here:
Sep 16 07:18:45 FIREWALL rc.gateway_alarm[63113]: >>> Gateway alarm: VPNAC_WG (Addr:10.11.0.1 Alarm:1 RTT:91.226ms RTTsd:79.944ms Loss:21%) Sep 16 07:18:45 FIREWALL check_reload_status[635]: updating dyndns VPNAC_WG Sep 16 07:18:45 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 16 07:18:45 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 16 07:18:45 FIREWALL check_reload_status[635]: Reloading filter Sep 16 07:18:45 FIREWALL rc.gateway_alarm[65772]: >>> Gateway alarm: WAN_PPPOE (Addr:10.98.238.224 Alarm:1 RTT:5.947ms RTTsd:11.776ms Loss:21%) Sep 16 07:18:45 FIREWALL check_reload_status[635]: updating dyndns WAN_PPPOE Sep 16 07:18:45 FIREWALL check_reload_status[635]: Restarting IPsec tunnels Sep 16 07:18:45 FIREWALL check_reload_status[635]: Restarting OpenVPN tunnels/interfaces Sep 16 07:18:45 FIREWALL check_reload_status[635]: Reloading filter Sep 16 07:18:46 FIREWALL php-fpm[20435]: /rc.openvpn: The command '/sbin/route -n6 get 'default' 2>/dev/null | /usr/bin/egrep 'flags: <.*PROTO.*>'' returned exit code '1', the output was '' Sep 16 07:18:46 FIREWALL php-fpm[20435]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use VPNAC_WG. Sep 16 07:18:46 FIREWALL php-fpm[20435]: /rc.openvpn: The command '/sbin/route -n6 get 'default' 2>/dev/null | /usr/bin/egrep 'flags: <.*PROTO.*>'' returned exit code '1', the output was '' Sep 16 07:18:46 FIREWALL php-fpm[20435]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_PPPOE. Sep 16 07:18:46 FIREWALL php-fpm[51827]: /rc.dyndns.update: phpDynDNS (@.mydomain.org): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Sep 16 07:18:50 FIREWALL ppp[53627]: [wan_link0] LCP: no reply to 1 echo request(s) Sep 16 07:19:00 FIREWALL ppp[53627]: [wan_link0] LCP: no reply to 2 echo request(s) Sep 16 07:19:05 FIREWALL rc.gateway_alarm[23895]: >>> Gateway alarm: MNG_DHCP (Addr:192.168.2.1 Alarm:1 RTT:4.611ms RTTsd:15.937ms Loss:22%)
Where all gateways start to indicate failures and the pppoe goes down. Effectively no traffic is passing from that point.
But there are no lower level errors, the NICs do not show loss of link for example.
The firewall is still logging and running scripts it doesn't appear to be down. At least until the end of that log.
When did you try to connect? How did you connect?