Troubleshooting WAN outage
-
@itsbry The Gateways log shows the dpinger alerts (and status updates, like when it starts).
-
@SteveITS
Hi Steve.. thanks for the reply!
I do see the dpinger entries... I am not sure exactly how to interpret them. Do these entries indicate a failure to both interfaces (WAN_DHCP and Cisco3650)?Below is a section that repeated until reboot:
2025-09-01 10:39:01.940024-04:00 dpinger 42829 WAN_DHCP 9.9.9.9: sendto error: 65
2025-09-01 10:39:00.916386-04:00 dpinger 43384 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 192.168.10.1 bind_addr 192.168.10.2 identifier "Cisco3650 "
2025-09-01 10:39:00.914271-04:00 dpinger 42829 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 9.9.9.9 bind_addr 64.179.196.79 identifier "WAN_DHCP "
2025-09-01 10:39:00.888785-04:00 dpinger 17634 exiting on signal 15
2025-09-01 10:39:00.881734-04:00 dpinger 17970 exiting on signal 15
2025-09-01 10:38:59.559298-04:00 dpinger 17970 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 192.168.10.1 bind_addr 192.168.10.2 identifier "Cisco3650 "
2025-09-01 10:38:59.557205-04:00 dpinger 17634 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 9.9.9.9 bind_addr 64.179.196.79 identifier "WAN_DHCP "
2025-09-01 10:38:59.532071-04:00 dpinger 51993 exiting on signal 15
2025-09-01 10:38:43.668833-04:00 dpinger 51993 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 192.168.10.1 bind_addr 192.168.10.2 identifier "Cisco3650 "
2025-09-01 10:38:43.654222-04:00 dpinger 36936 exiting on signal 15
2025-09-01 10:38:43.647320-04:00 dpinger 37539 exiting on signal 15
2025-09-01 10:38:43.602620-04:00 dpinger 36936 WAN_DHCP 9.9.9.9: sendto error: 65
2025-09-01 10:38:43.100796-04:00 dpinger 36936 WAN_DHCP 9.9.9.9: sendto error: 65
2025-09-01 10:38:41.600583-04:00 dpinger 37539 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 192.168.10.1 bind_addr 192.168.10.2 identifier "Cisco3650 "
2025-09-01 10:38:41.597546-04:00 dpinger 36936 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 9.9.9.9 bind_addr 64.179.196.79 identifier "WAN_DHCP "
2025-09-01 10:38:41.571964-04:00 dpinger 49679 exiting on signal 15
2025-09-01 10:38:41.564964-04:00 dpinger 50057 exiting on signal 15 -
@itsbry Error 65 is:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/gateway-errors.html#sendto-error-65re signal 15, see:
https://forum.netgate.com/topic/174601/dpinger-exiting-on-signal-15/6...check the main log at 2025-09-01 10:38:43 and see if it logs anything about the interface?
The "send_interval 500ms loss_interval 2000ms ...." messages are when dpinger starts. Seems like you have two WANs so it monitors both.
FWIW you can disable the monitoring action to not do anything if an interface has high packet loss etc. but that doesn't help if the interface is disconnecting or something.
-
Yeah I imagine the Cisco3650 is actually the downstream LAN side gateway. In which case it should probably be set to always up. Make sure it's not actually a gateway on LAN directly or pfSense will treat it as a WAN and NAT out of it by default. Unless you have disabled that.
But, yes, check the main system log for interface link state events.
-
@stephenw10 @SteveITS
Thanks to you both... yes, the Cisco3650 is on the LAN side (I'll check that for your recommendations).Here|s what I see at 10:38:43 (clearly the WAN link is down):
2025-09-01 10:38:43.683920-04:00 check_reload_status 514 Reloading filter
2025-09-01 10:38:43.683873-04:00 check_reload_status 514 Restarting OpenVPN tunnels/interfaces
2025-09-01 10:38:43.683819-04:00 check_reload_status 514 Restarting IPsec tunnels
2025-09-01 10:38:43.683649-04:00 check_reload_status 514 updating dyndns WAN_DHCP
2025-09-01 10:38:43.681845-04:00 rc.gateway_alarm 54536 >>> Gateway alarm: WAN_DHCP (Addr:64.179.196.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%) -
Ok, so I see this:
So I did this for Cisco3650:
-
@itsbry Cisco3650 shows as "LAN" interface, so why does it have a gateway set?
-
@SteveITS Novice user in a rush, probably (me).
I'm remote at the moment so I hesitate to pull that from the GW list... unless there's no way it will interrupt the network?FWIW, I have this:
-
It's a downstream gateway on the LAN, that's fine when you have a router there.
The thing it should not be is set on the LAN directly. So in Interfaces > LAN, no gateway should be set. That's what pfSense uses to determine which interfaces are "WANs".
Either way it's clearly not a problem currently and not the cause of a WAN outage so no need to change that yet. It would be good to do so eventually to clean up the logs and diagnosing other issues easier.
But, yes, dpinger shows it stops seeing ping responses from 9.9.9.9. That could be for a number of reasons though. Do you you see the WAN interface lose link or lose it's IP address in the system log when that happens?
-
@stephenw10 In Int>LAN, I have this (no gateway set?):
I do see the WAN int lose it's link and there is no IP address. The thing is, I cannot tell when the link comes back up. Typically I give it 5-10 minutes and reboot the pfSense box... that has been the only way to recover.
I'm not exactly sure where this loop start but I see this repeatedly, after the WAN link goes down:
2025-09-01 10:40:15.367970-04:00 php-fpm 12530 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 64.179.196.79 - Restarting packages.
2025-09-01 10:40:13.310548-04:00 php-fpm 12530 /rc.newwanip: Creating rrd update script
2025-09-01 10:40:13.305551-04:00 php-fpm 12530 /rc.newwanip: Resyncing OpenVPN instances for interface WAN.
2025-09-01 10:40:12.638767-04:00 rtsold 18399 <cap_rssend> sendmsg on re1: Permission denied
2025-09-01 10:40:12.524909-04:00 php-fpm 28462 /rc.openvpn: Default gateway setting as default.
2025-09-01 10:40:12.524697-04:00 php-fpm 33300 /rc.dyndns.update: Dynamic DNS () There was an error trying to determine the public IP for interface - wan (re1 ).
2025-09-01 10:40:12.520797-04:00 php-fpm 28462 /rc.openvpn: Gateway, NONE AVAILABLE
2025-09-01 10:40:12.301200-04:00 php-fpm 12530 /rc.newwanip: Dynamic DNS () There was an error trying to determine the public IP for interface - wan (re1 ).
2025-09-01 10:40:11.493599-04:00 check_reload_status 514 Reloading filter
2025-09-01 10:40:11.493551-04:00 check_reload_status 514 Restarting OpenVPN tunnels/interfaces
2025-09-01 10:40:11.493489-04:00 check_reload_status 514 Restarting IPsec tunnels
2025-09-01 10:40:11.493314-04:00 check_reload_status 514 updating dyndns WAN_DHCP
2025-09-01 10:40:11.491568-04:00 rc.gateway_alarm 61603 >>> Gateway alarm: WAN_DHCP (Addr:64.179.196.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
2025-09-01 10:40:10.347307-04:00 php-fpm 33300 /rc.dyndns.update: phpDynDNS (): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry.
2025-09-01 10:40:09.402183-04:00 php-fpm 49776 /rc.linkup: DEVD Ethernet detached event for wan
2025-09-01 10:40:09.402157-04:00 php-fpm 49776 /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp, 6: dhcp6)
2025-09-01 10:40:09.401931-04:00 check_reload_status 514 Reloading filter
2025-09-01 10:40:09.352329-04:00 php-fpm 81212 /rc.linkup: Removing static route for monitor 9.9.9.9 and adding a new route through 64.179.196.1
2025-09-01 10:40:09.333435-04:00 check_reload_status 514 updating dyndns wan
2025-09-01 10:40:08.637548-04:00 rtsold 18399 <cap_rssend> sendmsg on re1: Permission denied
2025-09-01 10:40:08.292761-04:00 check_reload_status 514 Restarting IPsec tunnels
2025-09-01 10:40:08.280665-04:00 php-fpm 81212 /rc.linkup: Gateway, NONE AVAILABLE
2025-09-01 10:40:07.958792-04:00 php-fpm 12530 /rc.newwanip: IP Address has changed, killing all states (ip_change_kill_states is set).
2025-09-01 10:40:07.942944-04:00 php-fpm 12530 /rc.newwanip: Default gateway setting Interface WAN_DHCP Gateway as default.
2025-09-01 10:40:07.937430-04:00 php-fpm 12530 /rc.newwanip: Gateway, NONE AVAILABLE
2025-09-01 10:40:07.378995-04:00 php-fpm 12530 /rc.newwanip: Removing static route for monitor 9.9.9.9 and adding a new route through 64.179.196.1
2025-09-01 10:40:07.362946-04:00 php-fpm 12530 /rc.newwanip: GW States: Killing states for down gateway: WAN_DHCP, 64.179.196.1
2025-09-01 10:40:07.258360-04:00 php-fpm 12530 /rc.newwanip: rc.newwanip: on (IP address: 64.179.196.79) (interface: WAN[wan]) (real interface: re1).
2025-09-01 10:40:07.258182-04:00 php-fpm 12530 /rc.newwanip: rc.newwanip: Info: starting on re1.
2025-09-01 10:40:06.266385-04:00 php-fpm 81212 /rc.linkup: Starting rtsold process on wan(re1)
2025-09-01 10:40:06.266249-04:00 php-fpm 81212 /rc.linkup: Starting DHCP6 client for interfaces re1
2025-09-01 10:40:06.254208-04:00 php-fpm 81212 /rc.linkup: Accept router advertisements on interface re1
2025-09-01 10:40:06.254131-04:00 php-fpm 81212 /rc.linkup: calling interface_dhcpv6_configure.
2025-09-01 10:40:06.253422-04:00 check_reload_status 514 rc.newwanip starting re1
2025-09-01 10:40:05.448607-04:00 kernel - re1: link state changed to UP
2025-09-01 10:40:05.361220-04:00 check_reload_status 514 Linkup starting re1
2025-09-01 10:40:01.428551-04:00 kernel - re1: link state changed to DOWN
2025-09-01 10:40:01.423296-04:00 check_reload_status 514 Linkup starting re1
2025-09-01 10:40:01.402732-04:00 php-fpm 81212 /rc.linkup: HOTPLUG: Configuring interface wan
2025-09-01 10:40:01.402723-04:00 php-fpm 81212 /rc.linkup: DEVD Ethernet attached event for wan
2025-09-01 10:40:01.402695-04:00 php-fpm 81212 /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp, 6: dhcp6)
2025-09-01 10:40:01.402592-04:00 check_reload_status 514 Reloading filter
2025-09-01 10:39:56.689768-04:00 rtsold 89908 <cap_rssend> sendmsg on re1: Permission denied
2025-09-01 10:39:56.285686-04:00 php-fpm 70474 /rc.start_packages: Restarting/Starting all packages.
2025-09-01 10:39:55.280147-04:00 check_reload_status 514 Reloading filter
2025-09-01 10:39:55.280042-04:00 check_reload_status 514 Starting packages
2025-09-01 10:39:55.279937-04:00 php-fpm 90685 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 64.179.196.79 - Restarting packages. -
@itsbry said in Troubleshooting WAN outage:
In Int>LAN, I have this (no gateway set?):
Ok that's normal then for LAN. Convention is to use the .1 IP for the router but any valid IP will work. So it was just a bit confusing to read.
2025-09-01 10:40:05.448607-04:00 kernel - re1: link state changed to UP
2025-09-01 10:40:05.361220-04:00 check_reload_status 514 Linkup starting re1
2025-09-01 10:40:01.428551-04:00 kernel - re1: link state changed to DOWN
2025-09-01 10:40:01.423296-04:00 check_reload_status 514 Linkup starting re1
2025-09-01 10:40:01.402732-04:00 php-fpm 81212 /rc.linkup: HOTPLUG: Configuring interface wanWAN lost link for a few seconds. Maybe the ISP modem rebooted? Though 4 seconds is pretty quick for that. I'd try changing patch cables, or put a switch in between pfSense and the ISP modem.
-
OK great that LAN side gateway is fine then. You need it for the downstream switch/router and it's correctly added separately from the interface. The only thing to consider there is disabling monitoring on it since it's should always be up. Edit: Looks like you already did so that's no longer any issue.
But if you have more than one gateway make sure that the default IPv4 gateway is set to WAN_DHCP. Otherwise if WAN goes down for some reason pfSense will switch to a different gateway when set to automatic. And it will not switch back which could explain why you have to reboot to restore connectivity. Though I don't see that in the log.
-
@stephenw10 @SteveITS Thank you both for the help... I've made the default gateway change and we'll give a few days to test. I'll try to remember to post up results of the next outage (hopefully we recover automatically!)