Gateway monitoring Error 64 fixed by reboot—what’s the cause?
-
I have a Netgate 1100 which has hundreds of these entries in the gateway log (Status → System Logs → System → Gateways):
Apr 15 04:55:00 newsyslog 54814 logfile turned over due to size>500K Apr 15 04:55:00 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:00 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:01 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:01 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:02 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:02 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:03 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:04 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64 Apr 15 04:55:04 dpinger 32921 WAN_DHCP 100.x.x.1: sendto error: 64
These cease only after a reboot. The way I see it, this means that the WAN gateway is not really down but that some other process is keeping
dpinger
from getting to the WAN gateway. How can I get to the bottom of this? My client has to reboot her Netgate 1100 every other day.Maybe this is a clue: The ISP is Metronet.
-
Is that the expected gateway IP address? It could be something temporary if the WAN goes down and is brought back up like a modem address.
Sento error 64 implies it's not responding to ARP so something low level.
Steve
-
@stephenw10: As far as I can tell, 100.x.x.1 is the gateway address assigned by the Metronet DHCP server. It does change from time to time, as Metronet does double-NAT for its residential customers.
Also, it’s not temporary, as the several hundred of those entries span at least 35 min. What I am showing is the beginning of the log file after it turned over. What happened before that and how long that lasted, I don’t know.
Again, the fact that a reboot fixes this immediately tells me that some pfSense process essentially causes a denial of service on the WAN interface.
-
Then I would run a pcap when it's in that state and see what's happening on the interface. The error implies pfSense is sending ARP requests and the gateway never replies.
Does it reconnect is the WAN cable in unplugged and reconnected?
Does it eventually reconnect if they do nothing?
-
@stephenw10: It has on occasion recovered after quote some time without a reboot.
What would the unplugging and reconnecting of the WAN cable accomplish? How long should it remain disconnected? I will suggest that to her.
As for the pcap, any suggestion as to what parameter to sets, what filters?
-
Re-linking the WAN triggers a bunch of scripts. Among others it would restart the dhcp client and will start by sending a broadcast to any server not just that gateway.
I would start by running the pcap without any filter on WAN. If you see anything coming back in at all that gives us a clue.