WAN connectivity consistently dropped every 20 minutes [Solved]
I'm entirely new to pfSense and the forums so please excuse me if this is in the wrong section.
Yesterday, I installed pfSense 2.3.2 in a virtual machine (bhyve on HardenedBSD) using tap interfaces over bridges from physical igb interfaces. The virtual machine has 2GB RAM dedicated to it and 4 CPU cores. I live in an apartment where there's a fiber box in the basement, and a TP cable running into my living room. This plugs into my router and I get an address over DHCP.
The install is fairly basic. I'm not running any OpenVPN or ipsec tunnels and I'm not using dyndns. I am not using tftp or xinetd in any way. The only thing I really added was a port forward with a corresponding filter.
The problem is, that every 20 minutes (almost precisely 20 or 21 minutes) the WAN gateway goes down. The interface stays up, and the IP address is kept on it, but all connectivity is cancelled. This forces me to reboot - I don't think it comes back up on its own, but I haven't let it sit around for more than 5 minutes or so.
Upon inspecting the system log, I find the following (most recent first);
Sep 3 12:18:18 xinetd 81664 Started working: 1 available service
Sep 3 12:18:18 xinetd 81664 xinetd Version 2.3.15 started with libwrap loadavg options compiled in.
Sep 3 12:18:17 check_reload_status Reloading filter
Sep 3 12:18:17 check_reload_status Restarting OpenVPN tunnels/interfaces
Sep 3 12:18:17 check_reload_status Restarting ipsec tunnels
Sep 3 12:18:17 check_reload_status updating dyndns WAN_DHCP
xinetd was started because I had manually disabled the service as to debug. Here's another example of what the log says exactly when connectivity is dropped (most recent first):
Sep 3 10:02:26 xinetd 13232 Reconfigured: new=0 old=1 dropped=0 (services)
Sep 3 10:02:26 xinetd 13232 readjusting service 6969-udp
Sep 3 10:02:26 xinetd 13232 Swapping defaults
Sep 3 10:02:26 xinetd 13232 Starting reconfiguration
Sep 3 10:02:25 check_reload_status Reloading filter
Sep 3 10:02:25 check_reload_status Restarting OpenVPN tunnels/interfaces
Sep 3 10:02:25 check_reload_status Restarting ipsec tunnels
Sep 3 10:02:25 check_reload_status updating dyndns WAN_DHCP
While this is happening, dpinger in the "Gateways" log tab will tell me something like this:
send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr xx.xx.xx.xx bind_addr xx.xx.xx.xx identifier "WAN_DHCP "
WAN_DHCP xx.xx.xx.xx: Alarm latency 1325us stddev 2001us loss 21%
…over and over, with the occasional "WAN_DHCP xx.xx.xx.xx: sendto error: 64".
Here's what I've tried and what that did:
- Disable WAN gateway monitoring to consider it always up - this did not prevent it from going down.
- Set the WAN IPv4 explicitly as the default gateway as my ISP doesn't even provide IPv6 - this did not prevent it from going down.
- Manually stopping xinetd using
service xinetd stop- this made it restart the next time check_reload_status ran.
- Spoof the MAC address of my previous router from pfSense - this broke WAN connectivity instantly. Probably because it's actually changing the MAC address of the virtual interface.
- Spoof the MAC address of my previous router on the host OS igb0 interface (the one that goes into the wall) - this changed nothing at all, and I got the same IP address over DHCP from upstream.
- Update to 2.3.3.a.20160902.1612 (latest - this did nothing apart from give me a really cool looking traffic graph.
I am kind of running out of ideas on how to debug this error and would greatly appreciate any pointers or solutions.
I got a reply in my corresponding Reddit thread (https://www.reddit.com/r/PFSENSE/comments/50yczo/wan_connectivity_consistently_dropped_every_20/) that explained everything. Turns out it's kinda hard to renew the DHCP lease when the firewall is blocking it. Oops!
Hope I'm not reviving too old a thread, but this has all the ingredients.
Same errors in Gateway log:
dpinger WAN_DHCP 126.96.36.199: sendto error: 65
General log: (starting round this time)
check_reload status updating dyndns WAN_DHCP
In the firewall log I noticed:
Block 192.168.100.1 port 67 (my modem) to port 68 192.168.100.20 (I assume this is the ISP DHCP server)
I rebooted pfsense and created a new firewall rule for the WAN interface to pass:
Source 192.168.100.1 IPv4 UDP
Destination 192.168.100.20 port 68
I just wanted to confirm if this is the correct remediation.
Edit: I spoke too soon
DHCP timed-out again and I got this in the firewall log
Oct 11 15:33:07 WAN Block ULA networks from WAN block fc00::/7 (12000) 192.168.100.1:67 192.168.100.20:68 UDP
Well, my rule was wrong (from port 68 to 68)
Changed to Source any IPv4 UDP 67, destination 192.168.100.20 68
The firewall log got me looking at my WAN interface connection. I have Block Private networks checked. Perhaps this is why the WAN interface is blocking the modem from sending traffic.