WAN connection randomly drops?

ExecutableFix

It doesn't show an 'IP address changed' log line, but it definitely changes to 0.0.0.0 when down. I'm passing through a HP NC364T quad nic to the VM. The em0 WAN port is directly connected to the bridged ISP router

flyboy320

I'm a newbie at this as well, so take this with a grain of salt.

I was having the same issues as you, WAN would drop randomly 1-2 times a day (no good when gaming). I connect through my modem with PPPoE and not DHCP, which I think is what the issue was. I tried everything to try and fix it, but nothing worked (new computer, new NIC, new cables, fresh install of pfSense, etc.). As a last resort I tried 2.5 and it has been rock solid for over a week with no disconnects.

Like I said before take what I say knowing I'm new at this, but connecting to my ISP through PPPoE just wasn't working on 2.4, but 2.5 seems to have solved the issue for me.

ExecutableFix

@flyboy320 My ISP doesn't support PPPoE so that's not going to work

stephenw10

Well 2.5 has fixes for numerous other things, it might work for you too but it would be better to narrow down what is actually failing when that happens.

ExecutableFix

It just happened again

Feb 23 19:53:44	php-fpm	                85374	/rc.start_packages: Restarting/Starting all packages.
Feb 23 19:53:43	check_reload_status		Starting packages
Feb 23 19:53:43	php-fpm	                72364	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 185.47.x.x -> 185.47.x.x - Restarting packages.
Feb 23 19:53:41	php-fpm	                72364	/rc.newwanip: Creating rrd update script
Feb 23 19:53:41	php-fpm	                72364	/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb 23 19:53:39	check_reload_status		Reloading filter
Feb 23 19:53:39	check_reload_status		updating dyndns wan
Feb 23 19:53:38	php-fpm	                72364	/rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1582484018] unbound[83768:0] error: bind: address already in use [1582484018] unbound[83768:0] fatal error: could not open ports'
Feb 23 19:53:36	php-fpm	                72364	/rc.newwanip: Gateway, none 'available' for inet6, use the first one configured. ''
Feb 23 19:53:36	php-fpm	                72364	/rc.newwanip: rc.newwanip: on (IP address: 185.47.x.x) (interface: WAN[wan]) (real interface: em0).
Feb 23 19:53:36	php-fpm	                72364	/rc.newwanip: rc.newwanip: Info: starting on em0.
Feb 23 19:53:35	check_reload_status		Restarting ipsec tunnels
Feb 23 19:53:35	php-fpm	                43417	/rc.linkup: Gateway, none 'available' for inet6, use the first one configured. ''
Feb 23 19:53:35	php-fpm	                43417	/rc.linkup: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
Feb 23 19:53:35	check_reload_status		rc.newwanip starting em0
Feb 23 19:52:55	php-fpm	                43417	/rc.linkup: HOTPLUG: Configuring interface wan
Feb 23 19:52:55	php-fpm	                43417	/rc.linkup: DEVD Ethernet attached event for wan
Feb 23 19:52:54	kernel		                em0: link state changed to UP
Feb 23 19:52:54	check_reload_status		Linkup starting em0
Feb 23 19:52:33	php-fpm	                43417	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Feb 23 19:52:33	php-fpm	                43417	/rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
Feb 23 19:52:32	check_reload_status		Reloading filter
Feb 23 19:52:32	check_reload_status		Restarting OpenVPN tunnels/interfaces
Feb 23 19:52:32	check_reload_status		Restarting ipsec tunnels
Feb 23 19:52:32	check_reload_status		updating dyndns WAN_DHCP
Feb 23 19:52:32	rc.gateway_alarm	14180	>>> Gateway alarm: WAN_DHCP (Addr:185.47.x.x Alarm:1 RTT:1.386ms RTTsd:.291ms Loss:22%)
Feb 23 19:52:18	check_reload_status		Reloading filter
Feb 23 19:52:17	php-fpm	                43417	/rc.linkup: DEVD Ethernet detached event for wan
Feb 23 19:52:16	kernel		                em0: link state changed to DOWN
Feb 23 19:52:16	check_reload_status		Linkup starting em0

I'm really not sure what's going on and why this is happening

stephenw10

Feb 23 19:52:54	kernel		                em0: link state changed to UP
Feb 23 19:52:54	check_reload_status		Linkup starting em0
...
Feb 23 19:52:17	php-fpm	                43417	/rc.linkup: DEVD Ethernet detached event for wan
Feb 23 19:52:16	kernel		                em0: link state changed to DOWN
Feb 23 19:52:16	check_reload_status		Linkup starting em0

This implies the NIC lost link. Since it's a VM that's unlikely unless it is a physical NIC passed through, is it?

What is em0 connected to?

Steve

ExecutableFix

It is a physical nic passed through and em0 is the WAN connection from the bridged router

stephenw10

And it's connected directly to the router? You might try putting a switch in between.

If it really is losing link that will probably prevent it but it will still lose connectivity.

It loses link for 38s, is the upstream router rebooting?

Steve

ExecutableFix

Not sure what good a switch would do when the connection just randomly drops?

I don't think the bridged router is restarting because it never happened before I switched to pfSense, but at this point I'm not sure if it's pfSense's fault or the isp router config. Would there be any way to 'debug' this problem?

A Former User

i don't want to be 'that guy', but do you happen to have a cable tester? or are you using a 'known good cable'?

stephenw10

Putting a switch in between should mean the link would not drop between em0 and the switch. You would see that change in the logs. If it does not drop then it looks like a problem with the router, or at least with the connection between that and the NIC. If it still drops then it's some issue with pfSense on the VM setup.

Steve

ExecutableFix

@sparkyMcpenguin I've tested the cables with a cable tester and did a speedtest, those work fine. That shouldn't be the reason for a random drop.

@stephenw10 Hmm I could give that a try and see if that works. If it's not pfSense's fault then I'll have to contact my isp, thanks for the tip!

ExecutableFix

Another update: Even with the switch attached it still dropped packets, but not the link. I'm pretty sure this is an ISP issue so I'll get in touch with them. I'll post another update once that's done.

ExecutableFix

It is most definitely something wrong with pfSense. The same thing is happening to a VM somewhere across the country.

Here's the log of that VM:

Feb 23 00:14:51 	php-fpm 	        60158 	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
Feb 23 00:14:51 	php-fpm 	        60158 	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. 'WAN_DHCP6'
Feb 23 00:14:51 	php-fpm 	        60158 	/rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
Feb 23 00:14:50 	check_reload_status 		Reloading filter
Feb 23 00:14:50 	check_reload_status 		Restarting OpenVPN tunnels/interfaces
Feb 23 00:14:50 	check_reload_status 		Restarting ipsec tunnels
Feb 23 00:14:50 	check_reload_status 		updating dyndns WAN_DHCP
Feb 23 00:14:50 	rc.gateway_alarm 	96155 	>>> Gateway alarm: WAN_DHCP (Addr:145.44.x.1 Alarm:0 RTT:5.100ms RTTsd:3.103ms Loss:17%)
Feb 23 00:11:48 	php-fpm 	        27479 	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
Feb 23 00:11:48 	php-fpm 	        27479 	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. 'WAN_DHCP6'
Feb 23 00:11:48 	php-fpm 	        27479 	/rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
Feb 23 00:11:47 	check_reload_status 		Reloading filter
Feb 23 00:11:47 	check_reload_status 		Restarting OpenVPN tunnels/interfaces
Feb 23 00:11:47 	check_reload_status 		Restarting ipsec tunnels
Feb 23 00:11:47 	check_reload_status 		updating dyndns WAN_DHCP
Feb 23 00:11:47 	rc.gateway_alarm 	8754 	>>> Gateway alarm: WAN_DHCP (Addr:145.44.x.1 Alarm:1 RTT:5.367ms RTTsd:3.278ms Loss:21%)

And here's my log (after putting a switch between the bridged router and pfSense:

Feb 25 03:46:01	check_reload_status		Reloading filter
Feb 25 03:46:01	check_reload_status		Restarting OpenVPN tunnels/interfaces
Feb 25 03:46:01	check_reload_status		Restarting ipsec tunnels
Feb 25 03:46:01	check_reload_status		updating dyndns WAN_DHCP
Feb 25 03:46:00	rc.gateway_alarm	43954	>>> Gateway alarm: WAN_DHCP (Addr:185.x.x.1 Alarm:0 RTT:1.397ms RTTsd:.280ms Loss:5%)
Feb 25 03:43:59	php-fpm	                43417	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Feb 25 03:43:59	php-fpm	                43417	/rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
Feb 25 03:43:58	check_reload_status		Reloading filter
Feb 25 03:43:58	check_reload_status		Restarting OpenVPN tunnels/interfaces
Feb 25 03:43:58	check_reload_status		Restarting ipsec tunnels
Feb 25 03:43:58	check_reload_status		updating dyndns WAN_DHCP
Feb 25 03:43:58	rc.gateway_alarm	71623	>>> Gateway alarm: WAN_DHCP (Addr:185.x.x.1 Alarm:1 RTT:1.477ms RTTsd:.290ms Loss:22%)

The interval is different for both VMs tho and mine is settling at somewhere between 2 and 1.5 days. On the other VM there were 5 days in between. It's starting to look like some sort of DHCP lease setting is configured wrong? Maybe you can give some more insight @stephenw10 ?

stephenw10

Neither of those logs indicate any issue other than there was more than 20% packet loss on the WAN to the IP being monitored. Everything else there is exactly what I would expect to happen when there is that much packet loss.

Make sure you are monitoring an IP that actually responds to ping reliably. The ISP gateway may not always do that.

If that's the only gateway you can disable 'gateway monitoring action' on it whilst still monitoring. That will avoid most of what you see in the logs there but it shouldn't be causing a problem.

Make sure you have a default IPv4 gateway set in Sys > Routing > Gateways rather than automatic to avoid switching to a bad gateway.

Steve

ExecutableFix

@stephenw10 There's only one gateway configured in the routing section which is my ipv4 dhcp. I've already tried to set the monitoring ip to the google dns, but the same alarms still appear. At the time those alarms appear the internet is infact down. Is there a setting I'm overlooking here?

stephenw10

No that's fine then. If the internet is actually down the alarms should trigger.

What exactly is the problem here?

ExecutableFix

@stephenw10 It's the fact that the internet is down every 1.5 day when that literally never happened before I switched to pfSense. It's nice that the pfSense alarms trigger, but that's not the problem. It seems like pfSense disconnects the internet every 1.5 day for no apparent reason. It looks like the same thing is happening to the random VM server which makes it seem like it's a pfsense bug or setting that's configured incorrectly.

stephenw10

If you disable the gateway monitoring action it will not actually do anything other then set an alarm.

Check the quality monitoring graphs in Status > Monitoring. Are you seeing packet loss consistently or just spikes before it goes down?
How were you monitoring the connection be fore you had pfSense?

Steve

ExecutableFix

@stephenw10 This is the graph showing the packet loss at the moment it goes down

There are no other spikes (traffic, packets etc) to be found. It's just plain packet loss at a random time. I wasn't really monitoring the connection before pfSense, but I never experienced internet outages this often. The only outages that happened were more like 1 hour+ long and those only happened rarely