Graph shows high packet loss but can't find the problem

alexharrington

Hi there

We've been using pfSense for coming up to a week now in a school balancing 3 Internet connections - one leased fibre line and two FTTC BT Infinity lines via Zen Internet.

It's been super - so much better than the unit we had before. However yesterday around 12pm, one of the two FTTC lines started showing 60% packet loss and constantly flips between up and down in the gateways section of the dashboard. I've checked the cabling and it all appears OK. I've restarted the OpenReach modem too.

If I manually ping the gateway address for that PPPoE connection using Diagnostics -> Ping and selecting the appropriate OPT interface, I can't get it to show me any dropped packets at all. I'm not sure if there's actually a problem here, or if there's something not right in the graphing? Graph attached showing the sudden jump up.

Many thanks

Alex

status_rrd_graph_img.png_thumb

alexharrington

Should have said we're using 2.0.1-RELEASE amd64.

In the system log I have loads of these lines:
Dec 12 11:26:16 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:16 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:16 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:16 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:11 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:11 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:11 php: : MONITOR: GW_OPT1 is down, removing from routing group
Dec 12 11:26:11 php: : MONITOR: GW_OPT1 is down, removing from routing group

alexharrington

If I manually ping from the Diagnostics -> Ping screen using the appropriate interface, I get this. I've tried numerous times but can't get a single one to fail.

64 bytes from 62.3.84.23: icmp_seq=0 ttl=255 time=15.384 ms
64 bytes from 62.3.84.23: icmp_seq=1 ttl=255 time=15.197 ms
64 bytes from 62.3.84.23: icmp_seq=2 ttl=255 time=15.439 ms
64 bytes from 62.3.84.23: icmp_seq=3 ttl=255 time=15.126 ms
64 bytes from 62.3.84.23: icmp_seq=4 ttl=255 time=15.222 ms
64 bytes from 62.3.84.23: icmp_seq=5 ttl=255 time=14.821 ms
64 bytes from 62.3.84.23: icmp_seq=6 ttl=255 time=15.151 ms
64 bytes from 62.3.84.23: icmp_seq=7 ttl=255 time=15.175 ms
64 bytes from 62.3.84.23: icmp_seq=8 ttl=255 time=15.605 ms
64 bytes from 62.3.84.23: icmp_seq=9 ttl=255 time=15.114 ms

--- 62.3.84.23 ping statistics ---
10 packets transmitted, 10 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 14.821/15.223/15.605/0.202 ms

alexharrington

… and as suddenly as it started it's fixed itself :/

Not really sure what happened but it seems OK for now.

Alex

cmb

Your monitor IP different from your gateway IP I'm guessing? That's the only explanation I can think of for being able to ping your gateway IP and having loss on the quality graph. That's assuming you were picking that WAN as the interface in Diag>Ping. If you picked a different WAN, it would route out to the Internet via that WAN and could well be able to ping it no problem via that path (the problem was most likely between your location and your ISP's router, assuming a type of service where the router isn't physically at your location).

alexharrington

Nope, monitor IP is blank so it's using the gateway address.

I made sure I selected the correct WAN interface when sending the pings and even tried from the command line using the -I option to specify the source address and couldn't get a dropped packet at all.

Very strange. It hasn't reoccurred yet so fingers crossed.