100% packet loss every 15-20 minutes

flat4

@colorock Who's the ISP?

Curious to know as I have had some issues with my fiber.

ColoRock

@flat4 What state are you in?

stephenw10

Yeah, you shouldn't have to send ARP requests like that. It shouldn't matter if their ARP entry expires, they should just send an ARP query for your WAN IP and pfSense responds.
The most likely scenario is something else is sending ARPs with your IP at a slightly shorter interval than pfSense.

michmoor

@stephenw10 thinking the same. packet capture on your WAN and look out for ARPs. Start investigation from there to see who else is sending ARPs with your Sender IP address in the field.

ColoRock

@michmoor @stephenw10

So, something else on my network might also be sending ARP requests to the gateway? I tested overnight previously with my LAN physically disconnected from the router, and pfSense still logged the outages. Could it be the ONT device? Not sure how I could monitor what the ONT is up to since it’s the ISPs device. Thanks!

flat4

@colorock
Oklahoma

ColoRock

@flat4 I’m in Colorado. My ISP only serves the town I live in.

stephenw10

If your ISP is any good you would hope they are not sending broadcast traffic from customers to all other customers. But they might and that would prove the issue which would be great to take to them so they can investigate.
Run a packet capture on WAN in promisc mode and filter by both protocol:ARP and your WAN IP. I would only expect to see your own ARP queries to the gateway at ~20min intervals. Or 10min if you set that sysctl. If you see something elsen sending queries from the same IP that's an issue.
However I doubt you will because clients should be isolated and pfSense will log errors if it sees something else using it's IP address. I assume you have not seen errors like that in the system log.

Steve

ColoRock

@stephenw10 Not seeing that in the system log. Captured all arp packets on the wan interface for a couple hours (with arp overridden at 600 seconds). Just saw the arp packets I’d expect at the interval I’d expect.

This article sound a lot like what I’m experiencing.

“many ISPs perform insecure probing to either identify unused IP addresses or to manage blocks of static IP addresses for their customers”

“the ARP requests the ISP sends to the (router) to publish is own ARP cache are coming from an address outside the (router’s) WAN interface and gateway subnet”

The article then says more secure routers will “recognized this behavior as a potential security risk and drop these packets”

When I ran my packet capture, I had the fix in place to prevent the outage. I’ll comment that out and capture again to see if there is an “arp probe”? coming from the ISP shortly before the drop. If that is the case, the article goes on to explain how to allow this probing from the ISP (though on a different brand of router).

stephenw10

Mmm, that would be interesting if that's what's happening. Should be easy enough to prove with a packet capture if it is.

Steve

ColoRock

I don’t see anything in packet capture that would indicate my issue is the same as described in the article. Still interesting how similar the issue sounds.

I think I just have an ISP that requires an ARP query at more frequent intervals than the pfSense default of 1200 seconds. Setting the interval to 600 seconds keeps the WAN super stable, and I don’t see anything weird in packet capture, so I’ll leave it at that for now.

If this was common (doesn’t appear to be) I’d expect this ARP interval setting to be in the pfSense GUI.

This local ISP has about 500 customers. Many are probably leasing a “preferred” router. But, probably a matter of time before others experience this.

Thanks @stephenw10 and @michmoor !

stephenw10

Well go with that if it solves it but it shouldn't be required. Even if pfSense was set with a static ARP so it never queried the gateway that still shouldn't be a problem. The gateway should query pfSense as soon as it's ARP table entry expires.

Steve

ColoRock

@stephenw10 Yeah, I never saw an ARP query initiated by the ISP over several hours of capturing all ARP traffic on the WAN port.