Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    100% packet loss every 15-20 minutes

    Scheduled Pinned Locked Moved General pfSense Questions
    16 Posts 4 Posters 2.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      ColoRock
      last edited by

      While configuring my first pfSense router, I noticed the WAN would drop entirely every 15-20 minutes, then come back up after 120-30 seconds. Then the cycle starts over again.

      Eventually I notice a pattern with the gateway ARP record. The internet only drops out between the final 200 and 30 seconds remaining on the gateway ARP (seems very random within that timeframe) and then once the internet drops, it’s completely out until the gateway ARP record expires, but immediately comes back up when the new gateway ARP record is created.

      I reflashed the router with the minimum customizations to make a connection and it still occurred.

      I added a cronjob that deletes the gateway ARP record every 10 minutes, and the connection has been stable for weeks.

      */10 * * * * arp -d ##.###.#.###
      (#’s are my gateway IP)

      My ISP says it’s the fault of pfSense. I don’t completely agree since I can’t find anyone else having this issue. Small, local ISP. Gig fiber. No modem I can login to. Just an ONT box that I plug my router to directly. Connection is DHCP.

      While the cronjob is fine, I prefer to get to the root of a problem. Any ideas why this happens?

      F 1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Yeah, sounds like the ISP gateway is losing the ARP record until pfSense sends an ARP query for it for some reason. There was another similar thread recently but much more extreme. That user has to send ARP queries every 10s or so to keep the gateway happy!

        Are your ISP giving you any reasons why it's pfSense's fault?

        Instead of the cron job you can try reducing the ARP timeout to 10mins:

        sysctl net.link.ether.inet.max_age=600
        

        If that works you can add it as a system tunable.

        Steve

        1 Reply Last reply Reply Quote 0
        • C
          ColoRock
          last edited by

          My understanding of ARP is fairly limited but it seems like an ISP shouldn’t disconnect a client for a long time (several hours or more?) for no ARP requests? What is the advantage to the ISP of keeping it shorter?

          The ISP said they use “default arp timeouts, which is recommended in most use cases” but don’t say what those timeouts are. They also said “it sounds like the device is losing knowledge of the gateway and then stalls until it sends an arp request”. But, to me it seems like it’s the other way around (like they “forgot” about my device).

          They say they haven’t seen this issue before so it must be my issue. As stated before, small local ISP, so their sample size isn’t that big…

          Your suggested fix was my first thought, but I couldn't find the setting. I’ll try it out. Is the 1200 second default higher than most routers?

          I don’t mind these workarounds. Just trying to save my neighbors the headache if they ever use a similar router with this ISP and to post solutions here for anyone else that might need them. Thanks!

          1 Reply Last reply Reply Quote 0
          • F
            flat4 @ColoRock
            last edited by

            @colorock Who's the ISP?

            Curious to know as I have had some issues with my fiber.

            C 1 Reply Last reply Reply Quote 0
            • C
              ColoRock @flat4
              last edited by

              @flat4 What state are you in?

              F 1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Yeah, you shouldn't have to send ARP requests like that. It shouldn't matter if their ARP entry expires, they should just send an ARP query for your WAN IP and pfSense responds.
                The most likely scenario is something else is sending ARPs with your IP at a slightly shorter interval than pfSense.

                M 1 Reply Last reply Reply Quote 0
                • M
                  michmoor LAYER 8 Rebel Alliance @stephenw10
                  last edited by

                  @stephenw10 thinking the same. packet capture on your WAN and look out for ARPs. Start investigation from there to see who else is sending ARPs with your Sender IP address in the field.

                  Firewall: NetGate,Palo Alto-VM,Juniper SRX
                  Routing: Juniper, Arista, Cisco
                  Switching: Juniper, Arista, Cisco
                  Wireless: Unifi, Aruba IAP
                  JNCIP,CCNP Enterprise

                  C 1 Reply Last reply Reply Quote 0
                  • C
                    ColoRock @michmoor
                    last edited by

                    @michmoor @stephenw10

                    So, something else on my network might also be sending ARP requests to the gateway? I tested overnight previously with my LAN physically disconnected from the router, and pfSense still logged the outages. Could it be the ONT device? Not sure how I could monitor what the ONT is up to since it’s the ISPs device. Thanks!

                    1 Reply Last reply Reply Quote 0
                    • F
                      flat4 @ColoRock
                      last edited by

                      @colorock
                      Oklahoma

                      C 1 Reply Last reply Reply Quote 0
                      • C
                        ColoRock @flat4
                        last edited by

                        @flat4 I’m in Colorado. My ISP only serves the town I live in.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          If your ISP is any good you would hope they are not sending broadcast traffic from customers to all other customers. But they might and that would prove the issue which would be great to take to them so they can investigate.
                          Run a packet capture on WAN in promisc mode and filter by both protocol:ARP and your WAN IP. I would only expect to see your own ARP queries to the gateway at ~20min intervals. Or 10min if you set that sysctl. If you see something elsen sending queries from the same IP that's an issue.
                          However I doubt you will because clients should be isolated and pfSense will log errors if it sees something else using it's IP address. I assume you have not seen errors like that in the system log.

                          Steve

                          C 1 Reply Last reply Reply Quote 0
                          • C
                            ColoRock @stephenw10
                            last edited by

                            @stephenw10 Not seeing that in the system log. Captured all arp packets on the wan interface for a couple hours (with arp overridden at 600 seconds). Just saw the arp packets I’d expect at the interval I’d expect.

                            This article sound a lot like what I’m experiencing.

                            “many ISPs perform insecure probing to either identify unused IP addresses or to manage blocks of static IP addresses for their customers”

                            “the ARP requests the ISP sends to the (router) to publish is own ARP cache are coming from an address outside the (router’s) WAN interface and gateway subnet”

                            The article then says more secure routers will “recognized this behavior as a potential security risk and drop these packets”

                            When I ran my packet capture, I had the fix in place to prevent the outage. I’ll comment that out and capture again to see if there is an “arp probe”? coming from the ISP shortly before the drop. If that is the case, the article goes on to explain how to allow this probing from the ISP (though on a different brand of router).

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Mmm, that would be interesting if that's what's happening. Should be easy enough to prove with a packet capture if it is.

                              Steve

                              1 Reply Last reply Reply Quote 0
                              • C
                                ColoRock
                                last edited by

                                I don’t see anything in packet capture that would indicate my issue is the same as described in the article. Still interesting how similar the issue sounds.

                                I think I just have an ISP that requires an ARP query at more frequent intervals than the pfSense default of 1200 seconds. Setting the interval to 600 seconds keeps the WAN super stable, and I don’t see anything weird in packet capture, so I’ll leave it at that for now.

                                If this was common (doesn’t appear to be) I’d expect this ARP interval setting to be in the pfSense GUI.

                                This local ISP has about 500 customers. Many are probably leasing a “preferred” router. But, probably a matter of time before others experience this.

                                Thanks @stephenw10 and @michmoor !

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Well go with that if it solves it but it shouldn't be required. Even if pfSense was set with a static ARP so it never queried the gateway that still shouldn't be a problem. The gateway should query pfSense as soon as it's ARP table entry expires.

                                  Steve

                                  C 1 Reply Last reply Reply Quote 0
                                  • C
                                    ColoRock @stephenw10
                                    last edited by

                                    @stephenw10 Yeah, I never saw an ARP query initiated by the ISP over several hours of capturing all ARP traffic on the WAN port.

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.