Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    runaway delay average and std. dev. on WAN

    Scheduled Pinned Locked Moved General pfSense Questions
    29 Posts 2 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Well in part of that graph you are seeing ping latency >100ms. So if you ping, for example, 8.8.8.8 you will very clearly see that. If you only see it against the monitoring target that implies something other than just a delay in the route may be happening.

      P 1 Reply Last reply Reply Quote 0
      • P
        papaMURKS @stephenw10
        last edited by papaMURKS

        @stephenw10 yes there is definitely a delay from reaching external hosts. noticeable by pinging an ip directly, as well as by pinging a domain. (i use Unifi Wifiman which has a neat little UI for monitoring pings in real time to facebook, google, x, and i added 8.8.8.8 and 1.1.1.1 as well. those pings to my local gateway are normal, ~4ms)

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Ok cool. Then I guess wait for it to grow to something clearly visible then try to reset it without rebooting the AT&T router. If nothing else resets it the issue pretty much has to be there.

          P 1 Reply Last reply Reply Quote 0
          • P
            papaMURKS @stephenw10
            last edited by

            @stephenw10 so, unexpected behavior...
            69c0d1b2-106e-482e-92c4-5b3d8fe78c47-image.png
            above graph is last 2 days...

            around 5pm on 8/11 my rtt ping statistics improved drastically (i.e., to expected levels and consistent with times immediately following a RG reboot) with NO (known) INTERVENTION BY ME

            in reviewing my logs, i see a HUGE amount of arpresolve logs in the times leading up to and following the good RTT pings:

            7d4b96cd-2b54-4ed6-b05d-05269c8d27f7-image.png

            For context, the logs in the above image are only displaying 500 lines and the first line starts Aug 11 @ 16:45. so this created ~480 entries between 1645 and 1652...

            192.168.1.254 is the LAN address of the RG. I don't recall seeing this message before, but almost certainly not to this extent.

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Hmm, seems like it might have rebooted? Unable to allocate local link info like that generally means pfSense doesn't have a IP in that subnet. So like it lost it's DHCP lease or the WAN went down.

              Though I'd expect to see some monitoring ping failures if that was the case.

              P 1 Reply Last reply Reply Quote 0
              • P
                papaMURKS @stephenw10
                last edited by papaMURKS

                @stephenw10

                Aug 11 16:47:17 	kernel 		arpresolve: can't allocate llinfo for 192.168.1.254 on mvneta2
                Aug 11 16:47:20 	php-fpm 	836 	/rc.newwanip: Removing static route for monitor [FIRST HOP] and adding a new route through [WAN GATEWAY]
                Aug 11 16:47:21 	php-fpm 	836 	/rc.newwanip: Gateway, NONE AVAILABLE
                Aug 11 16:47:22 	php-fpm 	836 	/rc.newwanip: Gateway, NONE AVAILABLE
                Aug 11 16:47:22 	php-fpm 	836 	/rc.newwanip: IP Address has changed, killing states on former IP Address 0.0.0.0. 
                

                also

                Aug 11 16:49:41 	php-fpm 	836 	/rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> [WAN IP] - Restarting packages. 
                

                does this look like the WAN went down and was recovered?

                EDIT: extra info, the RG renews the DHCP lease for the pfsense appliance every 24 hours

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by stephenw10

                  Does the DHCP log show anything for the dhclient at that time?

                  It should renew without any interruption but clearly it lost an IP entirely at one point.

                  P 1 Reply Last reply Reply Quote 0
                  • P
                    papaMURKS @stephenw10
                    last edited by

                    @stephenw10

                    well, my DHCP log is flooded with hundreds of dhcpd entries so the log only goes back to the last hour. Most are DHCPREQUESTs and DHCPACK for LAN devices and their MAC addresses. also dhcp lease renew and ipv6 advertise address entries.

                    there are no entries for dhclient

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      You can filter that for the dhclient process:
                      Screenshot from 2024-08-14 15-48-12.png

                      P 1 Reply Last reply Reply Quote 0
                      • P
                        papaMURKS @stephenw10
                        last edited by papaMURKS

                        @stephenw10 thanks! attached are the dhclient logs (forum flagged the pasted logs as spam...)
                        dhcplogs.txt

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Hmm, well the only thing there is that at that point the logs show it pulled a private IP:

                          Aug 11 16:45:02 	dhclient 	34520 	bound to 192.168.1.64 -- renewal in 15 seconds.
                          

                          That is usually a sign that the mode lost it's upstream connection and started handing out IPs itself. So if that did happen here that implies the line issues were reset by that upstream link reset/resync.

                          P 1 Reply Last reply Reply Quote 0
                          • P
                            papaMURKS @stephenw10
                            last edited by papaMURKS

                            @stephenw10 can you please clarify, do you mean it's a sign that the RG lost its upstream connection?

                            and by line issues, do you mean att -> my house ONT, ONT -> RG, or RG -> pfsense?

                            if the RG is handing out IPs itself, does that create a problem? (i believe it's possible for me to disable DHCP server in the RG if that could be the source of the issues...) it hasn't handed out any IPs except passing the WAN IP to pfsense.

                            d60538e2-48c5-4f47-b033-1d1ae1ac6fbf-image.png

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              I mean something upstream of the AT&T router/gateway. Those usually only hand out private IPs themselves when they can't connect to the upstream server.

                              P 1 Reply Last reply Reply Quote 0
                              • P
                                papaMURKS @stephenw10
                                last edited by

                                @stephenw10 thanks, sorry do you consider upstream in the direction of the ONT or in the direction of my pfsense firewall

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Yes sorry in the direction of the ONT.

                                  1 Reply Last reply Reply Quote 0
                                  • P
                                    papaMURKS
                                    last edited by papaMURKS

                                    @stephenw10 below is my gateway monitoring and i'm definitely experiencing client-side performance issues as of today (random stuff like Amazon loading, Twitch loading, Youtube thumbnails delayed load, etc.)

                                    pings to google and cloudflare as well as facebook, google, twitter all idle around 20+ms (higher than normal) but often spike to 60+ ms.

                                    d20b4a2b-584f-4a44-bfb8-329c87338d0d-image.png

                                    Going to pull the ethernet from 3100 -> RG, monitor, and update the thread.

                                    P 1 Reply Last reply Reply Quote 1
                                    • P
                                      papaMURKS @papaMURKS
                                      last edited by papaMURKS

                                      @stephenw10

                                      pulling ethernet from:

                                      • 3100 -> RG: no effect
                                      • RG -> ONT: no effect

                                      pulling power from ONT: no effect
                                      restart RG via the web UI: appears to reset the issue

                                      also definitely had gateway monitoring alarms leading up to this morning.

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Hmm, well that definitely seems like an issue in the RG then. 😕

                                        P 1 Reply Last reply Reply Quote 0
                                        • P
                                          papaMURKS @stephenw10
                                          last edited by

                                          @stephenw10 seems that way, however i replaced the RG earlier this year and that did not solve the issue.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Software issue in the RG firmware then maybe.

                                            Just to confirm you said rebooting the 3100 made no difference? Only rebooting the RG fixes the issue?

                                            P 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.