Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    WAN Gateway Latency

    General pfSense Questions
    wan gateway latency spike
    3
    15
    460
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • I
      iterator23 @vdsadmin
      last edited by

      @vdsadmin

      Let me know how it goes with changing the equipement I am also thinking about removing pfsense and putting a UDM Pro to see if there is a difference.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        That first log shows the latency triggering an alarm but then the alarm is cleared. That doesn't look like it would require a reboot.

        The second log shows the link completely down. Like it lost the dhcp lease on the WAN entirely.

        Those are very different I assume they were from different sites?

        When it's down what do you see on the WAN?

        1 Reply Last reply Reply Quote 0
        • V
          vdsadmin
          last edited by

          I appreciate the quick response.

          Same router. Different days. These are just samples of what appears in the Gateway logs at all locations. These Gateway log entries appear around the same time that the WAN link goes down permanently. The different monitor IPs are just me trying random things in an attempt to determine cause.

          Your last question is one of the more frustrating aspects of all this. I am offsite. The WAN goes down. Since I am offsite and the WAN is down there is no way for me to get in there and see what is going on. People need to work so they reboot the router. Since it is so random I can't sit at a site for days and wait for it to occur. I have considered setting up a WWAN link at a site to access the router during an outage but since these sites are simple networks I determined it would be a shorter path to victory if I just ripped it all out and replaced it all.

          If you have a better plan, I am all ears.

          1 Reply Last reply Reply Quote 0
          • V
            vdsadmin
            last edited by

            Does any of this seem relevant?

            https://forum.netgate.com/topic/135647/help-netgate-router-is-receving-frequent-gateway-alarms-resetting-causing-lost-connections/22

            https://forum.netgate.com/topic/111733/interesting-case-of-wan-dropping-daily-dhcp-being-blocked-by-firewall/7

            Changing the port speed did not solve it for me but if this triggers any memories or helps with the discussion please let me know.

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Can you upload the complete system log covering an event?

              https://nc.netgate.com/nextcloud/s/pPQWfi63ZeY5woM

              V 1 Reply Last reply Reply Quote 0
              • V
                vdsadmin
                last edited by

                Sorry. Forgot to mention I am currently working through another theory. I have pieced together some random bits of information that may or may not be connected.

                1. All routers are running 24.11 and I changed all the routers over to Kea DHCP from ISC DHCP months ago.
                2. I have had a few locations running Netgate routers where Kea DHCP just stops for no reason I can tell.
                3. Around the same time the Spectrum WAN links go down there is an entry in the DHCP logs that shows a DHCP request to the Spectrum DHCP server.
                4. Some of the users have mentioned the strange behavior of some workstations going offline while others are still active. This could be explained by the DHCP server going offline.
                5. I did a ping test to that Spectrum DHCP server at the affected site and the result was tremendous latency.

                64 bytes from 142.254.150.237: icmp_seq=0 ttl=251 time=12.257 ms
                64 bytes from 142.254.150.237: icmp_seq=1 ttl=251 time=13.415 ms
                64 bytes from 142.254.150.237: icmp_seq=2 ttl=251 time=16.755 ms
                64 bytes from 142.254.150.237: icmp_seq=3 ttl=251 time=13.201 ms
                64 bytes from 142.254.150.237: icmp_seq=4 ttl=251 time=10329.483 ms
                64 bytes from 142.254.150.237: icmp_seq=5 ttl=251 time=329.781 ms
                64 bytes from 142.254.150.237: icmp_seq=6 ttl=251 time=260.305 ms
                64 bytes from 142.254.150.237: icmp_seq=7 ttl=251 time=5631.755 ms
                64 bytes from 142.254.150.237: icmp_seq=8 ttl=251 time=55.087 ms
                64 bytes from 142.254.150.237: icmp_seq=9 ttl=251 time=84.499 ms

                1. I did a ping test to the same Spectrum DHCP server from a Spectrum link where I am not experiencing these disconnect issues and I got expected latency.

                PING 142.254.150.237 (142.254.150.237): 56 data bytes
                64 bytes from 142.254.150.237: icmp_seq=0 ttl=255 time=8.200 ms
                64 bytes from 142.254.150.237: icmp_seq=1 ttl=255 time=9.421 ms
                64 bytes from 142.254.150.237: icmp_seq=2 ttl=255 time=8.494 ms
                64 bytes from 142.254.150.237: icmp_seq=3 ttl=255 time=8.081 ms
                64 bytes from 142.254.150.237: icmp_seq=4 ttl=255 time=8.991 ms
                64 bytes from 142.254.150.237: icmp_seq=5 ttl=255 time=8.176 ms
                64 bytes from 142.254.150.237: icmp_seq=6 ttl=255 time=7.587 ms
                64 bytes from 142.254.150.237: icmp_seq=7 ttl=255 time=9.137 ms
                64 bytes from 142.254.150.237: icmp_seq=8 ttl=255 time=8.731 ms
                64 bytes from 142.254.150.237: icmp_seq=9 ttl=255 time=8.165 ms

                This lead me to believe that Kea is issuing a DHCP request on the WAN link but the Spectrum DHCP server is not responding fast enough so Kea is marking the link as down or just simply crashing and taking the WAN link down along the way.

                In the interest of the tried and true method of making system changes on slimly supported random speculation I reverted 2 of the locations back to ISC DHCP. They have been stable for 2 days now. I wish that means a lot but it is to soon to tell. I am going to give it a week and if it remains stable and there are no more disconnects I am going to blame Kea DHCP whether it is to blame or not.

                Best of luck to us all.

                I 1 Reply Last reply Reply Quote 0
                • I
                  iterator23 @vdsadmin
                  last edited by

                  @vdsadmin

                  Which unifi switches and aps do you have at the sites?

                  I am running ISC never moved to KEA.

                  Not sure if related at all but in your unifi console under ports are you seeing TX drops? I am asking because I am running Gen 1 switches and see some TX drops across multiple but not the newer switches.

                  V 1 Reply Last reply Reply Quote 0
                  • V
                    vdsadmin @iterator23
                    last edited by

                    @iterator23

                    At the most affected site I am running a single USW 16 PoE switch with a single Nano HD WAP.

                    On the average day I am not experiencing any drops on the switch. The Netgate 3100 is not reporting any errors on the interfaces.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Kea is only the server side for internal clients. It has nothing to do with the dhclient requests on WAN.

                      V 1 Reply Last reply Reply Quote 0
                      • V
                        vdsadmin @stephenw10
                        last edited by

                        @stephenw10

                        Understood. That is how I understand it as well. However, there are still entries in the DHCP log that reference the WAN DHCP server around the time of the WAN outage.

                        This is a current entry in the DHCP log from a Netgate 4100 and ix3 is the WAN Interface. This particular system is still running Kea and is on a stable Spectrum connection and is not experiencing any outages.

                        Mar 21 09:39:22 dhclient 44939 bound to xxx.xxx.xxx.xxx -- renewal in 43200 seconds.
                        Mar 21 09:39:22 dhclient 75840 Creating resolv.conf
                        Mar 21 09:39:22 dhclient 74947 RENEW
                        Mar 21 09:39:22 dhclient 44939 DHCPACK from 142.254.150.237
                        Mar 21 09:39:22 dhclient 44939 DHCPREQUEST on ix3 to 142.254.150.237 port 67

                        I have no idea if Kea is the culprit. I am just raising it up in case there is an outside chance this could be the issue.

                        1 Reply Last reply Reply Quote 0
                        • V
                          vdsadmin @stephenw10
                          last edited by

                          @stephenw10

                          I will do that as soon as I am able.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            It could be Kea via some affected process but not directly.

                            If dhclient shows failing to pull a new lease at release time then that's certainly a problem.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.