Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Random Massive Lag Spikes

    Scheduled Pinned Locked Moved General pfSense Questions
    9 Posts 4 Posters 247 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cortadalallo
      last edited by cortadalallo

      In advance I'm an absolute noob when it comes to networking and working with pfSense in general so I'm not sure how to navigate pfSense or debug issues with any level of sophistication.

      I have a client on my network which does a lot of downloading and, when turned on, causes massive lag spikes for packets moving into my pfSense box. Typically pings to my pfSense gateway addr take around 0.3ms to return, however at random times pings take up to 200ms and sometimes even longer. For example see this paste

      So far the best lead I have is that the System Activity screen starts showing less CPU idle time, interrupt load seems to skyrocket, and a program running debug against the ruleset starts showing up and taking up massive amounts of CPU time? The interrupts seem to be the culprits here but i'm not sure what's causing them or how to find that out. In addition I couldn't figure out where/how the pfctl program was being executed which is a bit suspicious. From what I understand it's from dynamic rules being applied, however I don't think I have any dynamic rules currently? Here's a pastebin I managed to capture with all of the aforementioned issues: see this paste for a top printout showing high random load

      My specs are as follows:

      Intel(R) Celeron(R) CPU J1900 @ 1.99GHz
      Current: 1992 MHz, Max: 1993 MHz
      
      Memory: 3924 MiB
      

      I've tried the following approaches to mitigate the issue to no avail:

      • Traffic shaping - I have a codel limiter working to keep traffic at a max of 85Mbit/s where my internet bandwidth is 100Mbit/s
      GertjanG K 2 Replies Last reply Reply Quote 0
      • GertjanG
        Gertjan @cortadalallo
        last edited by

        @cortadalallo

        The ping paste : how is the device you are pining from connected to pfSense ? Wifi ? Wired ?
        If its wired, is this a classic 1 Gbit/sec connection ?

        The top paste : your pfSense is basically doing 'nothing' with its 4 cores.
        That said, on my own 4100 I've 105 processes, and you have twice as much. That's ... a bit strange. You installed and activated all pfSense packages ?

        Btw : pfctl ones in a while the firewall rules get reloaded, that's normal.

        What is the brand (type) of the NICs used ?

        @cortadalallo said in Random Massive Lag Spikes:

        I have a codel limiter working to keep traffic at a max of 85Mbit/s where my internet bandwidth is 100Mbit/s

        If a LAN devices is loading or sending something 'big' and fills up the WAN connection, less priority packages might get dropped. ICMP (ping) is a less priority protocol.
        If your traffic shaping is set up using two queues one reserved 'channel' for ICMP only, and another for the rest of the traffic, the ping latency (buffer bloat) will be gone. That doesn't mean the system will be any any faster, though.

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        1 Reply Last reply Reply Quote 0
        • K
          kprovost @cortadalallo
          last edited by

          @cortadalallo /sbin/pfctl -o basic -f /tmp/rules.debug does not 'run debug against the ruleset', it applies a new ruleset. Setting new rules does impact traffic, so that's probably the cause here.

          It is unusual for that to be happening regularly. It might mean that you have an interface that's flapping, or there might be something else triggering this. That's what you need to figure out.

          1 Reply Last reply Reply Quote 2
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Yup that^.

            Check the system log to see what's triggering the ruleset reload.

            1 Reply Last reply Reply Quote 0
            • C
              cortadalallo
              last edited by

              @kprovost said in Random Massive Lag Spikes:

              @cortadalallo /sbin/pfctl -o basic -f /tmp/rules.debug does not 'run debug against the ruleset', it applies a new ruleset. Setting new rules does impact traffic, so that's probably the cause here.

              It is unusual for that to be happening regularly. It might mean that you have an interface that's flapping, or there might be something else triggering this. That's what you need to figure out.

              This provided a good lead, I used the system logs to figure out that the interface is flapping but I'm not sure why. I just replaced the cable on that interface to no avail.

              From the logs it looks like check_reload_status might be a factor but it looks like its trying to bring the interface up & not down? Although that's assuming that the check_reload_status & kernel printouts are synchronous when they're probably not.

              May 2 17:38:51	rc.gateway_alarm	12499	>>> Gateway alarm: WAN_DHCP6 (Addr:fe80::201:5cff:fe95:f846%igb0 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
              May 2 17:38:51	check_reload_status	429	Reloading filter
              May 2 17:38:51	check_reload_status	429	Restarting OpenVPN tunnels/interfaces
              May 2 17:38:51	check_reload_status	429	Restarting IPsec tunnels
              May 2 17:38:51	check_reload_status	429	updating dyndns WAN_DHCP
              May 2 17:38:51	rc.gateway_alarm	11618	>>> Gateway alarm: WAN_DHCP (Addr:68.112.120.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
              May 2 17:38:51	check_reload_status	429	Reloading filter
              May 2 17:38:51	kernel		                igb0: link state changed to UP
              May 2 17:38:51	check_reload_status	429	Linkup starting igb0
              May 2 17:38:48	php-fpm	                53068   /rc.linkup: DEVD Ethernet detached event for wan
              May 2 17:38:48	php-fpm	                53068   /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp, 6: dhcp6)
              May 2 17:38:47	kernel		                igb0: link state changed to DOWN
              May 2 17:38:47	check_reload_status	429	Linkup starting igb0
              
              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                It an be hard to see what is cause or symptom there. But those log lines from kernel show it's actually losing link.

                What is igb0 connected to? Can you try a different port?

                C 1 Reply Last reply Reply Quote 0
                • C
                  cortadalallo @stephenw10
                  last edited by

                  @stephenw10

                  That interface is connected directly to my modem so unforunately I am unable to try a different port.

                  I could potentially try a different interface in pfSense or even do a ping/stability test directly against the modem with a different device.

                  C 1 Reply Last reply Reply Quote 0
                  • C
                    cortadalallo @cortadalallo
                    last edited by

                    Also yeah I just tried turning on the high-bandwidth-consuming client and it triggered 2 "flaps" within about 10 minutes. Is it possible that my ISP is just totally crapping itself whenever theres an uptick in traffic?

                    dd66545f-992b-4356-9a4d-152aade6b5ec-image.png

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      It could just be the modem crapping out, yes.

                      Can you try a different port at the pfSense end?

                      Can you test putting a switch in between the pfSense WAN and the modem? That would prove which end is dropping the link.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.