pfSense 2.5.2 periodic HUGE lag spikes
-
I see the same on my Netgate 6100 with 4 WLANs loadbalancing. The lag shows in devices connected to the LAN interface - for these devices it looks like the internet connection is lagging but in the pfsense monitor I don't see any movement on all 4 WLANs.
After the lag is over, all WLANs continue as nothing happened.
I also found the longer the 6100 is running the more often these lags appear. If I reboot the 6100 I do not see any lag for a day or so -
@muenchris I've actually just rebuilt my pfsense box from factory defaults with Intel NICs and instantly got the same issue, but I may have found the fix. I had to apply traffic shaping rules.
I will still need to test this over a few days, but it's potentially fixed.
Here's how I did it:First, grab a speedtest and get your best-case Download/Upload speeds. In my case it was 40Mbit/4Mbit.
Head to Firewall > Traffic Shaper > Limiters. Create a new limited and name it something like "InternetDownload". Tick the "Enable" box and leave all the values as default except for the following:
- Bandwidth: The value you got from your speedtest - Queue Management Algorithm: CoDel - Scheduler: FQ_CODEL - Queue length: 1000 - ECN: Enabled
After saving, click on the Limiter you just created, scroll to the bottom and click "Add new Queue", name it InternetDownQueue, fill out the following:
- Enabled: ticked - Queue management algorithm: CoDel - ECN: enabled
Save that and follow the instruction again, but this time for your UPLOAD. Except, knock about 10-20% from your upload speed when filling it out.
Head over to firewall > Rules > Floating and add a new floating rule with the following info:
- Action: Pass - Quick: Enabled - Direction: Out - Address Family: IPv4 (usually) - Protocol: Any
Advanced Options
- Gateway: Select your WAN gateway - In/Out pipe: InternetUploadQueue / InternetDownloadQueue (Yes, it looks like its in reverse order, it's not).
Hit save and reload your configuration. Test a game/ping while running a speed test and the latency shouldn't get knocked.
-
@abtekk interesting. I thought the traffic shaper will limit connections from a client to one WLAN gateway (like a smart-queue in Unifi).
With my 4 WLANs I can download Steam or XBox updates across all WLANs, will this still work after the shaper is active? -
@muenchris Yes, in fact you won't be shaping your download speed at all, you just still need the rule to make the firewall happy. The only thing you'll be shaping is your upload speed (by up to 20%), which is just to stop the overload that seems to cause this. Let me know how you get on.
-
@abtekk Thanks, I will try and test for a couple of days then let you know
-
@abtekk Quick question: I did not select any of my interfaces in the firewall rule. Do I have to select all interfaces that I use in my load-balancer gateway group or keep them all unselected?
-
@muenchris said in pfSense 2.5.2 periodic HUGE lag spikes:
If I reboot the 6100 I do not see any lag for a day or so
A day seems fast for my suggestion but check to see if it's running out of memory. pcscd has a memory leak.
OTOH if the Internet pipe is full then that will back up everything and traffic shaping can help that a lot.
-
@steveits My network setup is quite complicated. Behind the pfSense/6100 there is a Unifi UDM Pro that manages my main network that has 5 sub-networks (IoT, Cameras, Media, Gaming and Business). All firewall rules are managed by the UDM Pro - the pfSense is "only" load balancing. This results in my triple-NAT setup.
Also my WLANs are all LTE router with non-consistent internet speeds (very different for time of day and day of week).
Especially the (isolated) IoT network creates a lot of "mini-Connections" to their respective clouds.
If there is a memory leak in one of the pfSense services it might be caused by the IoT network (over 150 devices).Is there something (like a specific service) I can "flush/restart" periodically? I would have to restart the complete Netgate every night
-
According to this thread:
https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4
Codel has a bug. Use "taildrop" instead. You'll get same result
-
@magikmark Thanks, I will check it out.
-
@muenchris said in pfSense 2.5.2 periodic HUGE lag spikes:
something (like a specific service) I can "flush/restart" periodically
The pcscd service, as mentioned :) If you aren't using IPSec you can just stop it, though it will start when pfSense boots. Otherwise if you follow into that bug report there is a patch to disable it properly. Not saying this is your issue, but it's generally an issue on all installs eventually.
-
@magikmark said in pfSense 2.5.2 periodic HUGE lag spikes:
Codel has a bug. Use "taildrop" instead. You'll get same result
First I've heard of that. I don't see anything recent in that thread detailing it. A lot of people are running that.
Is there a bug report?Steve
-
The thread is quite long. Here is the exact post. I think it's not Codel itself but pfsense:
https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/770
You get this error:
"config_aqm Unable to configure flowset, flowset busy!" error.
-
@magikmark said in pfSense 2.5.2 periodic HUGE lag spikes:
https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/770
Ah, OK. That's not a bug it's a feature.
I've never hit that but it looks like you would only ever hit it if trying to re-configure an existing pipe that is actively in use.
Steve