CoDel - How to use

tuffcalc

I just got fiber to my home (pfSense WAN plugged into an optical network terminal/modem) with symmetrical gigabit service.

With the default pfSense install where there was no traffic shaping, I was receiving an F for BufferBloat and my speeds were only in the 600's Mbps on the DSLReports speed tests. I found this thread and started to play with the codelq.

So far, I have found that just enabling codelq without specifying any other values works great. It got me from an F rating to A for BufferBloat. I also played with the Queue Limit (target) by specifying 5 (pfSense defaults to 50). The value of 5 resulted in more packet drops but no performance gain. I also noticed that I get more BufferBloat on my downloads so I specified a bandwidth of 980Mbps on my LAN interface. This seems to give me the best results. In my speed tests, I see spikes above 1000Mbps and so I think my ISP fiber connection is faster than my gigabit LAN and that is why I needed to limit my LAN bandwidth to 980Mbps to get better overall performance.

I am just running codelq on the WAN and LAN with no sub-queues.

| | |

Based on my (many) tests - I think this is working properly for you because your ports are physically limited to 1Gbps. If your internet upload speed is lower than your port speed, you need (in my experience) to limit your bandwidth speed in the traffic shaper to 95% of your upload speed for codel to properly work.

Thanks for posting this!

Nullity

So, 2.3's CODELQ (and assumedly the sub-discipline "CoDel Active queue" check-box version) have a CoDel with proper "interval" & "target" values (rather than reversed and/or wrong);```
[2.3-RELEASE][admin@pfsense.wlan]/root: pfctl -vsq | grep codel
altq on pppoe1 codel( target 5 interval 100 ) bandwidth 640Kb tbrsize 1492


Hmm… anyone got some stats to share?

Harvy66

What kind of stats? I did a DSLReport speedtest after 2.3 and the bufferbloat part looks the same. Still A+.

kpa

In 2.3 it seems you have to enter a bandwidth value to enable CODELQ. Just to verify that I've understood correctly the information, I have a DSL connection with theoretical maximum of 24Mbs/2Mbs down/up and that is in actuality something like 20Mbs/1.4Mbs down/up. Based on what I read here I should set the WAN bandwidth to about 95% of that 1.4Mbs, is that right?

Harvy66

Correct. Good that it forces you to fill out your bandwidth, because it's useless if you just forward the packets at max rate.

vesikk

I'm trying to enter 5.3 Mbps as the bandwidth value but I get a popup box saying "please enter a valid value. the two nearest values are 5 and 6" please help.

bodosom

@vesikk:

I'm trying to enter 5.3 Mbps …

Change the multiplier from Mbit/s to Kbit/s.

bodosom

I have a nominal 30/5 link that runs at ~ 40/6.
I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.

What settings result in a DSLreports A grade?

My previous experience was with Linux and fq_codel which worked perfectly.

kpa

Test your upstream speed a few times without any shaping and set the upstream bandwidth in the CODELQ settings to about 90-95% of the average value you got from the tests.

Nullity

@bodosom:

I have a nominal 30/5 link that runs at ~ 40/6.
I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.

What settings result in a DSLreports A grade?

My previous experience was with Linux and fq_codel which worked perfectly.

As the other poster said, 90-95% is a safe choice. You can go even higher if your connection is stable.
You might try doing your own tests, since they will likely be more accurate. Simply upload a file to an ftp (or any other service) and check the Quality graph in the Monitoring section of the pfSense GUI to see what your latency was during upload saturation.

(FYI, fq_codel/codel only helps upload.)

Harvy66

@bodosom:

I have a nominal 30/5 link that runs at ~ 40/6.
I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.

What settings result in a DSLreports A grade?

My previous experience was with Linux and fq_codel which worked perfectly.

Settings on the WAN link only affect upload, not download. If your download is causing bloat, then you will need to also shape your LAN interface. DSLReports does give separate graphs for up and down, but a single grade that represents both.

vesikk

I have tried using codelq and it works but as soon as I start an upload to google drive I get about 50% packetloss and then the gateways appear offline according to pfsense but ping stays low.

Harvy66

CoDel does not work well below 1Mb/s. If you have less than this, you may be better off trying FairQ. The problem is the 1500mtu is too large for such a slow link. At 1Mb/s, a single 1500 byte packet will take about 12ms, which quickly triggers codel's drop mode.

bodosom

@bodosom:

I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.

What settings result in a DSLreports A grade?

It seems I was unclear. Ideally I'd like to get back to getting an A or B on the DSLR test which I suspect means having an average latency of 100-200 ms. in both directions. I can get a B by carefully tweaking the test parameters after setting the WAN speed to ~90% of my uplink speed. I guess I was hoping for something both simpler and better.

Derelict

OK, look.

pfSense does not have a problem with buffer bloat.

CoDel locally does practically nothing.

People are likely seeing buffer bloat improvement by simply setting the bandwidth lower than their expected upload so buffer bloat does not occur in the ISP gear.

CoDel is for transit, not for the edge.

If you were to start increasing queue length in local shapers AND enabling CoDel you might derive some benefit. But the only reason you would derive benefit from CoDel is because you increased the queue length so just don't.

Nullity

@bodosom:

@bodosom:

I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.

What settings result in a DSLreports A grade?

It seems I was unclear. Ideally I'd like to get back to getting an A or B on the DSLR test which I suspect means having an average latency of 100-200 ms. in both directions. I can get a B by carefully tweaking the test parameters after setting the WAN speed to ~90% of my uplink speed. I guess I was hoping for something both simpler and better.

Like Harvy66 said, the grading score is a combination of your upload and download bufferbloat. You have not yet told us whether the majority of your bufferbloat is occurring during upload or download saturation. The test should show you specifically what your upload and download bufferbloat is in millisecond measurements.

Also, like I already mentioned, you will get a more accurate measurement if you look at the pfSense Quality graphs (or you could simply run a ping test yourself…). The dslreports test is a program within a browser, so it is not always accurate. dslreports says my latency is ~5x worse than it actually is...

We need more info from you. For example, what is your latency is (in milliseconds) during idle, upload saturation, and download saturation, with codel & without?

@vesikk:

I have tried using codelq and it works but as soon as I start an upload to google drive I get about 50% packetloss and then the gateways appear offline according to pfsense but ping stays low.

Where are you recording 50% packet loss? tcpdump? Queue Drops? ?

Nullity

@Derelict:

OK, look.

pfSense does not have a problem with buffer bloat.

CoDel locally does practically nothing.

People are likely seeing buffer bloat improvement by simply setting the bandwidth lower than their expected upload so buffer bloat does not occur in the ISP gear.

CoDel is for transit, not for the edge.

If you were to start increasing queue length in local shapers AND enabling CoDel you might derive some benefit. But the only reason you would derive benefit from CoDel is because you increased the queue length so just don't.

Hmm?

Codel is for any network node that receives more traffic than it can immediately send, which is usually at the edge.
Transit nodes (backbones) have almost no need for Codel, since they keep latency low by using the best possible option… always have excess bandwidth (~50%) available.
The Codel authors specifically advise that it most useful at gateway/edge routers. They even spear-headed the creation of a firmware (CeroWRT) for consumer routers specifically.

Without Codel, my ping hit 600ms during upload saturation (50 packet default queue depth).
With Codel, during upload saturation, my queue is usually 1-4 packets deep with a ~50ms ping.

I wouldn't say pfSense has a bufferbloat problem (no more than everyone else), but Codel does make a big difference in worst-case latency, that's a fact.

Derelict

I would disagree. On a high-speed connection with 50 buffers CoDel will do practically nothing. The benefit is being derived from not overloading the upstream (where the bufferbloat occurs) with the bandwidth setting.

Harvy66

@Derelict:

I would disagree. On a high-speed connection with 50 buffers CoDel will do practically nothing. The benefit is being derived from not overloading the upstream (where the bufferbloat occurs) with the bandwidth setting.

Yes, if you have your buffers set to 50, which is the default. I had the issue with my 100/100 connection that a buffer size of 50 was too small and it was dropped quite a few packets under sustained load. My ISP has a fully uncongested network and allows 1Gb microbursts from datacenters half-way across the USA. I know at least Google, and I think several others, configure their TCP stacks to transfer at full line rate for the first several packets. Their line rate is 10Gb-40Gb. My ISP does traffic shaping in their Cisco core router and it seems to have some anti-buffer bloat features built in, which not only gives me an A on DSLReports without any shaping on my end, but also allows short lived bursts through. When doing a WireShark on my WAN interface, a single TCP stream from Google will give me back-to-back packets at 1Gb/s for the first 100ms-250ms before the Cisco router starts to traffic shape. I'm only connected to my ONT via 1Gb Ethernet, so who knows how fast the actual burst is.

Since I shaped my LAN to 100Mb/s, this 1Gb burst quickly fills my 50 packet queue and I get bursts of packetloss. Even worse is some of these TCP streams fully transfer in the 100ms-250ms window. Since I will ACK all of the packets, Google's TCP stack thinks I can handle the rate they're sending at. As devices make more requests to Google, I get more and more 1Gb bursts until my average utilization gets too high and the ISPs rate limiting starts to kick in.

Maybe you're thinking, ohh, he's just getting flooded by an on-premises CDN. No. My traffic comes from a Level 3 to Google PoP 300 miles away. I even configured YouTube to pull from a European CDN in Germany. 130ms ping, less than 1ms of jitter, and still getting 1Gb bursts from them. I guess this is the price I pay for not having an uncongested ISP.

Nullity

@Derelict:

I would disagree. On a high-speed connection with 50 buffers CoDel will do practically nothing. The benefit is being derived from not overloading the upstream (where the bufferbloat occurs) with the bandwidth setting.

Rate-limiting is a separate function (the function that avoids overloading upstream nodes) from CoDel. Technically, ALTQ rate-limiting is done by either the Token Bucket Regulator (interface rate-limiting) or the queueing discipline (HFSC, CBQ, FAIRQ).

CoDel, to keep the network buffers small, simply drops packets depending on the amount of time packets are queued.

Rate-limiting puts pfSense in control (avoids overloading upstream), then CoDel mitigates bufferbloat (by dropping packets). Two separate functions.