CoDel - How to use

Nullity

This is a very informative thread, I've been searching this forum on general traffic shaping methods / tricks and I've bookmarked a couple of threads including this one…I'm yet do to my setup, I like doing the research first so I can have an idea of what I'm dealing with.

I have a question tho'; from what I read CoDel is only meant to work (well) with high speed connections? 2.5mbit and above?. I'm currently using a very low ADSL connection 1770kbps down 550kbps up (thats what I get from speedtest although the ISP sells as 2mbit down 1mbit up), would CoDel work for me with this speeds?

I'm looking to upgrade to 10mbit down 2.5mbit up soon but I kindly need to know if CoDel would work for me with my current speeds.

Another question is; I'm I right to assume that CoDel can work side to side with PRIQ and the Limiter (I intend to use the Limiter to implement share bandwidth evenly on LAN as foxale08 has illustrated here; https://forum.pfsense.org/index.php?topic=63531.0 )

Regards.

Answer to your first question: No (but it couldn't hurt to try… :))

If you check CoDel's official site you would see that there are a few known problems. Here is a quote from http://www.bufferbloat.net/projects/codel/wiki/Wiki/

At very low bandwidths (e.g. .5Mbps) on ADSL, we're having to play with the target; Kathie did not have to in her simulations. This is due to inevitable buffering
in htb or in the device driver. We have a version under development that does bandwidth limiting without buffering an extra packet, called cake. It's looking good so far.

The quote mentions "playing with the target", but this level of configuration is only available in the Linux kernel's Codel/fq_codel implementation (please correct me if I am wrong). I am assuming they mean target delay. Please refer to to the following link for more information: http://man7.org/linux/man-pages/man8/tc-fq_codel.8.html

Regarding your second question; I have only used Codel as a stand-alone queue, never as part of another queueing algorithm like PRIQ or HFSC. I have no reason to think it wouldn't work though. Make sure to reset the firewall states and test your queueing configuration to confirm that it is working as expected.

Harvy66

I use Codel in conjunction with HFSC on my 100/100 connection. In theory they're independent of each other. HFSC schedules which queue to dequeue and Codel deiced which packets to dequeue or to drop.

In practice, I never see any packets dropped according to the queue statistics, but I also never see any measurable latency spikes.

I'm not sure of a goad way to load test because my connection is "too" stable.

It is possible that if HFSC isn't "smooth" enough at interleaving the queues, a queue could wait longer than what Codel wants and could create packetloss when there isn't any real congestion, but HFSC seems like freaking magic. I assume bad things could happen with an older machine with 10ms timers or 10/40Gb interfaces that have fq_Codel tuned for 0.5ms, which is recommended for those rates. but right now, we do not have fq_codel nor do we have a way to tweak the target delay via the interface.

sideout

Here is how I have mine setup for testing right now:

QueueStatus1.JPG_thumb
TrafficShaper1.JPG_thumb
TrafficShaper2.JPG_thumb
TrafficShaper3.JPG_thumb
TrafficShaper4.JPG_thumb
TrafficShaper5.JPG_thumb
TrafficShaper6.JPG_thumb

cmutwiwa

@Nullity:

Answer to your first question: No (but it couldn't hurt to try… :))

If you check CoDel's official site you would see that there are a few known problems. Here is a quote from http://www.bufferbloat.net/projects/codel/wiki/Wiki/

Thanks, will give it a try.

I'm hoping it will work smoothly with PRIQ and the Limiter, I would achieve alot!

ltctech

I have been experimenting with my home router which is running DD-WRT and enabled HTB along with FQ_CoDel. My latency under load has improved from 300ms+ to just under 20ms+ as measured by pinging Google DNS. The connection is a Frontier FIOS 25M/25M. My throughput as verified by speedtest.net are good whether I am using a Seattle based server or one in Atlanta. I thought that was great and looked into turning CoDel on in pfSense at work.

We have two pfSense firewalls both in the Seattle area. One is sitting on a 50M/50M Comcast EDI circuit. The other one is sitting on a collocated burstable gigabit circuit that we shape to 100M/100M. Both use a simple PRIQ shaper with 500 packet queue limit, that's roughly a max queue delay of 114ms for 50M and 57ms for 100M. I have found that the 500 packet queue limit offers the best throughput performance with least drops and this has worked for us for a few years now.

For most general traffic I really don't care that it queues up and gets delayed, but less intensive flows should not get delayed. Which I see CoDel doing on my home router. pfSsense was able to do the same on our work connections. However, something weird is going on when I do speed tests from the pfSense boxes. When I choose a local Seattle server with an RTT of under 10ms, both connections approach close to their shaped throughput. When I choose a server in Atlanta or Miami where the RTT is around 80ms, suddenly the upload throughput is 25-50% less, and it takes longer to get up to that throughput. Download throughput is never affected even though CoDel is enabled on the LAN queues and I verified that it is working via ping. If I turn off CoDel AQM, the upload for high RTT servers goes back to roughly shaped throughput.

Can anyone explain why the longer RTT is causing issues for upload on pfSsense using CoDel. Why does DD-WRT not experience this issue?

Harvy66

Are you using the same computer when doing speed tests? I've seen large variations in speed tests between computers. Too bad PFSense doesn't have fq_codel yet, but there are some other awesome changes in the pipeline.

ltctech

@Harvy66:

Are you using the same computer when doing speed tests? I've seen large variations in speed tests between computers. Too bad PFSense doesn't have fq_codel yet, but there are some other awesome changes in the pipeline.

I am using Windows Server 2008 R2 when testing the work connections. I am using Mac OS X 10.10.2 and Windows 8.1 Pro at home. I'll see if Windows 8.1 Pro makes a difference on the office connection tomorrow, though I doubt it.

I am thinking that the queue length has something to do with it on pfSense.

Nullity

@ltctech:

@Harvy66:

Are you using the same computer when doing speed tests? I've seen large variations in speed tests between computers. Too bad PFSense doesn't have fq_codel yet, but there are some other awesome changes in the pipeline.

I am using Windows Server 2008 R2 when testing the work connections. I am using Mac OS X 10.10.2 and Windows 8.1 Pro at home. I'll see if Windows 8.1 Pro makes a difference on the office connection tomorrow, though I doubt it.

I am thinking that the queue length has something to do with it on pfSense.

If stream fairness is your goal then FAIRQ might be a better choice. There is a picture somewhere showing graphs of fq_codel, codel, SFQ (stochastic fair queue) and the latency changes when each algorithm is dealing with dozens of simultaneous streams. SFQ was a close 2nd behind fq_codel for best latency. FAIRQ is very similar to SFQ (both give each stream a hash then iterate through them round-robin style).

Codel is lacking "fair queueing" (there are many papers on this topic) so it does poorly with multiple streams, unlike fq_codel.

Harvy66

@ltctech:

@Harvy66:

Are you using the same computer when doing speed tests? I've seen large variations in speed tests between computers. Too bad PFSense doesn't have fq_codel yet, but there are some other awesome changes in the pipeline.

I am using Windows Server 2008 R2 when testing the work connections. I am using Mac OS X 10.10.2 and Windows 8.1 Pro at home. I'll see if Windows 8.1 Pro makes a difference on the office connection tomorrow, though I doubt it.

I am thinking that the queue length has something to do with it on pfSense.

There are a few things at play.

If your queue is too small, it will drop packets too aggressively. You can look at queue statistics to find out if there are any drops happening.
If your queue is too large and are not using something like Codel with time based dropping, your bandwidth can also be made less efficient
My personal most common reason for poor upload speeds is the TCP stack of the OS I'm using

Windows 202 R2 is the Win7 kernel, and Win7 defaults to some latency sensitive TCP congestion control. This may not be the same for the server edition, but when I switched to using CTCP, my upload bandwidth to higher latency targets increased substantially. Win8 of all versions default to CTCP.

And never assume two similar machines would get the same performance. I had two identical computers, exactly the same hardware, both with a fresh install of Win7, and one was over 50% faster than the other for speed tests. The only thing I could think of that would make the difference was heuristics. Win7 tries to be "smart" about certain things, which can cause it to get confused. A quick trip into the registry to change some settings and a reboot and both systems were getting identical speedtests.

So even freshly installed identical hardware can get large variations. ALWAYS test using the same machine. Or use an OS that doesn't suck. Freaking Windows.

ltctech

@Harvy66:

@ltctech:

@Harvy66:

Are you using the same computer when doing speed tests? I've seen large variations in speed tests between computers. Too bad PFSense doesn't have fq_codel yet, but there are some other awesome changes in the pipeline.

I am using Windows Server 2008 R2 when testing the work connections. I am using Mac OS X 10.10.2 and Windows 8.1 Pro at home. I'll see if Windows 8.1 Pro makes a difference on the office connection tomorrow, though I doubt it.

I am thinking that the queue length has something to do with it on pfSense.

There are a few things at play.

If your queue is too small, it will drop packets too aggressively. You can look at queue statistics to find out if there are any drops happening.

If your queue is too large and are not using something like Codel with time based dropping, your bandwidth can also be made less efficient

My personal most common reason for poor upload speeds is the TCP stack of the OS I'm using

Windows 202 R2 is the Win7 kernel, and Win7 defaults to some latency sensitive TCP congestion control. This may not be the same for the server edition, but when I switched to using CTCP, my upload bandwidth to higher latency targets increased substantially. Win8 of all versions default to CTCP.

And never assume two similar machines would get the same performance. I had two identical computers, exactly the same hardware, both with a fresh install of Win7, and one was over 50% faster than the other for speed tests. The only thing I could think of that would make the difference was heuristics. Win7 tries to be "smart" about certain things, which can cause it to get confused. A quick trip into the registry to change some settings and a reboot and both systems were getting identical speedtests.

So even freshly installed identical hardware can get large variations. ALWAYS test using the same machine. Or use an OS that doesn't suck. Freaking Windows.

You are right about the TCP stack differences. Windows 2008 R2 copes much worse than Windows 8.1 or Windows 2012 R2. Though even in Windows 8.1 where I checked that CTCP is enabled, I am getting an average upload of 40-45Mbits where it should be 48-49Mbits. So I guess CoDel is just not worth it, at least how its implemented on pfSense. Maybe they'll fix it when they do FQ_CoDel.

Harvy66

PFSense only implements the original Codel which has a large buffer length and has a target latency of 5ms. This allows it to do well if lots of small or large packets come through at the same time. One of the big issues with buffer bloat is if you buffer is too small you can drop small packets, but if your buffer is too large, then large packets cause too much back-log.

fq_Codel extends this to include "fair" queuing which breaks up data flows into hash buckets and does a mixture of prioritizing packets arriving into empty buckets and dequeing back-logged buckets equally. Codel is still pretty much the best option for now. Set and forget.

Nullity

Some have said CoDel is not a traffic shaper. This is confusing because CoDel drops packets to keep the buffers in check. Dropped TCP packets result in a throttling effect.

Perhaps I am confusing a "traffic shaper" with a "traffic policer".
http://www.cisco.com/c/en/us/support/docs/quality-of-service-qos/qos-policing/19645-policevsshape.html

CoDel is one of the 2 though, right?

I am confused. :o

jimp

To oversimplify it quite a bit:

Shaping can delay sending traffic (as well as drop) to smooth out usage, whereas policing simply lops off anything over the max rate and chucks it in the bit bucket.

Shaping typically employs queues as well as the occasional drop, whereas policing just says "nope" and drops it hard if it crosses the high rate.

Policing is very harsh, if you have ever had to deal with a circuit that had traffic policing, you know that both ends MUST have the same policing set or it's a nightmare of dropped packets. I haven't personally seen a circuit with traffic policing in probably 10 yrs or so, thankfully.

Harvy66

@Nullity:

Some have said CoDel is not a traffic shaper. This is confusing because CoDel drops packets to keep the buffers in check. Dropped TCP packets result in a throttling effect.

Perhaps I am confusing a "traffic shaper" with a "traffic policer".
http://www.cisco.com/c/en/us/support/docs/quality-of-service-qos/qos-policing/19645-policevsshape.html

CoDel is one of the 2 though, right?

I am confused. :o

Traffics shapers do not drop packets, they dequeue packets queues at specified rates. It's the queue's drop packets, but the traffic shaper's job to decide which queue and when.

Nullity

@Harvy66:

@Nullity:

Some have said CoDel is not a traffic shaper. This is confusing because CoDel drops packets to keep the buffers in check. Dropped TCP packets result in a throttling effect.

Perhaps I am confusing a "traffic shaper" with a "traffic policer".
http://www.cisco.com/c/en/us/support/docs/quality-of-service-qos/qos-policing/19645-policevsshape.html

CoDel is one of the 2 though, right?

I am confused. :o

Traffics shapers do not drop packets, they dequeue packets queues at specified rates. It's the queue's drop packets, but the traffic shaper's job to decide which queue and when.

I think I understand what you are saying, but he post above you and the Cisco link both say that shapers drop packets. :o

jimp

Shaping can drop but only by way of it dropping out of a queue. It still had to be queued, possibly delayed, etc.

The only action of Policing is to drop, no queue.

Nullity

Perhaps it is my confusion between incoming and outgoing egress. CoDel throttles (shapes?) incoming egress TCP streams based on queueing delay, but this queueing delay is controlled by outgoing egress speeds, which are controlled by the traffic-shaper.

I should probably just head back to the books… :-X

:D

Edit: I am referring to WAN interface.

Harvy66

Codel is just a regular queue. Just like when the default queue gets full, it drops packets. The difference is the default queue does tail drops and does abrupt drops once full. Codel does head drops and defines full not as a number of packets but how long a packet was in the queue, even then, it doesn't do abrupt drops does does ever increasing rates of drops.

It is impossible to have a network interface without a queue, even if it's a queue of one. The whole point of a queue is to buffer packets. Codel does so in a way that reduces buffer bloat while allowing high throughput relative to the default fixed-size tail-drop that has been around for decades.

When writing multi-threaded code, you use queues a lot because synchronizing threads is expensive and you rarely have two threads that process data at the same rate. You need to buffer that data somewhere. Queues!

Nullity

I think I get it.

Part of my confusion stemmed from when I tested CoDel, it caused my upload/download to drop to ~75% of my maximum bitrate and the throughput was unsteady. I never experienced this problem with "regular" queues. This caused me to assume that CoDel was doing something extra (shaping) to keep my queueuing delay low. Without CoDel, I achieved the bitrate assigned to the interface.

I now realize that CoDel should not have acted that way. I will need to revisit CoDel and see if I get the same results again.

My real-world internet speeds are 6.34Mb/666Kb.

tuffcalc

@Nullity:

I think I get it.

Part of my confusion stemmed from when I tested CoDel, it caused my upload/download to drop to ~75% of my maximum bitrate and the throughput was unsteady. I never experienced this problem with "regular" queues. This caused me to assume that CoDel was doing something extra (shaping) to keep my queueuing delay low. Without CoDel, I achieved the bitrate assigned to the interface.

I now realize that CoDel should not have acted that way. I will need to revisit CoDel and see if I get the same results again.

My real-world internet speeds are 6.34Mb/666Kb.

Did you set an upload bandwidth limit? Set it at 95% if 666Kb and have another run at it.