CoDel - How to use

kieranc

@kieranc:

@Nullity:

If qlimit is 0, it defaults to 50, and codel gets the (initial?) target value from qlimit.

is qlimit the queue length, or something else entirely?

qlimit is the queue length which becomes useless when codel is axtive, since codel dynamically controls queue length (AQM).

So when using codel the 'queue limit' setting seems to change the target instead… handy, but not very obvious..
Thanks!

Nullity

Yeah, it is pretty confusing but I'll take CoDel however I can get it. :)
Ermal ported himself, iirc. Ahead of the curve, that guy! :)

I still dunno how to view or set codel's parameters when it is a sub-discipline though. Default or gtfo, I suppose…

Harvy66

The whole target qlimit thing applies to CoDel for both the scheulder and the child discipline?

Do you know if the interval changes? The interval is supposed to be 20x the target.

kieranc

@Nullity:

Yeah, it is pretty confusing but I'll take CoDel however I can get it. :)
Ermal ported himself, iirc. Ahead of the curve, that guy! :)

I still dunno how to view or set codel's parameters when it is a sub-discipline though. Default or gtfo, I suppose…

I've just had a tinker and I can't find anything, but that certainly doesn't mean it's not there.
I've rarely used BSD, is there some /proc type interface where the information comes from that can be queried directly?

Nullity

@Harvy66:

The whole target qlimit thing applies to CoDel for both the scheulder and the child discipline?

Do you know if the interval changes? The interval is supposed to be 20x the target.

iirc, the sub-discipline setup is purely configured by hard-coded defaults and has no user configurable/viewable params that I am aware of. Hopefully, there is a simple way for a user to view/set the params in that situation. ermal? ;)

interval is the only value required by codel, so I do not think it changes. Technically, the target should be set based on the interval value, not vice versa.
afaik, current codel implementations do not automagically set interval to live RTT.

The CoDel building blocks are able to adapt to different or time-
varying link rates, to be easily used with multiple queues, to have
excellent utilization with low delay and to have a simple and
efficient implementation. The only setting CoDel requires is its
interval value, and as 100ms satisfies that definition for normal
internet usage, CoDel can be parameter-free for consumer use.

See: https://tools.ietf.org/id/draft-nichols-tsvwg-codel-02.txt

I have tried to run a thought-experiment concerning how a 5ms interval should negatively affect codel's performance, but I cannot fully comprehend it. I need to setup a bufferbloat lab…

Nullity

@kieranc:

@Nullity:

Yeah, it is pretty confusing but I'll take CoDel however I can get it. :)
Ermal ported himself, iirc. Ahead of the curve, that guy! :)

I still dunno how to view or set codel's parameters when it is a sub-discipline though. Default or gtfo, I suppose…

I've just had a tinker and I can't find anything, but that certainly doesn't mean it's not there.
I've rarely used BSD, is there some /proc type interface where the information comes from that can be queried directly?

iirc, the values could be gotten through some dev/proc interface, but it required an ioctl system call and could not be done via shell commands.

Though, I was confused then and now I've forgotten stuff, so I might be sense-making not-so-much.

kieranc

Well, this is fun. It seems to actually perform worse with the 'correct' values in place.
With 50/5 I was seeing mostly <200ms response time with upstream saturated and a 'B' on dslreports bufferbloat test
With 5/100 I'm seeing mostly <300ms response time, with more between 200 and 300ms than before, and a 'C' on dslreports bufferbloat test

Harvy66

That might explain why the CoDel people were saying they typically saw bufferbloat is low as 30ms, but I was seeing 0ms. PFSense may be more aggressive with the 5ms interval.

The interval is how often a single packet will be dropped until the packet's time in queue is below the target. If the target is 100ms with a 5ms interval, once you get 100ms of packets, CoDel will start dropping packets every 5ms and slowly increase the rate. It's not exactly how I say it, but close. They have some specific math that makes things everything not as simple as described, but very similar.

the interval is supposed to be set your your "normal" RTT, and the target should be 1/20th that value. Most services I hit have sub 30ms pings. My interval should be say 45ms and my target 2.25ms.

If the interval is too high, CoDel will too passive and have increasing bufferbloat, but if it's too low, it will be too aggressive and reduce throughput.

Maybe this is why PFSense's CoDel gives bad packetloss and throughput on slow connections. If the interval is 5ms, many packets will be dropped in a row.

Nullity

@kieranc:

Well, this is fun. It seems to actually perform worse with the 'correct' values in place.
With 50/5 I was seeing mostly <200ms response time with upstream saturated and a 'B' on dslreports bufferbloat test
With 5/100 I'm seeing mostly <300ms response time, with more between 200 and 300ms than before, and a 'C' on dslreports bufferbloat test

I think you may have another problem/misconfiguration. You should be seeing MUUUUCH better than 200ms. My ADSL connection goes from 600ms without any traffic-shaping, to 50ms with CoDel on upstream during a fully-saturating, single-stream upload test. My idle ping to first hop is ~10ms.

but lol…. I have been laughing that the fixed parameter values would actually cause a performance decrease...

kieranc

@Nullity:

@kieranc:

Well, this is fun. It seems to actually perform worse with the 'correct' values in place.
With 50/5 I was seeing mostly <200ms response time with upstream saturated and a 'B' on dslreports bufferbloat test
With 5/100 I'm seeing mostly <300ms response time, with more between 200 and 300ms than before, and a 'C' on dslreports bufferbloat test

I think you may have another problem/misconfiguration. You should be seeing MUUUUCH better than 200ms. My ADSL connection goes from 600ms without any traffic-shaping, to 50ms with CoDel on upstream during a fully-saturating, single-stream upload test. My idle ping to first hop is ~10ms.

but lol…. I have been laughing that the fixed parameter values would actually cause a performance decrease...

You're absolutely right, my problem is my ISP and their crappy excuse for a router, which I can't easily replace because it also handles the phones.
My connection will easily hit 2000ms+ if someone is uploading, <200ms is a massive improvement.

I'm also laughing a little at the results, based on your previous tests it's not a huge surprise but an explaination would be nice!

Nullity

@kieranc:

@Nullity:

@kieranc:

Well, this is fun. It seems to actually perform worse with the 'correct' values in place.
With 50/5 I was seeing mostly <200ms response time with upstream saturated and a 'B' on dslreports bufferbloat test
With 5/100 I'm seeing mostly <300ms response time, with more between 200 and 300ms than before, and a 'C' on dslreports bufferbloat test

I think you may have another problem/misconfiguration. You should be seeing MUUUUCH better than 200ms. My ADSL connection goes from 600ms without any traffic-shaping, to 50ms with CoDel on upstream during a fully-saturating, single-stream upload test. My idle ping to first hop is ~10ms.

but lol…. I have been laughing that the fixed parameter values would actually cause a performance decrease...

You're absolutely right, my problem is my ISP and their crappy excuse for a router, which I can't easily replace because it also handles the phones.
My connection will easily hit 2000ms+ if someone is uploading, <200ms is a massive improvement.

I'm also laughing a little at the results, based on your previous tests it's not a huge surprise but an explaination would be nice!

You might test enabling net.inet.tcp.inflight.enable=1 in the System->Advanced->System Tunables tab.

TCP bandwidth delay product limiting can be enabled by setting the net.inet.tcp.inflight.enable sysctl(8) variable to 1. This instructs the system to attempt to calculate the bandwidth delay product for each connection and limit the amount of data queued to the network to just the amount required to maintain optimum throughput.

This feature is useful when serving data over modems, Gigabit Ethernet, high speed WAN links, or any other link with a high bandwidth delay product, especially when also using window scaling or when a large send window has been configured. When enabling this option, also set net.inet.tcp.inflight.debug to 0 to disable debugging. For production use, setting net.inet.tcp.inflight.min to at least 6144 may be beneficial. Setting high minimums may effectively disable bandwidth limiting, depending on the link. The limiting feature reduces the amount of data built up in intermediate route and switch packet queues and reduces the amount of data built up in the local host's interface queue. With fewer queued packets, interactive connections, especially over slow modems, will operate with lower Round Trip Times. This feature only effects server side data transmission such as uploading. It has no effect on data reception or downloading.

Adjusting net.inet.tcp.inflight.stab is not recommended. This parameter defaults to 20, representing 2 maximal packets added to the bandwidth delay product window calculation. The additional window is required to stabilize the algorithm and improve responsiveness to changing conditions, but it can also result in higher ping(8) times over slow links, though still much lower than without the inflight algorithm. In such cases, try reducing this parameter to 15, 10, or 5 and reducing net.inet.tcp.inflight.min to a value such as 3500 to get the desired effect. Reducing these parameters should be done as a last resort only.

https://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html

Seems like exactly the type of thing we would be interested in.

kieranc

@Nullity:

@kieranc:

@Nullity:

@kieranc:

Well, this is fun. It seems to actually perform worse with the 'correct' values in place.
With 50/5 I was seeing mostly <200ms response time with upstream saturated and a 'B' on dslreports bufferbloat test
With 5/100 I'm seeing mostly <300ms response time, with more between 200 and 300ms than before, and a 'C' on dslreports bufferbloat test

I think you may have another problem/misconfiguration. You should be seeing MUUUUCH better than 200ms. My ADSL connection goes from 600ms without any traffic-shaping, to 50ms with CoDel on upstream during a fully-saturating, single-stream upload test. My idle ping to first hop is ~10ms.

but lol…. I have been laughing that the fixed parameter values would actually cause a performance decrease...

You're absolutely right, my problem is my ISP and their crappy excuse for a router, which I can't easily replace because it also handles the phones.
My connection will easily hit 2000ms+ if someone is uploading, <200ms is a massive improvement.

I'm also laughing a little at the results, based on your previous tests it's not a huge surprise but an explaination would be nice!

You might test enabling net.inet.tcp.inflight.enable=1 in the System->Advanced->System Tunables tab.

TCP bandwidth delay product limiting can be enabled by setting the net.inet.tcp.inflight.enable sysctl(8) variable to 1. This instructs the system to attempt to calculate the bandwidth delay product for each connection and limit the amount of data queued to the network to just the amount required to maintain optimum throughput.

This feature is useful when serving data over modems, Gigabit Ethernet, high speed WAN links, or any other link with a high bandwidth delay product, especially when also using window scaling or when a large send window has been configured. When enabling this option, also set net.inet.tcp.inflight.debug to 0 to disable debugging. For production use, setting net.inet.tcp.inflight.min to at least 6144 may be beneficial. Setting high minimums may effectively disable bandwidth limiting, depending on the link. The limiting feature reduces the amount of data built up in intermediate route and switch packet queues and reduces the amount of data built up in the local host's interface queue. With fewer queued packets, interactive connections, especially over slow modems, will operate with lower Round Trip Times. This feature only effects server side data transmission such as uploading. It has no effect on data reception or downloading.

Adjusting net.inet.tcp.inflight.stab is not recommended. This parameter defaults to 20, representing 2 maximal packets added to the bandwidth delay product window calculation. The additional window is required to stabilize the algorithm and improve responsiveness to changing conditions, but it can also result in higher ping(8) times over slow links, though still much lower than without the inflight algorithm. In such cases, try reducing this parameter to 15, 10, or 5 and reducing net.inet.tcp.inflight.min to a value such as 3500 to get the desired effect. Reducing these parameters should be done as a last resort only.

https://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html

Seems like exactly the type of thing we would be interested in.

Just for fun, I enabled it and disabled the traffic shaper, during the upload portion of a dslreports test, my ping hit 2700ms :)
With inflight and codel enabled it seems to be behaving fine, possibly slightly better than without but i'll have to do more testing.

kieranc

Given that the 2.2.4 'correct' settings seem to give worse results than the 'incorrect' 2.2.3 ones (for me at least), it seems that we need the ability to tune both interval and target in order to make codel useful for everyone.
I'm guessing it would be complicated to add an extra field to the traffic shaper setup page, but since the queue limit field is currently being reused to set the target, could we add some logic to use it to set both target and interval? if the field contains a single integer, use it as a target, if it contains something else (t5i100? 5:100?) then use it as target and interval?
It's a bit beyond my skill level but it seems like it should be possible in theory, or is there a better way to do it?

edit: can we just set the interval and derive the target from that, if it's easier?

Nullity

@kieranc:

Given that the 2.2.4 'correct' settings seem to give worse results than the 'incorrect' 2.2.3 ones (for me at least), it seems that we need the ability to tune both interval and target in order to make codel useful for everyone.
I'm guessing it would be complicated to add an extra field to the traffic shaper setup page, but since the queue limit field is currently being reused to set the target, could we add some logic to use it to set both target and interval? if the field contains a single integer, use it as a target, if it contains something else (t5i100? 5:100?) then use it as target and interval?
It's a bit beyond my skill level but it seems like it should be possible in theory, or is there a better way to do it?

edit: can we just set the interval and derive the target from that, if it's easier?

Those who created the codel algorithm are the one's who dictate that, and I presume they chose wisely. Target is dynamic anyway, I think.

Though, if you choose CoDel as the primary/parent scheduler, then you can choose whatever interval/target you want, via command-line, at least for temporary testing purposes.
Our problem is that we cannot customize CoDel's parameters when it is a sub-discipline aka "Codel Active Queue" checkbox.

Whether to expose the CoDel params in the GUI or not… if it is anything like the HFSC params, people will needlessly tweak them with unforeseen consequences simply because they are there. I dunno... maybe we can use the System Tunables tab and add custom params and values that way?

Harvy66

Like Nullity mentioned, the default values were chosen because they work best for the bulk of users. They are a one size fits all that are not optimal for all users, but is still better than FIFO.

The optimal interval should be the lowest value that covers the bulk of the RTTs of your flows. For me the default 100ms is my latency to London and way too large for many of the 10ms-20ms servers that I communicate with, until they add the ability to change the interval, I can't complain that they're choosing the recommended defaults. But I will post some results of before and after once I'm allowed to upgrade.

Harvy66

I just realized that I should mention that I am using HFSC with CoDel is a child discipline.

PFSense 2.2.3 - this is actually a fairly typical graph, so I didn't run it more than once

PFSense 2.2.4 - I ran this test a few times, they all pretty much showed the same

I see no real difference.

Nullity

@Harvy66:

I just realized that I should mention that I am using HFSC with CoDel is a child discipline.

PFSense 2.2.3 - this is actually a fairly typical graph, so I didn't run it more than once

PFSense 2.2.4 - I ran this test a few times, they all pretty much showed the same

I see no real difference.

Did you make sure that the values were different? I thought we stilll did not know how to display the live values of CoDel's params when it is a sub-discipline.

Maybe we should attempt some real testing though. Local tests mesuring small time-spans, probably kernel timer debugging level of granularity. That DSLreports test is great for introducing regular folks to bufferbloat, but it is not accurate. HTML5 within a browser just cannot perform well enough, especially with your 10ms-latency connection as the test.

I tried to set ALTQ to debugging state on pfSense then FreeBSD but I never got far. That is my best guess for a way to measure CoDel with enough accuracy to be useful. Any ideas?

Harvy66

@Nullity:

@Harvy66:

I just realized that I should mention that I am using HFSC with CoDel is a child discipline.

PFSense 2.2.3 - this is actually a fairly typical graph, so I didn't run it more than once

Removed images

PFSense 2.2.4 - I ran this test a few times, they all pretty much showed the same

Removed images

I see no real difference.

Did you make sure that the values were different? I thought we stilll did not know how to display the live values of CoDel's params when it is a sub-discipline.

Maybe we should attempt some real testing though. Local tests mesuring small time-spans, probably kernel timer debugging level of granularity. That DSLreports test is great for introducing regular folks to bufferbloat, but it is not accurate. HTML5 within a browser just cannot perform well enough, especially with your 10ms-latency connection as the test.

I tried to set ALTQ to debugging state on pfSense then FreeBSD but I never got far. That is my best guess for a way to measure CoDel with enough accuracy to be useful. Any ideas?

Like you said, last time I tried to check the values as a child discipline, it didn't show them. I figured if the defaults were changed for the scheduler, it may have affected these as well, but I have a hard time telling.

If I bypass PFSense and go strait to the Internet, I get a very distinctive bufferbloat on the DSLReports tests.

You can also see with this that if I change from 16 streams to 32 streams, it starts to tax the simple CoDel algorithm's ability to smooth out latency.

This is what my connection looks like when I bypass PFSense, which means no CoDel. Same 16 streams.

It's hard to see because of the large range, but the base idle ping is 17ms avg, the download is 27ms avg, and the upload is 44ms avg. That is distinctly different than what I get through PFSense with HFSC+CoDel, which was +2ms over idle instead of +10-27ms over idle. A magnitude difference.

As you can tell, my ISP has horrible bufferbloat /sarc 30ms, most horrible.

I'm not sure the best way to measure the affects of CoDel as a simple test. You'd probably need to use something to load a rate limited connection(not line rate), like iperf, then doing pings. My expectation is there should be a measurable difference between avg ping, and std-dev of ping.

You may need to be careful doing this test on a LAN where the latency can be measured in microseconds. All TCP implementations have a minimum 2 segments sent when streaming data. 0.1ms ping at 1500byte segments sizes puts a lower limit of 120Mb/s. All known TCP congestion algorithms will not backoff below this speed for that latency. That's per stream. If you have 8 streams, that's 960Mb/s.

0.1ms is actually a high latency for a LAN. I measure as low as 0.014ms using a high resolution ping program, but my switch is rated for 2.3 microseconds. It really depends on how often the kernel scheduling.

Nullity

@Harvy66:

I'm not sure the best way to measure the affects of CoDel as a simple test. You'd probably need to use something to load a rate limited connection(not line rate), like iperf, then doing pings. My expectation is there should be a measurable difference between avg ping, and std-dev of ping.

You may need to be careful doing this test on a LAN where the latency can be measured in microseconds. All TCP implementations have a minimum 2 segments sent when streaming data. 0.1ms ping at 1500byte segments sizes puts a lower limit of 120Mb/s. All known TCP congestion algorithms will not backoff below this speed for that latency. That's per stream. If you have 8 streams, that's 960Mb/s.

0.1ms is actually a high latency for a LAN. I measure as low as 0.014ms using a high resolution ping program, but my switch is rated for 2.3 microseconds. It really depends on how often the kernel scheduling.

Network measurements are not what we want. We only care about the time before packets are on the wire. All CoDel controls is the scheduling of the local buffers, so that is what we would want to measure, right?

CoDel has relatively zero influence over a packet's latency once the packet hits the wire.

Harvy66

I was thinking of a simple test that indirectly measures CoDel by the characteristics of the network. If you want a more direct measurement, one would need to implement some wrapper code that acts like the network stack and OS and simulates the network pushing packets through. More direct, much more work.