FQ CoDel - Any plans to implement?

switchman

Now that you have “CoDel” support, any plans on implementing “FQ-CoDel” support. It builds on CoDel. From the RFC.
_The FQ-CoDel algorithm is a combined packet scheduler and AQM developed as part of the bufferbloat-fighting community effort. It is based on a modified Deficit Round Robin (DRR) queue heduler, with the CoDel AQM algorithm operating on each sub-queue.

FQ-CoDel mixes packets from multiple flows and reduces the impact of head of line blocking from bursty traffic. It provides isolation for low-rate traffic such as DNS, web, and videoconferencing traffic. It improves utilisation across the networking fabric, especially for bidirectional traffic, by keeping queue lengths short; and it can be implemented in a memory- and CPU-efficient fashion across a wide range of hardware._

RFC
https://tools.ietf.org/html/draft-hoeiland-joergensen-aqm-fq-codel-01

Technical description of FQ CoDel
http://www.bufferbloat.net/projects/codel/wiki/Technical_description_of_FQ_CoDel

Good study of CoDel. FQ-CoDel and PIE. It is oriented toward cable modems, but it is good info.
http://www.cablelabs.com/wp-content/uploads/2013/11/Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf

Harvy66

Is it just me, or does this attempt to do flow based equality? This would be great an keeping a single flow from hogging all of the bandwidth.

Harvy66

It does have the added extra variable of hash size, but otherwise I wonder if this can be a drop-in replacement for CoDel?

Harvy66

I'm not quite sure what the difference is between FQ-CoDel and SFQ-CoDel, but according to that DOCSIS paper, SQF-CoDel has many more corner cases and did not handle high congestion as well as plain CoDel. If I had to choose, I would just use CoDel.

The DOCSIS paper described SFQ has having 32 buckets and ietf described FQ as having 1024, and even made mention to SQF as something different. I loved the DOCSIS paper for its testing data, but I wish it had FQ and not SFQ. I'll see if I can find any similar testing data for FQ.

Harvy66

Seems FQ is quite different from SQF. Better packet scheduling and better drop characteristics. Described as "Deploy fq_codel. It's just an across the board win"

https://indico.uknof.org.uk/getFile.py/access?contribId=3&resId=0&materialId=slides&confId=27

I find it interesting that they mention uTP as being "problematic".

I found this slide. I added a few notes. I never noticed how poorly TCP "shared" bandwidth over a congested link. One flow seems to take over.

switchman

The authors of the cable modem paper acknowledged that some of the issues they were seeing is a result of only 32 queues. I think they were looking at it from a model perspective where typically it would have limited flows, ie a home environment.

Here is another paper. Not bad.
http://akira.ruc.dk/~tohojo/bufferbloat/bufferbloat-final.pdf

Presentation. Look at slide 39. I think this explains the default of 1024 queues.

http://netseminar.stanford.edu/seminars/Inside_Codel_and_Fq_Codel.pdf

In the RFC, 1024 is set at the default. There is nothing saying it cant be user adjustable. You would just need a reboot to reinitialize the memory/queues.

Harvy66

I wonder what the penalty for large hash sizes would be. I don't mind giving up memory, but I don't want to increase cache thrashing.

switchman

Some good info here and the links on the page.

https://www.bufferbloat.net/projects/codel/wiki

and here

https://www.bufferbloat.net/projects/codel/wiki/Reconciling_codel_variants

I am running CoDel on my pfSense install and like it.

dtaht

To clear up some details:

SFQ and FQ_codel perform similarly with one tcp flow against a single measurement flow. However as you add TCP flows, the "sparse flow optimization" or "new flow optimization" in FQ_ codel kicks in and the measurement/voip/dns flow latency and jitter stay lower than SFQ (well, until you get a hash collision, and with 1024 vs 127 flows by default, hash collisions are rarer, and with codel kicking in, the cost of a collision is less)

The cable labs testing crippled sfq_codel by setting it to 32 flows, which we recommended highly against. ("Birthday problem") It also wasn't FQ_codel they were really testing - FQ_codel is DRR based, SFQ_codel is mostly packet based. They also did quite a few other things to make PIE come out on top, including changing the codel defaults dramatically, folding together all their data at all rates (at lower rates sfq_codel still won handsomely), weighting long fat flows higher than web traffic, not testing DNS traffic at all, and a ton of other faults discussed extensively (with some heat) on the aqm mailing list.

As things stand today, pie still hasn't deployed, while fq_codel is in a ton of products, including openwrt and it's derivatives, the linux kernel mainline, streamboost, ipfire, and a ton of QoS systems, on by default in openwrt, systemd, etc, etc.

And I would love a pfsense version. However a reason we did the internet draft was that all the fq_codel versions so far are GPL'd and we felt that someone else needed to produce a BSD licensed version from the spec, rather than from the code.

The codel portion of the code was explicitly BSD/GPL dual licensed, at Van's request.

I recently gave a talk on the subjects at nznog - see about 2:20 in on the friday morning session:

http://new.livestream.com/i-filmservices/NZNOG2015/videos/75358960

I will gladly review a BSD licensed version of fq_codel on request.

As for memory costs - 1024 queues (flows) costs 64k in overhead on 64 bit systems. You can have up to 64k queues if you want, set at instantation time (no need to reboot). but there is some optimal ratio of flows to codel state variables that we really don't know - we've merely proven that 1024 queues is good up to GigE for most traffic.

What else on your list above…? utp remains somewhat problematic against other fat flows, BUT not a problem in backing off compared to short flows in slow start, dns, etc. it turns out that torrent over TCP can be more aggressive than uTP. There's been some good work on looking at uTP vs AQM technologies but it still is not quite done yet.

http://perso.telecom-paristech.fr/~drossi/paper/rossi14comnet-b.pdf

to get more questions answered please come by the bloat bufferbloat.net mailing list.

eri--

It surely is not on my primary list, though it should not be difficult to implement.
Though it sounds like an improvement in general.

I really am convinced that any scheduler will have it deficency based on the scenario in question.
Choice is good to have but not sure when it can come out.

cplmayo

From what I have been reading this fq_codel would be nice to have as an option. I have been running HFSC on my box for a while and playing with floating rules and alias's to try and segment traffic into the proper queues has been problematic but I had a setup that worked very well. Today I saved my config and dumped all my traffic shaping rules and switched from HFSC with codel to straight codel to see if I could similar benefits with out all of the configuration.

Testing now to see how I like it, but fq_codel seems as though it would be a much better choice. The lack of "knobs" with codel makes it a lot less convoluted. In my home environment making rules for every use case I have is a pain.

Just my two cents.

Harvy66

HFSC and Codel are two different things with overlapping properties.

HFCS manages the bandwidth of a queue and manages the distribution of bandwidth among queues

Codel manages the congestion of a queue by increasingly periodically dropping packets once the queue is longer than 5ms

fq_Codel extends codel and breaks up traffic into buckets based on their hashes, tries to keep each bucket with roughly the same amount of back-log, and new buckets get priority. Since buckets disappear once all of their packets have been dequeued, low bandwidth flows tend to not have backlogs, so they effectively get prioritized so long as they stay low bandwidth. High bandwidth flows tend to get bandwidth evenly distributed.

codel/fq_codel still won't save you from P2P attempting to monopolize your bandwidth, but it will keep latency low.

switchman

Any feedback if the pfSense team plans on implementing fq_Codel?

Harvy66

They have showed interest in it, but it is not high priority. There is a lot going on that's keeping them busy.

Probably better off placing a bounty. Kickstarter!

Harvy66

It does a combination of "fairness" and latency-based rate-increasing head drop. It's a great combination of features that maintains low latency during high utilization.

Nullity

Has anyone seen if FAIRQ cannot offer what you want/need? FAIRQ is DragonFlyBSD's implementation of the respected SFQ algorithm. FAIRQ is that plus it has priorities, link-sharing, and a "hogs" param, that I still have no figured out yet.

Using FAIRQ and statically setting your queues to a limit of approximately what CoDel was allowing, could you not practically achieve FQ + tiny (CoDel) buffer?

I want fq_codel too, but… seems like mommy and daddy do not love us enough. :'(

mcwtim

Just in case those on this sub forum don't check out the bounty forum.

https://forum.pfsense.org/index.php?topic=90942.0

I've pledged to it, let's step up to get this done.

foonus

@mcwtim:

Just in case those on this sub forum don't check out the bounty forum.

https://forum.pfsense.org/index.php?topic=90942.0

I've pledged to it, let's step up to get this done.

Is this finally happening in 3.0? haven't seen any updates.

Nullity

@mcwtim:

Just in case those on this sub forum don't check out the bounty forum.

https://forum.pfsense.org/index.php?topic=90942.0

I've pledged to it, let's step up to get this done.

I suppose you want it implemented in ALTQ rather than the impending limiters/dummynet implementation?
https://forum.pfsense.org/index.php?topic=100427.0

Looking at the source-code I posted, we may already have virtually the same thing with FAIRQ (or HFSC) + CoDel. Regardless of the code differences, I have seen no performance comparison between the performance of proper fq_codel & our FAIRQ/HFSC + CoDel. With no information to go from, who is to say our setup is sub-par?

We need to do some testing and/or code review before throwing money at things.

Nullity

@Nullity:

…
We need to do some testing and/or code review before throwing money at things.

Actually, I will do some testing this weekend.

I figure I will install IPFire (fq_codel) and run a dozen dslreports throughput tests to test bufferbloat, perhaps run some manual ping test while fully saturating the upload, the try the same tests with pfSense (FAIRQ/HFSC + CoDel). Maybe some subjective web-browsing browsing tests during multi-stream & single-stream upload saturation… anyone know how to test web-browsing more objectively?

Any info about what I should test would be appreciated. :)

I doubt I will test download saturation as CoDel is not really meant for that (minimal buffering). Maybe though...