Playing with fq_codel in 2.4
-
@zwck https://forum.netgate.com/post/772130
I posted that earlier for reference, I am still using that configuration I believe
-
@mattund perfect I'll give that a try. Thank you so much
-
@mattund
No I did not created any redmine issues. -
@mattund said in Playing with fq_codel in 2.4:
Now, given you have your shaping set to 900/900, maybe you need to increase the slot size/queue length? That's a pretty fast connection.
Hey Matti,
in the mean time i switched to a floating rule setup, and the rule pretty much catches all similarly to what you described. I also played with bucket size and queue length to no avail, when i download or upload most of the time the scheduler uses sub 15 buckets and the default is 256 or so. I doubled the limit as well to 20480 and flows up to 2048 there was no difference. Then i read the fq_codel manual and the standard values put in place to cover basically everything from 1mbit to 1000mbit. Maybe the drop i see is just what i have.
Thanks for your help.
-
I am getting these errors in 2.4.4
kernel config_aqm Unable to configure flowset, flowset busy!
Anyone else seeing these?
-
@mattund said in Playing with fq_codel in 2.4:
@slowgrind said in Playing with fq_codel in 2.4:
So after applying the patch do you just fill in the settings under limiters?
Here's what I'm doing. This might be a little more than what you need, but I figure I would share my configuration in case others have a crazy Multi-WAN multi-LAN setup like I do. I've constructed a series of limiters, one for download and one for upload, each with its own associated queue (you can make the queue with the "+ Add new Queue" button on the bottom of a Limiter's settings page) :
(I have more for my second ISP following that naming scheme: lINTERFACEDownload/lINTERFACEUpload and qINTRERFACEDownload/qINTERFACEUpload children)
I'm assigning FQ_CoDel to the scheduler on the parent limiter and leaving everything else alone. You can either edit the parameters, or leave them at default if you have a typical connection (FQ_CoDel is supposed to be "knobless" after all).
According to the following diagram, this is how the traffic will flow inside dummynet:
(flow_mask|sched_mask) sched_mask +---------+ weight Wx +-------------+ | |->-[flow]-->--| |-+ -->--| QUEUE x | ... | | | | |->-[flow]-->--| SCHEDuler N | | +---------+ | | | ... | +--[LINK N]-->-- +---------+ weight Wy | | +--[LINK N]-->-- | |->-[flow]-->--| | | -->--| QUEUE y | ... | | | | |->-[flow]-->--| | | +---------+ +-------------+ | +-------------+
via: https://www.freebsd.org/cgi/man.cgi?query=ipfw&manpath=FreeBSD+9-current&format=html
Dissection: firewall traffic is assigned to a queue, which then generates flows defined by the mask, which pipe into the scheduler (set to FQ_CoDel), which then outputs to the pipe/link at the specified max bitrate.
To assign your traffic to queues, you could do something like I did, which is to use floating rules. I have two WANs, and I need independent shaping and all that, so if you're on a single WAN it may be different for you/you may have better options.
How I set the rules up:
- Interface: WAN A or B interface
- Direction: out
- Address Family: IPv4 or IPv6; I had to do two rules, one for each IP version
- Gateway: Select the applicable IPv4 or IPv6 gateway consistent with how traffic should be routed on that IP stack
- In / Out pipe: qCHARTERUpload / qCHARTERDownload
I have some filtering rules in play here as you can see in my screenshot, but that's only since I'm testing some issues I mentioned previously. It's up to you if you want to match certain protocols/ports, etc.
Anybody know why when i enable floating rules both upload and download speeds get cut in half? As soon as i disable it, speeds are back to normal.
As far as i can see, traffic both download and upload are getting matched by the interface rules anyway, what is the floating rule for?
-
Wrong numbers in the limiters? maybe post some images, so people can help you
-
@mattund said in Playing with fq_codel in 2.4:
@slowgrind said in Playing with fq_codel in 2.4:
So after applying the patch do you just fill in the settings under limiters?
Here's what I'm doing. This might be a little more than what you need, but I figure I would share my configuration in case others have a crazy Multi-WAN multi-LAN setup like I do. I've constructed a series of limiters, one for download and one for upload, each with its own associated queue (you can make the queue with the "+ Add new Queue" button on the bottom of a Limiter's settings page) :
(I have more for my second ISP following that naming scheme: lINTERFACEDownload/lINTERFACEUpload and qINTRERFACEDownload/qINTERFACEUpload children)
I'm assigning FQ_CoDel to the scheduler on the parent limiter and leaving everything else alone. You can either edit the parameters, or leave them at default if you have a typical connection (FQ_CoDel is supposed to be "knobless" after all).
According to the following diagram, this is how the traffic will flow inside dummynet:
(flow_mask|sched_mask) sched_mask +---------+ weight Wx +-------------+ | |->-[flow]-->--| |-+ -->--| QUEUE x | ... | | | | |->-[flow]-->--| SCHEDuler N | | +---------+ | | | ... | +--[LINK N]-->-- +---------+ weight Wy | | +--[LINK N]-->-- | |->-[flow]-->--| | | -->--| QUEUE y | ... | | | | |->-[flow]-->--| | | +---------+ +-------------+ | +-------------+
via: https://www.freebsd.org/cgi/man.cgi?query=ipfw&manpath=FreeBSD+9-current&format=html
Dissection: firewall traffic is assigned to a queue, which then generates flows defined by the mask, which pipe into the scheduler (set to FQ_CoDel), which then outputs to the pipe/link at the specified max bitrate.
To assign your traffic to queues, you could do something like I did, which is to use floating rules. I have two WANs, and I need independent shaping and all that, so if you're on a single WAN it may be different for you/you may have better options.
How I set the rules up:
- Interface: WAN A or B interface
- Direction: out
- Address Family: IPv4 or IPv6; I had to do two rules, one for each IP version
- Gateway: Select the applicable IPv4 or IPv6 gateway consistent with how traffic should be routed on that IP stack
- In / Out pipe: qCHARTERUpload / qCHARTERDownload
I have some filtering rules in play here as you can see in my screenshot, but that's only since I'm testing some issues I mentioned previously. It's up to you if you want to match certain protocols/ports, etc.
@mattund I have a question here, when you define your floating rule it states in the discription the following:
"If creating a floating rule, if the direction is In then the same rules apply, if the direction is Out the selections are reversed, Out is for incoming and In is for outgoing." Are you doing this or not ? -
@mattund I have a question here, when you define your floating rule it states in the description the following:
"If creating a floating rule, if the direction is In then the same rules apply, if the direction is Out the selections are reversed, Out is for incoming and In is for outgoing." Are you doing this or not ? -
wow
-
I am delighted to see all the enthusiasm for fq_codel. For theoretical discussions about how it works please visit the cake or codel mailing lists at lists.bufferbloat.net.
I've been meaning to code review the bsd implementation for a while, I know it has a few limitations and differences from the linux version. In reviewing this thread and all its comments just now I have a few comments. Note I'm primarily an emailer, not a web forum person, but I'll try to pay some attention here, now that I know this thread exists, while you sort out teething pains and new bugs.
-
do try the simplest possible config first - one shaper + fq_codel. I've generally found that this eliminates the former need for a lot of rules. sqm-scripts has a few of the common rules (like deprioritizing ping) we use in the openwrt world.
-
does this OS allow for compensating for frame size and spacing (as in dsl/cable/etc). Otherwise if you try to get close, you get bitten by that.
A lot of our motivation for fq_codel and now cake was driven by: https://www.bufferbloat.net/projects/bloat/wiki/Wondershaper_Must_Die/
which references the framing problem.
and if you are going to shape, shape everything, including ping.
-
your nic shouldn't matter
-
I don't know if the bulk-dropper in the linux version of fq_codel is in the bsd version. It helps on extreme overloads.
-
It does sound like there a memory reallocation bug in this version on a reconfigure?
-
anyone up for a sch_cake port? It's SQM on steroids. https://arxiv.org/pdf/1804.07617
I think that's most of my takeaway from this thread. Another plug is that the flent.org tool we created and use a lot to look hard at all sorts of networking problems - you'll find a lot of "rrul" tests in particular and I'd love to see some of those against your various configurations.
Happy debloating!
--
dave taht
co-founder bufferbloat project -
-
I also note that fq_pie was done by the same great folk that did fq_codel, and it benched out pretty good. Even though fq_codel is sort of my adopted baby, I'm pretty agnostic - I just want to beat bufferbloat across the entire internet before I kick the bucket. From what I could see of fq_pie - it looked good also! and I'd really like it if more people also gave it a shot on real world traffic and reported back. thx!
--
dave -
@dtaht et.al thank you!
-
@dtaht is the man!
-
oh, vey. I'm not "the man". There's a ton of us. (including y'all now!). I'm merely often the most visible and I'm only that because Jim Gettys (who coined the term bufferbloat and convinced me it was such a serious problem that I've now spent 7 years of my life on it) is not much of a public speaker, and ESR told me I only had to be the public face for 5 years before the meme was established and I could retire. And further Vint Cerf, ESR and Jim ganged up on me and told me I had to produce a viable existence proof in one cheap embedded router (cerowrt) before enough people would believe us.
I (we!) kind of owe my career to those guys, so I did that... but cerowrt really burned me out on embedded development. I've been trying to cut way back on "being out there", but finally seeing some major deployment is both very exciting - not just here and on freebsd, but the fq_codel for wifi stuff we did seemingly is spreading like wildfire ( https://arxiv.org/pdf/1703.00064.pdf ) ...
and having unleashed a technology that can make the internet so much better, and after recommending full deployment of fq_codel in 2013 (https://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/ ) , it's also terrifying... I've spent a lot of time since trying to figure out new problems we've created ( https://github.com/tohojo/flent/issues/148 ) , and in particular ecn and rabid tcp advocates are the most problematic ( https://github.com/systemd/systemd/issues/9725 )
I try always to give full credit to those that created the codel theory (Kathie Nichols & Van Jacobson) and to Eric Dumazet (fq_codel). My portion of the credit was insisting over and over again, with data and proof - to some of the smartest people I've ever met - that AQM was not enough (VJ) and FQ was not enough (Luca Muscarillo ( ) ), and a few theory and implementation fixes along the way with that analysis intact...
We owe an enormous debt to Tom Herbert, who created BQL, and actually made 2 decades of dusty aqm/fq research actually work again in spite of some horrible ongoing practices in the academic community ( some history here: http://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf )
Toke Hoiland-Jorgenson barged into my hotel room one day and insisted he wanted to make bufferbloat the subject of his PHD thesis and has been a truly fundamental driver of much of the improvements that have landed in wifi and runs a great deal of the bufferbloat project now in his spare time. Toke's been the greatest writing partner I've ever had. And he also wrote the wonderful flent.org tool we've used to be able to analyze networks 100s of times faster than other researchers could. (do check it out!)
Pete heist wrote irrt - the first good one way delay measurement tool I've ever used.
( https://github.com/heistp/irtt ). RTT measurements suck if you can't figure out which side of the path is messed up.There's people like sebastian moeller and kevin derbyshire-bryant that actively maintain and educate those using the sqm-scripts for linux and openwrt and have been really fundamental in finding and fixing obscure framing bugs. John, Jow, and Felix of lede - who were utterly willing to take semi-broken code and test it out in openwrt first.
Ryan mounce showed up out of the blue with a viable ack-filter for sch_cake (http://blog.cerowrt.org/post/ack_filtering/)
Somewhere along the way the whole project had to take on the fcc to keep open source wifi legal (https://www.computerworld.com/article/2993112/security/vint-cerf-and-260-experts-give-fcc-a-plan-to-secure-wi-fi-routers.html ) long enough for us to finish the fq_codel for wifi theory and implementation - and that was fixed by a random guy (michal kazior) showing up with the solution 2 years into us failing to come up with an answer. Given how hard that was elsewhere (example: http://blog.cerowrt.org/post/crypto_fq_bug/ ) I still get kind of teary when I look at the first plot of wifi behaving sanely at all mcs rates. And tearier still with the hundreds of signers of the document we collected that kept our wifi work legal, long enough.
Another random guy discovered he had bufferbloat, fixed it for himself, and was sufficiently POed to go on to write the dslreports bufferbloat tests ( http://www.dslreports.com/speedtest/results/bufferbloat?up ) from which we now have an accurate picture of the size of the problem worldwide. I periodically hit reload on that url, and over the last 6 months those on the left hand side of the 30ms green line has grown markedly - now including all of you! (It's not just the fq_codel deployment but growing upload speeds)
The ietf standardization process for rfc8290 was brutal - not something I'll ever go through willingly again. But that crowd was helpful in its own way.
Corporate support here and there has been helpful too - Jason Livinggood over at comcast funded some of the research, The cablelabs study was extremely helpful. NLNet helped, so did google. But the kind of work we did would have been impossible in a modern corporate environment. Patent attorneys would have been all over us. I'm deeply grateful for the backing of
OIN because attorneys scare me.Jon Corbet of lwn and Bob Cringely were incredibly helpful. A multitude of other journalists.
Dave Reed - the creator of UDP - has been deeply involved.
John Nagle (author of RFC970 which we cite a lot) has popped out of retirement and writes pithy stuff on reddit.
And then there were other random folk (like, I think @Harvy66) that picked up on the ubnt fq_codel implementation and ran with it until it was so compellingly polished ubnt incorporated it in part of their standard product...
And Grenville Armatrage's team that did the fq_codel and fq_pie implementation (http://caia.swin.edu.au/cv/jkua/preprint/jkua-icccn2017-chunklets-preprint-10may17.pdf ) that you are using....
They also did some great work tearing apart why most ethernet over powerline devices sucked so badly. I've run out of energy for cites....
Nowhere in my life have I seen the power and proof of open source and annoyed engineers scratching an itch, than I have in the bufferbloat project. And nearly absolutely everyone has contributed what they could in their spare time.
And now there's y'all. :)
My hope is that those that successfully deploy fq_codel for themselves will:
A) fix it for at least two friends, and ask them to fix it for two friends each. Fix a local coffee shop, help fix a local business
B) Push their ISPs to supply gear that works right out the box
It's great that we can fix it with another box, but a better default experience for everyone would be better. Another couple years maybe we won't all have to do inbound shaping.
C) Push chipmakers to add BQL and fq_codel to their DSL/cablemodems/other devices...
D) Worry, as I do... about what else is broken on the Internet.This got kind of long, and no doubt I missed some people, so I'll turn it into a blog entry at some point. I'm not "the man". y'all are.
Happy debloating!
PS I'll try to get around to that code review in the next month or so.
PPS I note, in passing, while karmically rewarding - that this gig doesn't pay very well or often, and I just burned a few months of my life helping finish sch_cake (principal author jon morton and see: https://www.bufferbloat.net/projects/codel/wiki/CakeTechnical/ ) and another openwrt release... and I keep a tip jar here: https://www.patreon.com/dtaht - it doesn't even cover the cost of keeping the servers lit up presently.
I'd rather like to spend a bit of time on webrtc and on ecn in the next year, if i can.
-
I'm eagerly awaiting DOCSIS 3.1 and PIE.
-
well, if you get used to fq_codel you will find docsis-pie disappointing. However as an out of the box default for improving the uplink it is way better than current cablemodems behave.
Sadly, nothing has been done (that I know of) to kill the overbuffered cmts side of things, where the bulk of the bad experience lies. We're going to need inbound shaping for a decade more yet.
-
What is the consensus of using the "Weight" option multiple child queues under each limiter to set priority?
ie.
downloadHigh -> Weight = 60
downloadDefault -> Weight = 30
downloadLow -> Weight = 10uploadHigh -> Weight = 60
uploadDefault -> Weight = 30
uploadLow -> Weight = 10 -
Hi @bjsmith - I have been using a setup similar to this for some time to guarantee bandwidth across different network segments. From the basic testing that I have done, it seems to be working fine.
Hi @dtaht - it's great to see you joining the discussion! I have read quite a few your posts and writings, especially over at bufferbloat.net, which has been an invaluable resource. Thank you so much for all your contributions to this effort, it truly is appreciated.
I first ran into the bufferbloat problem a couple years ago when I had a 150/150 symmetric fiber connection and saw that the upload developed bloat while running the DSL Reports speed test (which at the time actually really surprised me). After a lot of research and trial and error, I initially addressed the issue with the ALTQ FAIRQ scheduler and Codel AQM in pfSense, but eventually just switched to fq_codel as support was added in later pfSense versions.
I still use fq_codel today although on a now faster 1Gbit symmetric fiber connection - even at high WAN speeds bufferbloat can still become an issue. These days my LAN also runs at 10Gbit and feeds into a slower 1Gbit WAN link - using fq_codel keeps everything well behaved. More recently I discovered the Google developed TCP BBR congestion control algorithm. From the basic testing that I have done, the algorithm works great alongside fq_codel, and has shown significant improvements on the upload of my WAN connection, especially over longer distances.
Flent has been an invaluable network analysis tool while performing these experiments and has become my goto application for testing networking performance after making changes/optimizations/tweaks etc. I have used it for testing both WAN performance as well as LAN performance at 10Gbit across the pfSense firewall (for instance, with two TCP BBR enabled 10Gbit Linux hosts it has resulted in some really pretty charts).
I won't lie - all this optimization and tweaking can get a bit addictive, but having a reliable and stable connection makes it well worth it.
-
I'm glad that our efforts have been so useful for you! Going back to my longer post above, I ended up adding 50+ people to my credits list that will end up as a blog entry. I would rather like to fix about 2B routers and the only way to do that is encourage those that get the benefits of fixing bufferbloat to help get rid of it for others. There's more than a few business models for that... but if you could tell me you'd also gone and fixed it for a friend, upgraded your local coffee shop, helped a non-profit or small business out of the bloat-hole, it would cheer me up more. Doing a preso for a user group, nagging your ISP, are also things worth doing. Posting more before/after results from flent, etc.
As usual, I'm teetering on the edge of burnout. I promised a few folk that I'd stick with this cause until deployment was underweigh with correct code, and we're very close to that now. I look forward to one day leveraging all this low latency stuff as a default with applications that can take advantage of it - imagine quality 48khz stereo voice running at 2.7ms sample rates for example, or VR that didn't make your head spin because it runs at 4ms (240 frames/sec).
-
@tman222 - also, I'd like more folk to start posting "debloat events/day". Codel kicks in rarely, but I'm pretty sure every time it does it saves on a bit of emotional upset and jitter for the user. For example I get about 3000 drops/ecn marks a day on my inbound campus link (about 12000 on the wifi links), and outbound a 100 or so. Every one of those comforts me 'cause I feel like I'm saving a ~500ms latency incursion for all the users of the link.
I'm kind of curious how many drops/marks a day your 10gigE/1gig link has.
-
@dtaht said in Playing with fq_codel in 2.4:
@tman222 - also, I'd like more folk to start posting "debloat events/day". Codel kicks in rarely, but I'm pretty sure every time it does it saves on a bit of emotional upset and jitter for the user. For example I get about 3000 drops/ecn marks a day on my inbound campus link (about 12000 on the wifi links), and outbound a 100 or so. Every one of those comforts me 'cause I feel like I'm saving a ~500ms latency incursion for all the users of the link.
I'm kind of curious how many drops/marks a day your 10gigE/1gig link has.
Thanks @dtaht - I'd be happy to share that info. What's the best way to collect that data under FreeBSD? I'm assuming you are referring to the fq_codel packet drops over a 24hr period instead of interface drops. Thanks again.
-
Heh. I don't know. I tried to fire up bsd once recently and found it entirely alien.
In linux it's
tc -s qdisc show dev whatever
They also show up as interface drops, so I use mrtg to track it. I don't have ecn marks though, in my snmp mib. That's growing important.
If you figure it out (or it's a missing facility) let us know!
-
I note also flent can collect a lot of stats along the way on it's tests - cpu usage, qdisc stuff, etc - but we have not a lot of bsd support in there. see flent/flent/tests/*.inc
-
Another test in flent I've started using more of late is the "square wave" tests. These "speak" to those with more of an EE background. There's one that directly compares bbr and cubic in particular. Most people find the rrul test overwhelming at first. I used it on a recent preso to broadcom: http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf
-
@dtaht said in Playing with fq_codel in 2.4:
Heh. I don't know. I tried to fire up bsd once recently and found it entirely alien.
In linux it's
tc -s qdisc show dev whatever
They also show up as interface drops, so I use mrtg to track it. I don't have ecn marks though, in my snmp mib. That's growing important.
If you figure it out (or it's a missing facility) let us know!
Thanks @dtaht - I do know that one can see the statistics of fq_codel with the "ipfw sched show" command, but I'm not sure how to go about logging that for a 24 hour period. If anyone else here has an idea, please do let me know. Maybe the logging functionality is something that still needs to be added?
@dtaht said in Playing with fq_codel in 2.4:
I note also flent can collect a lot of stats along the way on it's tests - cpu usage, qdisc stuff, etc - but we have not a lot of bsd support in there. see flent/flent/tests/*.inc
So I do run the Flent application through Linux (Debian Stretch) and I see those different test options through the front UI. The only thing I can't seem to figure out is how to add parameters to the test through the UI -- maybe to do that I have to run from the command line.
@dtaht said in Playing with fq_codel in 2.4:
Another test in flent I've started using more of late is the "square wave" tests. These "speak" to those with more of an EE background. There's one that directly compares bbr and cubic in particular. Most people find the rrul test overwhelming at first. I used it on a recent preso to broadcom: http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf
Thanks @dtaht for this additional info. I didn't quite know which test you meant before I saw the presentation :). I ran the TCP 4 Up Squarewave test as well and got some interesting results. Using fq_codel on the WAN link and netperf.bufferbloat.net as the testing server on the other side, it doesn't look nearly as nice as what you have in your slides (in fact it looks closer what you had on slide 36). It's somewhat similar on a local 10Gbit to 10Gbit test (although no fq_codel enabled there): Had to turn the IDS/IPS off on the interfaces and the flows started to behave more, but still not quite as nice as what you had. Having said that, I think my results might be influenced by the fact that I already have TCP BBR enabled on my Linux machines, which requires the FQ scheduler to be enabled as well via tc qdisc. Do you think I should turn that off and rerun with the cubic defaults?
Thanks again for all your insight and advice.
-
re: "ipfw sched show" command, but I'm not sure how to go about logging that for a 24 hour period.
cron job for (date; ipfw sched show) >> debloat_events
re: additional stats. I usually script those but there is a way to stick additional stats in the default .rc file I think.
--te=cpu_stats=machine,machine --te=qdisc_stats=login@machine,... --te=qdisc_interfaces=etc
re "nearly as nice". One of my fears is of bias without actual testing - "I made this change and the network feels better, ship it!" - where people often overestimate the networks bandwidth, or fiddle with bbr, etc and don't measure. So my other favorite flent feature is you can upload somewhere your .flent.gz files and have an expert take a look (hint hint). Flent by default stores very little metadata about your machine so I'd hoped people would be comfortable sharing its data more widely, unlike, for example packet captures. (note the -x option grabs a lot more).
I am under the impression pacing works universally on modern kernels but have not tested it - particularly not under a vm. That's another fav flent feature, go change out sch_fq for fq_codel and you can easily compare the two plot changes over that single variable.
I love flent. It's the tool that let bufferbloat.net pull so massively ahead of other researchers in the field.
-
It's probable your graph looks like pg 36 'cause that box is on the east coast? we have flent servers all over the world - flent-fremont.bufferbloat.net on the west coast, one in germany, denmark, and london, several on the east coast, one in singapore....
-
@dtaht said in Playing with fq_codel in 2.4:
It's probable your graph looks like pg 36 'cause that box is on the east coast? we have flent servers all over the world - flent-fremont.bufferbloat.net on the west coast, one in germany, denmark, and london, several on the east coast, one in singapore....
Thanks @dtaht - I think that box (netperf.bufferbloat.net) might be located near Atlanta judging by a traceroute to it. Is the list of all Flent servers available publicly somewhere? Or are those testing servers only available to the contributors of the project?
Thanks again.
-
we probably should publish it because it doesn't vary much any more. We used to have 15, now, it's 7? 6? So far we haven't any major scaling issues.
Unlike speedtest.net we don't have a revenue model, and the hope was we'd see folk setting up their own internal servers for testing. It's just netperf and irtt.
-
@dtaht
I really respect the work that was done by you and people fighting bufferbloat over the world.
As for me I use AQM everywhere it possible, to eliminate bufferbloat from my networks. The one problem is still to be investigated is those networks where we can't to detect the bottleneck size aka bandwidth limit varies a lot during day usage. I know there is a lot of such ISP networks in Japan, for example.
Yes we can set bandwidth limit to the minimum one, but it not smart enough, it would be good to detect the current bandwidth automatically and adjust everything. I do know that there are some software and router scripts samples on the net that do it in various ways, but it still need to be studied and developed in many systems, also pfSense. -
@w0w - and you. keep fighting the bloat!
My sadness is that I'd hoped we'd see all the core bufferbloat-fighting tools in DSL/fiber/cablemodems by now. BQL + FQ-codel work well on varying line rates, so long as your underlying driver is "tight". We also showed how to do up wifi right. Also thought we'd see ways (snmp? screen scraping?) of getting more core stats out of more devices, so as to be able to handle sag better.
with sch_cake we made modifying the bandwidth shaping nondestructive and really easy (can pfsense's shaper be changed in place?) - it's essentially
while getlinestatssomehow()
do
tc change dev yourdevice newbandwidth
doneSo far the only deployment of that feature is in the evenroute, but I'm pretty sure at least a few devices can be screenscraped.
As for detecting issues further upstream, there are ways appearing, but they need to be sampling streams to work (see various tools of kathie's: https://github.com/pollere )
-
Hi @dtaht,
I have a basic question regarding bufferbloat that I never quite understood. I can understand how there can be bufferbloat on the uplink of a WAN connection (e.g. 1Gbit LAN interfaces sending into a e.g. 10Mbit cable modem upload). However, what causes bufferbloat on the download since generally the interface the data is being sent into is larger than the download speed on the WAN interface (e.g. a 250Mbit cable modem download speed sent into a 1Gbit LAN interface)?
Thanks in advance for the insight and explanation, I really appreciate it.
-
The buffering in that case builds at the CMTS (on the other side of the cablemodem). CMTS's tend to have outrageous amounts of FIFO buffering (680ms on my 120Mbit comcast connection), so, if you set your inbound shaper to less than their setting, you shift the bottleneck to your device and can control it better. It's not always effective (you can end up in a pathological situation where the CMTS is buffering madly as fast as you are trying to regain control of the link), but setting up an inbound shaper to 85% or so of your provisioned rate generally works, and you end up with zero inbound buffering for sparse packets, and 5-20ms for bigger flows, locally.
Does that work for you? (It's still an open question for me as to how netgate does inbound shaping).
It's horribly compute intensive to do it this way, but since we've been after the cablecos for 7+ years now to fix their CMTSes with no progress, shaping inbound is a necessity. In my networks, I drop 30x more packets inbound than outbound but my links stay usable for tons of users, web PLTs are good, voip and videoconferencing "just work", netflix automagically finds the right rate... etc.
That work for you? The buffering comes from bad shapers on the far side of the link. It's not just CMTSes that are awful. DSL is often horrific. I'm now seeing some 1G GPON networks with several seconds of downlink buffering. I guess they didn't get the memo.
-
I actually kind of wish I hadn't stopped work on "bobbie", a better policer. fq_codel is far too gentle and has the wrong goal for inbound shaping. Yes - it works better than anything we've ever tried, but a better policer would have zero delay for all packets at a similar cost in bandwidth and far less cpu. I think. Haven't got around to it. (basically you substitute achieving a rate in a codel like algorithm instead of a target delay). Tis research for someone else to do, I'm pretty broke after helping get sch_cake out the door.
-
Over here ( https://github.com/dtaht/fq_codel_fast ) I'm trying to speed up fq_codel a bit, and add an integrated multi-cpu inbound shaper.
-
@dtaht - thanks for the response - that makes a lot of sense.
Event though I used a cable modem in the example in my previous post, the question was really about GPON. I suppose what ends up happening is that there are buffers at at the GPON card (hardware) for both upload and download. If there is too much data being pushed into the link from upstream servers (i.e. a bunch of people downloading), the buffers start to fill up and packet delay (bufferbloat) occurs for downstream users. Since GPON bandwidth is generally shared among several users, I suppose the severity can vary depending on the amount of users, their usage patterns, and level of congestion. Furthermore, I suppose it's likely easier to experience bufferbloat on the uplink direction since GPON is asymmetric (2.4Gbit down / 1.2Gbit up).
In my personal experience with a gigabit GPON link I have been fortunate: I can set inbound/outbound shaping at >95% of max bandwidth and still not experience any significant delay/bloat.
-
Yes, the GPON folk did not think hard about buffering, and it won't be much of a problem in their early deployments until they start oversubscribing more links. This is a flaw repeated time after time in this industry - 3g grew to suck, 4g was "better", 5g is going to fix 4g... (and 2g, 'cause so many have exited the band now, can be surprisingly good nowadays).
Recently I gave a presentation to broadcom ( http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf ) Another one of my hopes in the bufferbloat project was that someone would solve the over-subscription problem up front for a change - and I thought we'd have a chance with GPON and gfiber, but the team I was on got dissolved about 9 months from being able to deploy.
Sigh:
http://www.dslreports.com/speedtest/results/isp/r3910-google-fiber
-
I'm going to have some serious issues attempting to measure the reduction of bufferbloat from fq_Codel
This is with shaping disabled. Dear lord, what is my ISP doing? I love fiddling and they're taking that away from me. I just realized I forgot to disable BitTorrent. Explains my low upload speed.
-
I just updated from 2.4.3 to the 2.4.4 release candidate and am keen to try out fq_codel. I haven't read all 434 (!) posts of this thread, but can anyone kindly point me to a basic beginner's guide or instructions to setting it up for a simple home connection?