Sharing my recent discovery on shaping downstream with limiters.

chrcoluk

Most people tend to only worry about egress, and then their priority is buffer bloat.

However for multiple years now I have been fighting a problem where by when I am downloading at anywhere near my line rate, especially multi threaded, all other traffic whilst the download is going on suffers from packet loss, so sticky cursor on ssh command line, buffering twitch and so on.

With limiters, I could only solve it by having a very aggressive small virtual pipe (around 60% of line speed or lower, so 40% buffer), which is obviously hugely wasteful of bandwidth.

The issues with shaping ingress are well known, the sender controls the flow of traffic (congestion window), plus any artificial changes take time for the sender to respond to, so you will also suffer from a large burst at start of flows.

A lot of services are designed to be resistant to packet loss, as to get round congestion and lossy technologies. As well as to fill out very fat pipes e.g. gigabit FTTP services, if you have a lossless small pipe e.g. DSL this can flood the connection with inflated congestion window sizes.

I had another read of the dummynet documentation on 'man ipfw' and decided to try new things.

The first thing I tried was to stop the use of the scheduler and queues and just send traffic straight to the pipe (called limiter in PF UI). This actually yielded a very noticeable improvement. I switched backwards and forwards on the PF rules between routing to a queue or pipe, and it was consistently noticeable. Note however if you use a pipe instead of queue for downstream it also means you have to do the same on upstream, you cannot mix a pipe with a queue on a rule.

The next idea I had was to rate limit the ack's on the upstream, something I never considered to do before. I fine tuned the upstream pipe capacity until it was just enough to allow a downstream flow at about 5mbit below my line capacity (to match what I had set the ingress pipe to). The result was amazing, the instant growth to a massive congestion window was finally tamed, getting rid of the burst at the start of every new flow. I have also played with artificial delay, artificial packet loss and queue length adjustments.

When adding forced packet loss to a downstream pipe, I found if the sender was using the bbr congestion algorithm, it was very resistant to it. Even with 10% packet loss the flow after a momentary struggle was able to maximise the pipe capacity and still cause packet loss connection wide. However cubic algorithm was murdered by it as expected. (bbr leans more on using delay to adjust its window). My testing for reference was all done using cubic and bbr which I expect covers most of the service providers on the internet. Cubic been the default in linux and windows, bbr used by some of the big players including google.

I was surprised that it was quite difficult to slow down bbr with artificial delay, I think the reason might be is it looks for a sudden increase in delay rather than a delay thats always high, adding 50ms delay didnt do a lot other than slow down the growth a little bit, but around the 150ms mark it suddenly had a big effect preventing the window getting big enough to get higher then about 20% pipe utilisation.

I observed decreasing queue size to a very low number e.g. 20 or less, made both bbr and especially cubic struggle, bbr was still quite resistant, cubic struggled to get above about 80-90% utilisation.

Back to ack shaping, when I had a low queue size on the egress ack pipe, it did slow down both algorithms, but the traffic was spiky, so e.g. it would download a large burst, then nothing, then a large burst again, and the acks were about 80% above the pipe size. I found making the queue size quite large e.g. 2000, it made the pipe enforce its size much better and the speed was a constant instead of bursty, basically the same effect as having a small rwin.

The config I am currently settled on for the bulk download egress pipe is 'pipe 5 config bw 800Kb queue 2000 delay 20 droptail', is really really effective. Amazing in taming downstream.

steam, blizzard, ftp, consoles, epic store, origin etc will use this egress pipe along with the capped ingress pipe, the egress is a basic droptail, ingress also droptail, although might try pie or codel later.

I have kept interactive traffic and cloud drive services on the fq_codel queues/schedulers I had in place before.

Hope this is of interest to some people.

thiasaef

@chrcoluk, why reinvent the wheel? https://www.bufferbloat.net/projects/codel/wiki/Cake/#inbound-configuration-under-linux

It is very unlikely that you will have to sacrifice 40% of your network bandwidth to get fq_codel to work really well.

chrcoluk

@thiasaef said in Sharing my recent discovery on shaping downstream with limiters.:

@chrcoluk, why reinvent the wheel? https://www.bufferbloat.net/projects/codel/wiki/Cake/#inbound-configuration-under-linux

It is very unlikely that you will have to sacrifice 40% of your network bandwidth to get fq_codel to work really well.

Hi, it is what it is and thats what I have to do (40% required on multithreaded like steam downloads). I dont think its a fq_codel specific issue though, I have the ingress issue on all schedulers when using queues and schedulers on ingress.

What you said its a bit like saying to someone with a red orange "its very unlikely you will have a red orange". :)

So note (a) he is doing this on a 920mbit pipe, (b) he is using cake (c) on a different stack, (d) his priority was buffer bloat, mine is making sure my packets arrive..

You cannot apply a one size fits all to this sort of thing.

thiasaef

@chrcoluk said in Sharing my recent discovery on shaping downstream with limiters.:

40% required on multithreaded like steam downloads

Which I simply do not believe without further details on the exact setup (ISP, Connection Type and Speed, ...).

What you said its a bit like saying to someone with a red orange "its very unlikely you will have a red orange". :)

No I'm saying that it's unlikely that you need a sledgehammer to drive a gnaw into the wall. ;)

chrcoluk

@thiasaef I am not asking you to believe, I have posted something here which some people may or may not be interested in, if you think I am posting FUD just move on. (The questions you asked are also answered in my post).

thiasaef

@chrcoluk, I'm not saying you are spreading FUD, I'm saying that it is very likely that there is something fundamentally wrong with your setup (unless you are on a 2 Mbit/s LTE connection, which is probably not the case).

Example: https://www.diva-portal.org/smash/get/diva2:854117/FULLTEXT01.pdf#page=4 (at most ~ 10 % loss of throughput and basically zero packet loss even on very low speed links).

The questions you asked are also answered in my post

They're not.

chrcoluk

@thiasaef Why do you keep linking me to research documents?

I will be asking the mods to delete my post, because you have somehow become obsessed that my situation doesnt fit your ideology and the thread is already polluted.

The problem of modern congestion algorithm's flooding downstream pipes is already well known in the UK ISP industry (this was explained in my OP which given every single reply from you, you have clearly not read or decided to disregard).

The UK Industry routinely rate limits every DSL line, as its a requirement from the wholesale provider BT wholesale. The reason this is required (on millions of lines) it was discovered that modern TCP sends more data than the recipient can handle. Most congestion algorithms will be constantly speeding up then slowing down the data flow, slow down when ack's dont arrive, speed up again after, they never settle on a optimal bandwidth flow.

In recent years with bulk download services becoming very popular, the providers dont want people complaining of "its too slow". So they configure things to make things go as fast as possible, and even be resistant to network congestion and lossy technologies, typically in short it means TCP doesnt slow down much when it encounters problems and speeds up aggressively when it can. The tricks are commo nin the industry to accelerate slow start, to skip slow start, and to ramp up speeds to fill pipes as fast as possible, not only that multi threading traffic flows is also common in the game industry.

I analysed the traffic on my own connection, and determined that my problem with packet loss when downloading was primarily caused by the congestion window been far too big. This is very easy to control e.g. by restricting the RWIN size, this can be done in windows e.g. by disabling auto tuning which caps it to a maximum 65k window size. Some applications also let you override the RWIN.

What I didnt reveal in my first post that one method that used to be used when isps traffic shaped connections was to the throttle the ack's, in addition it was also a problem on cable broadband networks which had upstream congestion on DOCSIS, that upstream congestion would affect the downstream because without the ack's the sender would send much slower.

Now there is various different congestion algorithms, for many reasons, that there is many different networks out there operating under different conditions, likewise there is many different traffic shapers, why? because no one algorithm and no one shaper can cover every situation, its impossible. There will be many more shapers after cake and fq_codel, and many more congestion algorithms after bbr.

Now what about this latest research document you sent me?

I got to page 3 and I can see its already not relevant, I will quote the article.

The tests are run in a controlled environment consisting of
five regular desktop computers, as shown in Fig. 1. The computers
are equipped with Intel 82571EB Ethernet controllers,
and networked together in a daisy-chain configuration. This
corresponds to a common dumbbell scenario, with the individual
flows established between the endpoint nodes serving
as multiple senders.

So this is already void as to my situation.

Further reading I was unable to find which congestion windows were used, also how senders were configured on the network stack. But even with this. The author of the document acknowledged both codel and fq_codel had trouble with very low rtt and very high rtt in this particular experiment. There was also observed unfairness in TCP flows compared to FIFO, it seems you have linked a document you have not read.

Now in regards to fq_codel vs plain codel, I dont need to split my traffic into flows at the scheduler, I decide the flows on the firewall rules, I send one set of traffic to one limiter, and another set of traffic to another limiter, at that I dont need that split into more flows, so fq_codel is not even the right tool for the job I am giving it. The job is purely to restrict the congestion window of the sender in the most aggressive means possible.

I dont know how long the thread will be up but I will repost this on a blog or something, as it was intended to be for people to try themselves if they had the same problem (there has been other UK DSL users posting on here with downstream packet loss problems) or just to read out of curiosity, not for a discussion on my network setup.

Please dont respond with another research document. :)

thiasaef

@chrcoluk said in Sharing my recent discovery on shaping downstream with limiters.:

because you have somehow become obsessed that my situation doesnt fit your ideology

No, I just don't want random people stumbling across your suggestions and thinking they are the right solution to their problem - which is very likely not the case.

So this is already void as to my situation.

Sorry, but if you think the test setup would not fit your situation then you are clearly wrong.

it was also a problem on cable broadband networks which had upstream congestion on DOCSIS, that upstream congestion would affect the downstream

Which is mainly due to the strongly asymmetrical ratio of down- and upload for fast cable connections, but cake also does support ack-filtering out of the box.

There was also observed unfairness in TCP flows compared to FIFO, it seems you have linked a document you have not read.

I did read the entire document, and FQ-CoDel beats FIFO in regards to RTT fairness unless you are on a connection with RTTs > 200 ms, which you most likely are not.

The job is purely to restrict the congestion window of the sender in the most aggressive means possible.

There are only two ways to slow the sender down: a) ECN b) dropping packets ...

I dont know how long the thread will be up

Is this the new trend of people asking for their posts to be deleted the first time they see criticism?

thiasaef

@chrcoluk said in Sharing my recent discovery on shaping downstream with limiters.:

I send one set of traffic to one limiter, and another set of traffic to another limiter

This is probably the reason why you failed at setting up FQ-CoDel properly. Combined with a basic misunderstanding of how intentional packet loss is an essential part of traffic shaping.

The UK Industry routinely rate limits every DSL line, as its a requirement from the wholesale provider BT wholesale.

Which forces you to undercut their rate limit in order to gain control over the queue.

Reference: https://support.aa.net.uk/CQM_Graphs.

chrcoluk

@thiasaef You have said you read the document and you read my post.

Yet your replies are contrary.

Some examples, you mention FQ-CODEL win out on RTT fairness, my issue isnt on RTT fairness its on the wrong packets been dropped. Higher latency isnt great but its the lesser evil over packet loss.

You are trying to imply I have misunderstood something yet I clearly stated I read the documentation from the original developers of dummynet.

To duplicate my environment you would have to do the following.

Setup a DSL line in the UK using the openreach infrastructure.
Use an ISP that has their own traffic policing including rate limiting the line below the sync speed. (all uk DSL isps are required to do this under openreach t&c's).
Use a zyxel dsl vmg8924 as your modem in bridge mode.
Use pfSense as your PPPoE endpoint.

Until you have this kind of setup you are not really able to comment on my experience in any kind of authoritative way, a test lab is not a reasonable duplicate.

The reason I asked you to stop replying is not that I dont accept a disagreement or questioning but rather there is a lack of respect in your responses, instead of saying something like "hmm thats interesting it doesnt match my experience", its more like "you are wrong, and your experience with fq-codel is impossible unless you misconfigured it" you see the difference, not to mention the dozens of lines that show you have not read the documentation you linked to, and the dummynet documentation and my post. So when I see someone not making the effort to do that yes, then I wont like it as at this point you just trolling.

Also to mention I am not alone in these thoughts, I received supportive DM's of people fearful to comment on this thread because of the hostility of your replies (the same issues on fq-codel with their home broadband). The reason I am making this reply.

thiasaef

"your experience with fq-codel is impossible unless you misconfigured it"

I never said that.

Yet your replies are contrary.

My reply is still the same:

If possible use CAKE (preferably with the ingress keyword) if you want to maximize throughput for a given increase of latency (which ultimately effects the packet loss rate).
Question your solution if you have to sacrifice 40 % of your throughput in order for FQ-CoDel to have no packet loss on sparse flows.

not to mention the dozens of lines that show you have not read the documentation you linked to

I leave it to the reader to judge who has neither read nor understood it.