Floating bandwidth value for shaper (bufferbloat checker)

w0w

If we are using traffic shaper to eliminate bufferbloat, then we need to limit bandwidth as well, 80-95% according to some sources, but here comes another problem, sometimes this limit needs to be adjusted depending on ISP channel load or day of time or anything else or you just do not to waste those bandwidth percents you paid for, so the main idea is to use scripting to change bandwidth value depending on ping value. We know current limit and current load, we can ping ISP gateway every n seconds, so actually we can collect statistics of current link saturation and change bandwidth value according to it, ex. from 50% to 100%.
Is it a good idea? What cons do you see?

Nullity

@w0w:

If we are using traffic shaper to eliminate bufferbloat, then we need to limit bandwidth as well, 80-95% according to some sources, but here comes another problem, sometimes this limit needs to be adjusted depending on ISP channel load or day of time or anything else or you just do not to waste those bandwidth percents you paid for, so the main idea is to use scripting to change bandwidth value depending on ping value. We know current limit and current load, we can ping ISP gateway every n seconds, so actually we can collect statistics of current link saturation and change bandwidth value according to it, ex. from 50% to 100%.
Is it a good idea? What cons do you see?

It's more complex than that.

From an upload perspective, your CPE modem is the most likely culprit of bufferbloat. (LAN)

From a download perspective, it's likely your ISP's network. (WAN)

Beyond both of those, there are many internet nodes that may be the bandwidth bottleneck at any given time. Even if you "know" certain aspects of your LAN network or your ISP's network, there are still unknowns beyond that.

richb-hanover

@w0w:

… We know current limit and current load, we can ping ISP gateway every n seconds, so actually we can collect statistics of current link saturation and change bandwidth value according to it, ex. from 50% to 100%.

It's more complex than that, in a different way from that mentioned by @nullity.

Sometimes, traffic is below the current setting/limit because there isn't any traffic. You wouldn't want to adjust down the limit in that case. So you do need to know the load.

But it isn't easy to know the current load. You'd need to devise a measurement of the "percentage of the bottleneck link that's being used for a long duration", say over 2-5 seconds. (Obviously, it's 100% used during a packet transmission…) I'm not aware of any measurements in a stock kernel that provide this.

That's why bufferbloat tests create their own artificial load to ensure that the link is saturated while measuring.

w0w

@Nullity:

@w0w:

If we are using traffic shaper to eliminate bufferbloat, then we need to limit bandwidth as well, 80-95% according to some sources, but here comes another problem, sometimes this limit needs to be adjusted depending on ISP channel load or day of time or anything else or you just do not to waste those bandwidth percents you paid for, so the main idea is to use scripting to change bandwidth value depending on ping value. We know current limit and current load, we can ping ISP gateway every n seconds, so actually we can collect statistics of current link saturation and change bandwidth value according to it, ex. from 50% to 100%.
Is it a good idea? What cons do you see?

It's more complex than that.

From an upload perspective, your CPE modem is the most likely culprit of bufferbloat. (LAN)

From a download perspective, it's likely your ISP's network. (WAN)

Beyond both of those, there are many internet nodes that may be the bandwidth bottleneck at any given time. Even if you "know" certain aspects of your LAN network or your ISP's network, there are still unknowns beyond that.

Thank you for the answer!
Yes it can be complex, but it would be a little bit better than nothing?

I don't care about those "many internet nodes that may be the bandwidth bottleneck at any given time", just because it's not my problem. We are talking about bufferbloat and for many users the first point will be ISP router or modem or other equipment that uses too long queue lenght when bandwidth limit is achieved, as for me ISP router is stable enough, but sometimes, under heavy traffic it's not so responsible and I can see it just pinging some ISP IPs outside the router, this what bufferbloat actually is.
When I tune manually my bandwidth value down, then bufferbloat goes away, but I don't want to stay on this value, just because it's too low for the rest of time.
Yes, you right the logic must be complex, there is something to think about.

w0w

@richb-hanover:

@w0w:

… We know current limit and current load, we can ping ISP gateway every n seconds, so actually we can collect statistics of current link saturation and change bandwidth value according to it, ex. from 50% to 100%.

It's more complex than that, in a different way from that mentioned by @nullity.

Sometimes, traffic is below the current setting/limit because there isn't any traffic. You wouldn't want to adjust down the limit in that case. So you do need to know the load.

But it isn't easy to know the current load. You'd need to devise a measurement of the "percentage of the bottleneck link that's being used for a long duration", say over 2-5 seconds. (Obviously, it's 100% used during a packet transmission…) I'm not aware of any measurements in a stock kernel that provide this.

That's why bufferbloat tests create their own artificial load to ensure that the link is saturated while measuring.

It's a little bit different, the bufferbloat test creators don't really know your bandwidth limits and load, thats why they need to emulate artificial load on you and try to achieve those limits, causing ISP modem or any equipment to bloat.
In our case we already have all data, limits, current traffic, states and so on.
I think it is possible to detect some slowdowns even just monitoring queue lengths, something like CODEL already do…
The logic must be complex and not just lowering bandwidth but detect also is there some result after lowering to desired minimum value, if this result is missing, the bandwidth should be restored to previous maximum value and stay untouched for period of time — this should prevent from lowering bandwidth limit in case when bufferbloat is not possible to eliminate by limiting bandwidth on user side.

Harvy66

There are other distros with scripts that do exactly what the OP mentioned. Ping a known target and when the ping goes above some threshold, the script starts to lower the provisioned bandwidth in the shaper down to some configured floor.

I guess it works well enough to be better than nothing. Or do what I do and pay $50/m for a 150/150 dedicated fiber connection with guaranteed zero congestion within the ISPs network and the ISP has 6x more trunk bandwidth than their 95% percentile. /sigh I you move out here, you'd better get used to bar hopping or cow tipping for entertainment, but at least you'll have crazy awesome pings to Chicago game servers. Why ever leave home?

Nullity

Gargoyle has a feature called "Active Congestion Controller". I've seen other scripts that either monitor bandwidth/ping or run occasional speedtests to determine optimal QoS bandwidth.

My connection is stable though, so I never tried any of them.

It'd be great to have dynamically adjusting QoS for unstable internet connections but I dunno if any of the mentioned solutions work well in practice.

belt9

For my connection the WAN bandwidth avaioable to me varies a lot depending on the time.

Just like the OP if I limit the value enough then bufferbloat isn't a problem, but I don't want to limit myself to 60% of my connection all the time just because it dips there occasionally.

For me at least, a script that would temporarily adjust bandwidth to a definable floor value based on a ping (or maybe pings to several different places? Would be a LOT better than nothing.

If someone writes that script for pfSense, please share on this thread!

w0w

Thanks Harvy66 and Nullity!
I will look at Gargoyle and it's "Active Congestion Controller".

w0w

https://github.com/ericpaulbishop/gargoyle/blob/master/package/qos-gargoyle/src/qosmon.c
https://dev.openwrt.org/attachment/ticket/8536/qosmon.patch (openwrt implementation)

I do think it's possible to use the same or similar logic on FreeBSD/pfSense, but we really need some help from professionals. Should I create a bounty for it or there are other plans at Netgate already?

Nullity

@w0w:

https://github.com/ericpaulbishop/gargoyle/blob/master/package/qos-gargoyle/src/qosmon.c
https://dev.openwrt.org/attachment/ticket/8536/qosmon.patch (openwrt implementation)

I do think it's possible to use the same or similar logic on FreeBSD/pfSense, but we really need some help from professionals. Should I create a bounty for it or there are other plans at Netgate already?

I highly doubt that the Linux-centric code (I see references to iproute2 & tc) is useful to FreeBSD.

I think I saw someone on these forums say they were working on a pfSense script (python, sh, etc?) that accomplishes practically what qosmon does. I dunno if they ever released anything… the post was also a few years old IIRC.

Really, it should not be too hard to script something that monitors a ping command and/or bandwidth.

I'll try to remember where I put another Linux-based QoS (shell) script that dynamically adjusted bandwidth based on bandwidth/ping, since it's likely more portable and it would also be useful to see their approach. I got it from some GPL request.

w0w

Yep, I did not mention that we need to port some linux code, but at least we have working piece of code that can be used as logic sample in a new bash script or better PHP code… Just dreaming 8)
Thanks Nullity, I hope you'll find something good enough!

belt9

Feature request added:

https://redmine.pfsense.org/issues/7904

belt9

The feature request was accepted but needs your input!

@Jim:

It's possible in some specific circumstances, but I don't see one of those being a way that would work properly with dummynet (limiters). It also still requires you to probe an external source at a specific destination (which would always have to respond as fast as your circuit allows), gets more difficult at higher bandwidth amounts (Takes ~16 minutes to probe minimum on a high speed link), and assumes all latency is from throughput saturation, which isn't necessarily true.

If someone comes up with a viable implementation, I'd love it see it work, because it would be useful. But like many ideas, it's simple to think of but difficult to implement.

So please provide your input on how to implement at the redmine feature request:
https://redmine.pfsense.org/issues/7904

w0w

This just sounds like it will be never coded by Netgate until they have ALREADY working source for FreeBSD. Thats sounds bad, but there is no way to affect this, only invest money or doing it by yourself and bring them working solution to implement.
Some interesting thing is TEACUP experiments — http://caia.swin.edu.au/tools/teacup/
http://caia.swin.edu.au/reports/161107A/CAIA-TR-161107A.pdf
Personally I am ready to invest some money into QoS monitor for FreeBSD (or anything with the same function) and also I am ready to pay for GUI version of dummynet FQ-Codel, the question remains, that Netgate just do not have resources for low priority tasks and new features, that are not needed by most of their customers.

w0w

OK… Another one speculation. We are using shaper and no matter what, when it really works on full load you will see packet drops and when your bandwidth limit is wrong (above the real limit), you have no drops, but your latency is growing.
SO may be we need both monitor queue drops on full load, ping RRT, and depending on it , change the bandwidth limit more quickly and/or accurately.

belt9

Yeah it sounds like it will either be completely user built, or possibly given a higher priority if it is better defined.

If someone can chime in on redmine with a specific technical outline of how it would work in freebsd reliably then it might be assigned a higher priority.

Even if it is assigned a higher priority at netgate it will likely be months to years before it is implemented in a Release version.

This won't happen fast unless someone(s) with the know how does this and shares it with the community.

Harvy66

The simplest way would be to allow the end user to supply a high and low mark for ping and possibly loss. When above the high marks, reduce the assigned bandwidth by some rate, probably N% every X seconds. When below the low mark, increase the bandwidth. You would also want some sort of sampling window size and sample rate. An example would to ping a target N times per second, and recalculate every X seconds.

I could see a default of something like 2 pings per second averaged over 10 seconds with a high mark of 120ms and a low of 40ms. When the high mark is hit, it will lower the assigned bandwidth by 10% and when the low mark is hit, raise it by 5%.

My guess is this would be an 80/20 rule that would be easy to implement.

edit: Probably have an absolute minimum assigned bandwidth, probably defaulted to 50%.

belt9

Please pass that along on redmine to try to get it Escalated!

cplmayo

I would also be willing to chip in on a bounty for this and the GUI for fq_codel dummy net.

These features have a lot of traction for home and SMB connections that do not have the budget for dedicated fiber.

While not a security issue features such as these just make pfSense a more well-rounded product.