Calculating the required bandwidth for ACK queues for asymetric link

eri--

Thank you for helping in this, 2 minds are always better than 1.

Then the wizard/daemon should simulate the policy-making-and-adjusting process of an exprerienced user, right? That seems to be a different approach (as opposed to optimalization). It's more practical and simpler to implement.

We get rid of the need to define optimalization criteria. (So, the policy need not be optimal in any sense.)

There is good chance that such a policy works well.

Problem is, no one know exactly how such a policy looks like. So let's discuss it.

Well just an update, the new wizard uses percentages values for getting input from user. Since there can be multiple links, which the wizard now supports, it is not preferable to ask the user how much he/she wants for each separate link so to me percentages made the most sense. They allow me to apply the correct policy to all the links selected, which might be of different scheduler even.

My comments and point of discussion are below.

Here a policy that works fine for me (as a home user). The basic rationale behind is to apply run-time curves whenever there is VoIP and/or game traffic, and to apply link-share curves otherwise.

A. Root queues

Set qWANroot and qLANroot to 80% up/downlink bandwidth. (Maybe 90-95% for fast [mb/s] links.)

B. Sub-queues

For each up/down link, use only seven predefined sub-queues – p2p, low, default, high, games, voip, ack, and set the following queue priorities (from lowest to highest).

Priority 1. p2p
Priority 2. low
Priority 3. def[ault]
Priority 4. high
Priority 5. game
Priority 6. voip
Priority 7. ack

The number of queues might be less depending on what the user chooses, but the wizard will not create more than this queues if you select all options.
I select the default queue depending if the user has selected p2pcatchall or not. If he does p2p becomes the default queue otherwise default queue will keep all uncategorized traffic and default queue always has better priority than p2p and lowerPriority queue.

Keep in mind we are talking about the modified wizard that is on RELENG_1 tree.

C. Upper-limit curves

1. Set upper-limit curves for p2p queues only.

2. Suitable upper limits of qWANp2p and qLANp2p are 80% of qWANroot and qLANroot, respectively.

1- Already do.
2- Do not need to since default already do that. For your information, there is no more a qWANroot/qLANroot since there is one provided by HFSC discipline by default. That's why i do not need to setup those queues and i setup upperlimit of p2p for now to 20- 30% of the link.

D. Linkshare curves

1. The link share (i.e., m1 and m2) of qWANack should be made as large as required (by calculation, excluding VoIP and game activities).

2. The link share of qWANhigh should be made several (say, 2 - 3) times larger than that of qWANdef.

3. The link share of qWANdef should be made several (say, 4 - 6) times larger than that of qWANlow.

4. The link share of qLANlow should be the same as that of qWANp2p.

5. The link shares of qLANack, high, def, low, p2p are set similarly.

To me makes more sense playing with delay more than bandwidth since the overall result is the same.
Even more, for HFSC i would really question the need of an ACK queue unless i know the link latency and configure it to not get in between VoIP and Games shaping. Since VoIP is not a consumer of ACK queue and Games are questionable about that, see below.

E. Realtime curves

1. If VoIP or games also use ACK queues, set qWANack's m1 as large as required (by calculation, excluding everything in qWANp2p, qWANlow, qWANdef, qWANhigh).

2. Set qWANack's m2 = m1.

3. Set qLANack's m1 and m2 similarly.

4. For qWANvoip, given a required number of concurrent calls and average voip packet size, set m1 such that an average packet shall not delay longer than specified. Set qLANvoip's m1 similarly. If the 'given' parameters are not given then, as the default, set the required number of concurrent calls to 1, the average packet size to 150 bytes and the maximum allowable delay to 10 ms (so, qWANvoip's m1 = qLANvoip's m1 = 120 kb/s per call).

5. qWANgame and qLANgame's m1 are set similarly to voip. Note however that the average packet size (and the maximal allowable packet delays) for need not be the same for incoming and outgoing direction. If the 'given' parameters are not given then, as the default, set the required number of concurent local players to 1, the average packet size to 200 bytes in both directions, and the maximum allowable delay to 50 ms in both directions (so, m1 = 32 kb/s per player for both qWANgame and qLANgame). Also note that in some cases, after allocating m1 to qWANack and qWANvoip, taking into account that the sum of real-time service curves must not exceed 80% qWANroot, few percentage of m1 remains and the best we can do is simply to allocate the rest to qWANgame, thus we can only guarantee the game delay as low as possible.

6. [Part of] the rest of m2 is allocated to qWANvoip's m2 and qWANgame's m2 in amounts required for reasonable traffic. Similarly for qLANvoip's m2 and qLANgame's m2. If not further specified, the 'reasonable traffic' means 32 kb/s per concurrent VoIP call plus 32 kb/s (in both directions) per concurrent local game player.

1- VoIP will not use ACK i have not seen any phone using TCP for this kind of traffic and i think we know that tcp gives pretty bad latency ofr such traffic.
2 and 3 - i do not see the reason for that, since i need still to be convinced that ACK queue does not get in the tracks of VoIP and Games in HFSC case.
4- Well now i set m1 = 25% of the link d = 30ms and m2 = 5%-20% depending on the user choices. That should take care of most home setups others might need to tweak it.

5- In the new shaper i use percentages as input from the user which makes me unbound to the speed specified for LAN/WAN or whatever number of links you have in the box. For Games d = 50ms will be OK(i think, but i am not much of a gamer), and HFSC realtime scheduler should overcommit when needed in the bounds of 80% so i think its better to leave it to choose what to do. Even though that 80% upperlimit for realtime is constructed to allow safe overcommit when needed and giving it some more amount where to overcommit safely is better.
For ACK queue for game i don't think is needed since usual home setups only have 2 - max 4 players and the delay can be guaranteed by the real time scheduler. Taking in consideration that most setups are asymmetric the bandwidth is there already and only delay is the culprit.

6- Well i think that most of the user will choose more then needed bandwidth during the wizard so that amount will be distributed accordingly. I think that delay parameter to the linkshare part of Games, OthersHigh and VoIP queue would need more discussion than the remaining bandwidth.

dusan

Well it appears that in essence, my figures agree with yours. Only few details diverge.

The delay. As far as I know, the maximum allowable delay is specified by m1, not d. (Thus if m1 = m2, then d does not matter at all.) If we require that an average packet of S bytes shall not delay longer than D ms (assuming single voip call/game player), we do that by setting m1 to

m1 = 8 * S / D [kb/s]

(kb reads 'kilobit'.)

The unit of m1 and m2 of real-time curves. I prefer specifying m1 and m2 in kb/s rather than in %. That's because the link is asymmetric, the same m1 kb/s in both directions would result in significantly different % of uplink and downlink. For S=150 bytes, m1 = 120 kb/s (per concurrent voip call) guarantees D=10 ms regardless of the link, while m1= 25% of 1024 kb/s downlink would guarantee D = 4.7 ms, and m1=25% of 256 kb/s uplink would guarantee D = 18.8 ms. Although 4.7 ms and 18.8 ms may be both acceptable, they look like a choice at random rather than an exact delay specification.
The meaning of the d parameter. I believe that d should be directly related to D and never less than D. For a few N concurrent voip calls (or concurrent game players) and the required maximum delay D ms, the suitable d is

d = (N+2) * D [ms]

One may also set d = (N+1) * D or N*D.

Thus if N = 1 and D = 10 ms (VoIP) then d = 10-30 ms,
and if N = 1 and D = 50 ms (games) then d = 50-150 ms.

eri--

Right, but i can have the problem that the user selects CBQ for downlink and HFSC for uplink and i could not translate the formula above to have some meaning for CBQ.(Personal thought, i could only if i patch PF to let me set the maxbust packet of CBQ :). I will think about handling different schedulers some more and let you know what i choose, in the mean time if you have a proposal apart a different wizard for different cases(which even might consider after all the discussion and your help) i will consider it.
My choices in percentages are forced from the generalization the wizard should give.
Yeah d is propperly what you understand it to be. As i said above, i would think about this better to see what conclusion i will arrive.

Thanks again for your support.

dusan

I have nothing against expressing every bandwidth in percentage or against the single-wizard-for-multiple-scheduler. It is nice and it should work. But I think it should offer at least two set of percentages – one for uplink(s), one for downlink(s) -- that may be specified independently. That's not because of the use of different schedulers, that's because of the link asymmetricity.

Then, for example (uplink,downlink) = (200,800) [kb/s] (after 20% cut off),

requirement like qWANack's linkshare m2 = 40 kb/s become 20%,

requirement like qLANack's linkshare m2 = 16 kb/s become 2%,

requirement like qWANvoip's realtime m1 = qLANvoip's realtime m1 = 120 kb/s become 60% and 15%, respectively.

And these percentages can be translated meaningfully (maybe taken as-is) from HFSC to other percentage-aware schedulers as well.

eri--

Than multiple wizards it is.

1- CBQ or HFSC only WAN-LAN wizard simple setup( 1 uplink/ 1 downlink )
2- CBQ or HFSC only multiple links. You specify which are the uplinks which are the downlinks and you enter values for each of them indipendently.
All values can be specified with whatever you prefer (%|Kb|Mb).

3- Mixed schedulers multiple links. Just specify bandwidths in percentage and tweak it after the wizard.

And some other for DMZ setups and such.

This way everybody is happy and we provide a better product.

eri--

@dusan:

Well it appears that in essence, my figures agree with yours. Only few details diverge.

The delay. As far as I know, the maximum allowable delay is specified by m1, not d. (Thus if m1 = m2, then d does not matter at all.) If we require that an average packet of S bytes shall not delay longer than D ms (assuming single voip call/game player), we do that by setting m1 to

m1 = 8 * S / D [kb/s]

(kb reads 'kilobit'.)

The unit of m1 and m2 of real-time curves. I prefer specifying m1 and m2 in kb/s rather than in %. That's because the link is asymmetric, the same m1 kb/s in both directions would result in significantly different % of uplink and downlink. For S=150 bytes, m1 = 120 kb/s (per concurrent voip call) guarantees D=10 ms regardless of the link, while m1= 25% of 1024 kb/s downlink would guarantee D = 4.7 ms, and m1=25% of 256 kb/s uplink would guarantee D = 18.8 ms. Although 4.7 ms and 18.8 ms may be both acceptable, they look like a choice at random rather than an exact delay specification.

The meaning of the d parameter. I believe that d should be directly related to D and never less than D. For a few N concurrent voip calls (or concurrent game players) and the required maximum delay D ms, the suitable d is

d = (N+2) * D [ms]

One may also set d = (N+1) * D or N*D.

Thus if N = 1 and D = 10 ms (VoIP) then d = 10-30 ms,
and if N = 1 and D = 50 ms (games) then d = 50-150 ms.

I want to ask you something i do not really found an answer or am totally sure of.

What is upperlimit m1 meaning:
1- is it burst?!
2- is it that for such delay you will not get more than this bandwidth(although this sounds like 1-)?!

dusan

What is upperlimit m1 meaning:
1- is it burst?!
2- is it that for such delay you will not get more than this bandwidth(although this sounds like 1-)?!

To tell the true, I don't know. It's not very well documented.

Assume that the upper-limit curve does what it should do, i.e. as a upper-limitting curve, then

A packet (as a point plotted in the time - service coordinate system) receives service only if it is plotted on or below the curve.
The curve moves whenever the linkshare curve moves.

If this is truth, one may think of upperlimit m1 as a minimal delay specification.

dusan

Here is an (empirical) open formula to calculate qWANack and qLANack for home user:

log(X/A) = 0.8 * log(B/A) + log(0.0558)
log(Y/B) = -0.8 * log(B/A) + log(0.0558)

Linear programming (LP) with 12 traffic patterns was used to estimate maximal qWANack and qLANack utilizations. The formula was then built from the estimated figures. It approximates the figures quite well (see table).

Using LP Using formula
B/A X/A Y/B X/A Y/B
=== ====== ===== ====== =====
1 5.58% 5.58% 5.58% 5.58%
2 10.39% 3.17% 9.72% 3.20%
3 15.21% 2.37% 13.44% 2.32%
4 20.02% 1.97% 16.92% 1.84%
5 24.83% 1.73% 20.22% 1.54%
6 29.64% 1.57% 23.40% 1.33%
7 34.09% 1.44% 26.47% 1.18%
8 37.34% 1.31% 29.45% 1.06%
9 40.58% 1.21% 32.36% 0.96%
10 43.83% 1.13% 35.21% 0.88%
11 47.08% 1.07% 38.00% 0.82%
12 50.32% 1.01% 40.74% 0.76%
13 53.57% 0.97% 43.43% 0.72%
14 56.82% 0.93% 46.08% 0.68%
15 60.06% 0.89% 48.70% 0.64%
16 63.15% 0.86% 51.28% 0.61%
17 65.77% 0.84% 53.83% 0.58%
18 68.39% 0.82% 56.34% 0.55%
19 70.25% 0.79% 58.84% 0.53%
20 71.64% 0.75% 61.30% 0.51%

Edit 2008-01-30: The table is now correct.

eri--

From the code seems that your conclusion makes the more sense.
The code updates the delay of a linkshare curve, after it has been selected as the winner, to that of the upperlimit only if it will exceed what the upperlimit guarantees. So it makes a guarantee that the next time linkshare takes on it will be in the bounds specified in the upperlimit.

So this makes the most sense of it. It might be used as a burst if careful enough though its usage is for something else.

eri--

@dusan:

Here is an (empirical) open formula to calculate qWANack and qLANack for home user:

log(X/A) = 0.8 * log(B/A) + log(0.0558)
log(Y/B) = -0.8 * log(B/A) + log(0.0558)

Linear programming with 12 traffic patterns was used to estimate maximal qWANack and qLANack bandwidths. The formula was then built from the estimated figures. It approximates the figures quite well (see table).
	Using LP		Using formula	
B/A 	X/A	  Y/B	  X/A	Y/B
1	5.58%	5.58%	5.58%	5.58%
2	9.62%	3.08%	9.72%	3.20%
3	13.53%	2.21%	13.44%	2.32%
4	17.44%	1.78%	16.92%	1.84%
5	21.35%	1.52%	20.22%	1.54%
6	25.26%	1.35%	23.40%	1.33%
7	29.17%	1.23%	26.47%	1.18%
8	33.08%	1.13%	29.45%	1.06%
9	37.00%	1.06%	32.36%	0.96%
10	40.91%	1.00%	35.21%	0.88%
11	44.82%	0.96%	38.00%	0.82%
12	48.73%	0.92%	40.74%	0.76%
13	52.64%	0.88%	43.43%	0.72%
14	56.55%	0.86%	46.08%	0.68%
15	60.06%	0.83%	48.70%	0.64%
16	63.15%	0.79%	51.28%	0.61%
17	65.77%	0.74%	53.83%	0.58%
18	68.39%	0.70%	56.34%	0.55%
19	70.25%	0.67%	58.84%	0.53%
20	71.64%	0.63%	61.30%	0.51%

Great, i was just dumping ACK queue but with this i have an approximation for it.

Thanks.

dusan

Updated formula:

log(X/A) = 0.773765872 * log(B/A) + log(0.0739086792)
log(Y/B) = -0.773765872 * log(B/A) + log(0.0739086792)

X/A, Y/B by linear programming and by the formula, corrected:

Using LP Using formula
B/A X/A Y/B X/A Y/B
=== ====== ===== ====== =====
1 5.58% 5.58% 7.39% 7.39%
2 10.39% 3.17% 12.64% 4.32%
3 15.21% 2.37% 17.29% 3.16%
4 20.02% 1.97% 21.60% 2.53%
5 24.83% 1.73% 25.68% 2.13%
6 29.64% 1.57% 29.57% 1.85%
7 34.09% 1.44% 33.31% 1.64%
8 37.34% 1.31% 36.94% 1.48%
9 40.58% 1.21% 40.46% 1.35%
10 43.83% 1.13% 43.90% 1.24%
11 47.08% 1.07% 47.26% 1.16%
12 50.32% 1.01% 50.55% 1.08%
13 53.57% 0.97% 53.78% 1.02%
14 56.82% 0.93% 56.95% 0.96%
15 60.06% 0.89% 60.08% 0.91%
16 63.15% 0.86% 63.15% 0.86%
17 65.77% 0.84% 66.19% 0.83%
18 68.39% 0.82% 69.18% 0.79%
19 70.25% 0.79% 72.14% 0.76%
20 71.64% 0.75% 75.06% 0.73%

eri--

Can i assume this is right now?!

I am just asking to be sure and i would hate to rerun the calculations again on this :).

dusan

The calculation is now formally exact and the formula approximates it tightly. However, for K = 20, the old formula gives X/A = 61% which corresponds to the largest qWANack I ever hear of (see hoba's post in this thread), while the new formula shows something else: X/A=75%, which is clearly more aggressive and may be over-estimate.

Which formula is better, I don't know. I think the better approach is to use the old as the default, users should apply the new only if they'll experience packet drops.

eri--

@dusan:

The calculation is now formally exact and the formula approximates it tightly. However, for K = 20, the old formula gives X/A = 61% which corresponds to the largest qWANack I ever hear of (see hoba's post in this thread), while the new formula shows something else: X/A=75%, which is clearly more aggressive and may be over-estimate.

Which formula is better, I don't know. I think the better approach is to use the old as the default, users should apply the new only if they'll experience packet drops.

if i solve this formula(the old one) for X i get:

log(X) = 0.8 * log(B) + 0.2 * log(A) - log(0.0558)

and for A=512 and B=128 i get X = 9.X ?!

Have i made an error somewhere?!
Or does this formula gives a result in percentages?!
Meaning i should do a 9% * A(512) after it. Sorry for the dumb question just to be sure.

dusan

@eri--:

if i solve this formula(the old one) for X i get:

log(X) = 0.8 * log(B) + 0.2 * log(A) - log(0.0558)

You may use it this way (to compute X directly), of course. When posted, it was meant to compute the ratio X/A from the known ratio B/A, however.

Just a typo: the sign of log(0.0558). I know that you used + and just mistakenly typed - here.

@eri--:

and for A=512 and B=128 i get X = 9.X ?!

Have i made an error somewhere?!

There is no error. X = 9.42445330 [kb/s] and X/A = 0.0184 = 1.84%. Just that it is for uplink four times faster than downlink, which may be unusual (for home user) and may not what you mean for.

Note that X/A=1.84% in this case (B/A = 1/4) equals to Y/B in the case of B/A = 4 (see the old table). In general, the formula has a property that a link with B/A = 1/K behaves exactly like a link with B/A = K in reverse orientation, i.e.

X/A * Y/B = (5.58%)^2 = const

That's because I've modelled the link using pairwise symmetric traffic patterns. For example,

FTP upload pattern: (1, 0.005, 0.001, 0.025)
FTP download pattern: (0.005, 1, 0.025, 0.001)

then, I've put a straight line as an approximation of the resulting curves (B/A,X/A) and (A/B,Y/B) in (log,log)-scale coordinate plane (see attached figures).

Also note that the formula gives X/A > 100%, which is clearly nonsense, in case of large B/A. The suitable range of direct applicability is about 1/20 <= B/A <= 20. If B/A is out of the range, one may take the result from the nearest in-range case (i.e., B/A = about 1/20 or 20).

qwanack.gif_thumb

qwanack-old.gif_thumb

eri--

You may use it this way (to compute X directly), of course. When posted, it was meant to compute the ratio X/A from the known ratio B/A, however.

I know i can use it this way, math has the rules to allow it.
Furthermore, i arrived to a conclusion that with a +-10% tolerance i can simplify it to
log(X) = 0.8 * log(A) so it will be mostly OK for most of users and remove the dependency on the other side.

One thing i noticed about ALTQ_HFSC is that parameter m1 is just the burst/bandwidth that will be guaranteed by the scheduler for d time, for a realtime queue.
So usually it gets back to m1 = packet size of the application, d = the latency you want guaranteed for it and m2 = the bandwidth you want it to get assured in the long run.

The last paragraph just to make this complete.

dusan

@eri--:

Furthermore, i arrived to a conclusion that with a +-10% tolerance i can simplify it to
log(X) = 0.8 * log(A) so it will be mostly OK for most of users and remove the dependency on the other side.

I don't think that it can be simplified that way. X in the original formula depends on on both A and B, while X in the simplified formula depends on A only. Thus the simplified formula is not an approximation of the original (they're completely different and non-comparable instead) and we may not talk about any tolerance. I doubt such a simplification would be helpful.

eri--

Formally they are very different.
Practically, with that tolerance the values ressemble pretty much the original for the set of bandwidths of home users.

dusan

That's an unfounded argument.

Assume, as an example, A = 1 [mb/s]. Using the simplified formula we would obtain X = 1 [mb/s] regardless of the downlink (B).

Using the original formula, with several possible downlinks B = 1, 2, 5, 10, 20 [mb/s] we have X = 0.0558, 0.0972, 0.2022, 0.3521, 0.6130 [mb/s], respectively.

Relative error of the simplified formula against the original is 1692%, 929%, 395%, 184%, 63%, respectively. So, the simplified formula is not 10%-tolerated.

The ratio of X in the last case (B=20) against X in the first case (B=1) is 0.6130/0.0558 = 1099%. This is big enough to see that no single value of X suits both the cases and, consequently, formulas computing such X can never be tolerated.

Nostradamus

Hmm… I don't understand a shit, hehe hehe

I like to setup traffic sharper with max 25/10 mbit for all IP's in this IP Range 192.168.100.100-192.168.100.111 with full max speed to all protocols, like P2P, everything. And give those users in this IP Range 192.168.100.112-192.168.100.125 max 10/5 mbit with low priority of P2P user and other traffic use. But still have 100mbit speed on LAN(local network, i have a server).
Please don't come with those formulas, because i don't understand:) I need pictures, hehe

Cheers