Calculating the required bandwidth for ACK queues for asymetric link

dusan

Thank you for your data.

Based on the "400 kbit/s ack only traffic to download with 16 mbit/s", I added another pattern, "FTP client", which is (epsilon, 1, 0.025, epsilon/5), where epsilon is somewhat arbitrarily chosen (I let it be 2.5E-4).

I've tried using them as a next example of calculation:

K=20. A=0.8, B=16 [mb/s].
"p2p upload" and "p2p download" are constrained by an upper limit of 10%.
"other bulk upload" and "other bulk download" are constrained by an upper limit of 20%.
"web server" and "web surf" are excluded (i.e. upper limited by 0).

Let v7 be the "activity" variable for FTP client (unconstrained).

Given that, WAN acks reached its maximum = 60,59% (while LAN acks reached 0.34% which isn't its maximum) at (v1 = 1.6; v3=2.483; v7=11.863), both links saturated.

As noted, my traffic patterns are only illustrative ("bulk download" and "bulk download" was based on audio streaming), to be accurate one would collect his/her own FTP download traffic patterns, say, from the values shown by pftop in its "Queues" screen in "P/S" (packets/second) and "B/S" (bytes/second) columns for qWANack, qLANack, qWANothers, qLANothers, qFTPup, qFTPdown when running (almost) solely FTP at full load.

Where qWANothers (qLANothers) is the sum of all uplink (downlink) queues except qWANack (qLANack), qFTPup and qFTPdown are the queues containing FTP traffic.

hoba

If you want to calculate a trafficshapingsetup for my connection and give me advice how to configure it or even sending me the trafficshaperpart of the config.xml (you can backup only this part from your pfSense) I can test it for you and report back how well these settings work. If we get some good results we can start integrating your formulars into the wizard ;D

dusan

As you can see, even with not-so-cool traffic patterns, my first attempt for an open formula (post#1) fails. For now, I can only offer closed formulas which must be solved by linear programming solvers.

Linear programming based on traffic patterns cannot answer the "What should I do" question, so it cannot recommend any optimal traffic shaping setup. It can only answer the "What if" questions. For example:

What maximum % qWANack would reach if I run p2p limited by 80% bandwidth, surf Web and check email at most 50% bandwidth and download FTP unlimited?
Isn't that maximum too overestimated if I want to exploit my downlink at 90%?

Note that if we want to predict the behavior with respect to linkshare setup, we must constrain not only the variables v_i but also ratios between some of them. Then the problem goes beyond the scope of linear programming. However, one can still analyze it with the same "traffic pattern" approach, using a non-linear solver (MS Excel, for example).

Can we still develop an (empirical) open formula? Yes we can, but much works must be done before any second attempt:

1. Collect more realistic traffic patterns.

2. Select several of them as representatives.

3. Analyze and estimate the maximum bandwidth of qWANack (maybe qLANack too) for K=2, 4, 8, 16, 20, 24 and 32 (if any).

4. Let people try them out. Listen to feedback.

If no problems are reported, build the formula and (if it's desired,) integrate it to the wizard.

–--------------------
Hoba,

Pls, PM me your email address and I'll send you my spreadsheet. (You'll need MS Windows and Excel.)

sullrich

What about a userland daemon that watches traffic and dynamically alters the HFSC profile as traffic patterns change?

Please send me a copy of your spreadsheet as well (sullrich@gmail.com).

Thanks!

dusan

Adaptive traffic shaping? Would be great. There is a software named QoSbox that change linkshare dynamically. Maybe you'll find it helpful. For more details, refer to http://www.cs.virginia.edu/~mngroup/projects/qosbox/docs.html.

In the context of ACK queues in HFSC, however, I think it's not so hot topic. Because it's almost surely harmless assigning some large value (say, 60%) to qWANack linkshare or realtime curve's m1. If that bandwidth is needed, it's used. Otherwise it is allocated to other traffics.

Just my opinion.

–----
to hoba, sullrich: spreadsheet sent.

sullrich

Sounds great. You seem to know a lot more about HFSC than we do. Want to adopt our code? :)

sullrich

Attaching the spreadsheet mentioned prior in this thread.

qWanAck.xls

david nordin

the outcome of this topic sounds really interesting and potentially very useful :)
best of luck

unreal1024

Hi, please help me! How much set LAN ack and WAN ack (into percent) for DSL line Upload 512Kbps and Download 8192Kbps. (asymetric line).
Very thanks!!!

sullrich

@unreal1024:

Hi, please help me! How much set LAN ack and WAN ack (into percent) for DSL line Upload 512Kbps and Download 8192Kbps. (asymetric line).
Very thanks!!!

Download the excel document and plug your values in.

unreal1024

where find cell for LAN ans WAN ack? I am not uderstand this sheet :-(
Thanks!

ack.JPG_thumb

sullrich

A (UP)
B (DOWN)

dusan

Then

click Tools/Solver…
click Set Target Cell, click R15C12 (or R16C12), click Max, click Solve
click Keep Solver Solution, click OK
The required X [kb/s or mb/s] is shown in R15C12. The required X/A [%] is shown in R16C12.

eri--

dusan can you please explain the rationale behind this in a formal way.

I want to integrate this in the shaper wizard and the excel is not easily readble/understandable.

Thanks in advance.

dusan

Well here's an explaination which – I hope -- is more detailed and formal.

All symbols are defined as before but we rather sumarize them here:

A = bandwidth of qWANroot
B = bandwidth of qLANroot
C = bandwidth of qWANdef
D = bandwidth of qLANdef
X = bandwidth of qWANack
Y = bandwidth of qLANack

In this model, we make use of no other queues:

A = C + X
B = D + Y

(To be exact, we know A, B and don't know C, D, X, Y. The analysis shows how much qWANack and qLANack could be actually utilized assuming they may be as large as needed, i.e. unbounded by anything else than A and B, respectively, that's what would be taken as the required value for X and Y and the rest of A and B would simply become C and D, respectively.)

Consider a single, i-th, traffic. The traffic varies in time, and utilizes four queues qWANdef, qLANdef, qWANack and qLANack at the same time. However, assuming that for the traffic, the four queue utilization "amounts" are directly related (rather than independent) by some constant coefficients (c_i,d_i,x_i,y_i), we represent the traffic "activity" as a single variable (v_i) rather than four. For example, for i=5 (Web surf traffic), the queue utilization "amounts" are assumed to be (v5c5, v5d5, v5x5, v5y5) at every time. The vector (c5, d5, x5, y5) = (0.135, 1, 0.0375, 0.0125), which is assumed constant for every network with any ratio of asymmetricity, is called Web surf traffic pattern.

(The assumption is supported by experimental observation. At time t in network N, it was observed that the qWANdef, qLANdef, qWANack, qLANack utilization "amounts" are

0.135, 1, 0.0375, 0.0125 [kb/s], respectively

and at time t' in network N', it was observed that the four queue utilization "amounts" are

135, 1000, 37.5, 12.5 [kb/s], respectively.

The experiments were made under Web surf traffic solely and no others involved, of course.)

The spreadsheed makes use of 8 traffic patterns, indexed by 0 through 7.

For every i in 0…7, let

a_i = c_i + x_i
b_i = d_i + y_i

At every time, the qWANdef utilization is

v0 * c0 + ... + v7 * c7

the qWANack utilization is

v0 * x0 + ... + v7 * x7

the qWANroot (ie. uplink) utilization is the sum of the two above:

(v0 * c0 + ... + v7 * c7) + (v0 * x0 + ... + v7 * x7)
= (v0 * c0 + v0 * x0) + ... + (v7 * c7 + v7 * x7)
= v0 * (c0 + x0) + ... + v7 * (c7 + x7)
= v0 * a0 + ... + v7 * a7

the qLANdef utilization is

v0 * d0 + ... + v7 * d7

the qLANack utilization is

v0 * y0 + ... + v7 * y7

the qLANroot (ie. downlink) utilization is the sum of the two above:

(v0 * d0 + ... + v7 * d7) + (v0 * y0 + ... + v7 * y7)
= (v0 * d0 + v0 * y0) + ... + (v7 * d7 + v7 * y7)
= v0 * (d0 + y0) + ... + v7 * (d7 + y7)
= v0 * b0 + ... + v7 * b7

Now we can construct a system of inequations, each represents a constraint, of 8 unknowns v_0 through v_7.

Uplink utilization must not exceed uplink bandwidth:

(C1) v0 * a0 + ... + v7 * a7 <= A

Downlink utilization must not exceed downlink bandwidth:

(C2) v0 * b0 + ... + v7 * b7 <= B

All network traffic must be non-negative:

(C3) v_i >= 0

Additional constraints may be made. For example, if we know that p2p uploading are upper-limited by 80% uplink bandwidth then we may add the constraint:

(C4) v0 * c0 <= A * 0.8

Similarly, if we know that p2p downloading are upper-limited by 80% downlink bandwidth then we may add the constraint:

(C5) v1 * d1 <= B * 0.8

The bounds like A and B are to be filled as values in row 3 of the Excel spreadsheet. The bounds like A0.8 and B0.8 are pre-filled as formulars in columns 8 and 9 of the sheet.

The MS Excel Solver find a solution of (C1)-(C5) that maximize a user-selected target cell. Note that we are concerned of queues' utilization implied from the solution, not the solution itself. Of particular interest may be the utilization of qWANack, qLANack, uplink and downlink. The spreadsheet includes formulars for them. We can select one of them as the target and observe others as the implied consequence.

As I've said, the Excel is for the purpose of analysis. It's not suitable and not worth to integrate in the Wizard as such.

eri--

A = interface/tocken bucket bandwidth
c = observed from the samples.
x = observed from the samples.

SO you are saying that the basic equation for a queue is:
A >= c*x (if we have a single queue under A).

and it transforms to
A >= sum(c_i* x_i) (for i queues).
and each queue gets c_i + x_i

If that is right and one wants to write a daemon that start by the assumtion of the constants calculated/observed by your testings and smaples traffic to adjust the i queues accordingly how can one calculate the c_i/x_i to be used later on a new calculation.
Basically can you provide even the calculation for the variables so one can write such a daemon?!

I hope to have understood your explanation. The rationale of integrating this with the wizard is to be coupled with such a daemon to make sense.

By the way thanks for your quick reply.

dusan

@eri--:

A = interface/tocken bucket bandwidth
c = observed from the samples.
x = observed from the samples.

SO you are saying that the basic equation for a queue is:
A >= c*x (if we have a single queue under A).

I am not sure if we are using the same symbol definition. My equations doesn't say anything about c*x, only c+x.
And given only one [sub-]queue under qWANroot, there exists either c or x, but not both.

@eri--:

and it transforms to
A >= sum(c_i* x_i) (for i queues).
and each queue gets c_i + x_i

Neither i was defined to be the number of queues, or the index of a queue. It was defined to be the index of a traffic (or, to be precise: a type of traffic). We use four [leaf-level] queues and eight [types of] traffic. The 48 matrix is enough to estimate largest qWANack utilization. You can extend it to, say, 14500 if you see any benefit of the extension and have a good traffic analyzer (see below).

@eri--:

If that is right and one wants to write a daemon that start by the assumtion of the constants calculated/observed by your testings and smaples traffic to adjust the i queues accordingly how can one calculate the c_i/x_i to be used later on a new calculation.
Basically can you provide even the calculation for the variables so one can write such a daemon?!

I hope to have understood your explanation. The rationale of integrating this with the wizard is to be coupled with such a daemon to make sense.

I interpret that such a daemon should

analyze traffics to determine their patterns, and
optimize linkshare setup using the traffic patterns determined.

As for 1), with firewall's (at the transport and/or the application layer) service we can parse every packet and determine the traffic it belongs to. Packet classification would enable statistical analysis. Based on the analysis we can observe patterns, if any exists in any sense. So the problem is solved, in principle.

As for 2), however, one may cast question: what is the objective, the goal, the target, or the criteria of such an optimization?

eri--

Heh i misunderstood you.

So what you're saying is that your is just an approach based on values observed for a specific traffic and specific config?!

I thought it was some generalized schema to achieve this.

For 1) i concur it is easily solvable.
2) is just providing an adaptive shaping to the patterns/classes of traffic the user selects.

think in terms of RSVP which wants for a specific class reserved traffic.
What i wanted to accomplish was just pessimization or optimization of traffic in behalf of the consumer.
Say you want that if the web traffic increases by 10% and at the same time VoIP traffic does the same we choose to serve VoIP better and pessimize the web traffic.

dusan

@eri--:

Heh i misunderstood you.

So what you're saying is that your is just an approach based on values observed for a specific traffic and specific config?!

I thought it was some generalized schema to achieve this.

Play with the spreadsheet, then try other traffic patterns, play with it again and draw your own conclusion.

@eri--:

For 1) i concur it is easily solvable.

Me too. I just said that it is principally solvable.

@eri--:

is just providing an adaptive shaping to the patterns/classes of traffic the user selects.

think in terms of RSVP which wants for a specific class reserved traffic.
What i wanted to accomplish was just pessimization or optimization of traffic in behalf of the consumer.
Say you want that if the web traffic increases by 10% and at the same time VoIP traffic does the same we choose to serve VoIP better and pessimize the web traffic.

That's not an optimization. That's a static (and simple) policy. I see no need to adaptively change the linkshare or realtime service curves. Construct static curves preferring VoIP and pfSense will do the rest for you.

Btw, I don't think there could be a smart deamon that knows what user wants. User must express his/her needs, i.e. shaping policy, in terms of service curves. That's user's job, not the deamon's one.

eri--

Yeah but i cannot teach HFSC or CBQ to anybody in the forum and they need some backroung to undertand/control their behaviour. This daemon configurable by the user would at least make it easier for home users to get right and get me some statistical data to generalize the configuration from a wizard.