Traffic shaper changes [90% completed, please send money to complete bounty]

eri--

@slicknetaaron2:

Hi Ermal,
Thanks so much for taking the time to further explain the shaper. It helps a lot. In my ongoing quest for thorough understanding of the shaper, I would like to confirm my understanding with you and ask a few more clarifying questions. With this, I will hopefully be able to support others and write a tutorial.

I said it is somewhat difficult for a not knowledgeable person to gain thorough understanding afaik.

1. Where the queues are located: Download queue limits go on the LAN side because you do not want to limit the packets coming in from the ISP. We just gotta take them as we get them. Upload limits go on the WAN interface to reorder and shape traffic going OUT to the WAN from all combined LAN interfaces.

It is just the way ALTQ works.

2. It looks like the wizard defaults to HSFC. Somehow we need to figure out a way to make editing the wizard settings more friendly to the user? Somehow hide the complexity of HSFC, but offer the benefits in the background? Maybe shorten the regular queue config to a Basic and an advanced? And explaining how the queue that we are editing will interact with other queues?

What do you find not friendly in there.
I does not default to HFSC just that happens to be the first value in there. And preserve compatibility since it was the only thing you have on 1.2.
I only ask for connection parameters and some schedulers to apply per interface what do you find Advanced in there?!

@ermal:

I.E. i have VoIP traffic that uses UDP protocol with packet sizes of 1.2Kbit which needs a delay of 30ms to feel as normal phone call.
But also i want a hard limit, 64Kb, on all the bandwidth that VoIP traffic consumes on my network.

What does packet length of 1.2kb have to do with the shaper (realtime m1)? Isn't the shaper looking at bandwidth per second, not packet length?

My understanding of VoIP (SIP in particular) is that there is a messaging and call setup on 1 port (5060) and 2 UDP ports used for the actual audio. A typical bandwidth of 96kbps per call (for most common encoder). I have also read that several users need to have a burst of more than 96kbps (say 128kbps) for the first 5-10 seconds of the call. So I would think that if there is 1 phone on the network, m1=128kb d=10000 m2=100kb. That is my understanding of m1, d and m2. Burst speed (m1) for (d) ms and then limit to (m2) for the remainder of the connection. I do not understand where 1.2kb comes from for 30ms. 1.2kb is much less than the required 128kbps and the beginning of a call.

( i will not go into detail why since it is very deep discussion). Take it or leave it.
Or better prove me wrong after you test it ;).
follow this link to for more discussion http://forum.pfsense.org/index.php/topic,2484.0.html

3. Do the m1, d, m2 parameters operate on a PER-SESSION environment? ie. I pick up the phone and it will activate m1, d, m2. Next time I need the phone m1 starts over again? What happens in the case of 2 phones or 10 phones or when you can't know how many phones there are?

m1 and d are per packet. m2 is global.
They can be thought as per session since if you have 4 phones they send traffic at the same rate.
They all have the same delay so packets for each phone will be scheduled on a round robin manner which is approx. the same as a session.
What would be ideal is to create a queue for each phone and give the exact parameters to each queue.
Then you would have perfect/exact per session tracking but even with one queue you would have pretty much the same result.

4. And how does m1, d, m2 work for a dynamic bandwidth WAN queue? When does m1 go into effect? With new sessions? hmm.. I'm hoping so! I think I am beginning to see the power of HSFC!

They scale accordingly if you have not set hard numbers in there.

@ermal:

Now there are three such schedulers in HFSC. Realtime, Linkshare, Upperlimit.
Realtime is the first scheduler that is run every time. Meaning if we are trying to send a packet the Realtime scheduler will be asked if it has one. After that the Linkshare scheduler takes the lead and if it exceeds some limits the Upperlimit one overrides its decision.
So getting back from theory, when the VoIP traffic above reaches the limit m2 it will be scheduled by the linkshare service curve till VoIP traffic gets back under m2 realtime limit. That's why you have to specify always the bandwidth parameter which is the same as specifying m2 parameter of linkshare.
When both bandwidth and linkshare m2 parameters are specified the m2 parameter is the one that prevails.

5. This is kind of confusing.. I think the terms might be mixed up? Here is what I am thinking:
a. RealTime tries to "grab" bandwidth to try to ie. guarantee a good VoIP call
b. Linkshare monitors RealTime to make sure he doesn't get out of hand for this queue's part of the bandwidth for the whole interface? This isn't quite clear to me..? Can we borrow bandwidth if it's not being used elsewhere? There is a note in the shaper that says "Linkshare overrides priority". Can you please explain that? I think we should only use priority?
c. UpperLimit is an Arbitrary maximum for a queue - no matter if we can borrow unused bandwidth or not?

A new packet needs to be transmitted on the wire.
We first ask Realtime scheduler if it has something to transmit.
After we ask the Linkshare which cooperates with Upperlimit to follow the rules.

6. What do you mean by: "you have to specify always the bandwidth parameter which is the same as specifying m2 parameter of linkshare." Which bandwidth parameter are you referring to?

If you click "Add new queue" on top of the form there is a bandwidth parameter and that is what i refer to as "bandwidth parameter".

I'm going to head over to wikipedia to try to understand this more as well.

Good luck you need it :).

@ermal:

I will explain some things but you have to wait for the next update to actually try to configure it.

Do you have an ETA for the update? I just want to decide if I should put 1.2 back on my box and reinstall pfSense onto my network, or if it will be a day or 2 and I can just wait with my network without pfSense for a bit longer.

Default rule & Anti-lockout: Is there a way you can script to change these rules, or give a message to the user that they need to do this?

Thanks for your time!
Aaron

Probably tomorrow.

Ermal

eri--

@slicknetaaron2:

Hi Ermal,

Thanks again for the reply. I apologize, I made a couple errors and did not mean to offend.

@ermal:

I said it is somewhat difficult for a not knowledgeable person to gain thorough understanding afaik.

I was not knowledgeable about hfsc and altq, but to say that I am not knowledgeable and not able to gain thorough understanding… thats just not very nice! :) I am incredibly knowledgeable, just not in this particular area, yet. After spending some time researching last night I am well on my way to thorough understanding and the ability to explain to others how it works. I certainly do not have the knowledge and development skills you possess, but I would like to contribute to the project.

It sound badly but i didn't meant what you understood.
It simply means that without reading too much you would have an hard time with it.
BTW, read the original HFSC paper to understand more.

What do you find not friendly in there.
I does not default to HFSC just that happens to be the first value in there. And preserve compatibility since it was the only thing you have on 1.2.
I only ask for connection parameters and some schedulers to apply per interface what do you find Advanced in there?!

I apologize, I did not mean for that portion of the wizard. That portion is not advanced at all. After reading about hfsc, I totally understand why the queue gui is designed as it is. However, trying to figure out what conn0 and conn1 mean and the "number of connections" questions are very counterintuitive. Is it possible to clear up the descriptions (labels) to ask the number of local and WAN connections? It seems on at least 1-2 of the wizards when I enter "2" in for num of local connections the next screen will not even let me select my LAN port and bugs like that. I am not the only one who had trouble with that (from responses in this tread.)

Yeah i will fix the labels!

@ermal:

I.E. i have VoIP traffic that uses UDP protocol with packet sizes of 1.2Kbit which needs a delay of 30ms to feel as normal phone call.
But also i want a hard limit, 64Kb, on all the bandwidth that VoIP traffic consumes on my network.

( i will not go into detail why since it is very deep discussion). Take it or leave it.
Or better prove me wrong after you test it ;).
follow this link to for more discussion http://forum.pfsense.org/index.php/topic,2484.0.html

I remember reading a thread about VoIP service curve settings. It looks like you were very active in that, and suggested almost exact service queue as I suggested. See here:
http://forum.pfsense.org/index.php/topic,7502.msg42693.html#msg42693

After spending several hours last night reading on hfsc, it is also invalid to have a realtime service curve that is concave. m1 must be higher than m2.
In the same thread linked above, you were telling people to set m1=m2. That is not a curve, but a straight line and is redundant. Not specifying m1 and d will have the same effect. Lastly, There is never a mention of packet size for any of the altq schedulers as you are suggesting for the m1 value for VoIP queue. plus, isn't it impossible to have packet sizes of 125kb as listed in that same post?

Well you cannot really configure a convcave(or is it convex?) service curve in HFSC. Since the starting point of the second curve is in the first service curve.

3. Do the m1, d, m2 parameters operate on a PER-SESSION environment?

m1 and d are per packet. m2 is global.

In my research, I found that the service curve is basically applied during "link congestion" only. Otherwise the scheduler is not doing much. the service curve value of m1 is not on a packet size, but total bandwidth used by the queue without regard for packet size. If m1 was packet size and m2 is global, wouldn't they be different variables instead of the same variable at different time spans?

Yeah every discipline is non-work conserving in ALTQ. Does it need not to be?!
Though if you want the discipline to behave as congested take a look at the tbrconfig/tbrsize parameter.
It might even help more in high speed links to lower it from what ALTQ/pf calculates automatically so the discipline acts propperly.
Actually m1 and m2 are different parameters since they define different service curves.
I can use it as packet size since i know the details as:
m1 * d converts to bytes approximately ;). Anyway long discussion but you can configure m1 < m2 with this shaper since i patched ALTQ/pf to allow that.

4. And how does m1, d, m2 work for a dynamic bandwidth WAN queue? When does m1 go into effect? With new sessions? hmm.. I'm hoping so! I think I am beginning to see the power of HSFC!

@ermal:

They scale accordingly if you have not set hard numbers in there.

So what settings would I use if I have a WAN that will burst all the way up to about 15mb download but it's guaranteed 8mb down and upload burst to 3mb and guarantee 1mb? I am thinking set bandwidth to 15mb/3mb and then use one of the service curves (not sure which one yet) to m1=15mb d=30000 m2=8mb?
Nailing this will help a lot of Comcast or other cable customers that have bursts that they are not able to take advantage of with the standard shaper wizard. In fact, if you could put this as an option in the wizard all the better!

Well i suggested it previously. Though you need the time of this bursting to pass to d parameter.

As for m1 = m2 try it if you find any difference or not!

@ermal:

Good luck you need it :).

Nah, I'll just use my brain. I learn quickly.

I'm looking forward to the updated today! Thanks so much for your hard work!

Good that's what i meant since the start :D.

Aaron

eri--

I have sent new links for the updated shaper to most of you.

The others will get a PM after an hour or so since there's a limit to how many PMs can be sent.

k3rmit

Thanks for the new update, however once installed and followed trough the revised (great thanks) multi lan wizard, i got stuck at "Generating ALTQ queues…" in the filter reload page.

It's not going forward and cannot get back to the shaper page, i have this error:

Fatal error: Call to a member function on a non-object in /usr/local/www/firewall_shaper.php on line 321

Thanks for any help

albe

eri--

Can you please send me a copy of the <shaper>and <ezshaper>sections of config.xml.
Please even tell me what options you choosed since i tested it but could not get to this error.

For you try to delete the <shaper>section and try again.</shaper></ezshaper></shaper>

ridnhard19

When upgrading an embedded install the file is too big:

Enter the URL to the .tgz update file:
> <local ftp="" url="">-upgrade-file.tgz

Fetching file size...

File size: 75129099

Fetching file...
looking up ***.***.***.***
connecting to ***.***.***.***:21
setting passive mode
opening data connection
initiating transfer
remote size / mtime: 75129099 / 1206885245

/: write failed, filesystem is full

fetch: /root/firmware.tgz: Inappropriate ioctl for device

Warning: filesize(): Stat failed for /root/firmware.tgz (errno=2 - No such file or directory) in /etc/rc.initial_firmware_update on line 58

File size mismatch.  Upgrade cancelled.</local>

SlickNetAaron

@ridnhard19:

When upgrading an embedded install the file is too big:

Enter the URL to the .tgz update file:

This build seems to be a LOT larger than the last update?  40ish MB vs 70?

I'm downloading now.  Do you have that much storage available?  Will report back

Aaron

SlickNetAaron

I had ame result. The filesystem created on the card is smaller than the new shaper image. I have a 2GB card, but the file system is not there.
Last half of the output:

remote size / mtime: 75129099 / 1206885245
/root/firmware.tgz 70% of 71 MB 146 kBps 02m29s
/: write failed, filesystem is full
/root/firmware.tgz 70% of 71 MB 146 kBps 02m29s
fetch: /root/firmware.tgz: No space left on device

Warning: filesize(): Stat failed for /root/firmware.tgz (errno=2 - No such file or directory) in /etc/rc.initial_firmware_update on line 58

File size mismatch. Upgrade cancelled.

Aaron

sullrich

Embedded upgrades are not supported and are not known to work all the time. See the release notes.

ridnhard19

@sullrich:

Embedded upgrades are not supported and are not known to work all the time. See the release notes.

Is it possible to roll a full/regular image for the embedded platform?

SlickNetAaron

This is true, but is the size of this image correct? 70mb? The previous embedded image was half that.

@sullrich:

Embedded upgrades are not supported and are not known to work all the time. See the release notes.

mikenl

@k3rmit:

Thanks for the new update, however once installed and followed trough the revised (great thanks) multi lan wizard, i got stuck at "Generating ALTQ queues…" in the filter reload page.

It's not going forward and cannot get back to the shaper page, i have this error:
Fatal error: Call to a member function on a non-object in /usr/local/www/firewall_shaper.php on line 321
Thanks for any help

albe

@ermal:

Can you please send me a copy of the <shaper>and <ezshaper>sections of config.xml.
Please even tell me what options you choosed since i tested it but could not get to this error.

For you try to delete the <shaper>section and try again.</shaper></ezshaper></shaper>

I'm experiencing the same problem.
Removing the <shaper>section from /cf/conf/config.xml doesn't help, i also tried deleting the <ezshaper>bit.
I tried the single wan multilan wizard. Hfsc, p2p catch all, prioritize http, dns. I believe thats it.
http://twentse-es.nl/shaper_config.xml</ezshaper></shaper>

eri--

http://cvstrac.pfsense.com/chngview?cn=21849
Found the problem it should not happen only with the Sinlge Lan multi Wan wizard.

If you can't wait for the next build do your fixes accordingly it is not hard afaik.

SlickNetAaron

I just reflashed with the image provided. It was labeled "upgrade" and now ALIX reports there is no boot disk. I imagine that since the image was labeled "upgrade" that we cannot flash this image? So how do we get this to go since upgrades are not supported?

I've had my network torn apart for 5 days waiting for a working shaper. I need to wrap this up.

@SlickNetAaron:

This is true, but is the size of this image correct? 70mb? The previous embedded image was half that.

@sullrich:

Embedded upgrades are not supported and are not known to work all the time. See the release notes.

sullrich

Embedded upgrades are not supported at all.

SlickNetAaron

@sullrich:

Embedded upgrades are not supported at all.

I understand that.

But the image that is provided by ermal is labeled "embedded upgrade". I flashed it (NOT using the upgrade process) and the image is invalid.

So the dev is ONLY giving us an embedded upgrade, which isn't working. And if we flash the image, it is not bootable.

Do you see the problem?

Aaron

hoba

It's a custom update file I think. You should feed it as such to the webgui. It's only labeled that way so the webgui accepts it as upgrade I think. Ermal has to comment on this.

SlickNetAaron

@hoba:

It's a custom update file I think. You should feed it as such to the webgui. It's only labeled that way so the webgui accepts it as upgrade I think. Ermal has to comment on this.

The Web GUI will NOT accept it. Option 13 on the console fails - as described above by myself and someone else.

This image is 2x the normal size?? I don't think the image was built correctly! And we only have the "upgrade" image, with no full install image provided.

Aaron

mikenl

@ermal:

http://cvstrac.pfsense.com/chngview?cn=21849
Found the problem it should not happen only with the Sinlge Lan multi Wan wizard.

If you can't wait for the next build do your fixes accordingly it is not hard afaik.

Fixed it indeed, thanks.

eri--

SlickNetAron i am building it. Check the link i gave i will update it there.

You will notice from the date.