Auto QOS fragmentation for VoIP over WAN - FreeBSD PFsense vs Cisco -


  • In the opensource world, we do not have good QOS for slow wan links, because opensource OS do not have a filter able to fragment data trafic in presence of VoIP trafic. Even one of the best paquet scheduler - HFSC - can't do a good job on slow links, because 1500 Bytes frames are too long to transmit.

    So Data trafic fragmentation is mandatory when using a shared slow link for VoIP and Data at the same time. The router need to fragment Data to a smaller MTU, so that VoIP jitter can be lowered.

    Fragmentation of data trafic do allow to keep the VoIP trafic jitter at a reasonnable level.

    This is done automatically on Cisco routers, when activating AutoQOS.

    This would be very nice, if we could have this inside Pfsense / FreeBSD. This would allow for the first time in the Opensource world, to get almost the same QOS quality level that Cisco do have.

    And we could use for the first time medium bandwith DSL links for VoIP / Data share without Voice quality compromise. This would be a big step to the full IP telephony world for small sized companies and individuals.


  • Can you please be clear on what is data fragmentation in VoIP?

    With QoS you can give VoIP precedence but what does it mean to 'fragment' it!


  • Ermal, Auto fragmentation is a function on Cisco routers when you activate auto QOS.

    As you know, when surfing the web or downloading, you can use big frames, as big as 1500 bytes, or eventually more if the network setup do accept jumbo frames.

    The problem with VoIP, is that even when using the best of the best QOS sheduler (Pfsense HFSC is one of them), you still need to wait for a 1500 bytes frame of tcp, (http or ftp for example) go out of the WAN interface, before to be able to effectively transmitt a VoIP frame.

    1500 bytes is a non negligeable amount of transmission time on a slow WAN link.

    As an example, a 256 000 bps link gives approximately 47 ms to transmitt a 1500 bytes frame :

    500 000 / 8 = 32 000 Bps

    1500 / 32 000  = 47 ms

    Even the best QOS sheduler can't do more than waiting for the Data TCP frame to go out of the WAN interface, before to transmitt a VoIP frame.

    The best he can do, is to try to minimize the amount of time needed between VoIP frame transfer, like this :

    The long lines is Data TCP traffic, the short one VoIP udp traffic.
    ___________________   ___  ___________________________   ___   _______________________    ____________________   ___

    As you can see, it is not possible to respect a good timing for VoIP frames. Jitter will be somewhere between 10 and 50 ms or even more, quite important and unsuported by some VoIP hardware, specially at telcom providers where they have costly telephony switches where they cannot update buffers. So you will get drops in the audio because of buffers underflow.

    The solution is to reduce the Data MTU, or fragmenting the Data trafic when there is VoIP trafic, like this :

    _______  ___  _______  ___  _______  _______  ___  ________  ___

    Here you can see that the VoIP jitter amount has been divided by a factor of two. The level of remaining jitter will depend on the MTU auto reduction amount you will set for Data trafic.

    This is done automatically inside Cisco routers, but is unsupported in the opensource world. This explain why Cisco has today a better quality for VoIP over slow WAN links when sharing Data / VoIP is needed. But this can change fastly as soon as a low level programmer will implement this inside the FreeBSD and Linux.

    There is no reason today that multinationnal companies get all the money when smart people and small companies can do better for less money.

    Slow links (< 500 kbps) are much more sensitive to the problem. At 128 kbps, it is not possible to use Linux or FreeBSD QOS to share a link for VoIP and Data, and 500 kbps is generaly the lowest usable speed to get good VoIP quality with Linux / BSD QOS.

    This phenomena is mostly unknown or missunderstood by peoples, but is responsible for a big part of the bad quality we can see on VoIP shared wan links using Linux or FreeBSD based routers, this means almost all small office routers.

    Olivier.


  • I think this is doable already with ALTQ.
    Check the end of this thread http://forum.pfsense.org/index.php/topic,15516.0.html i put a description of what is a tocken bucket (tbr option on 2.0).
    Its purpose is what you mention the tweaking of the rate sending of packets and that is what your graph is showing.
    So in simple words the TBR option allows you to tweak the sending rates/sizes(the slots you say) and the disciplines(HFSC, PRIQ) make sure that the packets you want higher priority to be sent during this slots.

    So from what i understand this is present today in open source!

    I do agree that ALTQ is not that well documented for new users but ….

    NOTE: That the options i am talking about are only on 2.0 snapshots of pfSense since 1.2 releases do not allow from the GUI to do that.


  • Ermal, even if we can dequeue dynamically packets with the new AltQ, reducing the time needed to trasmitt a VoIP paquet in

    presence of 1500 bytes Data paquets to a minimum of say one or two 1500 bytes paquets, it will not be possible to do more with

    AltQ.

    Again, on a slow WAN, a VoIP paquet must wait as long as say 20 - 150 ms, as soon as there is only one remaining queued 1500

    bytes Data paquet to transmitt.

    This is a big amount of jitter, specialy when the Data trafic is no fully uniform. Imagine now the amount of jitter, if the

    sheduler allow for 2 or 3 Data paquets to go out before a VoIP paquet. This will translate to 60 - 450 ms. Unacceptable.

    So the problem does not come from the sheduler, but really from the lenght of the Data paquets.

    There are no other solutions, mathematicaly speaking, than reduce the Data frame lenght to reduce VoIP Jitter.

    (In fact there is another one : stop Data trafic completly during VoIP calls :=) this is what we are doing when we have very

    important calls on the trunk !). Very effective solution :=)

    Seriously :

    To reduce this Data frame lenght, the OS system needs to be able to fragment dynamically Data trafic, when VoIP trafic need to

    be transmitted with a high priority low jitter class.

    Is FreeBSD able of doing this ? I do not think so, and Linux neitheir.

    I think that the same problem does exhibit for games, with an eratic and or higher latency when the Data trafic load is heavy

    on a slow wan link.

    Take a Cisco router, activate VoIP autoqos on it, overload its 250 kbps WAN uplink with http traffic at 100 %  (250 kbps), use

    iperf to generate UDP trafic classified in the VoIP queue and measure UDP jitter in the same uplink direction.

    You will see that it stay under acceptable levels (< 50 ms).

    Do the same thing with Linux or BSD and the best QOS rules and sheduler. You will see that the jitter will go over 100ms,

    possibly over 200 - 300 ms or worse.

    Cisco not only dequeue data paquets to give room to VoIP paquets, but reduce as well Data paquet size (fragmenting them) to

    allow for a smaller jitter and lower latency.

    Without fragmentation, even with very high priority for VoIP paquets, the best that a sheduler can do is put one VoIP paquet

    after one Data paquet. Not more

    Unfortunately this in not enough on slow wan links, for two reasons :

    • The changes in each data paquet size produce jitter on VoIP trafic :

    see this example :

    small lines are voip paquets
    longer ones are Data paquets of different sizes

    –- ------------------------- --- -------- --- ------------------------------------------- --- ---------------- ---

    you can clearly see that the time between VoIP paquets greatly varry.

    • to allow for a greater load sharing efficiency between different trafics, the "one voip paquet for one data paquet" rule is not usable for sheduling. it does work for a simple trafic scheme, but as soon as there is more than one call, mixed trafic bandwith, or parallel trafic with high priority like tcp ack, it does not work.

    So the conclusion is that there is no other way than fragmenting the Data trafic in presence of VoIP trafic on slow Wans to get good quality VoIP.

    Cisco did it like this and i think this is the only way. They call it "link efficiency mechanisms such as link fragmentation and interleaving (LFI) and low latency queuing (LLQ)"

    At the ATM level it is easy to do LFI without touching the IP MTU. But in the IP world reduncing the MTU means fragmenting the tcp/udp data trafic.

    See this white paper :

    http://www.cisco.com/en/US/tech/tk543/tk759/technologies_white_paper09186a00801348bc.shtml#wp39909

    or pdf version :

    http://www.cisco.com/warp/public/cc/pd/iosw/prodlit/autwp_wp.pdf

    Sometime it can produce indesirable effects because a receiving router or server cannot deal with fragmented trafic, but this is still better than having micro cuts in the VoIP calls, most of the time giving a disastrous image to the company.


  • i hate to be a newb here but couldn't you just lower the mtu of the entire interface and accomplish basicly the same thing


  • There is the possibility to even this by inforceing the mss for TCP although not from the GUI yet.
    And its not that its undoable but i guess on point to point link mtu is negotiable so you can play with that otherwise i do not see anything special in that cisco thing that cannot be done with FreeBSD hence pfSense.


  • Reducing statically the interface MTU does work, i tried it. But is not ideal. It means that there is a full time performance reduction for data.

    Same problem reducing the mss, as it does have a performance impact on TCP traffic for no reason when there is no VoIP traffic.
    More, it does work only to fragment TCp traffic, not UDP.

    If the MTU is dynamically changeable, it is better. We can then fragment only when necessary, but again, all the traffic is fragmented. I would say that it would be better to fragment selectively data trafic for the case where IAX is used in trunk mode. (the best would be to put a flag on VoIP trafic to leave it unfragmented)

    But you are right, what Cisco does is not really complex, it just makes it happen. In fact it is a bit more complex and efficient with ATM, because i think that in the case of IP over ATM AAL5 adaptation, Cisco is able to do the LFI stuff (fragmentation and interleaving) at the ATM level, where the 53 bytes cellule size of ATM is very efficient for this job. For this reason, we'll never have the ATM QOS efficiency for slow bandwith IP wan link.

    I think that those questions are important for the futur of the VoIP world, as many customers do not have the possibility to rent high bandwith SDSL, Cable or Fiber link, for private consumers as well as small companies.

    I will say that in our country (France) WAN quality is the biggest problem we have for full IP telephony. We can't yet propose the same level of reliability and quality than traditionnal telephony links, not because of technical limitations, but because the market (specially in France) is dominated and closed by a couple of big telcom providers.

    Having a better QOS for WANs, better that we can have today on Linux boxes, could help to change things.


  • If you think that will make you better finance it.
    Otherwise it is not on my own priority list. It is quite doable in the same way AutoQoS of cisco does.

    In my plans is adding the WFQ for HFSC meaning being able to have a queue for each call but that is all in my plans regarding to this.


  • Ermal,

    How much money needs to be raised to get this project high on your priority list?

    Thanks,

    GNB


  • Well minium 1.5K


  • For reference http://www.cs.virginia.edu/~mngroup/projects/qosbox/snooplet.html should do all that Cisco does.


  • What Cisco does is LFI, Link Fragmentation and Interleaving. AutoQoS in Cisco IOS enables it if the link speed is 768 Kbps or slower. In practice, it doesn't do much unless your link speed is 256 Kbps or slower, anything faster and the serialization delay isn't enough to make a considerable difference. Serialization delay on a 1500 byte frame at 256 Kbps is about 47 ms, so at 256 Kbps you're adding potentially as much as 47 ms of jitter (worst case scenario) to a VoIP frame that comes in immediately after a 1500 byte frame is sent. Even that generally wouldn't be enough to cause much or any quality degradation. At 128 Kbps or slower, it's a considerable benefit.

    In some parts of the world, this would still have some benefit today on their low speed links. In most of the world, it's irrelevant, as link speeds have become faster than the levels where this has a worthwhile impact.


  • Is there still an interest of this feature becoming a bounty?


  • I have some customers that intermittently complain about voice quality.  I'm going to order a Cisco router that can do "AutoQOS" and see if it makes a difference.


  • This is a great discussion and a valid one at that. I'm no expert, but wouldn't this be better implemented as a function of the network adapter and at the switch level? Layer 1 function?

    Why burden the firewall software anymore?