HFSC/CoDel for 40 devices

Harvy66

Are you using squid? FreeBSD does not shape incoming bandwidth, which means squid cannot shape incoming, only outgoing.

teladero

Common HFSC Use Cases

1 WAN / 1 LAN - [LAN: VoIP phones, ACK, DNS, WWW]

1 WAN / 2 LAN - [LAN: VoIP phones, ACK, DNS, WWW] * [DMZ: WWW, MAIL]

2 WAN / 1 LAN - [LAN: VoIP phones, ACK, DNS, WWW]

2 WAN / 2 LAN - [LAN1: VoIP phones, ACK, DNS, WWW] * [LAN2: VoIP phones, ACK, DNS, WWW]

2 WAN / 3 LAN - [LAN1: VoIP phones, ACK, DNS, WWW] * [LAN2: VoIP phones, ACK, DNS, WWW] * [DMZ: WWW, MAIL]

Per-client shaping

VPN shaping

I would be willing to add to the bounty on these scenarios too. We could then add it to the pfsense handbook.

moikerz

Nope, not using Squid.

jetblackwolf

I have been trying to implement HFSC/Codel in a >100 node environment with VOIP. After four months and many weeks on the forum going through years old information and reading HFSC papers from the late 1990s….I have managed to circle back to the beginning. This has been a monumental task. Every time I feel the setup is good the stats show otherwise. After a few weeks there will be notable drops in all the wrong queues. Something on the order of >50% drops of the default queue happening in the top two queues that should have a network bandwidth priority, as noted by the assignments.

Been trying to find the time to come on the forum and share some of my findings, so hopefully my recent experience can help shed some light on trying to implement this. Also a big thanks to Harvy66, Nullity, and the other forum members for their many posts on traffic shaping. One of the problems is that after making changes, it can take two or three weeks to collect enough information to see reliable results. A couple rounds of mediocre changes and a month has passed with nearly no progress in shaping traffic. (or negative progress, or just ending up cutting 20% off the internet bandwidth while trying to take control of the other 80%)

The big bugbear I feel happens when there are multiple LANs. Everything looks good on paper and during testing after business hours but the true running environment shreds the queue right up, every time. Whether it is because of multiple devices pulling down updates or groups of users eating bandwidth from the internet, it seems like any real stress across all LANs/queues together will immediately start throwing drops in higher bandwidth queues instead of cutting bandwidth out of the default/lower bandwidth queues. This is easily noticed when running a VOIP queue, that almost always has multiple bandwidth streams running, and will immediately start logging drops.

Tried setting M2 levels way below actual limits, for example setting qDefault to 64Kb on a parent queue that is good to 20Mbps, and then setting notably higher limits on the qVOIP and qACK in the order of 768Kb. That's 12 times the bandwidth limit of qDefault. To my understanding this -should- result in the qDefault dropping significantly more information than the other queues, but the results show otherwise. For example consider the 20Mbps interface with all child queue M2 combined equaling only 3Mbps, there is considerable bandwidth available to link share here. (and priority should be granted based on the M2 bandwidth allocation)

Tried removing most of the queues and just using two or three, this didn't seem to help. Then tried creating seven or eight queues, and this didn't help either. One benefit of more queues is that I can see exactly which queues are dropping information and know exactly which services are dropping packets. Also can see in pftop how much data passes through those queues, which can help when looking at dropped packets per gigabyte of data.

Just started using the d parameter (delay) so the plan now is to put a delay into the ramp up on bandwidth available for the queue. The examples seem to use this for bursting data but I'd like to try using it to slowly ramp up bandwidth allocation. The other solution is to assign a real time queue, this works until there is more than one "very important" stream to operate the network. My goal was to implement this on a fully link share setup with no real time queues. That is all I have for now, just tossed out a stack of diagnostics>pftop>queue printouts from the last couple of months....where I thought everything was good to go! The best advice I can share right now is don't believe the shaper is okay until a few weeks of data has run through it. ;D

teladero

Thank you very much for that contribution jetblackwolf. I understand your pain in trying to figure this out for a long time yet coming back needing to revise it. I wish it were more user friendly to Traffic shape than this. As a comparison, notice how simple my other configuration is on meraki (attached). I have 0 issues with this setup. 24 active users plus torrenting and Voip. No hiccups with meraki.

Simply by setting VoIP traffic to high priority and Bittorrent to Low, none of my users have issues with the most important things like calls, DNS and web browsing.

![traffic shaping meraki.PNG](/public/imported_attachments/1/traffic shaping meraki.PNG)
![traffic shaping meraki.PNG_thumb](/public/imported_attachments/1/traffic shaping meraki.PNG_thumb)

moikerz

@teladero:

notice how simple my other configuration is on meraki (attached)

You can do exactly this with pfSense, using PRIQ ;)

@jetblackwolf:

I have been trying to implement HFSC/Codel in a >100 node environment with VOIP.

Perhaps give us your details and we can walk you through it. Internet connection up/down rates, cable/dsl/fiber/t1, how many LANs do you have, how many queues do you want, etc. I have it working well for my 40-device implementation; you could simply tweak my setup. Also, packet drops are normal; don't expect zero drops - if you're not dropping, then something upstream is (which is the whole reason we want to avoid an upstream device doing that!).

teladero

@moikerz:

@teladero:

notice how simple my other configuration is on meraki (attached)

You can do exactly this with pfSense, using PRIQ ;)

I'd love to see a tutorial on that! Do you have one?

moikerz

@teladero:

I'd love to see a tutorial on that! Do you have one?

This really should be in it's own thread, as this thread is for HFSC… but anyway:

It's really as straightforward as creating PRIQ queues, and specifying a priority 1(lowest) to 7 (highest). You can have a max of 7 queues only. Then feed your traffic into the queues. There are plenty of PRIQ guides if you search the web. Here's an example I found in 30 seconds: http://sunstatetechnology.com/docs/pfSenseVoIPQoSGuide.pdf

I will take it back, it is not quite as easy as a basic drop-down that Meraki gives you. But that's what it does. Untangle does the same.

Harvy66

VoIP is fixed bandwidth. If you're seeing dropped packets, it's because

Devices are some how synchronized and packets are bursting in and filling up your queue faster than it's depleting. Improperly sized queue? Try using Codel?
You have too many devices and not enough bandwidth, so something has to give
Wrong traffic is getting into the queue and consuming precious bandwidth.

PRIQ is rarely what you want. It seemingly works well under simple tests, but it has some really nasty corner cases that can starve lower priority queues or cause massive jitter for large buffers. Network flows absolutely hate abrupt changes.

teladero

Thanks for sharing this moikerz. Can you see why I am confused? You say that it is easy to find a nice tutorial on how to do this with PRIQ, then Harvey66 comes along and says that PRIQ is not the right solution for this problem.

It seems like there is not a definitive answer on this. I understand that no two networks are identical, but there are some pretty common scenarios you find. KOM mentioned some great ones earlier in this thread.

Maybe we just need to figure out which Queue to use first, then layout the proper rules for it.

jetblackwolf

Thanks for the feedback. Trying not to hijack the thread, just meant to pop in and try to offer some information.

I'll gladly try and help with a guide. (at least for HFSC) Trying to get the correct setup going first before I created a big thread that was full of misinformation. Not on purpose mind you, along these months I have closed the book on this many times believing I understood what was going on and then started all over. So I am in an odd position where I can spot a bad setup now based on all of my tests and what I have concluded on….but not confident enough to offer up any kind of guide on my own.

I think some questions I would love to see answered for HFSC/Codel are what happens when new streams come into a queue? Do they abide by the M1/D while other streams have already met the M2? And where would the bandwidth come from? M2? Or dig into another child queue? Or does the queue literally fire off one M1/D check on first use of that queue and then potentially sits there in M2 mode until the queue returns to an idle state, to then repeat the process again? Many areas of the papers I read went right over my head, not a PHD by any stretch.

I spent time trying to assign the priority for the HFSC before noticing it doesn't actually seem to be a part of the queue documents, at all. This is confusing because there is a note in the GUI that says it sets priority on packets during overload (for HFSC). Yet HFSC only has bandwidth and time variables. Spent a bit of time on this before finding out it does nothing......at all. So even if the GUI was cleaned up and only the proper options provided for the selected queue, it would probably be less confusing. ::)

Why I believe the drops are still occurring on my end is related to how multiple LAN queues are being hammered at the same time, even though all upperlimits are correctly divided. I will look into that burst comment Harvy66. (and yes I am using CODEL per your findings in older forum posts)

Nullity

@jetblackwolf:

Thanks for the feedback. Trying not to hijack the thread, just meant to pop in and try to offer some information.

I'll gladly try and help with a guide. (at least for HFSC) Trying to get the correct setup going first before I created a big thread that was full of misinformation. Not on purpose mind you, along these months I have closed the book on this many times believing I understood what was going on and then started all over. So I am in an odd position where I can spot a bad setup now based on all of my tests and what I have concluded on….but not confident enough to offer up any kind of guide on my own.

I think some questions I would love to see answered for HFSC/Codel are what happens when new streams come into a queue? Do they abide by the M1/D while other streams have already met the M2? And where would the bandwidth come from? M2? Or dig into another child queue? Or does the queue literally fire off one M1/D check on first use of that queue and then potentially sits there in M2 mode until the queue returns to an idle state, to then repeat the process again? Many areas of the papers I read went right over my head, not a PHD by any stretch.

I spent time trying to assign the priority for the HFSC before noticing it doesn't actually seem to be a part of the queue documents, at all. This is confusing because there is a note in the GUI that says it sets priority on packets during overload (for HFSC). Yet HFSC only has bandwidth and time variables. Spent a bit of time on this before finding out it does nothing......at all. So even if the GUI was cleaned up and only the proper options provided for the selected queue, it would probably be less confusing. ::)

Why I believe the drops are still occurring on my end is related to how multiple LAN queues are being hammered at the same time, even though all upperlimits are correctly divided. I will look into that burst comment Harvy66. (and yes I am using CODEL per your findings in older forum posts)

Regarding HFSC, please post your questions in my HFSC explained - decoupled bandwidth and delay - Q&A - Ask anything thread. That thread also has links to the best HFSC documentation that I came across while researching HFSC.

You are not alone in your confusion… :)