HFSC/CoDel for 40 devices
-
I have an office with about 40 devices, with a 6/6 access (yay T1..). All I want to do is de-prioritize SMTP, and prioritize access to one website, while not letting any one user monopolize the connection. Directing traffic to the queues is not an issue.
I've had PRIQ working somewhat well, but lower queues were sometimes starved for bandwidth. (also, dslreport's bufferbloat was always an 'F').
I figured I could try pure CODELQ, but then I was reading that it doesn't perform well for multiple simultaneous users/threads. It gave me an instant 'A' for bufferbloat, but if it would be problematic for 40 devices, I'd prefer not. Is this true?
If so, then I guess I'm left with HFSC. But I'm getting confused. I can understand the parent/child relationship, but I can't for the life of me figure out Linkshare's relationship to Upperlimit/Realtime … ???
Example time:
Let's say I have a parent, with two children (HighPriority, LowPriority). I don't care about burst (m1,d); only about m2.
I want HighPriority to have minimum 30% bandwidth, maximum 80% bandwidth.
I want LowPriority to have minimum 10% bandwidth, maximum 40% bandwidth.
So HighPriority's Upperlimit = 80%, Realtime = 30%, Linkshare = ??
And LowPriority's Upperlimit = 40%, Realtime = 10%, Linkshare = ??WTF goes in Linkshare for the two children? ??? Am I not thinking of 'maximum' correctly?
-
The text is missleading or possibly wrong, but I have seen similar descriptions from tutorials. The issue is dumbing down the descriptions loses information that makes things more confusing when trying to reason through the more powerful features like Upper Limit and Real Time
Upperlimit and linkshare are both relative to their parent queues, Real Time is relative to the root queue. As a general rule of thumb, I do not use realtime. It creates a mess of things by overly complicating simple issues.
I would just stick with Upper Limit and Bandwidth and use percentages. Set your Bandwidth to the minimum percentage you want to have and remember it's relative to the current parent queue. eg If your parent queue has a bandwidth of 50% and your child queue has a bandwidth of 50%, then your child queue has an effective bandwidth of 25% relative to the root queue.
Upperlimit works the same way. If your parent queue can only have 80% of the root queue and your child can only have 80% of the parent queue, then your child queue can only have 64% of the root queue.
-
That's good information, thanks Harvy66! Realtime makes perfect sense; it would be nice if the UI could just say what each block is relative to (parent/root).
So with the concept of "borrowing":
1. using Bandwidth=minimum, and Upperlimit=maximum, does that imply that the child can borrow from the parent until it reaches the child's Upperlimit? Nothing else to set?
2. using percentages, does the Upperlimit of Child1 + Child2 need to equal 100%, or can it be more?
-
That's good information, thanks Harvy66! Realtime makes perfect sense; it would be nice if the UI could just say what each block is relative to (parent/root).
So with the concept of "borrowing":
1. using Bandwidth=minimum, and Upperlimit=maximum, does that imply that the child can borrow from the parent until it reaches the child's Upperlimit? Nothing else to set?
2. using percentages, does the Upperlimit of Child1 + Child2 need to equal 100%, or can it be more?
-
Correct. The percentage can be misleading, but it really is just a short-hand to place relative fixed bandwidth amounts, so it's pre-computed, not dynamic. As long as the child is not at it's upper-limit, it follows normal distribution rules.
-
Correct. Upper limits restrict, so they can be more than 100% among queues, but no more than 100% for a single queue
Interesting to note is that HFSC effectively distributes bandwidth in ratios. If I set one queue to 1% and another queue to 1%, and they both try to use all of the bandwidth, they will get a 50/50 split.
-
-
Forget about real-time. Like Harvy66 said, ocus on link-share (and maybe upper-limit on download traffic). Link-share is just a ratio, not an absolute, so pay attention to the proportional relationships of the queues. Keep your rules and queues simple.
Sadly, upload & download need to be treated differently. I would not use upper-limit on upload, because the router can throttle the LAN clients almost immediately to make bandwidth available for VIP traffic.
Download needs to be preemptively throttled, sometimes as low as 60% to account for the exponentially larger delays between when you request a slower download and when the download actually slows down.
-
So here's what I've currently got:
WAN, HFSC, Bandwidth: 5Mbps
- qInternet, CoDel, Bandwidth: 5Mbps
- qDefault, Default, CoDel, Bandwidth: 10%
- qOthersHigh, CoDel, Bandwidth: 20%
- qOthersLow, CoDel, Bandwidth: 5%LAN1, HFSC, Bandwidth: 900Mbps
- qLink, Default, CoDel, Bandwidth: 895Mbps
- qInternet, CoDel, Bandwidth: 4Mbps, Upperlimit: 4Mbps
- qOthersHigh, CoDel, Bandwidth: 10%
- qOthersLow, CoDel, Bandwidth: 5%LAN2, HFSC, Bandwidth: 900Mbps
- qLink, Default, Bandwidth: 895Mbps
- qInternet, Bandwidth: 1Mbps, Upperlimit: 1Mbps
- qOthersHigh, CoDel, Bandwidth: 10%
- qOthersLow, CoDel, Bandwidth: 5%So, here's my questions and notes:
1. So I should remove the WAN-qInternet Upperlimit value? Not sure I understand your reasoning, Nullity.2. For LAN1-qLink and LAN2-qLink, is this bandwidth correct? Or should it be my Upload bandwidth?
3. LAN1-qInternet + LAN2-qInternet = 5Mbps, is this the appropriate way to shape the LAN2 speed?
-
- You don't need an upper limit on your WAN because the interface is already limited to 5Mb total, Your LAN interfaces are not.
2+3) Because PFSense does not allow sharing bandwidth among interfaces, you are correct that you need to split the bandwidth between LAN1 and LAN2 in fixed amounts. PFSense shapes bandwidth going out because technically you can't shape bandwidth coming in. Your LAN interfaces represent your download.
-
I understand about have LAN1 and LAN2 Bandwidths set as my desired download speed.
But what does that mean for LAN1-qLink and LAN2-qLink? Since they are my default queues for LAN, it sounds like I should remove the qLink parents completely, and make new qDefaults as children of qInternet, correct? (With the obvious side-effect of limiting inter-vlan comms to whatever download speed I configure for the LANx parent)
WAN, HFSC, Bandwidth: 5Mbps
- qInternet, CoDel, Bandwidth: 5Mbps
- qDefault, Default, CoDel, Bandwidth: 10%
- qOthersHigh, CoDel, Bandwidth: 20%
- qOthersLow, CoDel, Bandwidth: 5%LAN1, HFSC, Bandwidth: 4Mbps
- qLink, Default, CoDel, Bandwidth: ???Mbps
- qInternet, CoDel, Bandwidth: 4Mbps, Upperlimit: 4Mbps
- qOthersHigh, CoDel, Bandwidth: 10%
- qOthersLow, CoDel, Bandwidth: 5%LAN2, HFSC, Bandwidth: 1Mbps
- qLink, Default, CoDel, Bandwidth: ???Mbps
- qInternet, Bandwidth: 1Mbps, Upperlimit: 1Mbps
- qOthersHigh, CoDel, Bandwidth: 10%
- qOthersLow, CoDel, Bandwidth: 5% -
qLink is meant to be used for non-WAN related traffic, like inter-LAN or between PFSense and the LANs. That way all of that traffic can run full LAN speed and not affect the WAN traffic coming in.
-
Yes that's what I understood qLink to be for too :D But I'm not understanding the purpose of setting LAN bandwidth. Does the bandwidth of the parent queues (qLink, qInternet) need to be equal/less than the interface bandwidth?
With the config I posted, is the correct approach (see bold):
LAN bandwidth = 900Mbps
- qLink bandwidth = 896Mbps
- qInternet bandwidth = 4Mpbs
- qOthersHigh, CoDel, Bandwidth: 10%
- qOthersLow, CoDel, Bandwidth: 5%
??It seems like that should be correct, from how I interpret what you're saying.
-
But I'm not understanding the purpose of setting LAN bandwidth.
You can only shape egress traffic. This means if you want to shape your download, you need to shape it as it leaves your LAN interface, not as it comes into your WAN interface. I like to shape my download so downloads don't make my ping jump high and reduces packet-loss.
-
You've inadvertently answered my question in another thread yesterday ::)
For the sake of completion for this thread, I'll link it here:
https://forum.pfsense.org/index.php?topic=112038.msg623926#msg623926
@Harvy66:Your LAN interface is set to 1Gb/s. Your traffic is probably going into the default queue of qLink, which is limited to….. 1Gb/s. If you want your traffic to be under your qInternet, you need to place it in there somewhere
P.S. Don't place any traffic directly in qInternet, you're only supposed to place traffic in a leaf queue with HFSC.
-
Here's what I've currently got:
WAN, HFSC, Bandwidth: 5Mbps
- qInternet, CoDel, Bandwidth: 5Mbps
- qNormal, Default, CoDel, Bandwidth: 10%
- qHigh, CoDel, Bandwidth: 20%
- qLow, CoDel, Bandwidth: 5%LAN1, HFSC, Bandwidth: 900Mbps
- qLink, CoDel, Bandwidth: 895Mbps
- qInternet, CoDel, Bandwidth: 4Mbps, Upperlimit: 4Mbps
- qHigh, CoDel, Bandwidth: 20%
- qNormal, Default, CoDel, Bandwidth: 10%
- qLow, CoDel, Bandwidth: 5%LAN2, HFSC, Bandwidth: 900Mbps
- qLink, CoDel, Bandwidth: 895Mbps
- qInternet, CoDel, Bandwidth: 1Mbps, Upperlimit: 1Mbps
- qHigh, CoDel, Bandwidth: 20%
- qNormal, Default, CoDel, Bandwidth: 10%
- qLow, CoDel, Bandwidth: 5%And my classification rules (attached).
This give me:
- 5Mbps max upload
- 4Mbps max download for LAN1
- 1Mbps max download for LAN2
- qHigh traffic can use 100% if available, always guaranteed 20% of parent
- qNormal traffic can use 100% if available, always guaranteed 10% of parent
- qLow traffic can always use 100% if available, always guaranteed 5% of parent
- All traffic defaults to qNormal
- email ports are low priority (qLow, saves approx 10% of bandwidth)
- DNS, private cloud and OpenVPN is high priority (qHigh)
- Internal-to-Internal traffic is assigned to qLink, approx 900Mbps speed
Still to do:
- push Pandora, Spotify traffic into qLow
- push Skype, Hangouts into qHigh
- consider making the defaults qLow, and prioritize back to qNormal
-
-
-
I've posted many times int he past what my setup is. I'd have to go over my message history to find it.
I use Codel as a sub-discipline and HFSC as the shaper.
-
It would be great if Harvy, Nullty or sideout could draft a quickie guide to pfSense HFSC and provide some basic examples for common use cases. The pfSense Book is weak on HFSC, there hasn't been a Hangout on it, and every day there is someone new trying to wrap his head around it. I'd even cough in a few bucks if it was bounty-worthy.
-
@KOM:
It would be great if Harvy, Nullty or sideout could draft a quickie guide to pfSense HFSC and provide some basic examples for common use cases. The pfSense Book is weak on HFSC, there hasn't been a Hangout on it, and every day there is someone new trying to wrap his head around it. I'd even cough in a few bucks if it was bounty-worthy.
I have tried to encourage myself to do precisely that, but the combination of documentation being so unrewarding mixed with HFSC being beyond my full comprehension makes the task very daunting.
Defining "common use cases" might be a good beginning.
-
Common HFSC Use Cases
-
1 WAN / 1 LAN - [LAN: VoIP phones, ACK, DNS, WWW]
-
1 WAN / 2 LAN - [LAN: VoIP phones, ACK, DNS, WWW] * [DMZ: WWW, MAIL]
-
2 WAN / 1 LAN - [LAN: VoIP phones, ACK, DNS, WWW]
-
2 WAN / 2 LAN - [LAN1: VoIP phones, ACK, DNS, WWW] * [LAN2: VoIP phones, ACK, DNS, WWW]
-
2 WAN / 3 LAN - [LAN1: VoIP phones, ACK, DNS, WWW] * [LAN2: VoIP phones, ACK, DNS, WWW] * [DMZ: WWW, MAIL]
-
Per-client shaping
-
VPN shaping
-
-
I've posted many times int he past what my setup is. I'd have to go over my message history to find it.
I use Codel as a sub-discipline and HFSC as the shaper.
Do you recommend setting the priority (0-7) in the child queues? Also if using Codel as sub-discipline, do you still check the "Explicit congestion notification" option?
Thanks. -
Sooo … my traffic shaper config for this thread is working well. EXCEPT for Windows10 updates. Holy moly, that brings my internet access to a crawl/stop. Yes, I have Windows10 configured to only get updates from Microsoft and local LAN.
But, when Win10 updates, the data ignores my HFSC rules and uses 100% of my bandwidth. Everything else obeys the rules, except these updates.
For those new to this thread, my rules are:
WAN, HFSC, Bandwidth: 5Mbps
- qInternet, CoDel, Bandwidth: 5Mbps
- qNormal, Default, CoDel, Bandwidth: 10%
- qHigh, CoDel, Bandwidth: 20%
- qLow, CoDel, Bandwidth: 5%LAN1, HFSC, Bandwidth: 900Mbps
- qLink, CoDel, Bandwidth: 895Mbps
- qInternet, CoDel, Bandwidth: 4Mbps, Upperlimit: 4Mbps
- qHigh, CoDel, Bandwidth: 20%
- qNormal, Default, CoDel, Bandwidth: 10%
- qLow, CoDel, Bandwidth: 5%LAN2, HFSC, Bandwidth: 900Mbps
- qLink, CoDel, Bandwidth: 895Mbps
- qInternet, CoDel, Bandwidth: 1Mbps, Upperlimit: 1Mbps
- qHigh, CoDel, Bandwidth: 20%
- qNormal, Default, CoDel, Bandwidth: 10%
- qLow, CoDel, Bandwidth: 5%Refer to LAN1 above, how can traffic from LAN1 to WAN possibly take 6Mbps (my total bandwidth) when WAN is configured as 5Mbps and LAN is configured as 4Mbps?? ???
-
When Windows 10 is updating and saturating LAN (I know it's saturating WAN downloads but that is regulated by shaping LAN out), what queue is the traffic in? qLink?
-
I thought it would be on qLink also.. but no, it's on qNormal.
In an interesting turn of events, when I monitor the network usage on that machine itself, it says it's capping at 4Mbps, which is how pfSense is configured. But the network graphs - and confirmed by the amount of complaints I get - show the bandwidth is at 100% (6Mbps) instead of my HFSC at 4Mbps (see attached graph).
Attached are my floating rules as well.
Any ideas? I'm stumped ???
-
Are you using squid? FreeBSD does not shape incoming bandwidth, which means squid cannot shape incoming, only outgoing.
-
@KOM:
Common HFSC Use Cases
-
1 WAN / 1 LAN - [LAN: VoIP phones, ACK, DNS, WWW]
-
1 WAN / 2 LAN - [LAN: VoIP phones, ACK, DNS, WWW] * [DMZ: WWW, MAIL]
-
2 WAN / 1 LAN - [LAN: VoIP phones, ACK, DNS, WWW]
-
2 WAN / 2 LAN - [LAN1: VoIP phones, ACK, DNS, WWW] * [LAN2: VoIP phones, ACK, DNS, WWW]
-
2 WAN / 3 LAN - [LAN1: VoIP phones, ACK, DNS, WWW] * [LAN2: VoIP phones, ACK, DNS, WWW] * [DMZ: WWW, MAIL]
-
Per-client shaping
-
VPN shaping
I would be willing to add to the bounty on these scenarios too. We could then add it to the pfsense handbook.
-
-
Nope, not using Squid.
-
I have been trying to implement HFSC/Codel in a >100 node environment with VOIP. After four months and many weeks on the forum going through years old information and reading HFSC papers from the late 1990s….I have managed to circle back to the beginning. This has been a monumental task. Every time I feel the setup is good the stats show otherwise. After a few weeks there will be notable drops in all the wrong queues. Something on the order of >50% drops of the default queue happening in the top two queues that should have a network bandwidth priority, as noted by the assignments.
Been trying to find the time to come on the forum and share some of my findings, so hopefully my recent experience can help shed some light on trying to implement this. Also a big thanks to Harvy66, Nullity, and the other forum members for their many posts on traffic shaping. One of the problems is that after making changes, it can take two or three weeks to collect enough information to see reliable results. A couple rounds of mediocre changes and a month has passed with nearly no progress in shaping traffic. (or negative progress, or just ending up cutting 20% off the internet bandwidth while trying to take control of the other 80%)
The big bugbear I feel happens when there are multiple LANs. Everything looks good on paper and during testing after business hours but the true running environment shreds the queue right up, every time. Whether it is because of multiple devices pulling down updates or groups of users eating bandwidth from the internet, it seems like any real stress across all LANs/queues together will immediately start throwing drops in higher bandwidth queues instead of cutting bandwidth out of the default/lower bandwidth queues. This is easily noticed when running a VOIP queue, that almost always has multiple bandwidth streams running, and will immediately start logging drops.
Tried setting M2 levels way below actual limits, for example setting qDefault to 64Kb on a parent queue that is good to 20Mbps, and then setting notably higher limits on the qVOIP and qACK in the order of 768Kb. That's 12 times the bandwidth limit of qDefault. To my understanding this -should- result in the qDefault dropping significantly more information than the other queues, but the results show otherwise. For example consider the 20Mbps interface with all child queue M2 combined equaling only 3Mbps, there is considerable bandwidth available to link share here. (and priority should be granted based on the M2 bandwidth allocation)
Tried removing most of the queues and just using two or three, this didn't seem to help. Then tried creating seven or eight queues, and this didn't help either. One benefit of more queues is that I can see exactly which queues are dropping information and know exactly which services are dropping packets. Also can see in pftop how much data passes through those queues, which can help when looking at dropped packets per gigabyte of data.
Just started using the d parameter (delay) so the plan now is to put a delay into the ramp up on bandwidth available for the queue. The examples seem to use this for bursting data but I'd like to try using it to slowly ramp up bandwidth allocation. The other solution is to assign a real time queue, this works until there is more than one "very important" stream to operate the network. My goal was to implement this on a fully link share setup with no real time queues. That is all I have for now, just tossed out a stack of diagnostics>pftop>queue printouts from the last couple of months....where I thought everything was good to go! The best advice I can share right now is don't believe the shaper is okay until a few weeks of data has run through it. ;D
-
Thank you very much for that contribution jetblackwolf. I understand your pain in trying to figure this out for a long time yet coming back needing to revise it. I wish it were more user friendly to Traffic shape than this. As a comparison, notice how simple my other configuration is on meraki (attached). I have 0 issues with this setup. 24 active users plus torrenting and Voip. No hiccups with meraki.
Simply by setting VoIP traffic to high priority and Bittorrent to Low, none of my users have issues with the most important things like calls, DNS and web browsing.
![traffic shaping meraki.PNG](/public/imported_attachments/1/traffic shaping meraki.PNG)
![traffic shaping meraki.PNG_thumb](/public/imported_attachments/1/traffic shaping meraki.PNG_thumb) -
notice how simple my other configuration is on meraki (attached)
You can do exactly this with pfSense, using PRIQ ;)
I have been trying to implement HFSC/Codel in a >100 node environment with VOIP.
Perhaps give us your details and we can walk you through it. Internet connection up/down rates, cable/dsl/fiber/t1, how many LANs do you have, how many queues do you want, etc. I have it working well for my 40-device implementation; you could simply tweak my setup. Also, packet drops are normal; don't expect zero drops - if you're not dropping, then something upstream is (which is the whole reason we want to avoid an upstream device doing that!).
-
-
I'd love to see a tutorial on that! Do you have one?
This really should be in it's own thread, as this thread is for HFSC… but anyway:
It's really as straightforward as creating PRIQ queues, and specifying a priority 1(lowest) to 7 (highest). You can have a max of 7 queues only. Then feed your traffic into the queues. There are plenty of PRIQ guides if you search the web. Here's an example I found in 30 seconds: http://sunstatetechnology.com/docs/pfSenseVoIPQoSGuide.pdf
I will take it back, it is not quite as easy as a basic drop-down that Meraki gives you. But that's what it does. Untangle does the same.
-
VoIP is fixed bandwidth. If you're seeing dropped packets, it's because
- Devices are some how synchronized and packets are bursting in and filling up your queue faster than it's depleting. Improperly sized queue? Try using Codel?
- You have too many devices and not enough bandwidth, so something has to give
- Wrong traffic is getting into the queue and consuming precious bandwidth.
PRIQ is rarely what you want. It seemingly works well under simple tests, but it has some really nasty corner cases that can starve lower priority queues or cause massive jitter for large buffers. Network flows absolutely hate abrupt changes.
-
Thanks for sharing this moikerz. Can you see why I am confused? You say that it is easy to find a nice tutorial on how to do this with PRIQ, then Harvey66 comes along and says that PRIQ is not the right solution for this problem.
It seems like there is not a definitive answer on this. I understand that no two networks are identical, but there are some pretty common scenarios you find. KOM mentioned some great ones earlier in this thread.
Maybe we just need to figure out which Queue to use first, then layout the proper rules for it.
-
Thanks for the feedback. Trying not to hijack the thread, just meant to pop in and try to offer some information.
I'll gladly try and help with a guide. (at least for HFSC) Trying to get the correct setup going first before I created a big thread that was full of misinformation. Not on purpose mind you, along these months I have closed the book on this many times believing I understood what was going on and then started all over. So I am in an odd position where I can spot a bad setup now based on all of my tests and what I have concluded on….but not confident enough to offer up any kind of guide on my own.
I think some questions I would love to see answered for HFSC/Codel are what happens when new streams come into a queue? Do they abide by the M1/D while other streams have already met the M2? And where would the bandwidth come from? M2? Or dig into another child queue? Or does the queue literally fire off one M1/D check on first use of that queue and then potentially sits there in M2 mode until the queue returns to an idle state, to then repeat the process again? Many areas of the papers I read went right over my head, not a PHD by any stretch.
I spent time trying to assign the priority for the HFSC before noticing it doesn't actually seem to be a part of the queue documents, at all. This is confusing because there is a note in the GUI that says it sets priority on packets during overload (for HFSC). Yet HFSC only has bandwidth and time variables. Spent a bit of time on this before finding out it does nothing......at all. So even if the GUI was cleaned up and only the proper options provided for the selected queue, it would probably be less confusing. ::)
Why I believe the drops are still occurring on my end is related to how multiple LAN queues are being hammered at the same time, even though all upperlimits are correctly divided. I will look into that burst comment Harvy66. (and yes I am using CODEL per your findings in older forum posts)
-
Thanks for the feedback. Trying not to hijack the thread, just meant to pop in and try to offer some information.
I'll gladly try and help with a guide. (at least for HFSC) Trying to get the correct setup going first before I created a big thread that was full of misinformation. Not on purpose mind you, along these months I have closed the book on this many times believing I understood what was going on and then started all over. So I am in an odd position where I can spot a bad setup now based on all of my tests and what I have concluded on….but not confident enough to offer up any kind of guide on my own.
I think some questions I would love to see answered for HFSC/Codel are what happens when new streams come into a queue? Do they abide by the M1/D while other streams have already met the M2? And where would the bandwidth come from? M2? Or dig into another child queue? Or does the queue literally fire off one M1/D check on first use of that queue and then potentially sits there in M2 mode until the queue returns to an idle state, to then repeat the process again? Many areas of the papers I read went right over my head, not a PHD by any stretch.
I spent time trying to assign the priority for the HFSC before noticing it doesn't actually seem to be a part of the queue documents, at all. This is confusing because there is a note in the GUI that says it sets priority on packets during overload (for HFSC). Yet HFSC only has bandwidth and time variables. Spent a bit of time on this before finding out it does nothing......at all. So even if the GUI was cleaned up and only the proper options provided for the selected queue, it would probably be less confusing. ::)
Why I believe the drops are still occurring on my end is related to how multiple LAN queues are being hammered at the same time, even though all upperlimits are correctly divided. I will look into that burst comment Harvy66. (and yes I am using CODEL per your findings in older forum posts)
Regarding HFSC, please post your questions in my HFSC explained - decoupled bandwidth and delay - Q&A - Ask anything thread. That thread also has links to the best HFSC documentation that I came across while researching HFSC.
You are not alone in your confusion… :)