Traffic Shaping Worse Than Baseline?

CaptainElmo

CPU jumps to 50% when PRIQ is enabled and only 100Mb/100Mb connection?

Yes - it does this without shaping enabled as well. I also have Suricata running and I assumed it was the culprit here. In any case the CPU doesn't seem to be a limiting factor in this particular test scenario.

@Harvy66:

Since PRIQ does not seem to work, want to give HFSC a try?

Yes, I'm willing to try anything. If you'll lead the way I'll gladly follow. I'm not familiar with the HFSC settings so I'll need some guidance there.

@Harvy66:

Also, do you have your bandwidth limit set on your LAN interface also?

Aha - the bandwidth was set to 10 Mbps on the LAN interface. I bumped that to 990 Mbps and the latency spikes are now down to only 1,000 ms or so.

@Harvy66:

You can shape your download, it's just not as effective as shaping your upload, but it does still help. You may also need to waste a bit more bandwidth to gain more control.

Roger that. When the time comes that shaping is a necessity this will be an acceptable trade-off for more control. Ideally I would like to gain the necessary understanding now rather than later when I'm under pressure to keep everything running smoothly.

Harvy66

Since you're not actually doing any download shaping, just don't setup anything on your LAN.

On you WAN, set to HFSC, then go to each of the queues and instead of them being "priority", use the Bandwidth field. Don't worry about the three lower fields, leave those unchecked/set. HFSC is all about shaping bandwidth, not priorities. Figure out how much bandwidth you need or just set your "high priority" queues to have more bandwidth. You bandwidth can't be greater than the interface's total bandwidth. I prefer just to use percentages instead of actual bandwidth figures.

You can just throw ballpark figures at your low bandwidth queues. VOIP should be quite low compared to your 100Mb connection, so I wouldn't worry about giving it too much bandwidth. Unused bandwidth will get fairly distributed among the other queues.

Some examples
ICMP: Bandwidth 1%, Codel Active Queue
ACK: Bandwidth 20%, Codel Active Queue
VOIP: Bandwidth 20%, Codel Active Queue
Default: Bandwidth 59%, Codel Active Queue

of course you had some other queues, you decide how you want the bandwidth distributed. Remember, just concern yourself with if your connection is maxed out and all of your queues are also maxed out, how would you want your bandwidth distributed.

Another general rule of thumb. 80% is 100%. If you need 1Mb of bandwidth, then give a queue at least 1.25Mb of bandwidth.

You may want to disable Suricata during performance testing if you're having performance issues. Always reduce the number of variables when attempting to debug.

I chose 20% for ACK only because I've seen it recommend by a trust worthy source. I don't quite agree with it since ACK in my network have a 20:1 ratio, placing it at only 5%, but I'm not concerned because unused bandwidth gets distributed.

Nullity

You are having trouble with traffic-shaping because you do not need it, as KOM said, I think.
HFSC is for people with numerous, assumedly life-or-death services that all need to get their guaranteed service levels (bandwidth and/or latency guaratees). Most people really only have a VIP traffic, Penalized traffic, and Bulk traffic which can be handled just fine with PRIQ or CBQ (or FAIRQ).

Traffic-shaping is rarely a good idea unless you need to FIX a PROBLEM. If you are just looking to get web-pages loaded faster, other avenues are much easier and more effective. Also, you cannot prioritize traffic without resultingly deprioritizing other traffic, so… Unless you NEED something, let it go for now.

Harvy has some goodish hints and theories, but without packet/traffic captures to show before and after, be wary.
Only by monitoring your traffic, will you know where the problems are, if there are any...
With my ADSL 15mbit I allocated 300Kb for ACKs because that is what I observed with pftop and other monitors. Each person will have their own ACK ratio.

Unless you have a precise goal and a solid grasp of internetworking, so that you know what to expect, there's only going to be confusion and haphazard trial and error. That is what happened to me...

So, if you have no reason for traffic-shaping, just don't. Read up first, if you must. For example, The Book of pf, Network flow analysis, Practical Packet Analysis, and of course the Computer Networks by Tanenbaum. Knowing how to fix a network is different from bumbling around hoping you will learn, while screwing up your network and thinking you are actually improving it.

If you are actually just interested in fixing the bufferbloat, I would find a Linux-based router OS, as Linux has been where every new scheduling algorithm has been implemented, for like 10+ years, and it has the modern CoDel with Fair-Queueing. The CoDel people are now working on Cake, which is quite a bit more ambitious than CoDel.

P.S. With CoDel enabled on my WAN traffic, I see 0 to 4 packets in the queue, regardless. Users do not control the queue depth, CoDel does by constantly measuring the packet sojourn timing.

CaptainElmo

HFSC appears to be doing the trick! Initial tests show all parameters at expected values during testing - even with Suricata running. For thoroughness I disabled Suricata for a couple tests - with and without shaping - and the CPU stayed below 10% in all cases without Suricata's overhead while still hitting 50% in all cases with Suricata enabled.

CaptainElmo

Nullity - thank you for the excellent feedback. I agree with you wholeheartedly.

I do not plan to use traffic shaping until it is actually needed, but before reaching that point I need to get my brain around how it works in pfSense. During this practice I discovered that simple PRIQ queues induce huge unacceptable spikes in latency for reasons which are still unexplained. Harvey's suggestion to use HFSC seems to work though, so now when I need traffic shaping I will better understand the appropriate way to implement it without doing more harm than good to my network.

Thank you again for all of the excellent advice. The more I interact with the pfSense ecosystem the more impressed I am.

Harvy66

@Nullity:

HFSC is for people with numerous, assumedly life-or-death services that all need to get their guaranteed service levels (bandwidth and/or latency guaratees). Most people really only have a VIP traffic, Penalized traffic, and Bulk traffic which can be handled just fine with PRIQ or CBQ (or FAIRQ).

Even without all of the special features of HFSC, even if all you use is the bandwidth fields, HFSC is still superior to PRIQ or CBQ when it comes to fairly distributing bandwidth while maintain tight scheduling. It's like comparing an electric motor to a combustion engine, even if you don't enable regenerative breaking, it's still better.

@Nullity:

Traffic-shaping is rarely a good idea unless you need to FIX a PROBLEM.

You can't go wrong with "don't fix what aint broken". But also good to learn new stuff that may become relevant as long as you're not harming production while testing.

@Nullity:

Harvy has some goodish hints and theories, but without packet/traffic captures to show before and after, be wary.

Nullity makes a good point. I am not 100% confident, but I have had very good results on my network and I am confident enough with my reasoning to give ideas but not say "this is how you fix it". I think they should work most of the time, but I do not have a comprehensive background in these issues. I have good intentions, but take what I say with a grain of salt.

@Nullity:

Unless you have a precise goal and a solid grasp of internetworking, so that you know what to expect, there's only going to be confusion and haphazard trial and error. That is what happened to me…

I think what he's getting at, is it's a great way to learn on your own network. Empirical evidence is important with debugging problems and solving specific problems is better than taking guesses, unless something you're doing is a general recommendation. One thing that may be worth trying and should be dead simple is using FAIRQ. I haven't used it myself, so I have no idea of its jitter characteristics or fairness under many flow high loads, but in theory it should be "set your interface bandwidth", and that's all. fq_Codel or Cake would be a future replacement for FAIRQ, whenever those get implemented.

I still have no idea why PRIQ was giving such an issue with performance. In theory, VOIP should have been perfect since it was the highest priority.

I am curious as to what kind of loads you have tested and what kind of jitter, loss, latency, bandwidth you're seeing. On my network, I get pretty much exactly what I expect, but I have a strange situation of very low pings, low loss, and my ISP uses an AQM. Yeah, boo hoo, I have great Internet.

CaptainElmo

@Harvy66:

I still have no idea why PRIQ was giving such an issue with performance. In theory, VOIP should have been perfect since it was the highest priority.

The main issue is that PRIQ is inducing huge latency spikes between LAN and WAN. Even though the VOIP packets are being dropped on the wire first they are still subject to the high latency of getting there in the first place. I still have no idea why PRIQ is causing these latency spikes.

@Harvy66:

I am curious as to what kind of loads you have tested and what kind of jitter, loss, latency, bandwidth you're seeing.

I do what I can to fill up the WAN pipe from the LAN interface - mostly simultaneous speed tests and FTP transfers to/from a dedicated server on the other end of the WAN pipe. The pfSense traffic graph shows constant saturation on the WAN interface during these tests.

When the WAN pipe is saturated PRIQ is inducing latency of 2-4 seconds with up to 20% packet loss according to the apinger stats. While I don't fully trust the apinger service, I am able to confirm noticeable (to the ear) VOIP issues coinciding with the high latency being reported by apinger.

The WAN pipe is a dedicated gigabit fiber circuit which the ISP limits to 100/100 Mbps. If I saturate it without traffic shaping it remains rock solid with no movement in latency and not a single lost packet. No matter what I try I am not able to make the WAN link blink at all. If I'm not mistaken that means the WAN link itself can be ruled out as the source of high latency and packet loss.

CaptainElmo

Is any part of the PRIQ queue processing offloaded in a manner which HFSC is not? Could there be a situation where I am hitting processing limits of an offloaded resources which are not reported as part of the main CPU statistics?

Harvy66

I don't think the shapers use any offloading, but they make use of certain driver features

Nullity

@CaptainElmo:

Is any part of the PRIQ queue processing offloaded in a manner which HFSC is not? Could there be a situation where I am hitting processing limits of an offloaded resources which are not reported as part of the main CPU statistics?

The CPU needed for any sched algo will be minimal. Elegance and efficiency are perhaps more important than actual scheduling capability (Stochastic Fair Scheduling, for example). HFSC, perhaps the most complex and CPU intensive, was capable of 80,000+ packets per second on a 200Mhz Pentium Pro.