HFSC basic minimums & maximums

moikerz

I have a stable 20/20 fiber connection to my ISP. Still running 2.2.6 (because my SG1000 Gold activation hasn't reactivated my SG2440 Gold like it's supposed to - known bug. As soon as that's established I'll upgrade).

I have a HFSC configuration set to limit to 10/10, but I'm noticing that I don't seem to understand the configuration of minimum guaranteed bandwidth. Particularly the purpose of "Link Share". Also my bufferbloat reports on DSLreports.com are stuck at C or D levels.

I want all queues to borrow up to maximum of the parent link if no other queue wants it. (Perhaps this is my mistake?)

I thought setting the child Bandwidth values was the way to accomplish this. But when those damn Windows10 updates start downloading, it caps the 10Mbps link but the other network devices are starved…

Have I missed something?

WAN (HFSC, Bandwidth: 10Mbps)
- qInternet (CODEL, Bandwidth: 10Mbps, Upperlimit m2: 10.0Mb)
- qHigh (Bandwidth: 20%, no other options set)
- qNormal (Bandwidth: 10%, no other options set)
- qLow (Bandwidth: 5%, default queue, no other options set)

LAN (HFSC, Bandwidth: 900Mbps)
- qLink (Bandwidth: 890Mbps, no other options set)
- qInternet (Bandwidth: 10Mbps, Upperlimit m2: 10.0Mb)
- qHigh (Bandwidth: 20%, no other options set)
- qNormal (Bandwidth: 10%, no other options set)
- qLow (Bandwidth: 5%, default queue, no other options set)

Nullity

You likely need to use HFSC's upper-limit parameter.

Link-share is a proportional value. https://forum.pfsense.org/index.php?topic=90512.msg505122#msg505122

You also probably need to separate different traffic flows into different queues that make use of upper-limit.

Downloads are more complex that uploads, from a QoS perspective. The best tutorial I know of is http://www.linksysinfo.org/index.php?threads/qos-tutorial.68795/

Give more details about what you want and what you've tried so that we can better critique.

moikerz

Thanks Nullity - I was hoping for a reply from you or Harvy66 8)

Per my config in my OP, I have Upperlimits set on the branches, but not on the leafs. Are you saying I should add Upperlimit configurations to the leafs?

I read and re-read your post, but don't understand the Linkshare purpose. I've also read that linksys link quite a few times a few years ago ;)

Essentially, out of my raw 20/20Mbps connection, I want to limit LAN to 10/10Mbps (while testing; I'll bump it later):

1. I want qHigh to use 100% of bandwidth if it's unused, with a minimum guarantee of 20% of parental bandwidth
2. I want qNormal to use 100% of bandwidth if it's unused, with a minimum guarantee of 10% of parental bandwidth
3. I want qLow to use 100% of bandwidth if it's unused, with a minimum guarantee of 5% of parental bandwidth

Harvy66

Your config based on the first post looks mostly correct.

WAN - Codel on qInternet doesn't affect anything. qInternet doesn't actually have traffic in it, it's just the parent "queue". It's not really a real queue, just metadata about how to split the bandwidth. Only leaf queues get traffic.
WAN - You technically do not need to set the upper limit on your WAN since you configured the interface to be limited to 10Mb. Not to say that the interface may limit differently(for better or worse) than how HFSC's upperlimit will limit.
LAN/WAN - Enable Codel on all of your child queues qHigh/qNormal/qLow for both WAN and LAN.

But when those damn Windows10 updates start downloading, it caps the 10Mbps link but the other network devices are starved

Two possible issues here

If you're using a proxy, shaping won't work correctly because pfSense doesn't shape ingress, which means it can't shape how fast the proxy downloads
If your ISP has an on-site CDN from which Windows Updates downloads, TCP may be flooding your connection. TCP has a minimum window size of 2 segments. This means if you have a 1ms ping to your ISP, that's 2 1520 packets every 1ms, which turns out to be 20Mb/s. I have a similar issue with my ISP and Steam. Downloading a game opens up 30 connections to a CDN 1ms away. Instantly peg my 150Mb connection and get packetloss while TCP can't back-off.

chrcoluk

I am using HFSC for ingress (downstream).

I have not configured link share.
The bandwidth % from my testing seemed to have no affect on throughput but it has to be configured to keep the firewall happy, totals of all queues cannot exceed 100%. So you cannot over provision bandwidth %.

I set the actual queue limit in m2 (which actually does limit throughput) and didnt populate m1 or d boxes.
For some of my queues I set a real time % (min bandwidth) which works absolute wonders for ensuring my high priority queues are never ever affected by things like bulk downloads.

If real time queues are not populated then other queues can still use the reserved bandwidth.

HFSC is complicated but on my testing it is by far the best performing ingress shaper on my network. All the other shaper's were unable to prevent steam downloads causing packet loss (steam is very aggressive and opens up 30+ threads per download sometimes). It also gives me the best result on dslreports bufferbloat testing.

Outbound is way easier to shape and I didnt bother with HFSC for that, here is my queue config output from the cli.

root@PFSENSE ~ # pfctl -s queue
queue qInternet on igb0 bandwidth 19.50Mb fairq( red ecn ) 
queue qACK on igb0 bandwidth 0 b priority 6 fairq( codel linkshare 19.50Mb ) 
queue qDefault on igb0 bandwidth 0 b priority 4 qlimit 500 fairq( codel default linkshare 19.50Mb ) 
queue qOthersHigh on igb0 bandwidth 0 b priority 5 fairq( codel linkshare 19.50Mb ) 
queue qOthersLow on igb0 bandwidth 0 b priority 3 fairq( codel linkshare 19.50Mb ) 
queue qICMP on igb0 bandwidth 0 b priority 7 fairq( codel linkshare 19.50Mb ) 
queue qBulk on igb0 bandwidth 0 b priority 2 fairq( codel linkshare 19.50Mb ) 
queue root_igb1 on igb1 bandwidth 67.15Mb priority 0 {qDefault, qICMP, qACK, qOthersHigh, qOthersLow, qBulk}
queue  qDefault on igb1 bandwidth 16.79Mb qlimit 150 hfsc( codel default upperlimit 65.13Mb ) 
queue  qICMP on igb1 bandwidth 1.34Mb hfsc( codel realtime 1.34Mb ) 
queue  qACK on igb1 bandwidth 3.36Mb qlimit 150 hfsc( codel realtime 3.36Mb upperlimit 16.79Mb ) 
queue  qOthersHigh on igb1 bandwidth 26.86Mb qlimit 150 hfsc( codel realtime 49.02Mb upperlimit 65.13Mb ) 
queue  qOthersLow on igb1 bandwidth 12.09Mb hfsc( codel upperlimit 65.13Mb ) 
queue  qBulk on igb1 bandwidth 6.71Mb hfsc( codel upperlimit 60.43Mb )

more verbose if preferred.

root@PFSENSE ~ # pfctl -vs queue
queue qInternet on igb0 bandwidth 19.50Mb fairq( red ecn ) 
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue qACK on igb0 bandwidth 0 b priority 6 fairq( codel linkshare 19.50Mb ) 
  [ pkts:     336811  bytes:   21668702  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue qDefault on igb0 bandwidth 0 b priority 4 qlimit 500 fairq( codel default linkshare 19.50Mb ) 
  [ pkts:     276243  bytes:  113541004  dropped pkts:   2233 bytes: 3219986 ]
  [ qlength:   0/500 ]
queue qOthersHigh on igb0 bandwidth 0 b priority 5 fairq( codel linkshare 19.50Mb ) 
  [ pkts:      24275  bytes:    2046273  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue qOthersLow on igb0 bandwidth 0 b priority 3 fairq( codel linkshare 19.50Mb ) 
  [ pkts:      21921  bytes:    9128713  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue qICMP on igb0 bandwidth 0 b priority 7 fairq( codel linkshare 19.50Mb ) 
  [ pkts:      63278  bytes:    3975972  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue qBulk on igb0 bandwidth 0 b priority 2 fairq( codel linkshare 19.50Mb ) 
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue root_igb1 on igb1 bandwidth 67.15Mb priority 0 {qDefault, qICMP, qACK, qOthersHigh, qOthersLow, qBulk}
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue  qDefault on igb1 bandwidth 16.79Mb qlimit 150 hfsc( codel default upperlimit 65.13Mb ) 
  [ pkts:     449204  bytes:  493827574  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/150 ]
queue  qICMP on igb1 bandwidth 1.34Mb hfsc( codel realtime 1.34Mb ) 
  [ pkts:         47  bytes:       4530  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue  qACK on igb1 bandwidth 3.36Mb qlimit 150 hfsc( codel realtime 3.36Mb upperlimit 16.79Mb ) 
  [ pkts:      85138  bytes:    6887208  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/150 ]
queue  qOthersHigh on igb1 bandwidth 26.86Mb qlimit 150 hfsc( codel realtime 49.02Mb upperlimit 65.13Mb ) 
  [ pkts:      77713  bytes:   87691144  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/150 ]
queue  qOthersLow on igb1 bandwidth 12.09Mb hfsc( codel upperlimit 65.13Mb ) 
  [ pkts:     347128  bytes:  506816627  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]
queue  qBulk on igb1 bandwidth 6.71Mb hfsc( codel upperlimit 60.43Mb ) 
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50 ]

Nullity

@moikerz

I think part of your config's problem is that your 10Mbit limit limits everything, leaving no room for other traffic. Always leave some bandwidth for new, high-priority traffic. For example, you could set your Default queue to 9Mbit upper-limit, leaving 1Mbit headroom for any other, non-defaulted traffic.

You might also try enabling codel on the WAN or simply set the queue length to 1 (or 0?). IMO, buffering on ingress, primarily when going from a slow link to a fast link, is completely useless. Why artificially delay a packet that can be transmitted? I dunno if pfSense is capable of bufferless traffic-shaping ("traffic-policing") though.

Especially with HFSC link-share, rather than using percentages you can just use normal values since they are both just proportioning bandwidth.

The following link-share values mean exactly the same thing to HFSC:
qOne: 10%
qTwo: 90%

qOne: 1Kbit
qTwo: 9Kbit

qOne: 1Mbit
qTwo: 9Mbit

Since the ratio is 1:9, they all are the same. During contention/congestion, for every bit qOne gets, qTwo gets 9 bits.
Only upper-limit & real-time are "hard" values. (IMO, real-time is really only useful when m1 & d are being used. Otherwise it's possibly just another point of unneeded confusion.)

moikerz

@Nullity:

I think part of your config's problem is that your 10Mbit limit limits everything, leaving no room for other traffic. Always leave some bandwidth for new, high-priority traffic.

I thought the purpose of HFSC with defining minimums, was that minimums would be honoured if there is data in that queue; if no data in that queue then use that available bandwidth for other queues? This is why I have the percentages in my Bandwidth fields of the leaf nodes. Reserving 1Mbps of throughput sounds counter-intuitive. Should minimums be defined elsewhere instead?

@Nullity:

Especially with HFSC link-share, rather than using percentages you can just use normal values since they are both just proportioning bandwidth.

I understand HFSC only uses normal values, and strips additional characters. I'm keeping the percentages there for more of a visual reference to me, or anyone who may come in after me.

@Harvy66:

WAN - Codel on qInternet doesn't affect anything. qInternet doesn't actually have traffic in it, it's just the parent "queue". It's not really a real queue, just metadata about how to split the bandwidth. Only leaf queues get traffic.

WAN - You technically do not need to set the upper limit on your WAN since you configured the interface to be limited to 10Mb. Not to say that the interface may limit differently(for better or worse) than how HFSC's upperlimit will limit.

LAN/WAN - Enable Codel on all of your child queues qHigh/qNormal/qLow for both WAN and LAN.

All good advice, thanks! Although I need to set the upperlimit on WAN to throttle my upload.

So my question still remains:
- Have I configured the minimum guaranteed bandwidth for each queue correctly? (see my OP for config)

Nullity

@moikerz:

@Nullity:

I think part of your config's problem is that your 10Mbit limit limits everything, leaving no room for other traffic. Always leave some bandwidth for new, high-priority traffic.

I thought the purpose of HFSC with defining minimums, was that minimums would be honoured if there is data in that queue; if no data in that queue then use that available bandwidth for other queues? This is why I have the percentages in my Bandwidth fields of the leaf nodes. Reserving 1Mbps of throughput sounds counter-intuitive. Should minimums be defined elsewhere instead?

Refer to the linksys link I posted to see the differences/problems with download. Depending on latency, you must allow for some headroom because the sender always takes time to respond to your request to slow down. The time between request and actually receiving the slower rate can vary (10ms, 100ms, 1000ms?). Until the traffic is slowed, you effectively have zero bandwidth for other traffic, which is most impactful on short-lived connections like web browsing.

Yeah, you are defining minimums but that is only as the traffic leaves the interface. Incoming traffic is unpredictable and is only "controlled" as a side-effect of controlling output.

Ultimately, just intelligently try different methods of traffic-shaping and see what works.

moikerz

@Nullity:

… Depending on latency, you must allow for some headroom because the sender always takes time to respond to your request to slow down. ... Until the traffic is slowed, you effectively have zero bandwidth for other traffic, which is most impactful on short-lived connections like web browsing.

Ah yes, I had forgotten about the response time to allow for slowdown. Good call, thanks! I'll futz with the overhead and enable Codel on my leaf nodes and see how that goes.

Harvy66

@Nullity:

@moikerz

I think part of your config's problem is that your 10Mbit limit limits everything, leaving no room for other traffic. Always leave some bandwidth for new, high-priority traffic. For example, you could set your Default queue to 9Mbit upper-limit, leaving 1Mbit headroom for any other, non-defaulted traffic.

You might also try enabling codel on the WAN or simply set the queue length to 1 (or 0?). IMO, buffering on ingress, primarily when going from a slow link to a fast link, is completely useless. Why artificially delay a packet that can be transmitted? I dunno if pfSense is capable of bufferless traffic-shaping ("traffic-policing") though.

Delaying/dropping ingress is how your signal TCP to backoff. If you don't do this, your ISP will, and it will probably have massive bufferbloat. While shaping works best on egress, it does work on ingress quite well, just not nearly as "perfect" as it does on egress.

Traffic-policing causes that pesky saw-tooth pattern with TCP and does not like the gigabit bursts that networks. Until packet-pacing gets implemented industry wide, if you look at wireshark, what's really happening with a 5Mb/s Netflix stream is it's sending 1Gb microbursts several times per second. Depending on the policer, it may start dropping packets. This is why networks have buffers in the first place.

The real crazy is when you look at GPON, WIFI, or any other protocol that encapsulates multiple Ethernet frame into a super-frame. Even with a 10Mb/s connection, you may see 64KiB bursts of 1Gb/s with a 10Mb average.

In the end, you either have a natural limit or an artificial limit. You have zero control over the natural limit, but you can control the characteristics of the artificial limit.

chrcoluk

Indeed Harvy66, the rate of flow on TCP is not just controlled by the sender, ingress shaping works exactly as you said, it drops packets or delays them to simulate congestion and it will cause the sender to back off to slow things down. TCP is a self regulating protocol, it works very well in single streams, but gets wrecked when applications abuse it, e.g. torrents and steam.

Congestion window is the prime driver of the flow of data. However the Congestion window is capped by the sender's send window and and the recipients receive window, all the ingress shaper has to do is either deliberately drop/delay packets to make the congestion window shrink itself, or it can manipulate the packets themselves to pretend the recipient has a smaller receive window (the latter is how some isp's apply traffic shaping).

Nullity

You do not need to delay a packet to delay the returning ACK, right? A delayed ACK would trigger congestion control. ECN could also simulate congestion without actually buffering any packet.

I dunno if any ingress shaping setups use this no-buffer idea though. On download, the goal is to slow the sender below our maximum, but sadly we must simulate a slower link which in-turn simulates congestion when the buffer grows. It's not completely necessary to buffer though.

The only algorithm I know of that treats ingress & egress shaping differently is Cake, IIRC.

In some ways it's kinda neat that traffic-shaping has so many obvious improvements to come.

Nullity

@Harvy66:

@Nullity:

@moikerz

I think part of your config's problem is that your 10Mbit limit limits everything, leaving no room for other traffic. Always leave some bandwidth for new, high-priority traffic. For example, you could set your Default queue to 9Mbit upper-limit, leaving 1Mbit headroom for any other, non-defaulted traffic.

You might also try enabling codel on the WAN or simply set the queue length to 1 (or 0?). IMO, buffering on ingress, primarily when going from a slow link to a fast link, is completely useless. Why artificially delay a packet that can be transmitted? I dunno if pfSense is capable of bufferless traffic-shaping ("traffic-policing") though.

…
Traffic-policing causes that pesky saw-tooth pattern with TCP and does not like the gigabit bursts that networks. Until packet-pacing gets implemented industry wide, if you look at wireshark, what's really happening with a 5Mb/s Netflix stream is it's sending 1Gb microbursts several times per second. Depending on the policer, it may start dropping packets. This is why networks have buffers in the first place.
...

The saw-tooth problem is on egress, on ingress it may happen regardless of what's happening on egress.

On my connection, whenever I use any ingress shaping or policing I get saw-tooth. When I allow my ISP to do the rate-limiting I get a smooth bitrate. My latency increases from maybe 10ms to 25ms during full saturation without shaping. With shaping, the saw-toothing dropped my average bitrate below the already synthetically lowered bitrate (15%?) so I decided against ingress shaping. I only have 12Mbit so …

Edit: Fixed quote tags.

chrcoluk

One thing I noticed Nullity, is to make the sender back of quick enough the packet is best to be dropped rather than delayed, which means low queue depths are important on low priority ingress queues to ensure packets get dropped quickly on those queues.

In an ideal world the congestion window would stop growing the moment the throughput is saturating the line, but it tends to actually continue to grow for a while after as it can get quite big before packets get dropped naturally.

This how I believe HFSC differs to the others, when I run a dslreports speedtest on HFSC (with a low priority queue) it moans about retransmissions, meaning I was dropping the packets. However even with those dropped packets the test still saturates the ALTQ pipe easily so there is no noticeable performance lost On the others like fairq.I get no warning about dropped packets during the test however my latency is more jittery and I can get packet loss on ssh and other higher priority stuff during the test. This suggests to me on ingress, delaying is not that effective compared to dropping packets to stimulate congestion.

Also I get no sawtooth affect, speed ramps up quickly and stays there assuming I am downloading from a good source.

Nullity

@chrcoluk:

One thing I noticed Nullity, is to make the sender back of quick enough the packet is best to be dropped rather than delayed, which means low queue depths are important on low priority ingress queues to ensure packets get dropped quickly on those queues.

In an ideal world the congestion window would stop growing the moment the throughput is saturating the line, but it tends to actually continue to grow for a while after as it can get quite big before packets get dropped naturally.

This how I believe HFSC differs to the others, when I run a dslreports speedtest on HFSC (with a low priority queue) it moans about retransmissions, meaning I was dropping the packets. However even with those dropped packets the test still saturates the ALTQ pipe easily so there is no noticeable performance lost On the others like fairq.I get no warning about dropped packets during the test however my latency is more jittery and I can get packet loss on ssh and other higher priority stuff during the test. This suggests to me on ingress, delaying is not that effective compared to dropping packets to stimulate congestion.

Also I get no sawtooth affect, speed ramps up quickly and stays there assuming I am downloading from a good source.

A "dropped" packet is just a packet that the sender received no ACK or duplicate ACKs for, right? There's nothing that forces us to actually drop that packet when our congestion is purely artificial. Or maybe there is… I'm a bit rusty in this area nowadays.

I think CoDel does some sort of intelligent rate-limiting calculations to keep bandwidth more consistent from whoever is transmitting when it experiences bufferbloat. I don't think HFSC cares about anything but transmitting packets "fairly" with regard to worst-case latency.

In general, low queue depth seems to be best. I struggle to think of scenerios where it isn't.

chrcoluk

dropped packet inbound, which means the receiving computer will not report the chunk of data as arrived as pfsense will have prevented the packet been passed on, so the ack never gets sent back to the sender, as a result the sender will backoff as it will assume its congestion.

HFSC basic minimums & maximums

The following link-share values mean exactly the same thing to HFSC: qOne: 10% qTwo: 90%

qOne: 1Kbit qTwo: 9Kbit

The following link-share values mean exactly the same thing to HFSC:
qOne: 10%
qTwo: 90%

qOne: 1Kbit
qTwo: 9Kbit