Traffic Shaper - Queue Length and Dropped Packets

tuffcalc

Question - mostly out of curiosity. When I set a queue length (i.e., 500), I never get dropped packets, even under fully saturated links (although I do see the queue length increase, i.e. 25/500, etc.).

When I do not set a queue length it looks like PFSense defaults the queue length to 50. Without a queue length, I will see some dropped packets (say less than 1%) during normal load (i.e., not even close to fully saturated). When the link is fully saturated I still see no dropped packets, but I do see the queue length increase.

I thought the whole idea of traffic shaping is to drop packets. Can anyone explain this behavior?

Many thanks

KOM

The point of TS isn't to drop packets, it's to control your connection and maintain specified service levels. Dropping packets is one method of doing that, as is limiting the outgoing data rate.

Harvy66

If you queue gets too long, you get buffer bloat, which is completely separate of traffic shaping. If your queue is too small, you get packet-loss and lower throughput. One of the bigger issues with sizing your queue is that most queues are based on the number of packets, long the actual amount of data in the queue. A queue of 500 can hold 32,000 bytes of 64byte packets or 750,000 bytes of 1500 byte packets. That is a large difference in the number of bytes.

I prefer to use Codel for my queue, and eventually fq_codel whenever it makes it into PFSense. I use an arbitrarily large queue depth with Codel, like 4096, because it already fights buffer bloat.

Nullity

My understanding of this area is foggy too. If a stream is being precisely controlled, then there should be practically no queue. A queue appears when numerous streams collide and someone needs to make a decision about who waits and for how long. Drops can happen regardless of queue size (I think), since this is how TCP throttles a stream. Drops must happen if a queue is at it's limit. UDP is also an area of confusion for me.

But what happens with FAIRQ or HFSC? Yes, there are queues the user creates, but where are the stats for the arbitrary FIFO queues that these algorithms create for each and every connection?

I have seen confusing drop-rates too. I just assumed I was not seeing the whole picture. This stuff is hard to comprehend when only understanding bits and pieces.

Harvy66

Modern TCP does not need drops, but drops do cause it to back-off very quickly. UDP is only just a way to immediately send a packet when the application attempts to send it. There is no waiting, buffering, rate limiting, it just gets sent. If you need UDP to back-off during congestion, the application needs handle that. TCP gives you that for free, which is a very complicated issue. uTP is an example of a UDP protocol that is actually better than most TCP congestion algorithms for reducing congestion.

Queues are buffers, they allow the network to have a little bit of elasticity. You want the buffer to be large enough to allow brief bursts, but not large enough to cause latency to spike. An example of an undersized buffer was when I was playing around with PFSense and set a queue to 50 on my 100Mb connection, and my bandwidth was erratic during the ramp-up TCP phase. I would see the bandwidth spike from 0 to 100, then back down to 20, then back up to 80, and so on, until the spikes eventually flattened out after 10+ seconds. Once the TCP stream has stabilized, the queue length is nearly 0, but during the ramp-up, the queue can reach upwards of 100. Every time you add another flow, or bandwidth fluctuates and TCP needs to re-stabilize, the queue needs to handle excess packets during these discovery phases.

This was with a single TCP stream. If I attempted to do anything else while doing a transfer, my bandwidth would again destabilize. What was happening is TCP has an exponential ramp up phase, which would cause the sender to get 100M by only a little, so TCP sees a lot of packet-loss and backs off, a lot. After dropping far down, it would then start to grow again, slightly slower at first, then quickly ramp up again. Eventually getting past 100Mb again, and dropping back down.

At 100Mb/s, the TCP will be potentially reaching short lived bursts of nearly 200Mb/s, but only for a few milliseconds. If the buffer was large enough, these packets would be buffered and delayed. TCP would see this because there would be a bunch of delayed ACKs, so it would backoff some small amount from the last known good value. But with a small buffer, TCP sees a string of dropped packets, freaks out, and clamps down really hard.

One of the big reasons why what we see at home for congestion is different than what we see with congestion far up-stream, like Verizon's Netflix peering port, is the number of new TCP sessions going on. Many TCP stacks will burst the first N packets at full line rate, then watch for ACKs and start growing from there. An example of this is when watching Netflix or Youtube, I can see upwards of 1Gb/s for the first ~10ms. Even though my ISP quickly clamps my connection back down to 100Mb, I do not get packet-loss because TCP can detect the changes without loss. But when you have a peering port where there is a lot of congestion, all of those new connections are all bursting at full line rate. You effectively get a a larger percentage of unregulated packets.

Because TCP is written this way to allow it to quickly ramp up on modern high latency high bandwidth connections, when congestion happens and there is a huge amount of over-subscription, TCP is too aggressive.

On to the next part, why queue mangers matter. A simple queue just does tail drop. When the queue is full, new packets get discarded until there is room. One of the detriments of this simplistic design is a single TCP can hog a huge portion of the buffer, allowing that stream's packets to make it through, keeping TCP from backing off, while other streams are starved because the one stream has monopolized the queue. You also have the issue is when packet-loss happens, it happens in abrupt bursts. This causes many TCP streams to back-off at the same time. Suddenly the link is under-utilized. Then all of those TCP streams start to build back up at the same time, until the link reaches capacity, congestion happens again, and a bunch of packets get dropped, causing all of the streams to back off again. This is called global synchronization. This will result in huge swings from congested to under-utilized, and little in between.

How do you stop a single flow from monopolizing? Head drop tends to help with this. Instead of saying "I'm full" and rejecting new packets, you replace old packets. This allows new packets to keep coming in. This method has pros and cons, but it many cases, the pros are better than the cons. The other issue is binary full/not-full. Head or tail drop, this causes abrupt packet-loss. An arguably better way is to increase the rate of drops as certain thresholds are met. This allows shades of grey, possibly 50 of them(joke). This keeps many streams from being hit at the same time, reducing global synchronization and allowing for smaller reductions of congestion instead of large changes, keeping bandwidth utilization high and average usage stable.

Nullity

Aside from ECN, or perhaps an prematurely undersized TCP RWIN, how does modern TCP work without dropped packets? Citing an RFC would be preferable.

The answer to why dropped packets would happen during low utilization but not during high utilization is still unanswered.

Harvy66

TCP has RFCs, but the congestion control typically control does not.

Win8 uses CTCP by default http://en.wikipedia.org/wiki/Compound_TCP

The aim is to keep their sum approximately constant, at what the algorithm estimates is the path's bandwidth-delay product. In particular, when queueing is detected, the delay-based window is reduced by the estimated queue size to avoid the problem of "persistent congestion" reported for FAST and Vegas.

Another one out there http://en.wikipedia.org/wiki/TCP_Vegas

TCP Vegas detects congestion at an incipient stage based on increasing Round-Trip Time (RTT) values of the packets in the connection

Nullity

Interesting. So if his LAN clients were running Windows 8, he might see no dropped packets?

I thought the most widely adopted algorithms were CUBIC (Linux default) and NewReno (pfSense default). I am painfully ignorant of Windows nowadays.

Harvy66

It is possible to never see dropped packets, but they are bound to happen extreme circumstances. I typically go days with no dropped packets in any queue for either direction. I have been seeding torrent quite hard lately, so I may have some more than I used to, but I have at least seen once where I went a whole month with zero dropped packets, and that included at least a few times that I maxed out my connection, while still maintaining low pings.

Harvy66

Example

pfTop: Up Queue 1-10/10, View: queue
QUEUE               BW SCH  PR  PKTS BYTES DROP_P DROP_B QLEN BORR SUSP P/S  B/S
root_igb0          98M hfsc  0     0     0      0      0    0                   
 qACK                0 hfsc      32M 1872M      0      0    0                   
 qDefault          14M hfsc     591M  776G      0      0    0                   
 qHigh             29M hfsc    2137K  193M      0      0    0                   
 qNormal           29M hfsc    1244K  745M      0      0    0                   
root_igb1          95M hfsc  0     0     0      0      0    0                   
 qDefault          14M hfsc     289M   53G      0      0    0                   
 qHigh             28M hfsc    1633K  260M      0      0    0                   
 qNormal           28M hfsc      48M   70G      0      0    0                   
 qACK                0 hfsc     108M 6844M      0      0    0

Length is pretty much always at 0

Nullity

What network layers is the "QueueDrops" graph aware of?

Unless the queue is tracking the TCP layer, it could not know about missing/errored TCP segments. The only drops the graph would be logging would be IP packets (or ethernet frames?) that overflow the queue, and your AQM flow-controls the TCP streams, keeping them from overflowing the queue.

Assuming the above is true…

tuffcalc: Perhaps setup a high-priority queue for non-TCP only. Since a sender (LAN client) can sense congestion/flow control with TCP streams (via AQM/traffic-shaping perhaps), they most likely will not be dropped. Non-TCP has no flow-control so you can only delay or drop, so if you want less drops it is probably best to let the non-TCP packets preempt the resilient TCP streams. I am assuming most non-TCP is most likely delay-sensitive.Referring to Egress.

I have wondered if non-TCP should always be highest priority on ingress, since those packets are already there. Dropping them is just wasted bandwidth.

Harvy66

Drops is any packets that the queue drops because it becomes full. In my case, I am using codel, so it will start dropping packets when more than 5ms of packets get backlogged. Since I have codel applied to all queues on all interfaces, except qACK, at no point in the amount of data transfered has it dropped any packets

Packets could be dropped by other hops but at no point did PFSense drop anything. Packets will pretty much only be dropped for hops that go from a fast link to a slow link. The first two that come to mind is my ISP going from 10Gb+ down to my 100Mb connection and from my 1Gb LAN to my 100Mb uplink.

I have seen codel on my WAN drop packets, but only when BitTorrent ramped up really fast. I have also seen codel on my LAN drop packets, but only during similar situations.

If packet-drops where the only way modern TCP stacks used to detect congestion, I should always be seeing some dropped packets any time my connection is at 100% in either direction, because at those moments, PFSense is the bottleneck.

Nullity

@Harvy66:

Drops is any packets that the queue drops because it becomes full. In my case, I am using codel, so it will start dropping packets when more than 5ms of packets get backlogged. Since I have codel applied to all queues on all interfaces, except qACK, at no point in the amount of data transfered has it dropped any packets

Packets could be dropped by other hops but at no point did PFSense drop anything. Packets will pretty much only be dropped for hops that go from a fast link to a slow link. The first two that come to mind is my ISP going from 10Gb+ down to my 100Mb connection and from my 1Gb LAN to my 100Mb uplink.

I have seen codel on my WAN drop packets, but only when BitTorrent ramped up really fast. I have also seen codel on my LAN drop packets, but only during similar situations.

If packet-drops where the only way modern TCP stacks used to detect congestion, I should always be seeing some dropped packets any time my connection is at 100% in either direction, because at those moments, PFSense is the bottleneck.

Part of the confusion here is that I cannot tell when we are referring to drops caused by a full queue and drops by TCP, perhaps related to artificial stream throttling by a traffic-shaper. You are seeing no TCP drops because you are looking at QueueDrops. Run a packet sniffer if you want to see TCP drop/reorders/dupes.
TCP packet drops (congestion control) are not the only way to rate-limit, but flow control (TCP sliding window) is set at the receiver unlike congestion control.

Let's say Vegas or C-TCP was employed, how would a router know that sender used an algorithm that could modify the Congestion Window without drops? Packet drops are supported by all TCP congestion control algos, the non-drop method is not.

OP: I just ran into a problem when I mixed a queue with TCP and non-TCP packets and used CoDel on the queue. Actually, no queue-drops registered, but CoDel was dropping ~30% of the ping packets being sent through. I now make sure to only use CoDel on queues exclusively filled with TCP only.

Harvy66

OP: I just ran into a problem when I mixed a queue with TCP and non-TCP packets and used CoDel on the queue. Actually, no queue-drops registered, but CoDel was dropping ~30% of the ping packets being sent through. I now make sure to only use CoDel on queues exclusively filled with TCP only.

So you have a queue for both TCP and ICMP and say ICMP packet-loss but nothing registered with the queue? Are you sure the packets are being dropped by PFSense and not the connection to your ISP?

dtaht

I would be very interested in bufferbloat-reduction reports using either the rrul test suite or the new dslreports tool regarding hfsc + fairqueue + codel, in particular.

http://www.internetsociety.org/blog/tech-matters/2015/04/measure-your-bufferbloat-new-browser-based-tool-dslreports

Nullity

@Harvy66:

OP: I just ran into a problem when I mixed a queue with TCP and non-TCP packets and used CoDel on the queue. Actually, no queue-drops registered, but CoDel was dropping ~30% of the ping packets being sent through. I now make sure to only use CoDel on queues exclusively filled with TCP only.

So you have a queue for both TCP and ICMP and say ICMP packet-loss but nothing registered with the queue? Are you sure the packets are being dropped by PFSense and not the connection to your ISP?

With CoDel on a mixed protocol queue, during a saturated upload test, if my ping packets went through said queue they had CoDel-like latency (30-60ms) but had ~30% of the pings unreplied. No QueueDrops registered. Queue averaged about 3 packets.
If CoDel was disabled on the queue, ping latency was 600ms with 0% ping packet loss. Queue averaged about 30 packets.

When I assigned pings to another queue, I had 0% packet loss and the same latency.

I did not run tcpdump to see where the packet was last seen.

Harvy66

So it sounds like the queue statistics may not be reporting drops correctly when using codel?

Nullity

@Harvy66:

So it sounds like the queue statistics may not be reporting drops correctly when using codel?

That it what I was thinking as well.

I wonder if it should register. The queue never overflows, technically, right?

Harvy66

I guess it could be possible that the way FreeBSD records dropped packets is if the queue responds with a "full" return code when attempting to enqueue from the wire to the queue. If that is the only way for it to track drops, when you'll never see drops that are internal to codel. That would also mean those times I did see drops, it was because the queue actually got full and not because of max latency in queue.

tuffcalc

@Nullity:

@Harvy66:

So it sounds like the queue statistics may not be reporting drops correctly when using codel?

That it what I was thinking as well.

I wonder if it should register. The queue never overflows, technically, right?

I have this issue as well. Codel drops never register.