CoDel - How to use
-
I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.
What settings result in a DSLreports A grade?
It seems I was unclear. Ideally I'd like to get back to getting an A or B on the DSLR test which I suspect means having an average latency of 100-200 ms. in both directions. I can get a B by carefully tweaking the test parameters after setting the WAN speed to ~90% of my uplink speed. I guess I was hoping for something both simpler and better.
-
OK, look.
pfSense does not have a problem with buffer bloat.
CoDel locally does practically nothing.
People are likely seeing buffer bloat improvement by simply setting the bandwidth lower than their expected upload so buffer bloat does not occur in the ISP gear.
CoDel is for transit, not for the edge.
If you were to start increasing queue length in local shapers AND enabling CoDel you might derive some benefit. But the only reason you would derive benefit from CoDel is because you increased the queue length so just don't.
-
I've tried a variety of bandwidth settings on the WAN link and I still get a C-D grades on the DSLreports "bufferbloat" grade and see high echo RTT when the link is loaded.
What settings result in a DSLreports A grade?
It seems I was unclear. Ideally I'd like to get back to getting an A or B on the DSLR test which I suspect means having an average latency of 100-200 ms. in both directions. I can get a B by carefully tweaking the test parameters after setting the WAN speed to ~90% of my uplink speed. I guess I was hoping for something both simpler and better.
Like Harvy66 said, the grading score is a combination of your upload and download bufferbloat. You have not yet told us whether the majority of your bufferbloat is occurring during upload or download saturation. The test should show you specifically what your upload and download bufferbloat is in millisecond measurements.
Also, like I already mentioned, you will get a more accurate measurement if you look at the pfSense Quality graphs (or you could simply run a ping test yourself…). The dslreports test is a program within a browser, so it is not always accurate. dslreports says my latency is ~5x worse than it actually is...
We need more info from you. For example, what is your latency is (in milliseconds) during idle, upload saturation, and download saturation, with codel & without?
I have tried using codelq and it works but as soon as I start an upload to google drive I get about 50% packetloss and then the gateways appear offline according to pfsense but ping stays low.
Where are you recording 50% packet loss? tcpdump? Queue Drops? ?
-
OK, look.
pfSense does not have a problem with buffer bloat.
CoDel locally does practically nothing.
People are likely seeing buffer bloat improvement by simply setting the bandwidth lower than their expected upload so buffer bloat does not occur in the ISP gear.
CoDel is for transit, not for the edge.
If you were to start increasing queue length in local shapers AND enabling CoDel you might derive some benefit. But the only reason you would derive benefit from CoDel is because you increased the queue length so just don't.
Hmm?
Codel is for any network node that receives more traffic than it can immediately send, which is usually at the edge.
Transit nodes (backbones) have almost no need for Codel, since they keep latency low by using the best possible option… always have excess bandwidth (~50%) available.
The Codel authors specifically advise that it most useful at gateway/edge routers. They even spear-headed the creation of a firmware (CeroWRT) for consumer routers specifically.Without Codel, my ping hit 600ms during upload saturation (50 packet default queue depth).
With Codel, during upload saturation, my queue is usually 1-4 packets deep with a ~50ms ping.I wouldn't say pfSense has a bufferbloat problem (no more than everyone else), but Codel does make a big difference in worst-case latency, that's a fact.
-
I would disagree. On a high-speed connection with 50 buffers CoDel will do practically nothing. The benefit is being derived from not overloading the upstream (where the bufferbloat occurs) with the bandwidth setting.
-
I would disagree. On a high-speed connection with 50 buffers CoDel will do practically nothing. The benefit is being derived from not overloading the upstream (where the bufferbloat occurs) with the bandwidth setting.
Yes, if you have your buffers set to 50, which is the default. I had the issue with my 100/100 connection that a buffer size of 50 was too small and it was dropped quite a few packets under sustained load. My ISP has a fully uncongested network and allows 1Gb microbursts from datacenters half-way across the USA. I know at least Google, and I think several others, configure their TCP stacks to transfer at full line rate for the first several packets. Their line rate is 10Gb-40Gb. My ISP does traffic shaping in their Cisco core router and it seems to have some anti-buffer bloat features built in, which not only gives me an A on DSLReports without any shaping on my end, but also allows short lived bursts through. When doing a WireShark on my WAN interface, a single TCP stream from Google will give me back-to-back packets at 1Gb/s for the first 100ms-250ms before the Cisco router starts to traffic shape. I'm only connected to my ONT via 1Gb Ethernet, so who knows how fast the actual burst is.
Since I shaped my LAN to 100Mb/s, this 1Gb burst quickly fills my 50 packet queue and I get bursts of packetloss. Even worse is some of these TCP streams fully transfer in the 100ms-250ms window. Since I will ACK all of the packets, Google's TCP stack thinks I can handle the rate they're sending at. As devices make more requests to Google, I get more and more 1Gb bursts until my average utilization gets too high and the ISPs rate limiting starts to kick in.
Maybe you're thinking, ohh, he's just getting flooded by an on-premises CDN. No. My traffic comes from a Level 3 to Google PoP 300 miles away. I even configured YouTube to pull from a European CDN in Germany. 130ms ping, less than 1ms of jitter, and still getting 1Gb bursts from them. I guess this is the price I pay for not having an uncongested ISP.
-
I would disagree. On a high-speed connection with 50 buffers CoDel will do practically nothing. The benefit is being derived from not overloading the upstream (where the bufferbloat occurs) with the bandwidth setting.
Rate-limiting is a separate function (the function that avoids overloading upstream nodes) from CoDel. Technically, ALTQ rate-limiting is done by either the Token Bucket Regulator (interface rate-limiting) or the queueing discipline (HFSC, CBQ, FAIRQ).
CoDel, to keep the network buffers small, simply drops packets depending on the amount of time packets are queued.
Rate-limiting puts pfSense in control (avoids overloading upstream), then CoDel mitigates bufferbloat (by dropping packets). Two separate functions.
-
If you were to start increasing queue length in local shapers AND enabling CoDel you might derive some benefit. But the only reason you would derive benefit from CoDel is because you increased the queue length so just don't.
I didn't increase the queue length. Did something change it as a result using codel?
In any case under some load situations codel has improved my performance. Under others it doesn't but it doesn't seem to make things worse so it's a net win. -
So I've read this thread, and given the suggestion in it a try. However, when I enable Codel, I seem to lose about 20% of download speed (100mbps down to 75-80)… though it does solve the buffer bloat problem.
Is this expected behaviour?
~Spritz
-
So I've read this thread, and given the suggestion in it a try. However, when I enable Codel, I seem to lose about 20% of download speed (100mbps down to 75-80)… though it does solve the buffer bloat problem.
Is this expected behaviour?
~Spritz
It should be limited to whatever bitrate you set.
-
It should be limited to whatever bitrate you set.
Thank you for the response Nullity, I appreciate it. I've read many posts over the last week or so from you and Harvy66.
So it appears then if I'm seeing a drop in throughput, something isn't configured correctly… though I'm at a loss as to what it could be. Assuming the kids go down easy tonight, I'll play with it further tonight.
~Spritz
-
What rate did you set on your LAN and WAN?
-
What rate did you set on your LAN and WAN?
105/9.5mbps… After multiple speedtests, as well as monitoring usenet downloads I get 117/11mbps.
Again, thank you.
~Spritz
-
-
I've tried a few different ways to be honest. I was hoping to figure it out before posting here with my tail between my legs.
But here is what I've tried (note that the rates I used were consistent) –>
1.
CODELQ - Only2.
HFSC - Parent
CODELQ - ChildFAIRQ - Parent
CODELQ - ChildCBQ - Parent
CODELQ - ChildNow 1-4 gave me very similar results, with only slight fluctuations in the throughput (1-2mbps). When I tried putting in an artificially lowered rate, it honored it.
I also changed the advanced settings to enabling net.inet.tcp.inflight.enable=1 which seemed to help smooth out ping spikes, and I disabled Hardware TSO offload as per a link posted by Harvy (see I really did read and try and figure this out on my own!).
Lastly I ran the wizard using HFSC queue, and it seemed to work as expected (buffer bloat under control, correct rate). However, two things I discovered that makes this a less than ideal situation -->
1. I don't like having to use a full blow QoS implementation without knowing why the simpler one didn't work. Nor do I love the potential support overhead this introduces.
2. When I went into the individual queue's after running the wizard to enable Codel, I was unable to due to the fact that the bandwidth numbers were decimals. It wouldn't let me save unless I put whole #'s (wherever a % was). This is despite the fact that the wizard uses decimals... is this a bug?Thanks!
~Spritz
PS - If something is unclear, or terminology is incorrect, I apologize. I'm at work and don't have access to my pfsense box currently, so I'm pulling all this from memory.
I also tried running the QoS wizard, and this seemed to work quite well. It honored my set rate, as well as keeping my buffer bloat under control... but I'm not keen on just using the wizard unless I know why the other options didn't work.
-
Very interesting subjet indeed.
Just a quick question: why, when I try to use sysctl net.inet.tcp.inflight in the console, I get
sysctl: unknown oid 'net.inet.tcp.inflight': No such file or directory
shouldn't I see it there?
-
So I get a lot of bufferbloat when DOWNLOADS saturate my link. Uploads don't seem to cause the bloat.
I have codelq, and only codelq, turned on my WAN and LAN connections. I have a 300/20 connection. for WAN speed I set 17 Mb, for the LAN connections I put 1Gb.
Is there anything that can really be done about the download bloat? My pfSense is also my central router with 3 VLANs, so I'm not sure I can do much on the LAN side as I don't want to ruin my inter-VLAN routing speeds.
Jason
-
Set your LAN to 280Mb and see how it goes.
-
I'm also trying to battle buffer bloat by implementing CoDel. While it works well to eliminate the bloat, it also drops my download speed by 100Mbps. Here are the details.
My link speed is 250/10
dslreports speed tests without any shaping applied show speeds of 250/11, but I usually get a D or F for buffer bloat.
When I activate CoDel on the LAN/WAN interfaces with bandwidth set to 9Mbps on the WAN and 230Mbps on the LAN, my buffer bloat grade goes to an A, but my download speed drops to 148Mbps. I've tried raising the LAN interface bandwidth up to 240 and 250Mbps but this doesn't increase the download speeds past 150Mbps.
Why is CoDel taking such a big bite out of my download bandwidth? Is there better way to eliminate buffer bloat without such a big impact to download speed?
Hardware: Netgate SG-8860
pfSense Version: 2.3.2-RELEASE-p1 -
Where are you setting "Codel"? Got a screenshot? Sorry to sound not trusting, but you'd be amazed what people claim they're doing when they're not.