Playing with fq_codel in 2.4
-
My interpretation of information from the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z) is that masks set of pipes result in dynamic pipes and masks set on queues result in dynamic queues, the difference being that:
when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).
The takeaway is that if you have dynamic pipes, you won't get a cumulative bandwidth cap, but rather a bandwidth cap per flow. That would be useful if you don't want anyone on your network to be able to use more than a certain amount of bandwidth, but not useful if you want to prevent the total bandwidth usage of all users on your network from exceeding your ISP's bandwidth caps.
What's confusing to me is that I believe I should have dynamic queues, but I'm not seeing any evidence of them (but admittedly may be looking in the wrong place by expecting to see that evidence in the output from ipfw sched show).
-
Has anyone else encountered an issue with fq_codel with 2.4.3? I haven't dug into it yet, but it appears to have stopped working for me…
-
I haven't run tests yet, but I'll try to over the weekend. From all outward appearances (ipfw sched show, etc.) it seemed the same though. That said, I was never completely satisfied I had it configured correctly. I don't know any way of getting meaningful visibility into what it's actually doing and I never saw evidence of dynamic queuing in the output from ipfw sched show, only a single line per limiter with about "Source IP/Port" and "Dest. IP/Port" both 0.0.0.0./0 even though I have queues with masks configured under my limiters. But in any case, my bufferbloat did seem to be under control based on the dslreports test.
Are you judging that your setup isn't working by the results of that same bufferbloat test, or some other metrics from pfSense itself?
-
Are you judging that your setup isn't working by the results of that same bufferbloat test, or some other metrics from pfSense itself?
I run latency monitoring through pfSense (i.e. dpinger from lan through firewall to my ISP's local hub). With 2.4.3, I am now seeing large spikes (both latency base and stddev) when hosts in the local network are actively downloading. Seeing poor results with dslreports speed test as well. Previously was A/A+, now D.
-
I checked my quality monitoring logs and haven't seen any latency spikes since I updated to 2.4.3 yesterday, but I also don't know how much the connection has been stressed during that period. Are you using the shellcmd method of persisting the application of fq_codel to your limiters? And have you sanity checked that it actually was applied by running ipfw sched show? If I think of anything else I'll let you know, and I'll get back with my test results once I'm able to run them.
-
Are you using the shellcmd method of persisting the application of fq_codel to your limiters? And have you sanity checked that it actually was applied by running ipfw sched show?
Yes, using shellcmd, and the queues appear to still be set up correctly.
ipfw sched show:
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 0 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 0 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2
The firewall rules still appear to be assigning packets to queues correctly. If I halve the bandwidth in the limiter for upload or download, I do see a corresponding reduction in speed for the expected direction.
-
Upgraded to 2.4.3 yesterday evening and fq_codel still working as expected.
Did you edit any files when you set it up originally? Is there anything particularly unique about your setup, e.g. running fq_codel on VLANs, etc.? Also when you do run a speed test and then keep refreshing ipfw sched show during the process, do you see traffic being passed?
Hope this helps.
-
No VLANs, no file edits, using shell command:
ipfw sched 1 config pipe 1 type fq_codel; ipfw sched 2 config pipe 2 type fq_codel
Here are some ipfw sched show outputs from during the download portion of a dslreports test using http:
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 2285 3421723 193 289500 10 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 290 15423 0 0 0
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 3793 5686580 90 135000 15 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 369 19572 0 0 0
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 5577 8331991 140 208833 22 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 50 3092 0 0 0
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 427 634711 91 136500 0 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 184 9635 0 0 0
Of course, if I use https for dslreports, the problem is not nearly as bad (as you would expect).
-
I'm not using fq_codel, but I am using fairq+codel.
I've noticed since updating to 2.4.3 that a speedtest gives me exactly 50Mb/sThis is interesting, because this is exactly the BW I have set on my LAN queue for web traffic.
Previously a speedtest would burst high up towards the 100Mb/s my connection is.
Downloading torrents etc still seems to be fine, that is multiple sessions seem to be unaffected. Also fast.com checks report up around 70-80MB/s. But Speedtest is always ~exactly the 50Mb/s I've set for web traffic.
But certainly 2.4.3 seems to have changed something with regards to traffic shaping etc.
Apologies for taking the fq_codel thread slightly offtopic!
-
@muppet:
I'm not using fq_codel, but I am using fairq+codel.
I've noticed since updating to 2.4.3 that a speedtest gives me exactly 50Mb/sThis is interesting, because this is exactly the BW I have set on my LAN queue for web traffic.
Previously a speedtest would burst high up towards the 100Mb/s my connection is.
Downloading torrents etc still seems to be fine, that is multiple sessions seem to be unaffected. Also fast.com checks report up around 70-80MB/s. But Speedtest is always ~exactly the 50Mb/s I've set for web traffic.
But certainly 2.4.3 seems to have changed something with regards to traffic shaping etc.
Apologies for taking the fq_codel thread slightly offtopic!
Yeah, AFAIK, limiters (dummynet) & fairq+codel (ALTQ) are very much separated in software. Though, your bug is still interesting. Maybe.
-
dennypage
The best way to check is it related to pfSense is to downgrade and re-test it again. You can also try to lower your bandwidth limits or play with KPTI option.
As for me, nothing changed in bufferbloat, but I have over provisioned hardware. -
Anybody know why dslreports doenst how bufferbloat? It never gives me a score regardless of settings on that website.
-
Anybody know why dslreports doenst how bufferbloat? It never gives me a score regardless of settings on that website.
It depends on where you are. If you live in Australia or NZ, it doesn't seem to work for us (it gives an error and doesn't repot anything)
-
@w0w:
dennypage
The best way to check is it related to pfSense is to downgrade and re-test it again. You can also try to lower your bandwidth limits or play with KPTI option.
As for me, nothing changed in bufferbloat, but I have over provisioned hardware.KPTI doesn’t affect it. Wouldn’t expect it to as limiters are kernel code. I’ve tested with half available bandwidth, and still encounter the bloat issue. I don’t think I’m under provisioned with hardware -– I have a 4860 running a 150/20 connection.
I’m assuming that this is the result of a FreeBSD kernel change. I haven’t downgraded and run a test to confirm yet, but hat’s about what I’m down to.
I’m taking it that you haven’t seen any change with a multichannel http test?
-
I have been having endless years of shaping due to out poor LTE service in South Africa.
Just want to Say THANKS to all for this guide. I although did add a little twist. am still running
HSFC traffic shaper with Limiters children in addition to queues from the traffic shaper I can
still manage my ports as before just give them a nice hard piece of codel if needed.
due to speeds in my area fluctuating alot the schedules works wonders to "dynamically"
change the pipe has not brought ping down from 100 to +- 30 - 43ms which is huge.
another thing is that when using the traffic shaper with fq_codel I can still watch netflix / youtube
full hd no buffering while gaming @ acceptable latency.Tests were done while streaming netflix minimal impact my girlfriend didnt even notice this while
playing CS:GO test added 0.3-4ms to her game.Note Due to intentional latency being terrible please dont work on the latency results from dslreports.
Note Speeds are currently Set @ 10mb / 10Mb 2Pipes 4Children Each 75 / 50 / 5 / 2
Depending where I have lay them in the sharper determines which children I am using.
Yep I know its weird but for me it works.
![local speedtest.net before.png](/public/imported_attachments/1/local speedtest.net before.png)
![local speedtest.net before.png_thumb](/public/imported_attachments/1/local speedtest.net before.png_thumb)
![local speedtest.net after.png](/public/imported_attachments/1/local speedtest.net after.png)
![local speedtest.net after.png_thumb](/public/imported_attachments/1/local speedtest.net after.png_thumb) -
I just wanted to report that I did run some tests over the weekend using 2.4.3 and don't see any differences (i.e. fq_codel still seems to be working for me). I tested using the dslreports speed test and also ran flent's rrul and rrul_torrent tests. I realize that just another report of "works for me" isn't especially helpful, though it does seem to point to some specific configuration issue rather than a wholesale breakdown of fq_codel in 2.4.3.
-
I just wanted to report that I did run some tests over the weekend using 2.4.3 and don't see any differences (i.e. fq_codel still seems to be working for me). I tested using the dslreports speed test and also ran flent's rrul and rrul_torrent tests. I realize that just another report of "works for me" isn't especially helpful, though it does seem to point to some specific configuration issue rather than a wholesale breakdown of fq_codel in 2.4.3.
No, it still helps. Just to be sure, when you test with dslreports, you used http rather than https, yes? My tests with https still show an A most of the time, but http is showing C/D. I haven't run anything with rrul.
-
That's a very interesting point. I didn't explicitly select HTTP, and I use the HTTPS Everywhere plugin, so I quite possibly was testing with HTTPS. I'm at work, but when I get home this evening I'll run a test with HTTP and get back to you. Are you aware of a specific reason why that may make a difference, or just know that it does based on your testing?
-
I ran some tests with flent/rrul, and it doesn't show any issue at all. I also ran a rrul with the limiters disabled. The bad results with limiters disabled confirmed that basic fq_codel is working. :)
Of note is that rrul seems to be hard coded to use 4 streams only. Testing with DSLReports restricted to 4 streams (http/websocket) doesn't show an issue either.
DSLReports with 8 streams is marginally okay. Above 12 streams or so, DSLReports starts to tank. By the time it gets to 24 streams, it's digging big holes in the ground.
As it seems to be an issue with the number of simultaneous streams, I'm going to try multiple lan systems running simultaneous tests next.
-
dennypage
http://www.dslreports.com/speedtest/31667858 plain httpBut I found that wifi clients have different situation — bufferbloat is rated like "С" and I remember that last summer I have activated QoS on the wifi access point to eliminate this problem and now looks like it does not work anymore, I don't think it's related to pfSense, but may be to wifi client that have updated against spectre/meltdown android kernel. I