Playing with fq_codel in 2.4
-
I'm getting the feeling that "ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel" doesn't mean the same thing when you have multiple *queues. I find it interesting that your "ipfw sched show" says something like
Shed1 weight 75 fq_codel
child flowsets: 3 2 1Shed2 weight 25 fq_codel
child flowsets: 6 5 4Why do your two sched claim to have different weights if they're unrelated? Start small. Do a single queue per direction, the work your way to 3 each.
*I use the term "queue" in the general concept, not the technical context of the ipfw command
Well i thought i understood somehow the concept here, but evidently not :D
I thought
The floating rule decides which limiter to use
default traffic upload -> default limiter upload weight 25
default traffic download -> default limiter download weight 25Vip traffic upload -> high limiter upload weight 75
…. and so on -
Thanks to everyone who has contributed to this thread. I've read through the whole thing and am beginning to try out FQ_CODEL myself, but I have two distinct points of confusion that I didn't see addressed anywhere:
-
Why are masks used on the limiters? The explanation of masks on the limiter configuration page states that "If "source" or "destination" slots is chosen a dynamic pipe with the bandwidth, delay, packet loss and queue size given above will be created for each source/destination IP address encountered, respectively. This makes it possible to easily specify bandwidth limits per host." But isn't that exactly what we don't want? We want the limiter to serve as a cumulative bandwidth cap, such that the total bandwidth usage of all hosts on the network will not exceed an ISPs upload or download caps.
-
Is it really necessary to create queues for the limiters? I had thought it would suffice to simply create one limiter each corresponding to my upload and download caps, and then assign them as my in and out pipe respectively using LAN firewall rules.
Thanks in advance if anyone can help to clarify these points for me.
-
-
Thanks to everyone who has contributed to this thread. I've read through the whole thing and am beginning to try out FQ_CODEL myself, but I have two distinct points of confusion that I didn't see addressed anywhere:
-
Why are masks used on the limiters? The explanation of masks on the limiter configuration page states that "If "source" or "destination" slots is chosen a dynamic pipe with the bandwidth, delay, packet loss and queue size given above will be created for each source/destination IP address encountered, respectively. This makes it possible to easily specify bandwidth limits per host." But isn't that exactly what we don't want? We want the limiter to serve as a cumulative bandwidth cap, such that the total bandwidth usage of all hosts on the network will not exceed an ISPs upload or download caps.
-
Is it really necessary to create queues for the limiters? I had thought it would suffice to simply create one limiter each corresponding to my upload and download caps, and then assign them as my in and out pipe respectively using LAN firewall rules.
Thanks in advance if anyone can help to clarify these points for me.
1) Source/Destination masks are actually used on the queues under the limiters, but are not necessary on the limiters themselves.
2) Yes, you'll have to create queues to get fq_codel to work. The codel ("controlled delay") part of fq_codel controls the size of the queue of packets and helps prevent bufferbloat by dropping packets when necessary, while the fq (or "fair queuing") helps to ensure that flows get fair access to the bandwidth (vs. one flow dominating another etc.). For some further reading, check out these links:https://en.wikipedia.org/wiki/CoDel
https://tools.ietf.org/html/rfc8290#section-1.3Hope this helps.
-
-
Thanks tman222. I think that makes sense. It definitely makes sense that fq_codel needs queues to work, I just got confused by the fact that the ipfw sched show command seems to indicate that fq_codel has been applied to the limiters themselves (i.e. without queuse). It had me thinking that each limiter itself had one queue implicitly, but you could create child queues if desired.
As to the masking, is the goal to have a new queue created for each host, but the cumulative bandwidth is still capped by the parent limiter? I find I'm still confused about why multiple queues would be desirable, and that seems to be the main - if not only - purpose of using masks. From the second (ietf.org) link you provided, this excerpt would seem to suggest that multiple queues are created implicitly?
The intention of FQ-CoDel's scheduler is to give each flow its ownvqueue, hence the term "flow queueing". Rather than a perfect realisation of this, a hashing-based scheme is used, where flows are hashed into a number of buckets, each of which has its own queue.
That makes it sound like the "bucket size" parameter on the limiter itself dictates how many queues there are, and packets are placed into them when they are hashed
on the 5-tuple of source and destination IP addresses, source and destination port numbers, and protocol number
So it seems like there are implicitly queues involved already, but there is also the capability to explicitly add more queues, the need and/or reason for I'm struggling to understand.
I'm sure I'm just not grasping some (or multiple) concepts here and really appreciate your input.
-
You raise some good points. To be honest, I'm not 100% sure what would happen if you just create two limiters and then try to enable fq_codel. If fq_codel itself creates a queue for each flow then this might be enough for things to work and you don't need the additional queue underneath each limiter. In my case, I actually use multiple queues under each limiter to help me mange bandwidth fairness across several VLAN's by assigning weights to those queues.
I think you might need to experiment and report back your findings. Try creating just two limiters and then enabling fq_codel. If that doesn't yield the desired results, create two limiters with one queue under each and see if that changes the results.
Hope this helps.
-
I don't really have quantitative results yet, but I'm hoping to run some more tests tonight and report back. Basically, after reading over the traffic shaping section of the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z), I believe that I've come to understand the guidance on using queues underneath pipes and for using masks on those queues. There's a lot of great information in that man page, but this excerpt is particularly relevant:
Thus, when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).
Using masks results in dynamic queues/pipes. So we do not want dynamic pipes (because we want an overriding bandwidth cap) but we do want dynamic queues. I believe that by setting a source address mask of 32 (IPv4) and 128 (IPv6) on upload queues and a destination address mask of 32 (IPv4) and 128 (IPv6) on download queues, each host on the LAN will get its own queue, but the aggregate bandwidth usage of all those queues will be constrained by their common parent pipe.
I'll try to run some tests later tonight to get actual results to post. One thing I struggle with is that beyond running the dlsreports speed/bufferbloat test (and running it on multiple hosts simultaneously to see whether bandwidth is being both capped and fairly shared), I don't know how to get visibility into what is actually taking place with respect to the dummynet config (e.g. is it actually dynamically creating queues?). Of course, if the network ends up behaving well that's all that ultimately matters, but I do like understanding things when I can ;)
-
While I've been poking around, I have managed to increase my confusion. From what I can tell, ipfw is the interface by which dummynet is configured (hence the ipfw commands that are set up with shellcmd in order to persist the application of fq_codel through reboots). But does the ipfw service need to be running in order for dummynet to work? Because on my system, kldstat shows dummynet.ko loaded but not ipfw.ko. And executing ipfw show results in the message ipfw: retrieving config failed: Protocol not available. But ipfw queue show, ipfw sched show, and ipfw pipe show all work as expected. My presumption is that this is okay, because pf is still the firewall subsystem being used, and can somehow throw traffic into dummynet queues even though dummynet is a subset of the ipfw firewall subsystem, but because ipfw is not performing firewall duties it does not need to be (and should not be) loaded itself. Perhaps this ability of the pf firewall to assign traffic to dummynet queues is a pfSense-specific patch?
-
Here's one other specific point of confusion. Based on the output from ipfw sched show, it looks like dynamic queues aren't created when the scheduler is fq_codel even when the masks (dst-ip for download queues, src-ip for upload queues) are set to 0xffffffff (/32). Instead, no matter how many hosts on the LAN are active, the output from ipfw sched show only ever shows a single line per pipe with both Source IP/Port and Dest. IP/Port as 0.0.0.0/0, as shown in the attached screen shot. Perhaps I'm just not looking in the right place to see dynamically created queues, but I did notice that before changing the scheduler type to fq_codel, the default was WFQ2+, and with that scheduler and all other settings the same, the output from ipfw sched show showed multiple lines with real IPs and port numbers. Am I fundamentally misunderstanding something that explains this discrepancy?
-
Here's one other specific point of confusion. Based on the output from ipfw sched show, it looks like dynamic queues aren't created when the scheduler is fq_codel even when the masks (dst-ip for download queues, src-ip for upload queues) are set to 0xffffffff (/32). Instead, no matter how many hosts on the LAN are active, the output from ipfw sched show only ever shows a single line per pipe with both Source IP/Port and Dest. IP/Port as 0.0.0.0/0, as shown in the attached screen shot. Perhaps I'm just not looking in the right place to see dynamically created queues, but I did notice that before changing the scheduler type to fq_codel, the default was WFQ2+, and with that scheduler and all other settings the same, the output from ipfw sched show showed multiple lines with real IPs and port numbers. Am I fundamentally misunderstanding something that explains this discrepancy?
My understanding was that dynamic pipes were only created if masks were set on the queues not the pipes. That's how I have my limiters setup and things have been working well for the past 6+ months I have had them enabled.
-
My interpretation of information from the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z) is that masks set of pipes result in dynamic pipes and masks set on queues result in dynamic queues, the difference being that:
when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).
The takeaway is that if you have dynamic pipes, you won't get a cumulative bandwidth cap, but rather a bandwidth cap per flow. That would be useful if you don't want anyone on your network to be able to use more than a certain amount of bandwidth, but not useful if you want to prevent the total bandwidth usage of all users on your network from exceeding your ISP's bandwidth caps.
What's confusing to me is that I believe I should have dynamic queues, but I'm not seeing any evidence of them (but admittedly may be looking in the wrong place by expecting to see that evidence in the output from ipfw sched show).
-
Has anyone else encountered an issue with fq_codel with 2.4.3? I haven't dug into it yet, but it appears to have stopped working for me…
-
I haven't run tests yet, but I'll try to over the weekend. From all outward appearances (ipfw sched show, etc.) it seemed the same though. That said, I was never completely satisfied I had it configured correctly. I don't know any way of getting meaningful visibility into what it's actually doing and I never saw evidence of dynamic queuing in the output from ipfw sched show, only a single line per limiter with about "Source IP/Port" and "Dest. IP/Port" both 0.0.0.0./0 even though I have queues with masks configured under my limiters. But in any case, my bufferbloat did seem to be under control based on the dslreports test.
Are you judging that your setup isn't working by the results of that same bufferbloat test, or some other metrics from pfSense itself?
-
Are you judging that your setup isn't working by the results of that same bufferbloat test, or some other metrics from pfSense itself?
I run latency monitoring through pfSense (i.e. dpinger from lan through firewall to my ISP's local hub). With 2.4.3, I am now seeing large spikes (both latency base and stddev) when hosts in the local network are actively downloading. Seeing poor results with dslreports speed test as well. Previously was A/A+, now D.
-
I checked my quality monitoring logs and haven't seen any latency spikes since I updated to 2.4.3 yesterday, but I also don't know how much the connection has been stressed during that period. Are you using the shellcmd method of persisting the application of fq_codel to your limiters? And have you sanity checked that it actually was applied by running ipfw sched show? If I think of anything else I'll let you know, and I'll get back with my test results once I'm able to run them.
-
Are you using the shellcmd method of persisting the application of fq_codel to your limiters? And have you sanity checked that it actually was applied by running ipfw sched show?
Yes, using shellcmd, and the queues appear to still be set up correctly.
ipfw sched show:
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 0 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 0 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2
The firewall rules still appear to be assigning packets to queues correctly. If I halve the bandwidth in the limiter for upload or download, I do see a corresponding reduction in speed for the expected direction.
-
Upgraded to 2.4.3 yesterday evening and fq_codel still working as expected.
Did you edit any files when you set it up originally? Is there anything particularly unique about your setup, e.g. running fq_codel on VLANs, etc.? Also when you do run a speed test and then keep refreshing ipfw sched show during the process, do you see traffic being passed?
Hope this helps.
-
No VLANs, no file edits, using shell command:
ipfw sched 1 config pipe 1 type fq_codel; ipfw sched 2 config pipe 2 type fq_codel
Here are some ipfw sched show outputs from during the download portion of a dslreports test using http:
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 2285 3421723 193 289500 10 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 290 15423 0 0 0
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 3793 5686580 90 135000 15 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 369 19572 0 0 0
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 5577 8331991 140 208833 22 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 50 3092 0 0 0
00001: 150.000 Mbit/s 0 ms burst 0 q00001 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 427 634711 91 136500 0 00002: 20.000 Mbit/s 0 ms burst 0 q00002 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 184 9635 0 0 0
Of course, if I use https for dslreports, the problem is not nearly as bad (as you would expect).
-
I'm not using fq_codel, but I am using fairq+codel.
I've noticed since updating to 2.4.3 that a speedtest gives me exactly 50Mb/sThis is interesting, because this is exactly the BW I have set on my LAN queue for web traffic.
Previously a speedtest would burst high up towards the 100Mb/s my connection is.
Downloading torrents etc still seems to be fine, that is multiple sessions seem to be unaffected. Also fast.com checks report up around 70-80MB/s. But Speedtest is always ~exactly the 50Mb/s I've set for web traffic.
But certainly 2.4.3 seems to have changed something with regards to traffic shaping etc.
Apologies for taking the fq_codel thread slightly offtopic!
-
@muppet:
I'm not using fq_codel, but I am using fairq+codel.
I've noticed since updating to 2.4.3 that a speedtest gives me exactly 50Mb/sThis is interesting, because this is exactly the BW I have set on my LAN queue for web traffic.
Previously a speedtest would burst high up towards the 100Mb/s my connection is.
Downloading torrents etc still seems to be fine, that is multiple sessions seem to be unaffected. Also fast.com checks report up around 70-80MB/s. But Speedtest is always ~exactly the 50Mb/s I've set for web traffic.
But certainly 2.4.3 seems to have changed something with regards to traffic shaping etc.
Apologies for taking the fq_codel thread slightly offtopic!
Yeah, AFAIK, limiters (dummynet) & fairq+codel (ALTQ) are very much separated in software. Though, your bug is still interesting. Maybe.
-
dennypage
The best way to check is it related to pfSense is to downgrade and re-test it again. You can also try to lower your bandwidth limits or play with KPTI option.
As for me, nothing changed in bufferbloat, but I have over provisioned hardware.