Playing with fq_codel in 2.4

TheNarc

Thanks tman222. I think that makes sense. It definitely makes sense that fq_codel needs queues to work, I just got confused by the fact that the ipfw sched show command seems to indicate that fq_codel has been applied to the limiters themselves (i.e. without queuse). It had me thinking that each limiter itself had one queue implicitly, but you could create child queues if desired.

As to the masking, is the goal to have a new queue created for each host, but the cumulative bandwidth is still capped by the parent limiter? I find I'm still confused about why multiple queues would be desirable, and that seems to be the main - if not only - purpose of using masks. From the second (ietf.org) link you provided, this excerpt would seem to suggest that multiple queues are created implicitly?

The intention of FQ-CoDel's scheduler is to give each flow its ownvqueue, hence the term "flow queueing". Rather than a perfect realisation of this, a hashing-based scheme is used, where flows are hashed into a number of buckets, each of which has its own queue.

That makes it sound like the "bucket size" parameter on the limiter itself dictates how many queues there are, and packets are placed into them when they are hashed

on the 5-tuple of source and destination IP addresses, source and destination port numbers, and protocol number

So it seems like there are implicitly queues involved already, but there is also the capability to explicitly add more queues, the need and/or reason for I'm struggling to understand.

I'm sure I'm just not grasping some (or multiple) concepts here and really appreciate your input.

tman222

You raise some good points. To be honest, I'm not 100% sure what would happen if you just create two limiters and then try to enable fq_codel. If fq_codel itself creates a queue for each flow then this might be enough for things to work and you don't need the additional queue underneath each limiter. In my case, I actually use multiple queues under each limiter to help me mange bandwidth fairness across several VLAN's by assigning weights to those queues.

I think you might need to experiment and report back your findings. Try creating just two limiters and then enabling fq_codel. If that doesn't yield the desired results, create two limiters with one queue under each and see if that changes the results.

Hope this helps.

TheNarc

I don't really have quantitative results yet, but I'm hoping to run some more tests tonight and report back. Basically, after reading over the traffic shaping section of the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z), I believe that I've come to understand the guidance on using queues underneath pipes and for using masks on those queues. There's a lot of great information in that man page, but this excerpt is particularly relevant:

Thus, when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).

Using masks results in dynamic queues/pipes. So we do not want dynamic pipes (because we want an overriding bandwidth cap) but we do want dynamic queues. I believe that by setting a source address mask of 32 (IPv4) and 128 (IPv6) on upload queues and a destination address mask of 32 (IPv4) and 128 (IPv6) on download queues, each host on the LAN will get its own queue, but the aggregate bandwidth usage of all those queues will be constrained by their common parent pipe.

I'll try to run some tests later tonight to get actual results to post. One thing I struggle with is that beyond running the dlsreports speed/bufferbloat test (and running it on multiple hosts simultaneously to see whether bandwidth is being both capped and fairly shared), I don't know how to get visibility into what is actually taking place with respect to the dummynet config (e.g. is it actually dynamically creating queues?). Of course, if the network ends up behaving well that's all that ultimately matters, but I do like understanding things when I can ;)

TheNarc

While I've been poking around, I have managed to increase my confusion. From what I can tell, ipfw is the interface by which dummynet is configured (hence the ipfw commands that are set up with shellcmd in order to persist the application of fq_codel through reboots). But does the ipfw service need to be running in order for dummynet to work? Because on my system, kldstat shows dummynet.ko loaded but not ipfw.ko. And executing ipfw show results in the message ipfw: retrieving config failed: Protocol not available. But ipfw queue show, ipfw sched show, and ipfw pipe show all work as expected. My presumption is that this is okay, because pf is still the firewall subsystem being used, and can somehow throw traffic into dummynet queues even though dummynet is a subset of the ipfw firewall subsystem, but because ipfw is not performing firewall duties it does not need to be (and should not be) loaded itself. Perhaps this ability of the pf firewall to assign traffic to dummynet queues is a pfSense-specific patch?

TheNarc

Here's one other specific point of confusion. Based on the output from ipfw sched show, it looks like dynamic queues aren't created when the scheduler is fq_codel even when the masks (dst-ip for download queues, src-ip for upload queues) are set to 0xffffffff (/32). Instead, no matter how many hosts on the LAN are active, the output from ipfw sched show only ever shows a single line per pipe with both Source IP/Port and Dest. IP/Port as 0.0.0.0/0, as shown in the attached screen shot. Perhaps I'm just not looking in the right place to see dynamically created queues, but I did notice that before changing the scheduler type to fq_codel, the default was WFQ2+, and with that scheduler and all other settings the same, the output from ipfw sched show showed multiple lines with real IPs and port numbers. Am I fundamentally misunderstanding something that explains this discrepancy?

ipfw_sched_show.png_thumb

tibere86

@TheNarc:

Here's one other specific point of confusion. Based on the output from ipfw sched show, it looks like dynamic queues aren't created when the scheduler is fq_codel even when the masks (dst-ip for download queues, src-ip for upload queues) are set to 0xffffffff (/32). Instead, no matter how many hosts on the LAN are active, the output from ipfw sched show only ever shows a single line per pipe with both Source IP/Port and Dest. IP/Port as 0.0.0.0/0, as shown in the attached screen shot. Perhaps I'm just not looking in the right place to see dynamically created queues, but I did notice that before changing the scheduler type to fq_codel, the default was WFQ2+, and with that scheduler and all other settings the same, the output from ipfw sched show showed multiple lines with real IPs and port numbers. Am I fundamentally misunderstanding something that explains this discrepancy?

My understanding was that dynamic pipes were only created if masks were set on the queues not the pipes. That's how I have my limiters setup and things have been working well for the past 6+ months I have had them enabled.

TheNarc

My interpretation of information from the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z) is that masks set of pipes result in dynamic pipes and masks set on queues result in dynamic queues, the difference being that:

when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).

The takeaway is that if you have dynamic pipes, you won't get a cumulative bandwidth cap, but rather a bandwidth cap per flow. That would be useful if you don't want anyone on your network to be able to use more than a certain amount of bandwidth, but not useful if you want to prevent the total bandwidth usage of all users on your network from exceeding your ISP's bandwidth caps.

What's confusing to me is that I believe I should have dynamic queues, but I'm not seeing any evidence of them (but admittedly may be looking in the wrong place by expecting to see that evidence in the output from ipfw sched show).

dennypage

Has anyone else encountered an issue with fq_codel with 2.4.3? I haven't dug into it yet, but it appears to have stopped working for me…

TheNarc

I haven't run tests yet, but I'll try to over the weekend. From all outward appearances (ipfw sched show, etc.) it seemed the same though. That said, I was never completely satisfied I had it configured correctly. I don't know any way of getting meaningful visibility into what it's actually doing and I never saw evidence of dynamic queuing in the output from ipfw sched show, only a single line per limiter with about "Source IP/Port" and "Dest. IP/Port" both 0.0.0.0./0 even though I have queues with masks configured under my limiters. But in any case, my bufferbloat did seem to be under control based on the dslreports test.

Are you judging that your setup isn't working by the results of that same bufferbloat test, or some other metrics from pfSense itself?

dennypage

@TheNarc:

Are you judging that your setup isn't working by the results of that same bufferbloat test, or some other metrics from pfSense itself?

I run latency monitoring through pfSense (i.e. dpinger from lan through firewall to my ISP's local hub). With 2.4.3, I am now seeing large spikes (both latency base and stddev) when hosts in the local network are actively downloading. Seeing poor results with dslreports speed test as well. Previously was A/A+, now D.

TheNarc

I checked my quality monitoring logs and haven't seen any latency spikes since I updated to 2.4.3 yesterday, but I also don't know how much the connection has been stressed during that period. Are you using the shellcmd method of persisting the application of fq_codel to your limiters? And have you sanity checked that it actually was applied by running ipfw sched show? If I think of anything else I'll let you know, and I'll get back with my test results once I'm able to run them.

dennypage

@TheNarc:

Are you using the shellcmd method of persisting the application of fq_codel to your limiters? And have you sanity checked that it actually was applied by running ipfw sched show?

Yes, using shellcmd, and the queues appear to still be set up correctly.

ipfw sched show:

00001: 150.000 Mbit/s    0 ms burst 0 
q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
 sched 1 type FQ_CODEL flags 0x0 0 buckets 0 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 1 
00002:  20.000 Mbit/s    0 ms burst 0 
q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
 sched 2 type FQ_CODEL flags 0x0 0 buckets 0 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 2

The firewall rules still appear to be assigning packets to queues correctly. If I halve the bandwidth in the limiter for upload or download, I do see a corresponding reduction in speed for the expected direction.

tman222

Upgraded to 2.4.3 yesterday evening and fq_codel still working as expected.

Did you edit any files when you set it up originally? Is there anything particularly unique about your setup, e.g. running fq_codel on VLANs, etc.? Also when you do run a speed test and then keep refreshing ipfw sched show during the process, do you see traffic being passed?

Hope this helps.

dennypage

No VLANs, no file edits, using shell command:

ipfw sched 1 config pipe 1 type fq_codel; ipfw sched 2 config pipe 2 type fq_codel

Here are some ipfw sched show outputs from during the download portion of a dslreports test using http:

00001: 150.000 Mbit/s    0 ms burst 0 
q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 1 
BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp
  0 ip           0.0.0.0/0             0.0.0.0/0     2285  3421723 193 289500  10
00002:  20.000 Mbit/s    0 ms burst 0 
q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 2 
  0 ip           0.0.0.0/0             0.0.0.0/0      290    15423  0    0   0

00001: 150.000 Mbit/s    0 ms burst 0 
q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 1 
BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp
  0 ip           0.0.0.0/0             0.0.0.0/0     3793  5686580 90 135000  15
00002:  20.000 Mbit/s    0 ms burst 0 
q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 2 
  0 ip           0.0.0.0/0             0.0.0.0/0      369    19572  0    0   0

00001: 150.000 Mbit/s    0 ms burst 0 
q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 1 
BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp
  0 ip           0.0.0.0/0             0.0.0.0/0     5577  8331991 140 208833  22
00002:  20.000 Mbit/s    0 ms burst 0 
q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 2 
  0 ip           0.0.0.0/0             0.0.0.0/0       50     3092  0    0   0

00001: 150.000 Mbit/s    0 ms burst 0 
q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
 sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 1 
BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp
  0 ip           0.0.0.0/0             0.0.0.0/0      427   634711 91 136500   0
00002:  20.000 Mbit/s    0 ms burst 0 
q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
    mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
 sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active
 FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
   Children flowsets: 2 
  0 ip           0.0.0.0/0             0.0.0.0/0      184     9635  0    0   0

Of course, if I use https for dslreports, the problem is not nearly as bad (as you would expect).

A Former User

I'm not using fq_codel, but I am using fairq+codel.
I've noticed since updating to 2.4.3 that a speedtest gives me exactly 50Mb/s

This is interesting, because this is exactly the BW I have set on my LAN queue for web traffic.

Previously a speedtest would burst high up towards the 100Mb/s my connection is.

Downloading torrents etc still seems to be fine, that is multiple sessions seem to be unaffected. Also fast.com checks report up around 70-80MB/s. But Speedtest is always ~exactly the 50Mb/s I've set for web traffic.

But certainly 2.4.3 seems to have changed something with regards to traffic shaping etc.

Apologies for taking the fq_codel thread slightly offtopic!

Nullity

@muppet:

I'm not using fq_codel, but I am using fairq+codel.
I've noticed since updating to 2.4.3 that a speedtest gives me exactly 50Mb/s

This is interesting, because this is exactly the BW I have set on my LAN queue for web traffic.

Previously a speedtest would burst high up towards the 100Mb/s my connection is.

Downloading torrents etc still seems to be fine, that is multiple sessions seem to be unaffected. Also fast.com checks report up around 70-80MB/s. But Speedtest is always ~exactly the 50Mb/s I've set for web traffic.

But certainly 2.4.3 seems to have changed something with regards to traffic shaping etc.

Apologies for taking the fq_codel thread slightly offtopic!

Yeah, AFAIK, limiters (dummynet) & fairq+codel (ALTQ) are very much separated in software. Though, your bug is still interesting. Maybe.

w0w

dennypage
The best way to check is it related to pfSense is to downgrade and re-test it again. You can also try to lower your bandwidth limits or play with KPTI option.
As for me, nothing changed in bufferbloat, but I have over provisioned hardware.

kripz

Anybody know why dslreports doenst how bufferbloat? It never gives me a score regardless of settings on that website.

A Former User

@kripz:

Anybody know why dslreports doenst how bufferbloat? It never gives me a score regardless of settings on that website.

It depends on where you are. If you live in Australia or NZ, it doesn't seem to work for us (it gives an error and doesn't repot anything)

dennypage

@w0w:

dennypage
The best way to check is it related to pfSense is to downgrade and re-test it again. You can also try to lower your bandwidth limits or play with KPTI option.
As for me, nothing changed in bufferbloat, but I have over provisioned hardware.

KPTI doesn’t affect it. Wouldn’t expect it to as limiters are kernel code. I’ve tested with half available bandwidth, and still encounter the bloat issue. I don’t think I’m under provisioned with hardware -– I have a 4860 running a 150/20 connection.

I’m assuming that this is the result of a FreeBSD kernel change. I haven’t downgraded and run a test to confirm yet, but hat’s about what I’m down to.

I’m taking it that you haven’t seen any change with a multichannel http test?