Playing with fq_codel in 2.4



  • @tman222

    if0 -> 10.5.5.1 -> PFSense WAN interface
    if2 -> WAN Connection (DCHP From ISP) -> Cable Modem

    pf0 -> 10.5.5.2

    On PfSense you would configure the WAN INterface with a static IP in the 10.5.5.0 subnet, and set the gateway to the adress of if0.

    It's probably easiest is you use a firewall to set up forwarding etc., and also have one interface. Firehol is great, here is a basic configuration which would work for your setup - just an example, you'd have to adjust for your needs. Please note that this does not firewall anything:)

    /etc/firehol/firehol.conf:

    version 6
    
    # LAN subnets.
    lan_ips="10.5.5.0/24 WHATEVER_SUBNET_FOR_YOUR_PFSENSE_LAN/24"
    
    interface4 if0 lan src "${lan_ips}"
        server  all             accept
        client  all             accept
        policy                  accept
    
    
    interface4 if1 wan src not "${lan_ips} ${UNROUTABLE_IPS}"
        protection strong 100/sec 10
        server all              accept
        client all              accept
    
    
    router4 lan2wan inface if0 outface if1
        server  all             accept
        client  all             accept
        route   all             accept
        masquerade
    

    Also, you'd need to add a route back to your PFSense WAN interface static ip ( you can also do this from firehol, but I prefer the network manager):

    /etc/network/if-up.d/custom-routes:

    #!/bin/sh
    route add -net 192.168.0.0/24 gw 10.5.5.2 dev if0
    

    Hope this helps to clarify things.



  • @gsakes said in Playing with fq_codel in 2.4:

    @tman222 In my case, I'd prefer to use PFSense for firewalling and shaping, but my testing showed that Linux/Cake performs better than PfSense/fq_codel, albeit not by much, maybe 10-15% depending on the load.

    As far as the architecture is concerned you don't need to run a firewall on the linux host, you can simply configure it as a router; you'd need two network interfaces, where you'd configure cake using the 'layer.cake' script from the cake github repo on the egress interface.

    Taking from @mattund 's example, this is how my setup looks like:

                |   pfSense FW          | Router                    |                   |
                |   (no shaping)        | Ubuntu Server             |                   |
                |   (192.168.0.0/24)|   | (10.18.9.0/24)            |                   |
       LAN -->  |                       | eth1 --> (cake) --> eth2  | --> Cable Modem   | --> WAN
                |                       |                           |                   |
    

    While this is a cool idea, you lose a cake feature by not doing nat on it: per host fq. None of our testing here to date has established the coolness of this feature, you need two or more source ip addresses to test from to see it work.

    (yep, flent can do this test too. Not gonna describe how here).

    Anyway, if you go this route, use "flows" instead of triple-isolate on cake, and nonat, save some cpu. But I thought we were dealing with a nat bug in pfsense in the first place?



  • @gsakes You can even get rid of any need to have the cake box have ips. Just create a bump in the wire:

    https://apenwarr.ca/log/?m=201808



  • Thanks @dtaht - I have seen that link you shared before. Now I'm curious how that "bump" should be spec'd hardware wise to shape at gigabit speeds. Does anyone have any thoughts on that? Also, does OpenWRT run on a normal linux box? if not, is it possible to duplicate the functionality on just a regular Linux install?



  • @dtaht Thanks for the tips Dave, as always much appreciated:) I'm decommissioning my PFSense box for now. I've been using PFSense since 2010, and I don't think there's anything better, but I'll put it on ice until fq_codel matures and/or Cake is implemented. I'm slowly building out this Ubuntu box to be my firewall, using Firehol/Fireqos, netdata and PiHole.

    So yes - I will be doing NAT on the box:)



  • @dtaht said in Playing with fq_codel in 2.4:

    triple-isolate

    Yep, it's the 'per-host-fq' that is a real big factor, compared to fq_codel - frankly the biggest reason for me switching over:)

    BTW - Anyone wanting to learn about fq_codel, Cake and the design of both should read this:

    Piece of CAKE: A Comprehensive Queue
    Management Solution for Home Gateways



  • So I read this paper Dummynet AQM v0.1 – CoDel and FQ-CoDel for FreeBSD’s ipfw/dummynet framework

    The paper is written by the folks that implemented Codel and FQ-CoDel into FreeBSD ipfw/dummynet. I know @dtaht knows this because he reviewed the source and there is correspondence between them and he back in the day. I'm just catching up - thanks for your patience.

    Looking at the examples in the paper, I'm wondering why the Codel AQM is selected in the pfSense WebUI in the August 2018 hangout? Per the FQ-CoDel examples in the paper above, it does not seem appropriate and removing Codel as the AQM from the pipe and queue removes the "flowset busy" error @mattund mentioned 4 months ago. @dtaht this is why I was stating codel+fq-codel - when I first learned about FQ-CoDel being added to pfSense 2.4.4, it was in the hangout video which it instructs to choose Codel as the AQM.

    Concerning buckets and CPU utlization, I played with net.inet.ip.dummynet.hash_size which is the closest thing I could find to what you were explaining - pfSense defaults to 256 and I doubled the value on each flent rrul test up to 16384. I had to use sysctl -w net.inet.ip.dummynet.hash_size=$value on the fly in the console because /etc/inc/shaper.inc overwrites the setting to 256 any time you make a change to the limiters. I did not find setting this above 256 to provide real value.

    So, unfortunately I haven't made much progress...



  • @xraisen
    I found using CODEL for QMA & FQ_PIE for Scheduler, along with CODEL for queue QMA does NOT error with “config_aqm Unable to configure flowset, flowset busy!” Hope this is helpful to those reading this thread late.



  • @markn6262 Thanks for the suggestion. But it doesn't work on my end. codel/tail drop+fq_codel do wonders even it nags “config_aqm Unable to configure flowset, flowset busy!”



  • Thanks for cake + OpenWRT in bridge mode suggestion. For my 150mbps+ speeds I was suggested ipq806x based or mvebu (cortexa9) device on the irc. Can't wait for it to arrive next week.



  • yes the ipx8xx and a15 gear is good to a couple hundred mbit.

    For a gbit the lowest end x86 I recommend is the apu2 or an i3.

    Is there any way to push harder on the pfsense nat bug?



  • flent has now been packaged up and made available for freebsd.

    https://github.com/tohojo/flent/commit/c928c03a301258c26c7d045c74ecce6dfeaa3d5a



  • @xraisen Your right FQ_PIE initally didn't exhibit the error but later did in some cases. Your recommendation appears more solid so I'm now using it as well. Thanks.



  • @uptownvagrant said in Playing with fq_codel in 2.4:

    Looking at the examples in the paper, I'm wondering why the Codel AQM is selected in the pfSense WebUI in the August 2018 hangout? Per the FQ-CoDel examples in the paper above, it does not seem appropriate and removing Codel as the AQM from the pipe and queue removes the "flowset busy" error @mattund mentioned 4 months ago. @dtaht this is why I was stating codel+fq-codel - when I first learned about FQ-CoDel being added to pfSense 2.4.4, it was in the hangout video which it instructs to choose Codel as the AQM.

    I have wondered the same as well (if you look up a few posts I shared some thoughts on this based on my current understanding of Dummynet and Limiters). In most situations, the scheduler chosen is just that - a scheduler only. In that case controlling the traffic flowing to the scheduler in the queue(s) makes sense to me. However, fq_codel combines scheduling and AQM into one, so having Codel on the input queue(s) seems a bit redundant to me. Now having said that, I currently have both enabled and it provides the best performance in my case. I'm still trying to figure out exactly as to why, but it might be because I'm trying to push a lot of packets from a 10Gbit LAN link into a 1Gbit WAN link and the additional AQM helps keeps things orderly.

    I would be very interested to see some comparisons of Codel + fq_codel vs. just fq_codel as I do wonder at which point it actually starts to make a difference vs. just using additional processing without any real benefit.



  • Just wanted to report that my ipq806x based router tp-link archer c2600 could only do about 85-119 mbps download with qos on but was able to max out my speed with qos off. Guess I'll be returning the router.



  • @dtaht said in Playing with fq_codel in 2.4:

    flent has now been packaged up and made available for freebsd.

    https://github.com/tohojo/flent/commit/c928c03a301258c26c7d045c74ecce6dfeaa3d5a

    This has not gone unnoticed, it took me while to take a look!
    In order to use the flent pkg you most likely have to be on the latest pkg train (often its the quarterly branch). For the FreeNAS users:

    Create the file /usr/local/etc/pkg/repos/FreeBSD.conf as noted but override the URL instead of disabling the repository:
    Code:

    FreeBSD: {
      url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest"
    }
    

    then in shell

    pkg update
    pkg install flent
    # first test
    flent rrul -p all_scaled -l 60 -H flent-london.bufferbloat.net -t no_shaper -o RRUL_no_shaper.png
    

    cheers!



  • @mattund wouldn't there be a security issue with wan facing bridge interfaces and lan facing management interface on the same machine?



  • @strangegopher

    Eh, it's an unaddressed layer 2 bridge and just passes the traffic straight through. It won't route to the internal net from the external side unless it's through pfSense. On the internal side, I guess, but this is all home usage. I think it's ultimately up to the implementer to decide how they best want to manage it, my choice to use that management interface doesn't impact the functionality of what it accomplishes (although, like dtaht said, you lose per-host FQ if you neglect the NAT to pfSense like I am, but I am OK with that)

    I do recommend disabling IPv6 on the machine's "default" interfaces (everything but eth0) though: net.ipv6.conf.default.disable_ipv6=1, so it will only accept forward Layer2 stuff, and just double-check ip route, ip -6 route are all via eth0.... let me know if there's something I'm missing.



  • @mattund I see, just checking if it was an issue.



  • Hello everyone,

    I am the developer of Dummynet AQM (CoDel/PIE/FQ-CoDel and FQ-PIE). I have read part of this very long interesting thread and I can see some of you have the "config_aqm Unable to configure flowset, flowset busy!" error.
    That error tells you that you are trying to reconfigure a flowset (pipe or queue) while there is an actual traffic uses that flowset (it has an active scheduler). I prevent reconfiguring an active flowset because I had some difficulties freeing/reallocating individual AQM (CoDel and PIE) memory space (as well as timeout function used by PIE which can cause a kernel panic). Definitely, the issue can be solved with some work but I hadn't (and I still don't have) enough time to fix that issue.
    However, there is an easy workaround to avoid this error. If you make sure that there's no traffic passes through the pipe/queue that you want to reconfigure, then you can reconfigure the pipe/queue without problems. To achieve that you can use "skipto" action (in ipfw) to skip the rules that include the pipes/queues you would like to configure. Here is an example:

    00010  70451180  71665155748 skipto 65534 ip from any to any
    00100    188029      9852488 pipe 1 ip from 172.16.10.0/24 to 172.16.11.0/24 out
    65534 165525594 168354591831 allow ip from any to any
    65535         0            0 deny ip from any to any
    

    Now, pipe 1 can be reconfigured without problems.

    ipfw pipe 1 config bw 10mbit/s codel
    

    Please note that sometimes you should wait for a short time (based on net.inet.ip.dummynet.expire sysctl) to allow queues/schedulers to drain before reconfiguring the pipe/queue.
    For scheduler/AQM cases (i.e. fq_codel and fq_pie), you have to skip the queues instead of pipes.

    I hope this workaround works for you. Please let me know if you still have issues and I will try to do my best to provide solutions.

    Regards,
    Rasool Al-Saadi



  • Additionally, I cannot understand what is the purpose of using Codel + fq_codel?
    I want to clarify some dummynet/AQM internals to understand what does Codel + fq_codel mean. (all figures are from [http://caia.swin.edu.au/reports/160708A/CAIA-TR-160708A.pdf]).

    CoDel and PIE:
    In a simple setup, when a new dummynet pipe is configured, a queue (Droptail queue), two schedulers (FIFO and WF2Q+) and a link (traffic shaper and delay emulation) are created. The queue is connected to FIFO scheduler and the scheduler is connected to a link. If we reconfigure the pipe, dummynet will configure the queue and the link.
    0_1541641053361_codel.jpg
    For example, the following command will set traffic shaping to 10mbit/s for the link and set Codel AQM (instead of droptail) to be used with the queue. Note that CoDel and PIE can be used to manage the queue (not the pipe).

    ipfw pipe 1 config bw 10mbit/s codel
    

    fq_codel and fq_pie:
    fq_codel and fq_pie algorithms are implemented as dummynet schedulers and they have internal sub-queues as shown in the figure below.
    0_1541641603923_fq_codel.jpg
    We need a queue, a scheduler and a link to configure fq_codel/fq_pie. The queue is not "really" used to buffer packets but just to add it the ipfw rules. All packets are stored in fq_codel/fq_pie sub-queues directly. The scheduler is full fq_codel and fq_pie implementation (sub-queue buffers, DDR and codel/pie). The link is a normal dummynet link (traffic shaper).
    We have to configure a pipe to create a link since we cannot configure a link alone (according to my knowledge).
    Example:

    ipfw pipe 1 config  bw 800Mb
    ipfw sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 1024 flows 1024 ecn
    ipfw queue 1 config pipe 1 sched 1
    

    Then we add queue 1 to ipfw rules (in my case)

     ipfw add 100 queue 1 ip from 172.16.10.0/24 to 172.16.11.0/24 out
    

    Now, Codel + fq_codel is basically just fq_codel if the firewall is configured using the method above.

    Finally, I saw some configurations use queue with "mask" option and Codel. That is useful if you want to try different scheduling algorithms with codel/pie and flow separation to produce something similar to fq_codel. For example, you use that method to produce WF2Q+_codel, qfq_pie, qfq_codel, ... etc. This method produces similar results to fq_codel in general but not exactly the same (no priority for new flows for example).

    Hopefully, the explanation above was clear and useful.

    Regards,
    Rasool



  • @Rasool can this be done in current implementation in the UI? or does @mattund need to get involved.



  • @strangegopher Unfortunately, I cannot help with regard to the UI. To be honest, I haven't used pfsense before and I use only ipfw/dummynet CLI to configure my router/firewall.



  • @Rasool I was able to get it working with first creating a random limiter and setting that limiter in firewall pipe rule and then creating this shellcmd and rebooting.

    0_1541684169624_Annotation 2018-11-08 053512.jpg

    edit: actually setting Queue to droptail in settings does the same thing. ignore my last comment.



  • @strangegopher Excellent work!
    I installed pfSense and was able to setup fq_codel correctly (without CoDel) using just the WebUI. Here are the steps:

    1- Create "out" limiter

    • Tick Enable
    • Name: pipe_out
    • Set the bandwidth
    • Queue Management Algorithm: Tail Drop
    • Scheduler: FQ_CODEL

    2- Add new Queue

    • Tick "Enable"
    • Name: queue_out
    • Queue Management Algorithm: Tail Drop
    • Save

    3- Create "in" limiter

    • Tick Enable
    • Name: pipe_in
    • Set the bandwidth
    • Queue Management Algorithm: Tail Drop
    • Scheduler: FQ_CODEL

    4- Add new Queue

    • Tick "Enable"
    • Name: queue_in
    • Queue Management Algorithm: Tail Drop
    • Save

    5- Add limiter in firewall rule

    • Configure floating rule (as normal)
    • In / Out pipe: queue_in / queue_out

    I believe these steps prevent "config_aqm Unable to configure flowset, flowset busy!" error and no need for rebooting pfSense.

    Could you please test the above setup?



  • @rasool Yes!!! That is exactly what I did and I no longer see those errors even when my bandwidth is being used.



  • @rasool

    This is what I found too. Big thanks to you for implementing these schedulers into ipfw/dummynet!

    @uptownvagrant said in Playing with fq_codel in 2.4:

    So I read this paper Dummynet AQM v0.1 – CoDel and FQ-CoDel for FreeBSD’s ipfw/dummynet framework

    The paper is written by the folks that implemented Codel and FQ-CoDel into FreeBSD ipfw/dummynet. I know @dtaht knows this because he reviewed the source and there is correspondence between them and he back in the day. I'm just catching up - thanks for your patience.

    Looking at the examples in the paper, I'm wondering why the Codel AQM is selected in the pfSense WebUI in the August 2018 hangout? Per the FQ-CoDel examples in the paper above, it does not seem appropriate and removing Codel as the AQM from the pipe and queue removes the "flowset busy" error @mattund mentioned 4 months ago. @dtaht this is why I was stating codel+fq-codel - when I first learned about FQ-CoDel being added to pfSense 2.4.4, it was in the hangout video which it instructs to choose Codel as the AQM.

    Concerning buckets and CPU utlization, I played with net.inet.ip.dummynet.hash_size which is the closest thing I could find to what you were explaining - pfSense defaults to 256 and I doubled the value on each flent rrul test up to 16384. I had to use sysctl -w net.inet.ip.dummynet.hash_size=$value on the fly in the console because /etc/inc/shaper.inc overwrites the setting to 256 any time you make a change to the limiters. I did not find setting this above 256 to provide real value.

    So, unfortunately I haven't made much progress...



  • Hi @rasool - Welcome! It's great to see you joining the discussion.

    I'm very glad you were able to confirm the proper setup. I had postulated something similar a few post back up in this thread based on what I had read about Limiters/Dummynet:

    https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/635

    Here's one interesting thing though (and a question that I'm still trying to answer): In my case I have a 10Gbit LAN feeding into a 1Gbit WAN link. If I enable Codel on the limiter's child queues I see slightly better performance than just enabling fq_codel and leaving AQM alone on the queues (i.e. just going with the default Tail Drop).

    Now, is it the case that with Codel enabled on the child queues, some of the packets being received would already be dropped before reaching the fq_codel scheduler and its own set of queues? In other words there are two stages of AQM occurring? Or is Codel AQM on the child queues ignored when fq_codel is enabled?

    I definitely agree that it is not necessary to have Codel enabled on child queues since fq_codel handles the AQM with its own set of queues. However, could adding the Codel to child queues help when dealing with high speed (high pps) networks with a slower uplink? Or am I just creating additional CPU overhead and no benefit?

    Thanks in advance for the help and clarification, I really appreciate it.



  • @uptownvagrant
    Sorry, I missed that post :( Anyways, if you use the setup above without dynamic flows (without using mask option), I don't think change net.inet.ip.dummynet.hash_size can improve CPU utilization when FQ-CoDel is used since Dummynet fq_codel implementation creates and manages its own sub-queues.



  • @tman222 said in Playing with fq_codel in 2.4:

    I'm very glad you were able to confirm the proper setup. I had postulated something similar a few post back up in this thread based on what I had read about Limiters/Dummynet:
    https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/635

    You provided very nice information in that post. Actually, I tried your RR+Codel setup a long time ago to confirm a specific behaviour I saw with our fq_codel implementation. Regarding the differences to fq_codel (in addition to the quantum value you mentioned and regardless of the internal implementation differences), FQ-CoDel has two groups of sub-queues. One group for new (and very short-life like a DNS query) flows and other for old flows. New flows sub-queue has higher priority than old flows. This prioritisation improves network response (RFC8290 explains that very clearly).

    @tman222 said in Playing with fq_codel in 2.4:

    Here's one interesting thing though (and a question that I'm still trying to answer): In my case I have a 10Gbit LAN feeding into a 1Gbit WAN link. If I enable Codel on the limiter's child queues I see slightly better performance than just enabling fq_codel and leaving AQM alone on the queues (i.e. just going with the default Tail Drop).

    That's so weird. Choosing CoDel or Tail Drop with fq_codel should not change the performance at all. In fact, Tail Drop or CoDel enqueue/dequeue code is not be executed at all when fq_codel scheduler is configured (fq_codel has sperate enqueue/dequeue functions).

    @tman222 said in Playing with fq_codel in 2.4:

    Now, is it the case that with Codel enabled on the child queues, some of the packets being received would already be dropped before reaching the fq_codel scheduler and its own set of queues? In other words there are two stages of AQM occurring? Or is Codel AQM on the child queues ignored when fq_codel is enabled?

    The answer is Codel AQM on the child queues ignored when fq_codel is enabled. So, CoDel will not drop any packet in this setup and the buffer space of the child queue will not be used to store packets at all.

    @tman222 said in Playing with fq_codel in 2.4:

    I definitely agree that it is not necessary to have Codel enabled on child queues since fq_codel handles the AQM with its own set of queues. However, could adding the Codel to child queues help when dealing with high speed (high pps) networks with a slower uplink? Or am I just creating additional CPU overhead and no benefit?

    As you mentioned, fq_codel uses its own queues. These queues are accessible only by fq_codel instances. As mentioned in ipfw(8) man page, you can configure the number of these queues using fq_codel flows parameter. I don't think you are creating additional CPU overhead by enabling CoDel.

    Sorry if you mentioned that in your early post (so many posts) but I am curious about how do you measure your firewall performance when testing fq_codel. Do you use a local testing environment? Additionally, have you tried to use just CoDel, PIE or Tail Drop with traffic shaping (limiter) and see how many pps can be achieved? That is important to see which part causes a reduction in performance.



  • Hi @rasool - thanks for getting back to me. After reading your response, I decided to start over once more and followed your steps - i.e. I created child queues but did not enable Codel this time. After doing some initial testing, I'm happy to report that performance was similar to what I had before, so indeed it seems that Codel on the child queues is just being ignored. The only thing that would speak against that is that I found myself increasing the queue size (from the default 50) when I originally had Codel enabled on the child queues, and this did improve performance.

    I also had one other quick question for you: Why are the child queues necessary in the first place if fq_codel has its own set of queues? Put another way, why can't the Limiter be applied directly to the firewall rules instead of the child queues underneath the limiter?

    Thanks again for all your help, I really appreciate it.



  • @tman222 Thanks for posting your question. Was wondering the same but hadn't got around to asking.



  • @tman222
    Thank you for testing and confirming that. Enabling Codel for child queues and increasing queue size should not improve the performance of fq_codel as well, similar to pipe/limiter case beacuse fq_codel bypass dummynet queues.

    @tman222 said in Playing with fq_codel in 2.4:

    I also had one other quick question for you: Why are the child queues necessary in the first place if fq_codel has its own set of queues? Put another way, why can't the Limiter be applied directly to the firewall rules instead of the child queues underneath the limiter?

    @tman222 and @markn6262

    A simple answer is because of current pfSense WebUI you need to create "child" queues. You can use fq_codel with just limters if the WebUI configures the pipe to use the created schdulare. Here is an example of how to use fq_codel without creating a queue (I haven't test that though).

    ipfw pipe 1 config bw 800Mb sched 1
    ipfw sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 1024 flows 1024 ecn
    

    Basically, just sched <sched number> should be added to pipe configuration.
    Please note that creating a pipe will also create a queue internally because dummynet needs a flowset to interact with packets. There is no way to send packets directly to fq_codel schdeuler.



  • @Rasool, @tman222 , and @markn6262

    I mentioned this in a previous post but what I have found is that with lower configured mbit pipes, latency is higher under load when a child queue is not used. At 90mbit, latency is higher when not using a child queue and just using the limiter pipe - at 800mbit everything is basically the same with or without a configured child queue. The only changes I make between tests are to increase/decrease the pipe bandwidth and/or remove/add child queues and change floating rules to reflect in/out limiter/queue - the firewall is rebooted between tests to make sure all states are flushed and rules.limiter is reloaded properly. I'm not sure if the aforementioned latency behavior is specific to pfSense 2.4.4 or if this is also the case in vanilla FreeBSD 11.2. While the limiter does technically work with just a pipe and fq-codel, it appears to currently be more performant using a child queue.

    0_1542149748217_90Mb.jpg

    0_1542149759149_800Mb.jpg

    Edit 1 - Adding tested limiter configs

    No Child Queue - 90mbit

    pipe 1 config  bw 90Mb droptail
    sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    
    pipe 2 config  bw 90Mb droptail
    sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    

    Child Queue - 90mbit

    pipe 1 config  bw 90Mb droptail
    sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    queue 1 config pipe 1 droptail
     
    
    pipe 2 config  bw 90Mb droptail
    sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    queue 2 config pipe 2 droptail
    

    No Child Queue - 800mbit

    pipe 1 config  bw 800Mb droptail
    sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    
    pipe 2 config  bw 800Mb droptail
    sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    

    Child Queue - 800mbit

    pipe 1 config  bw 800Mb droptail
    sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    queue 1 config pipe 1 droptail
    
    pipe 2 config  bw 800Mb droptail
    sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    queue 2 config pipe 2 droptail
    


  • The other thing of note is that pfSense appears to use ipfw to configure the pipes/sched/queue but traffic is sent to "dnpipe" using a patched version of pf and not via ipfw rules. So we're not exactly comparing apples to apples with @Rasool.



  • @uptownvagrant said in Playing with fq_codel in 2.4:

    At 90mbit, latency is higher when not using a child queue and just using the limiter pipe

    Could you please increase pipe size for No Child Queue - 90mbit experiment and rerun the same test? I feel like fq_codel was not working in that configuration because @90mbit and 50pkts queue size and DropTail, the maximum queueing delay is around 6.5ms. For 800mbit, queueing delay is less than 1ms so you should not see a large queuing delay even without fq_codel.
    I think the maximum queue size you can set is 100 by default. You can increase that limit using sysctl net.inet.ip.dummynet.pipe_slot_limit



  • @rasool

    I reran the tests using a 90mbit pipe and changing the queue size to 15 slots (delay approx. 2ms) and 100 slots (delay approx. 13.5ms). As you suspected, it does appear that the FQ-CoDel scheduler is not being executed when just a pipe, without an associated queue, is used in pfSense.

    0_1542221109938_FQ-CoDel_Pipe_No_Queue.JPG

    And here is with 1 child queue with a queue size of 1000 to show that FQ-CoDel is properly handling the queue and not the Dummynet directive.

    Confirming ipfw limiter config:

    pipe 1 config  bw 90Mb queue 1000 droptail
    sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    queue 1 config pipe 1 queue 1000 droptail
    
    pipe 2 config  bw 90Mb queue 1000 droptail
    sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn
    queue 2 config pipe 2 queue 1000 droptail
    

    Confirming pf rules are using queues and root pipes:

    [2.4.4-RELEASE][admin@dev.localdomain]/root: pfctl -vvsr | grep "FQ-CoDel"
    @84(1540490406) match in on ix0 inet all label "USER_RULE: WAN in FQ-CoDel" dnqueue(2, 1)
    @85(1540490464) match out on ix0 inet all label "USER_RULE: WAN out FQ-CoDel" dnqueue(1, 2)
    [2.4.4-RELEASE][admin@dev.localdomain]/root:
    

    Confirming slot limit:

    [2.4.4-RELEASE][admin@dev.localdomain]/root: sysctl -n net.inet.ip.dummynet.pipe_slot_limit
    1000
    [2.4.4-RELEASE][admin@dev.localdomain]/root:
    

    0_1542224490733_FQ-CoDel_Pipe_1_Queue.JPG

    Edit 1: Adding flent.gz files.
    0_1542313536664_rrul-2018-11-14T081901.482271.C3558_pfSense2_4_4_90Mb_100qlen_FQ-CoDel_BBR_t010.flent.gz
    0_1542313547047_rrul-2018-11-14T082439.469459.C3558_pfSense2_4_4_90Mb_15qlen_FQ-CoDel_BBR_t011.flent.gz
    0_1542313567879_rrul-2018-11-14T113005.867344.C3558_pfSense2_4_4_90Mb_1q_1000qlen_FQ-CoDel_BBR_t012.flent.gz



  • @uptownvagrant
    Thank you for confirming that. So that means if CoDel+FQ_CoDel limiter is selected directly (not the child queue) in floating rules, the traffic will be controlled by CoDel algorithm.

    I can say, to avoid any possible problems when configuring fq_codel using current WebUI, the limiter child queue method should be used (with DropTail selected for both the limiter and child queue).

    Now we have to figure out which part(s) causes performing issues. I think we have to compare the results (pps, CPU %utilisation, throughput) when using limiter with DropTail+FIFO (limiter only) and DropTail+FQ_CoDel (using child queue method).



  • @uptownvagrant

    @uptownvagrant said in Playing with fq_codel in 2.4:

    @rasool

    I reran the tests using a 90mbit pipe and changing the queue size to 15 slots (delay approx. 2ms) and 100 slots (delay approx. 13.5ms). As you suspected, it does appear that the FQ-CoDel scheduler is not being executed when just a pipe, without an associated queue, is used in pfSense.

    0_1542221109938_FQ-CoDel_Pipe_No_Queue.JPG

    I am really loving watching y'all go at this, trying different things, fiddling with params, etc - coming up with things I'd have never thought of!

    I have to admit, that I'd really like *.flent.gz files to to tests like these. In . particular, it's obvious to my eye, you are using BBR, due to the drop every 10 sec.

    This comparison plot, though, was awesome.

    In this work, you are exposing a BBR pathology, where 4 flows start at exactly the same time, all go into their PROBE_RTT mode all at the same time, and you can see things go wrong at T+35 on the non-fq_codel case where one flow actually grabs the right bandwidth and the other flows do not, then it gets a mis-estimate of the queue and comes back too strong 10sec later.

    Which doesn't happen in the fq_codel case. We get good ole sawtooths, and no pathologies.

    (though I'd love to be getting drop statistics and other stuff, I imagine packet loss is pretty high as BBR is kind of confused). I gotta go repeat this style test on my own testbed!

    However, well, I do tend to stress rrul is a stress test, and applications really shouldn't be opening up 4 flows at the same time to the same place. BBR would hopefully behave much better were the starts staggered by 200ms.

    0_1542224490733_FQ-CoDel_Pipe_1_Queue.JPG

    Still, joy! no pathologies, low latency. Wish I had the .flent.gz file.... :)

    And a huge welcome to Rasool. I'm just amazed he did such a great job with fq_codel working from the RFC alone.



  • I have to note that things seem to be looking very good here.

    But, we had a huge problem with nat and UDP and a bug report filed on that a few weeks back. Is that fixed by getting the ipfw pipe right and being able to swap stuff out in flight?


Log in to reply