Playing with fq_codel in 2.4
-
Well I upgraded to the 2.4.4 release today and thought I would share a few thoughts and tips on the upgrade process, since I had been using fq_codel in prior versions of pfSense already.
My initial setup had fq_codel enabled via the Shellcmd package, i.e. I enabled the scheduler using the ipfw command upon boot up (if you read back in this thread a little bit you'll see this configuration mentioned). Furthermore, I had two limiters, setup (one for upload and one for download) and 3 queues underneath each (they are weighted queues for three different network subnets). I did not remove any of these settings before upgrading to 2.4.4, hoping that they would maybe just carry over. Unfortunately, the settings did not carry over and after the upgrade to 2.4.4 and reboot, I was left only with two unconfigured limiters and no queues at all. I was able to reconfigure the limiters, but for some reason was unable to add queues (they would simply not show up on I saved and settings were applied). So I decided to just start from scratch: I went ahead and removed the limiters, any remnants of the old queues from my firewall rules, and also deleted the fq_codel setup command from ShellCmd. I then rebooted and after that was able to configure everything just fine - up and down limiters along with 3 queues underneath each of them. I applied the queues back to the firewall rules on the three subnets, and everything is working great like it did before the 2.4.4 upgrade. Also, like the @TheNarc alluded to already, it's not necessary to use a floating rule on the WAN interface, applying queues to the LAN side works fine as well.
If you have similar situation like mine where you already have an existing fq_codel setup in 2.4.3 or prior, and it's not too overly complicated, I would recommend removing it and just reconfiguring everything via the GUI once the upgrade to 2.4.4 is complete.
-
I had a very similar experience to @tman222. One important thing to note, is that if you were using the Shellcmd package in 2.4.3 for fq_codel, it is not sufficient to simply uninstall the package. If you uninstall the Shellcmd package without first deleting any shell commands that you installed with it, those shell commands will persist the package's uninstallation. You must first delete all shell commands that you had installed with the Shellcmd package, and then uninstall the Shellcmd package iteself.
-
@thenarc
Generally you don't need to uninstall it at all if you still using it. But you are right anyway, first delete all unneeded entries. -
Just tried all possible variants, playing with netgate recommended configuration, messed with masks, rules, changed everything wrongly and still getting my normal bandwidth and bufferbloat result, so I do thing that there is some configuration/package conflict which can be cause of those issues users reported above, that's why we need more information about what else is configured on firewall. Since some reported hardware is fully supported, I think the problem is limited to the software.
-
I'm finding no matter what I do, I get the following errors quite often in my dmesg:
config_aqm Unable to configure flowset, flowset busy! config_aqm Unable to configure flowset, flowset busy! config_aqm Unable to configure flowset, flowset busy!
This is straight after a reboot without me editing/configuring anything. They only appear a few times, sometimes once, sometimes twice, or in the example above, three times.
This is my /tmp/rules.limiter
pipe 1 config bw 97Mb codel target 5ms interval 100ms ecn sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ecn queue 1 config pipe 1 mask dst-ip6 /128 dst-ip 0xffffffff codel target 5ms interval 100ms ecn pipe 2 config bw 19Mb droptail sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn queue 2 config pipe 2 mask src-ip6 /128 src-ip 0xffffffff codel target 5ms interval 100ms noecn
I have the pipes configured on my default LAN out rule, because if I configure them on a floating rule, I get the routing loop problem with ICMP as posted above.
-
@w0w said in Playing with fq_codel in 2.4:
Just tried all possible variants, playing with netgate recommended configuration, messed with masks, rules, changed everything wrongly and still getting my normal bandwidth and bufferbloat result, so I do thing that there is some configuration/package conflict which can be cause of those issues users reported above, that's why we need more information about what else is configured on firewall. Since some reported hardware is fully supported, I think the problem is limited to the software.
I hear ya. Wish I had the answer. Colelq shows A+ for bufferbloat. fq_codel I get no better than C and when I run speed tests my gateways show out of service because they are triggering the loss threshold.
-
I just tried changing the roots to droptail and rebooting, but no different:
pipe 1 config bw 97Mb droptail sched 1 config pipe 1 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ecn queue 1 config pipe 1 mask dst-ip6 /128 dst-ip 0xffffffff codel target 5ms interval 100ms ecn pipe 2 config bw 19Mb droptail sched 2 config pipe 2 type fq_codel target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 noecn queue 2 config pipe 2 mask src-ip6 /128 src-ip 0xffffffff codel target 5ms interval 100ms noecn <<logs>> ng_pppoe[11]: no matching session ng_pppoe[11]: no matching session ng_pppoe[11]: no matching session config_aqm Unable to configure flowset, flowset busy! config_aqm Unable to configure flowset, flowset busy! config_aqm Unable to configure flowset, flowset busy! config_aqm Unable to configure flowset, flowset busy! config_aqm Unable to configure flowset, flowset busy!
It seems to be working OK though:
[2.4.4-RELEASE][admin@trogdor.muppetz.com]/root: ipfw sched show 00001: 97.000 Mbit/s 0 ms burst 0 q65537 50 sl. 0 flows (1 buckets) sched 1 weight 0 lmax 0 pri 0 droptail sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN Children flowsets: 1 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 43 26721 0 0 0 00002: 19.000 Mbit/s 0 ms burst 0 q65538 50 sl. 0 flows (1 buckets) sched 2 weight 0 lmax 0 pri 0 droptail sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 NoECN Children flowsets: 2 0 ip 0.0.0.0/0 0.0.0.0/0 29 2695 0 0 0
-
@gsmornot
What packages you have installed? What is WAN connection type? Do you have some other features configured on traffic shaper side? -
@muppet said in Playing with fq_codel in 2.4:
config_aqm Unable to configure flowset, flowset busy!
I've seen those messages also, once I've rebooted this morning and before. I am not sure does it impact anything.
-
@w0w said in Playing with fq_codel in 2.4:
@gsmornot
What packages you have installed? What is WAN connection type? Do you have some other features configured on traffic shaper side?Avahi, pfblocker, open vpn export, and nut. WAN is gigabit fiber with cellular backup.
I have a backup much more powerful system. I will update it tomorrow and test.
-
@gsmornot said in Playing with fq_codel in 2.4:
pfblocker
I do think that pfblocker have issues with limiters. That's reported before. Can you uninstall it and try again sometime?
-
@w0w said in Playing with fq_codel in 2.4:
@gsmornot said in Playing with fq_codel in 2.4:
pfblocker
I do think that pfblocker have issues with limiters. That's reported before. Can you uninstall it and try again sometime?
I won’t on my main system because it is one of the main packages I use but I will on my backup. The other issue is like I said before, CodelQ fixes my bufferbloat perfectly, or at least it passes the dslreports test with an A or A+.
-
@w0w said in Playing with fq_codel in 2.4:
@muppet said in Playing with fq_codel in 2.4:
config_aqm Unable to configure flowset, flowset busy!
I've seen those messages also, once I've rebooted this morning and before. I am not sure does it impact anything.
Yea, I agree, everything seems to be working fine (and fq_codel is showing up in ipfw sched show) but I thought it worth reporting as I didn't encounter such issues with 2.4.3-p1 and manually applying the following rules.limiter
pipe 1 config bw 95Mb sched 1 config pipe 1 type fq_codel queue 1 config pipe 1 mask dst-ip6 /128 dst-ip 0xffffffff pipe 2 config bw 18Mb sched 2 config pipe 2 type fq_codel queue 2 config pipe 2 mask src-ip6 /128 src-ip 0xffffffff
-
@tman222 thanks for the info mr tman...
I was just wondering one more thing... did you set up masks on child queues? (Source for upload / destination for downlod)The reason I ask this is because the netgate video does not set a mask
-
-
@muppet
Your configuration seems to not set/configure an AQM (explicitly defined with the inline parameter droptail/codel/pie/red/gred). You're just setting up the necessary fq_codel part. I found this:
if (busy) { D("Unable to configure flowset, flowset busy!"); err = EINVAL; break; }
That's the
config_aqm
function in dummynet/limiters for FreeBSD. My theory right now is, because my patch explicitly supplies one of those 4 aforementioned AQM arguments, dummynet is interpreting that as, "re-configure the AQM". Unfortunately, from what I know dummynet has a limitation where if the queue is "busy" (and I don't understand the specifics of that), you cannot re-configure the AQM only.I don't think this affects the Scheduler option though, unless I am reading this wrong. Maybe someone can double-check this. So, in summary, if you see these lines it just means that your AQM option didn't save, which most people would be leaving at Drop Tail would be my guess.
EDIT: I'm continuing to dig into this more.
/* * Reconfigure AQM as the parameters can be changed. * We consider the flowset as busy if it has scheduler * instance(s). */ s = locate_scheduler(nfs->sched_nr); config_aqm(fs, ep, s != NULL && s->siht != NULL);
s->siht != NULL
maps tobusy
. This may mean that, if my patch were to de-configure the scheduler first, then run the current commands, it may not print this error? -
@mattund said in Playing with fq_codel in 2.4:
@muppet
Your configuration seems to not set/configure an AQM (explicitly defined with the inline parameter droptail/codel/pie/red/gred). You're just setting up the necessary fq_codel part. I found this:
if (busy) { D("Unable to configure flowset, flowset busy!"); err = EINVAL; break; }
That's the
config_aqm
function in dummynet/limiters for FreeBSD. My theory right now is, because my patch explicitly supplies one of those 4 aforementioned AQM arguments, dummynet is interpreting that as, "re-configure the AQM". Unfortunately, from what I know dummynet has a limitation where if the queue is "busy" (and I don't understand the specifics of that), you cannot re-configure the AQM only.I don't think this affects the Scheduler option though, unless I am reading this wrong. Maybe someone can double-check this. So, in summary, if you see these lines it just means that your AQM option didn't save, which most people would be leaving at Drop Tail would be my guess.
@tman222 have you watched the video, would you mind sharing your thoughts on ecn, and masks.
I remember that ecn should only be on the up queue, and masks should be appropriaty set downq destination upq source
-
@mattund said in Playing with fq_codel in 2.4:
EDIT: I'm continuing to dig into this more.
In case you missed it:
https://www.reddit.com/r/PFSENSE/comments/9j1h8u/244_codel_limiter_error/?st=JMJ7GJB0&sh=4db7939a
-
Personally, I don't use it on that side (upload), and I haven't noticed any performance loss. I am not sure where the idea came from to not use ECN on the outgoing queues, however in saying that I don't mean to discredit the idea. I have a limited understanding of what ECN actually accomplishes besides setting a TCP flag for the channel participants when the queue/link is at capacity, so I'll have to pass on saying much more than that. I will say, at first impression it seems as though it would help to have it set on the upload side. We may need to carefully benchmark it set on and with connections shared with ECN-supported hosts (not all support it).
As for masks, I have played with them a little. I do believe they work still, and you can use them if you choose to. Personally, my setup is extremely basic so I don't need them configured outside of the default. FQ_CODEL will show one flow if you have one mask set up, by the way, usually 0.0.0.0/0 for the source and destination. From my experiences so far, this doesn't mean it's not working, it's just how it seems a dummynet scheduler configured as FQ_CODEL ingests streams. I think the developer of dummynet chose to use the internals of the scheduler type to determine the flow instead of using dummynet's capabilities of identifying unique flows, so maybe it "anonymizes" the traffic before heading into the FQ_CODEL code to save on CPU cycles.
Unrelated to that post, I am looking into why people aren't able to configure their limiters after upgrading to 2.4.4. I had no trouble, however, I was using the 2.4.3 patch. I hope I'm not too late to help there
-
You don't need masks if you don't want additional features / filtering like evenly shared bandwidth. Anyway I've followed Netgate guide and even tried to change some settings wrongly, everything I've tried — does not affect bufferbloat nor bandwidth numbers at all.