Playing with fq_codel in 2.4

Nullity

Affects on latency for uploads are better than ALTQ which matches my earlier experience.

How much better?

You were using CoDel with ALTQ, yeah?

chrcoluk

I was using fairq+codel, it was doing a reasonable job but had perhaps about double the jitter I see now on thinkbroadband upload tests at same throughput.

tman222

I'm glad you got things working chrcoluk. I'm starting to wonder if the instructions in the OP might be incompatible with the latest version of pfSense and that is causing issues.

I posted these instructions in another thread - https://forum.pfsense.org/index.php?topic=142321.0

–---------------------------
Basic Instructions For Setting Up fq_codel:

1) Setup limiters - at minimum you'll need to create two root limiters and then create one queue under reach root limiter. You can setup more queues if it's required/desired. This is also where you set your bandwidth limits.
2) Apply the queues to the necessary firewall rules (e.g. to the LAN rule(s) that allows your outbound traffic in the "In/Out Pipe" section).
3) Enable fq_codel via the command line (can SSH into the firewall for that): Issue the following command:


ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel

To validate that the command has indeed enabled fq_codel, issue this command:

ipfw sched show

If all looks good (you should now see fq_codel listed in the output), go ahead and test to see if performance is acceptable. If not, you can make changes by tweaking the algorithm's default parameters and/or your bandwidth limits. For instance, you may have to increase the algorithm's target latency if you have a connection with slower upload speed, or decrease your bandwidth limits if e.g. your upload/download speeds aren't stable.

4) To make sure that your settings stick between reboots, install the ShellCmd add on package in pfSense. Once you have done that make sure you add the command in step 3 to ShellCmd.

Some additional notes:
1) On setting up limiters: See post #121 in the thread: https://forum.pfsense.org/index.php?topic=126637.msg754199#msg754199
2) On tweaking algorithm parameters: See post #198 (and following) in the thread: https://forum.pfsense.org/index.php?topic=126637.msg769665#msg769665

–-------------------------

These instructions have worked for me through 2.4.2-RELEASE-p1. Would it make sense to start a new thread with them?

Thanks in advance.

chrcoluk

I agree with a new thread, I would also put findings by others as notes as well such as tuning the quantum size, I think someone mentioned using 300 is good for prioritising small packets?

This thread has a lot of pages, so its easy to miss stuff.

For reference I am still using the newer method, I have left pfsense unpatched and added the command in shellcmd so it applies every reboot. (still got the patch applied that enhances limiter diagnostics page on gui)

Doing a filter reload doesnt seem to break it so its fine for me.

It is possible I somehow mixed up the modules or something as I had added extra modules to add functionality, so doing a clean install of 2.4.2 to ensure both modules are synced will be done by me at some point, as right now I am still using the 2.4.0 modules. Or I might install 2.4.2 elsewhere and just copy the modules from that across.

kcallis

I am getting ready to make the plunge on 2.4.2_p1. I am been using the wizard with Multiple LAN/WAN (I currently have 10 VLANs, 1 WAN and three VPN_WAN connections. I do so enjoy and envy those people that have 100/50 and 50/25 connections, but I have been curse with using AT&T and my DSL is 18/2, so I need to squeeze to most optimal setup.

I have been reading, but was wondering if some has possibly started a new thread so that I can be up to date on all the tricks to make this work smoothly?

gsmornot

Quick question. If I run the command ipfw sched show I see fq_codel. If I look in the gui at diagnostics limiter, I see fifo. Is that what I should see? The limiters are working fine but I wonder if fq_codel is really applied to the stream or is what I see just the result of setting limiters.

Edit: Just to see what happens, I left everything in place but removed the entry from shellcmd and restarted. This restored the system settings related to limiters. The result on DSLReports is A+ across the board. Limiter info in the GUI populates info now about the limiters. Maybe I missed something in this process but this is much better for me. I notice as shown in the screen shots and as mentioned here in other places that I use schedules 1 and 2 in my script but the system limiters do not. My DSLReports ratings prior to this change were D and F. I have a feeling its something I'm missing but for the moment I am getting the result I was after.

One caveat about this current config. I have gig symmetrical that will do 920 each way without limiters. With my current config it tests at @750 which is fine.

![Screen Shot 2018-03-02 at 3.31.46 PM.png](/public/imported_attachments/1/Screen Shot 2018-03-02 at 3.31.46 PM.png)
![Screen Shot 2018-03-02 at 3.31.46 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2018-03-02 at 3.31.46 PM.png_thumb)
![Screen Shot 2018-03-02 at 3.31.33 PM.png](/public/imported_attachments/1/Screen Shot 2018-03-02 at 3.31.33 PM.png)
![Screen Shot 2018-03-02 at 3.31.33 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2018-03-02 at 3.31.33 PM.png_thumb)

whitewidow

@Animosity022:

I figured I'd share my config as I spent some time today with little to do at work on converting over to fq_codel setup for my pfSense setup. I have a 1Gig Verizon FIOS line coming in which is rated at 940 down and 880 up. I have a pretty straight forward setup going as I only split up into 3 queues and basically high prioritize my games and VOIP to high and lower all my p2p / plex download traffic to everything else.

I have the Shell Command to create the proper queue setup:

https://i.imgur.com/k08PJQZ.png

I have an upload and download limiters with 3 buckets at 880Mb/s and 940Mb/s respectively. In those queues, I have a high, default and low at a 75, 25, 5 weight.

https://i.imgur.com/6JZTEXd.png

https://i.imgur.com/6cDzTe5.png

Source and Destination in the config gets a little squirrelly for me as I want to make sure I have a clear break in my upload and download traffic so I didn't select either there as I handle that in the rules config.

I have a series of match floating rules with logging setup so I can validate. All shaping is selected on my WAN interface:

https://i.imgur.com/HeMy45B.png

My rules examples are a bit big so I linked them a little different:

Default queue
http://i.imgur.com/CQDQGcf.png

http://i.imgur.com/CQDQGcf.png

Low priority rule
http://i.imgur.com/MDuvFFe.png

http://i.imgur.com/CADTE77.pn

For floating rules and pipes, the in and out are switched as noted in the help text. I did check that in my speed test as I can see the speeds are exactly what I expected. I noticed much better performance when compared with the other schedules stock in pfSense.

My speedtest results made me happy:

Edit 1: I seem to have a slight problem with matching my internal (Private) IPs properly. I've gotta do a little more testing to figure out why they aren't matching. My WAN rules work perfect though so it's a start. I just want to make sure I can get internal stuff matched as well.

From what I remember about limiters I thought that the Mask need to be set depending if the traffic is in bound or outbound.

I have my Upload mask set to "source address" and download to "destination address" for the limiter and each queue nested under.

Is this correct? Seems it works and I see traffic passing. I didn't with it set to "none"

zwck

I have some "issues" with the download queue. So i'd like to tell you what i have done so far.

I am on PfSense 2.4.2_1 and i have a symetrical 1000Mbit line and DSL reports
image before is attached. And My dsl report looks like this, as expected (image_1…)

Creating Limiters (screenshots attached for the upload Part, for the download part its the same but with a different name)

Upload (limited to 900Mbit)
highUp 75
defaultUp 25
lowUp 5
Download (limited to 900Mbit)
HighDown
defaultDown
lowDown

Creating Floating rules Rules
I created in total 6 Floating rules but only going to show the default ones in the screenshots
the other ones are basically clones anyway
Installing the shellcmd package and adding
ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel
horrible results, something is not working right on the download side, dunno what it is :D

Also an imigur album to just take a look at all the screenshots. https://imgur.com/a/bkIuA

01-DSL-Report_before.JPG_thumb

02_upload.JPG_thumb

03_outhigh.JPG_thumb

04outlow.JPG_thumb
![05_rule setup.JPG](/public/imported_attachments/1/05_rule setup.JPG)
![05_rule setup.JPG_thumb](/public/imported_attachments/1/05_rule setup.JPG_thumb)

06_in_01.JPG_thumb

07_out_01.JPG_thumb

08_shellcmf.JPG_thumb

09_commands_after.JPG_thumb
![10_horrible_download results.JPG](/public/imported_attachments/1/10_horrible_download results.JPG)
![10_horrible_download results.JPG_thumb](/public/imported_attachments/1/10_horrible_download results.JPG_thumb)

11_horrible.png_thumb

Harvy66

@zwck

I'm getting the feeling that "ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel" doesn't mean the same thing when you have multiple *queues. I find it interesting that your "ipfw sched show" says something like

Shed1 weight 75 fq_codel
child flowsets: 3 2 1

Shed2 weight 25 fq_codel
child flowsets: 6 5 4

Why do your two sched claim to have different weights if they're unrelated? Start small. Do a single queue per direction, the work your way to 3 each.

*I use the term "queue" in the general concept, not the technical context of the ipfw command

zwck

@Harvy66:

@zwck

I'm getting the feeling that "ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel" doesn't mean the same thing when you have multiple *queues. I find it interesting that your "ipfw sched show" says something like

Shed1 weight 75 fq_codel
child flowsets: 3 2 1

Shed2 weight 25 fq_codel
child flowsets: 6 5 4

Why do your two sched claim to have different weights if they're unrelated? Start small. Do a single queue per direction, the work your way to 3 each.

*I use the term "queue" in the general concept, not the technical context of the ipfw command

Well i thought i understood somehow the concept here, but evidently not :D

I thought

The floating rule decides which limiter to use

default traffic upload -> default limiter upload weight 25
default traffic download -> default limiter download weight 25

Vip traffic upload -> high limiter upload weight 75
…. and so on

TheNarc

Thanks to everyone who has contributed to this thread. I've read through the whole thing and am beginning to try out FQ_CODEL myself, but I have two distinct points of confusion that I didn't see addressed anywhere:

Why are masks used on the limiters? The explanation of masks on the limiter configuration page states that "If "source" or "destination" slots is chosen a dynamic pipe with the bandwidth, delay, packet loss and queue size given above will be created for each source/destination IP address encountered, respectively. This makes it possible to easily specify bandwidth limits per host." But isn't that exactly what we don't want? We want the limiter to serve as a cumulative bandwidth cap, such that the total bandwidth usage of all hosts on the network will not exceed an ISPs upload or download caps.
Is it really necessary to create queues for the limiters? I had thought it would suffice to simply create one limiter each corresponding to my upload and download caps, and then assign them as my in and out pipe respectively using LAN firewall rules.

Thanks in advance if anyone can help to clarify these points for me.

tman222

@TheNarc:

Thanks to everyone who has contributed to this thread. I've read through the whole thing and am beginning to try out FQ_CODEL myself, but I have two distinct points of confusion that I didn't see addressed anywhere:

Why are masks used on the limiters? The explanation of masks on the limiter configuration page states that "If "source" or "destination" slots is chosen a dynamic pipe with the bandwidth, delay, packet loss and queue size given above will be created for each source/destination IP address encountered, respectively. This makes it possible to easily specify bandwidth limits per host." But isn't that exactly what we don't want? We want the limiter to serve as a cumulative bandwidth cap, such that the total bandwidth usage of all hosts on the network will not exceed an ISPs upload or download caps.

Is it really necessary to create queues for the limiters? I had thought it would suffice to simply create one limiter each corresponding to my upload and download caps, and then assign them as my in and out pipe respectively using LAN firewall rules.

Thanks in advance if anyone can help to clarify these points for me.

1) Source/Destination masks are actually used on the queues under the limiters, but are not necessary on the limiters themselves.
2) Yes, you'll have to create queues to get fq_codel to work. The codel ("controlled delay") part of fq_codel controls the size of the queue of packets and helps prevent bufferbloat by dropping packets when necessary, while the fq (or "fair queuing") helps to ensure that flows get fair access to the bandwidth (vs. one flow dominating another etc.). For some further reading, check out these links:

https://en.wikipedia.org/wiki/CoDel
https://tools.ietf.org/html/rfc8290#section-1.3

Hope this helps.

TheNarc

Thanks tman222. I think that makes sense. It definitely makes sense that fq_codel needs queues to work, I just got confused by the fact that the ipfw sched show command seems to indicate that fq_codel has been applied to the limiters themselves (i.e. without queuse). It had me thinking that each limiter itself had one queue implicitly, but you could create child queues if desired.

As to the masking, is the goal to have a new queue created for each host, but the cumulative bandwidth is still capped by the parent limiter? I find I'm still confused about why multiple queues would be desirable, and that seems to be the main - if not only - purpose of using masks. From the second (ietf.org) link you provided, this excerpt would seem to suggest that multiple queues are created implicitly?

The intention of FQ-CoDel's scheduler is to give each flow its ownvqueue, hence the term "flow queueing". Rather than a perfect realisation of this, a hashing-based scheme is used, where flows are hashed into a number of buckets, each of which has its own queue.

That makes it sound like the "bucket size" parameter on the limiter itself dictates how many queues there are, and packets are placed into them when they are hashed

on the 5-tuple of source and destination IP addresses, source and destination port numbers, and protocol number

So it seems like there are implicitly queues involved already, but there is also the capability to explicitly add more queues, the need and/or reason for I'm struggling to understand.

I'm sure I'm just not grasping some (or multiple) concepts here and really appreciate your input.

tman222

You raise some good points. To be honest, I'm not 100% sure what would happen if you just create two limiters and then try to enable fq_codel. If fq_codel itself creates a queue for each flow then this might be enough for things to work and you don't need the additional queue underneath each limiter. In my case, I actually use multiple queues under each limiter to help me mange bandwidth fairness across several VLAN's by assigning weights to those queues.

I think you might need to experiment and report back your findings. Try creating just two limiters and then enabling fq_codel. If that doesn't yield the desired results, create two limiters with one queue under each and see if that changes the results.

Hope this helps.

TheNarc

I don't really have quantitative results yet, but I'm hoping to run some more tests tonight and report back. Basically, after reading over the traffic shaping section of the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z), I believe that I've come to understand the guidance on using queues underneath pipes and for using masks on those queues. There's a lot of great information in that man page, but this excerpt is particularly relevant:

Thus, when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).

Using masks results in dynamic queues/pipes. So we do not want dynamic pipes (because we want an overriding bandwidth cap) but we do want dynamic queues. I believe that by setting a source address mask of 32 (IPv4) and 128 (IPv6) on upload queues and a destination address mask of 32 (IPv4) and 128 (IPv6) on download queues, each host on the LAN will get its own queue, but the aggregate bandwidth usage of all those queues will be constrained by their common parent pipe.

I'll try to run some tests later tonight to get actual results to post. One thing I struggle with is that beyond running the dlsreports speed/bufferbloat test (and running it on multiple hosts simultaneously to see whether bandwidth is being both capped and fairly shared), I don't know how to get visibility into what is actually taking place with respect to the dummynet config (e.g. is it actually dynamically creating queues?). Of course, if the network ends up behaving well that's all that ultimately matters, but I do like understanding things when I can ;)

TheNarc

While I've been poking around, I have managed to increase my confusion. From what I can tell, ipfw is the interface by which dummynet is configured (hence the ipfw commands that are set up with shellcmd in order to persist the application of fq_codel through reboots). But does the ipfw service need to be running in order for dummynet to work? Because on my system, kldstat shows dummynet.ko loaded but not ipfw.ko. And executing ipfw show results in the message ipfw: retrieving config failed: Protocol not available. But ipfw queue show, ipfw sched show, and ipfw pipe show all work as expected. My presumption is that this is okay, because pf is still the firewall subsystem being used, and can somehow throw traffic into dummynet queues even though dummynet is a subset of the ipfw firewall subsystem, but because ipfw is not performing firewall duties it does not need to be (and should not be) loaded itself. Perhaps this ability of the pf firewall to assign traffic to dummynet queues is a pfSense-specific patch?

TheNarc

Here's one other specific point of confusion. Based on the output from ipfw sched show, it looks like dynamic queues aren't created when the scheduler is fq_codel even when the masks (dst-ip for download queues, src-ip for upload queues) are set to 0xffffffff (/32). Instead, no matter how many hosts on the LAN are active, the output from ipfw sched show only ever shows a single line per pipe with both Source IP/Port and Dest. IP/Port as 0.0.0.0/0, as shown in the attached screen shot. Perhaps I'm just not looking in the right place to see dynamically created queues, but I did notice that before changing the scheduler type to fq_codel, the default was WFQ2+, and with that scheduler and all other settings the same, the output from ipfw sched show showed multiple lines with real IPs and port numbers. Am I fundamentally misunderstanding something that explains this discrepancy?

ipfw_sched_show.png_thumb

tibere86

@TheNarc:

Here's one other specific point of confusion. Based on the output from ipfw sched show, it looks like dynamic queues aren't created when the scheduler is fq_codel even when the masks (dst-ip for download queues, src-ip for upload queues) are set to 0xffffffff (/32). Instead, no matter how many hosts on the LAN are active, the output from ipfw sched show only ever shows a single line per pipe with both Source IP/Port and Dest. IP/Port as 0.0.0.0/0, as shown in the attached screen shot. Perhaps I'm just not looking in the right place to see dynamically created queues, but I did notice that before changing the scheduler type to fq_codel, the default was WFQ2+, and with that scheduler and all other settings the same, the output from ipfw sched show showed multiple lines with real IPs and port numbers. Am I fundamentally misunderstanding something that explains this discrepancy?

My understanding was that dynamic pipes were only created if masks were set on the queues not the pipes. That's how I have my limiters setup and things have been working well for the past 6+ months I have had them enabled.

TheNarc

My interpretation of information from the FreeBSD man page for ipfw (https://tinyurl.com/jfzok5z) is that masks set of pipes result in dynamic pipes and masks set on queues result in dynamic queues, the difference being that:

when dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe).

The takeaway is that if you have dynamic pipes, you won't get a cumulative bandwidth cap, but rather a bandwidth cap per flow. That would be useful if you don't want anyone on your network to be able to use more than a certain amount of bandwidth, but not useful if you want to prevent the total bandwidth usage of all users on your network from exceeding your ISP's bandwidth caps.

What's confusing to me is that I believe I should have dynamic queues, but I'm not seeing any evidence of them (but admittedly may be looking in the wrong place by expecting to see that evidence in the output from ipfw sched show).

dennypage

Has anyone else encountered an issue with fq_codel with 2.4.3? I haven't dug into it yet, but it appears to have stopped working for me…