Playing with fq_codel in 2.4

tman222

Hi @gsakes and @mattund: The setups you have described sound intriguing and it might be something I want to try as well down the road when I have some spare time and access to an additional machine:

Could you guys talk a little more about how this is setup? Does the Linux machine in front take in all the WAN traffic, shape it and pass it through to pfSense? Are there then two sets of firewalls traffic must traverse (one on pfSense and on the Linux box) or just one? I'm just trying to understand the architecture a bit better and how one would setup up something like what you have described.

Thanks in advance.

gsakes

@tman222 In my case, I'd prefer to use PFSense for firewalling and shaping, but my testing showed that Linux/Cake performs better than PfSense/fq_codel, albeit not by much, maybe 10-15% depending on the load.

As far as the architecture is concerned you don't need to run a firewall on the linux host, you can simply configure it as a router; you'd need two network interfaces, where you'd configure cake using the 'layer.cake' script from the cake github repo on the egress interface.

Taking from @mattund 's example, this is how my setup looks like:

            |   pfSense FW          | Router                    |                   |
            |   (no shaping)        | Ubuntu Server             |                   |
            |   (192.168.0.0/24)|   | (10.18.9.0/24)            |                   |
   LAN -->  |                       | eth1 --> (cake) --> eth2  | --> Cable Modem   | --> WAN
            |                       |                           |                   |

mattund

0_1539793141165_a81c822f-7156-4b00-a86d-ed7993e14276-image.png

@tman222 said in Playing with fq_codel in 2.4:

Does the Linux machine in front take in all the WAN traffic, shape it and pass it through to pfSense?

That is correct, this is an entirely different VM that I have set up bridging on; I'm using Debian since it's what I'm most familiar with. I was able to do it all in /etc/network/interfaces, after installing Cake following https://www.bufferbloat.net/projects/codel/wiki/Cake/#installing-cake-out-of-tree-on-linux (you may need some packages to install Cake, not to worry, if your build system is missing them just install them):

# The loopback network interface
auto lo
iface lo inet loopback

# MANAGEMENT
allow-hotplug eth0
iface eth0 inet dhcp

# WAN1
iface eth1 inet manual
iface eth2 inet manual
auto br0
iface br0 inet manual
    bridge_stp off 
    bridge_waitport 0 
    bridge_fd 0
    bridge_ports eth1 eth2
    up tc qdisc add dev eth1 root cake bandwidth 12280kbit ; tc qdisc add dev eth2 root cake bandwidth 122800kbit
    down tc qdisc del dev eth1 root ; tc qdisc del dev eth2 root

# WAN2
iface eth3 inet manual
iface eth4 inet manual
auto br1
iface br1 inet manual
    bridge_stp off 
    bridge_waitport 0 
    bridge_fd 0
    bridge_ports eth3 eth4
    up tc qdisc add dev eth3 root cake bandwidth 2000kbit ; tc qdisc add dev eth4 root cake bandwidth 27000kbit
   down tc qdisc del dev eth3 root ; tc qdisc del dev eth4 root

You just add 5 NICS to a VM:
1: Management
2/3: Internal WAN1, External WAN1 attached to hardware
3/4: Internal WANN+1, External WANN+1 attached to hardware

FYI, I was not aware of netdata before I was informed here, but it is absolutely fantastic if you want to audit your setup.

gsakes

@mattund Yes, netdata is fantastic - I prefer telegraf/grafana or Prometheus for monitoring, but out-of-the box netdata is one of the best apps I've seen in a long time.

alt text

tman222

Thanks @gsakes and @mattund.

So in terms of configuration:

Let's say I had my pfSense FW --> Debian Linux Box --> WAN Connection.

Let's say the Debian Box I have had three network interfaces:
if0 Connected to pfSense WAN Interface
if1 DHCP From Management VLAN
if2: WAN Connection (DCHP From ISP)

Would I setup if0 with a static IP, e.g. let's say 10.5.5.1 and then connect if0 to my pfSense WAN interface (let's call it pf0)? On the pfSense side I would assign my WAN interface (pf0) another static IP, e.g. 10.5.5.2 and then set 10.5.5.1 as the Gateway? If that's correct so far, what would I have to setup on the linux box to make sure traffic can get out into the internet? I would have to enable ip forwarding and then ensure that if2 becomes the gateway for if0? Also, do I have to setup any firewall rules on the Linux box to make sure that, even though it's just a router, it's not open to the whole internet?

Apologies for these basic questions, I'm having just a bit of trouble visualizing how this would all work in concert.

Thanks again.

mattund

@tman222

I opted to only address my management interface; I have not addressed either the if0 or if2 in your scenario, instead I have created a basic bridge over each interface (note my creation of br0 and br1 and lack of addressing above Layer 2 in the interface configuration). Now, in my case I feel fine managing the VM over a management port, so this is fine.

Effectively, I've just made a "dumb" Layer2 switch between pfSense and the modems with isolated port pairs, that happens to also use CAKE on each participating interface's outbound, root queue.

Reason being, I wanted an extremely simple setup with as little overhead as possible and as few layers in the stacks of the participating bridge interfaces.

gsakes

@tman222

if0 -> 10.5.5.1 -> PFSense WAN interface
if2 -> WAN Connection (DCHP From ISP) -> Cable Modem

pf0 -> 10.5.5.2

On PfSense you would configure the WAN INterface with a static IP in the 10.5.5.0 subnet, and set the gateway to the adress of if0.

It's probably easiest is you use a firewall to set up forwarding etc., and also have one interface. Firehol is great, here is a basic configuration which would work for your setup - just an example, you'd have to adjust for your needs. Please note that this does not firewall anything:)

/etc/firehol/firehol.conf:

version 6

# LAN subnets.
lan_ips="10.5.5.0/24 WHATEVER_SUBNET_FOR_YOUR_PFSENSE_LAN/24"

interface4 if0 lan src "${lan_ips}"
    server  all             accept
    client  all             accept
    policy                  accept


interface4 if1 wan src not "${lan_ips} ${UNROUTABLE_IPS}"
    protection strong 100/sec 10
    server all              accept
    client all              accept


router4 lan2wan inface if0 outface if1
    server  all             accept
    client  all             accept
    route   all             accept
    masquerade

Also, you'd need to add a route back to your PFSense WAN interface static ip ( you can also do this from firehol, but I prefer the network manager):

/etc/network/if-up.d/custom-routes:

#!/bin/sh
route add -net 192.168.0.0/24 gw 10.5.5.2 dev if0

Hope this helps to clarify things.

dtaht

@gsakes said in Playing with fq_codel in 2.4:

@tman222 In my case, I'd prefer to use PFSense for firewalling and shaping, but my testing showed that Linux/Cake performs better than PfSense/fq_codel, albeit not by much, maybe 10-15% depending on the load.

As far as the architecture is concerned you don't need to run a firewall on the linux host, you can simply configure it as a router; you'd need two network interfaces, where you'd configure cake using the 'layer.cake' script from the cake github repo on the egress interface.

Taking from @mattund 's example, this is how my setup looks like:
            |   pfSense FW          | Router                    |                   |
            |   (no shaping)        | Ubuntu Server             |                   |
            |   (192.168.0.0/24)|   | (10.18.9.0/24)            |                   |
   LAN -->  |                       | eth1 --> (cake) --> eth2  | --> Cable Modem   | --> WAN
            |                       |                           |                   |

While this is a cool idea, you lose a cake feature by not doing nat on it: per host fq. None of our testing here to date has established the coolness of this feature, you need two or more source ip addresses to test from to see it work.

(yep, flent can do this test too. Not gonna describe how here).

Anyway, if you go this route, use "flows" instead of triple-isolate on cake, and nonat, save some cpu. But I thought we were dealing with a nat bug in pfsense in the first place?

dtaht

@gsakes You can even get rid of any need to have the cake box have ips. Just create a bump in the wire:

https://apenwarr.ca/log/?m=201808

tman222

Thanks @dtaht - I have seen that link you shared before. Now I'm curious how that "bump" should be spec'd hardware wise to shape at gigabit speeds. Does anyone have any thoughts on that? Also, does OpenWRT run on a normal linux box? if not, is it possible to duplicate the functionality on just a regular Linux install?

gsakes

@dtaht Thanks for the tips Dave, as always much appreciated:) I'm decommissioning my PFSense box for now. I've been using PFSense since 2010, and I don't think there's anything better, but I'll put it on ice until fq_codel matures and/or Cake is implemented. I'm slowly building out this Ubuntu box to be my firewall, using Firehol/Fireqos, netdata and PiHole.

So yes - I will be doing NAT on the box:)

gsakes

@dtaht said in Playing with fq_codel in 2.4:

triple-isolate

Yep, it's the 'per-host-fq' that is a real big factor, compared to fq_codel - frankly the biggest reason for me switching over:)

BTW - Anyone wanting to learn about fq_codel, Cake and the design of both should read this:

Piece of CAKE: A Comprehensive Queue
Management Solution for Home Gateways

uptownVagrant

So I read this paper Dummynet AQM v0.1 – CoDel and FQ-CoDel for FreeBSD’s ipfw/dummynet framework

The paper is written by the folks that implemented Codel and FQ-CoDel into FreeBSD ipfw/dummynet. I know @dtaht knows this because he reviewed the source and there is correspondence between them and he back in the day. I'm just catching up - thanks for your patience.

Looking at the examples in the paper, I'm wondering why the Codel AQM is selected in the pfSense WebUI in the August 2018 hangout? Per the FQ-CoDel examples in the paper above, it does not seem appropriate and removing Codel as the AQM from the pipe and queue removes the "flowset busy" error @mattund mentioned 4 months ago. @dtaht this is why I was stating codel+fq-codel - when I first learned about FQ-CoDel being added to pfSense 2.4.4, it was in the hangout video which it instructs to choose Codel as the AQM.

Concerning buckets and CPU utlization, I played with net.inet.ip.dummynet.hash_size which is the closest thing I could find to what you were explaining - pfSense defaults to 256 and I doubled the value on each flent rrul test up to 16384. I had to use sysctl -w net.inet.ip.dummynet.hash_size=$value on the fly in the console because /etc/inc/shaper.inc overwrites the setting to 256 any time you make a change to the limiters. I did not find setting this above 256 to provide real value.

So, unfortunately I haven't made much progress...

markn6262

@xraisen
I found using CODEL for QMA & FQ_PIE for Scheduler, along with CODEL for queue QMA does NOT error with “config_aqm Unable to configure flowset, flowset busy!” Hope this is helpful to those reading this thread late.

xRaisen

@markn6262 Thanks for the suggestion. But it doesn't work on my end. codel/tail drop+fq_codel do wonders even it nags “config_aqm Unable to configure flowset, flowset busy!”

strangegopher

Thanks for cake + OpenWRT in bridge mode suggestion. For my 150mbps+ speeds I was suggested ipq806x based or mvebu (cortexa9) device on the irc. Can't wait for it to arrive next week.

dtaht

yes the ipx8xx and a15 gear is good to a couple hundred mbit.

For a gbit the lowest end x86 I recommend is the apu2 or an i3.

Is there any way to push harder on the pfsense nat bug?

dtaht

flent has now been packaged up and made available for freebsd.

https://github.com/tohojo/flent/commit/c928c03a301258c26c7d045c74ecce6dfeaa3d5a

markn6262

@xraisen Your right FQ_PIE initally didn't exhibit the error but later did in some cases. Your recommendation appears more solid so I'm now using it as well. Thanks.

tman222

@uptownvagrant said in Playing with fq_codel in 2.4:

Looking at the examples in the paper, I'm wondering why the Codel AQM is selected in the pfSense WebUI in the August 2018 hangout? Per the FQ-CoDel examples in the paper above, it does not seem appropriate and removing Codel as the AQM from the pipe and queue removes the "flowset busy" error @mattund mentioned 4 months ago. @dtaht this is why I was stating codel+fq-codel - when I first learned about FQ-CoDel being added to pfSense 2.4.4, it was in the hangout video which it instructs to choose Codel as the AQM.

I have wondered the same as well (if you look up a few posts I shared some thoughts on this based on my current understanding of Dummynet and Limiters). In most situations, the scheduler chosen is just that - a scheduler only. In that case controlling the traffic flowing to the scheduler in the queue(s) makes sense to me. However, fq_codel combines scheduling and AQM into one, so having Codel on the input queue(s) seems a bit redundant to me. Now having said that, I currently have both enabled and it provides the best performance in my case. I'm still trying to figure out exactly as to why, but it might be because I'm trying to push a lot of packets from a 10Gbit LAN link into a 1Gbit WAN link and the additional AQM helps keeps things orderly.

I would be very interested to see some comparisons of Codel + fq_codel vs. just fq_codel as I do wonder at which point it actually starts to make a difference vs. just using additional processing without any real benefit.