Playing with fq_codel in 2.4
-
@pentangle I assume this was implied, but just to clarify, you do not see these connections being dropped when you don't use any limiters? One thing that may be worth trying (and at least is easy to try) is setting masks on the queues. That will cause a dynamic queue to be created for each host, rather than all hosts funneling into a single queue, while still enforcing a cumulative bandwidth limit. To be clear, don't set masks on the limiters, but on the child queues of the limiters. For download queues set a destination mask and for upload queues set a source mask; for both set them to 32/128 bits for IPv4/IPv6.
-
I would try @TheNarc 's suggestions first. If you still experience drops, here are a couple more things to try:
-
Reducing the quantum parameter on the algorithm to something lower like 300. Reference: https://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_benchmarking_Codel_and_FQ_Codel/
-
If that also does not help, you could create separate weighted queues under your limiters (e.g. one set of queues for VoIP and RDP traffic, another set for the other traffic) to ensure that VoIP and RDP are guaranteed a certain amount of bandwidth. This would also require you to create the appropriate firewall rules to route VoIP and RDP through the one set of queues, while the rest of the traffic would go through the other set of queues.
Hope this helps.
-
-
As far as I know, there are no weights or priorities with FQ_CODEL. If you want such functionality you could try using QFQ + codel. From ipfw man page:
implements the QFQ algorithm, which is a very fast variant of WF2Q+, with similar service guarantees and O(1) processing costs (roughly, 200-250ns per packet).
Using QFQ + Codel works pretty well for me when it comes to sharing bandwith priority based.
-
@tman222 try your test further from the ap.
-
@pentangle normally things like voip "just work" with fq_codel. no classification required. https://www.researchgate.net/publication/327781871_Analysing_the_Latency_of_Sparse_Flows_in_the_FQ-CoDel_Queue_Management_Algorithm
-
how to do fq_codel on dual wan (load balancing + failover) setup?
I tried to follow jim-p video on fq_codel, it works on single wan. but on dual it's not working.
-
-
More long duration flent based tests would help. I'm concerned about various bits of flakyness y'all are reporting, like dropping all connections at the end of a test.
-
if there is someone out there that would like to send me a box to play with, or (or more simply) open up a ssh port, that would help.
-
-
@knowbe4 said in Playing with fq_codel in 2.4:
how to do fq_codel on dual wan (load balancing + failover) setup?
I tried to follow jim-p video on fq_codel, it works on single wan. but on dual it's not working.
I have several gateways (WAN + VPNs) and I did it using limiter rules on LAN. They all get put into the same pipe. The tricky part with your setup is that your overall bandwidth is shared over two connections so if I were you I would use interface QoS per WAN interface and use a round-robin WAN assignment which is I'm assuming you are doing given you are doing load balancing. The only way to properly load balance and share bandwidth evenly between your customers would be to weight the random gateway assignment based on real time bandwidth usage, effectively load balancing the bandwidth as opposed to connections. I'm not sure such thing exists off the shelf.
-
What exactly would you need, a port open on the pfsense so you could ssh into it and you could play with fq_codel. Or are you more interested in just pure data acquisition with a pfsense standard config using flent. I have not looked into flent, are there some guides i can follow? Or would it be already enough to setup a ubuntu vm and you'll do the rest.
-
@thenarc Sorry for the delay, this is my first time back in the office since I wrote my last post. I've tried adding masks to the queues but still get connections dying right at the end of the upload test. I can confirm when disabling the limiters the connections stay up.
-
@tman222 I'm holding my breath, but I think that Quantum change might have fixed this. It's not dropped this time. I'll try further tests later, but I suspect this might be the issue - thanks!
-
@tman222 Done a bunch more tests and no drops!! :) thanks.
-
@zwck The easiest thing from my perspective would be for you to compile netperf with
./configure --enable-demo
for this platform, run "netserver -N" on the box (open up the relevant port 12685 on both ipv6 and ipv4 if available), then I can flent test from anywhere if you give me the ip.
That's all flent needs to target tests at a box from the outside world. If you are extra ambitious you can also install irtt (it's in go) and open up it's udp port....
Flent has other good guidelines on flent.org if you wish to play with it yourself (the rrul test is the way to get the mostest fastest). but it involves installing a lot of python code somewhere.
Having ssh into the box also available would let me tcpdump and monitor other things (I'm not usually big on asking for access of that level, being able to target some tests at the box would be a start)
-
@pentangle You shouldn't have to fiddle with the quantum at all. You are saying that a quantum 300 works when a quantum 1514 does not? The quantum 300 thing is an optimization for very slow links, it shouldn't "cause connections to drop at the end of a test".
-
Hi @dtaht - the intuition behind my suggestion was to try to make the algorithm more "fair" since VoIP packets tend to be a bit smaller.
I was going off of what I read here:
https://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_benchmarking_Codel_and_FQ_Codel/
"We have generally settled on a quantum of 300 for usage below 100mbit as this is a good compromise between SFQ and pure DRR behavior that gives smaller packets a boost over larger ones."
@Pentangle also mentioned that his connection speed was 80/20.
Is this not the right way to think about the quantum?
-
@dtaht said in Playing with fq_codel in 2.4:
@zwck The easiest thing from my perspective would be for you to compile netperf with
./configure --enable-demo
for this platform, run "netserver -N" on the box (open up the relevant port 12685 on both ipv6 and ipv4 if available), then I can flent test from anywhere if you give me the ip.
That's all flent needs to target tests at a box from the outside world. If you are extra ambitious you can also install irtt (it's in go) and open up it's udp port....
Flent has other good guidelines on flent.org if you wish to play with it yourself (the rrul test is the way to get the mostest fastest). but it involves installing a lot of python code somewhere.
Having ssh into the box also available would let me tcpdump and monitor other things (I'm not usually big on asking for access of that level, being able to target some tests at the box would be a start)
I mean i can set up an netperf server behind the pfsense, on it, might not be possible for me.
-
@tman222 It is a right way to think about the problem. :) However the OP was reporting "all my connections drop at the end of the test". There should be a "burp" at the beginning of the test as all the flows start, but i guess i don't know what he means by what he said.
-
@zwck If you can port forward or ipv6 for a box behind, that would work. Otherwise I tend to originate flent tests from the box(es) behind to one or more of our flent servers in the cloud.
-
@dtaht said in Playing with fq_codel in 2.4:
@zwck If you can port forward or ipv6 for a box behind, that would work. Otherwise I tend to originate flent tests from the box(es) behind to one or more of our flent servers in the cloud.
For that type of test should i just turn off any type of the traffic shaping or should i keep it running as i have it configured.
-
if the netgate folk could make that netperf (and irtt) available that would be very helpful overall. I really don't trust web based tests - most of them peak out at ~400mbit in the browser...