UPDATE: 2.4.3 is slower than 2.4.2 (Was: Has FAIRQ Behaviour Changed in 2.4.3?)
-
So I just changed my queues to PRIQ and I still see the same thing.
I think the 50Mb/s was a red herring.
I am testing using IPERF etc as well (and that traffic goes into my Bulk queue)
Even with just plain ol PRIQ enabled, I still only see 60-70Mb/s traffic.
But as soon as I disable the WAN&LAN queues (at the Interface/Parent level) I can get 93-94Mb/s
I think I'm going to have to roll back a VM to 2.4.2-P1 and do more testing, because PRIQ should always allow me my maximum bandwidth, assuming nothing else is using the network (and nothing else is, as is shown by removing the config and getting full speed)
-
if I am not mistaken you should not be setting bandwidth values for individual FAIRQ queues for the behaviour you want, I believe they work as hard caps rather than allowing one queue to borrow from another.
The problem is I cannot remember, and there is no reference to FAIRQ on the pf.conf man page as well as the pfsense documentation page. But what I do know is that I left the box blank on my ALTQ configuration when I used to use it, and if I left it blank it was probably because there is no queue borrowing system which stopped me using it, I just applied priorities instead.
HFSC allows queues to borrow from each other.
-
The DragonflyBSD has a manual page for pf.conf with fairq details.
However it appears my problem, whatever it is, isn't specifici to FAIRQ, as I posted above, I see slower than expected traffic even using PRIQ.
-
I've done a lot more testing and something is definetly amiss, but it might be a problem of mine.
I am running pfSense virtualised on Proxmox.When I run a single iperf stream from an external host into my network with no traffic queues configured (i.e. unticked on WAN/LAN) then I get a good 94-95 Mb/s
When I tick them, regardless of PRIQ or FAIRQ, that drops to about 60-70Mb/s, but the WAN interface still records it's getting 100Mb/s.I will have to do some more digging, but for the moment it looks like maybe my CPU is being pegged harder (though both pfsense and proxmox show only 50% CPU is being used when a queue is active, but only 15% when one isn't)
Is there a way to download 2.4.2-P1 so I can go back and forth and test?
-
https://atxfiles.pfsense.org/mirror/downloads/old/
I would strongly advise to try all of this on physical hardware. It wouldn't be the first time that someone got tripped up by a hypervisor glitch.
-
Yea, I'll see if I can dig out some actual hardware to try it on.
It's just the old thing of it was working on 2.4.2-P1.
I know because that was the first pfSense version I've deployed and I did extensive testing to make sure my QoS was working correctly and with excellent performance, and it was!
And then after 2.4.3 I suddenly have this bad performance problem.
I agree though, it's possible it's a hypervisor issue or a hardware issue that's crept in somewhere.Thanks for your help.
-
So I've done some more testing - can anyone smarter than me tell me if the below shows that my 2 x vCPUs are being pegged here?
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0K 32K CPU1 1 291.7H 55.16% [idle{idle: cpu1}] 11 root 155 ki31 0K 32K RUN 0 290.1H 50.37% [idle{idle: cpu0}] 12 root -92 - 0K 400K WAIT 0 213:36 48.55% [intr{irq261: virtio_pci2}] 12 root -92 - 0K 400K WAIT 1 148:00 43.18% [intr{irq264: virtio_pci3}]
That's running a 100Mb/s iperf that only gets ~50-70Mb/s with FAIRQ turned on.
With it off:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0K 32K CPU1 1 291.7H 91.36% [idle{idle: cpu1}] 11 root 155 ki31 0K 32K RUN 0 290.1H 87.53% [idle{idle: cpu0}] 12 root -92 - 0K 400K WAIT 0 215:07 11.89% [intr{irq261: virtio_pci2}] 12 root -92 - 0K 400K WAIT 1 149:20 8.10% [intr{irq264: virtio_pci3}]
That to me says that I'm CPU bound in the first example
Can someone confirm/deny this for me please :)
-
55 & 50% idle isn't that bad for your dual cores. Turning on shaping quadruples the CPU load.
-
Depends on the hardware for CPU usage. I do HFSC+Codel+NAT at line rate gigabit half-duplex around 17% cpu. I can't test full-duplex because I don't have the client hardware to test bidirectionally at line rate.
-
Hardware is a Intel(R) Core(TM) i5-5250U CPU. pfSense is one of two VM's on a Proxmox (kvm) host, cpu type is Host. pfSense is allocated a Gig of memory and using ~25% of that. PTI is turned off, both in the pfSense guest and the vm host. The only extra packages I use are openvpn-export and Avahi.
I have allocated it 2 CPUs in Proxmox, so really it's probably getting a single core with Hyperthreading turned on.
I can see the old pfSense v2.4.2 but is there a way to get the -P1 I was originally using? I assume if I install 2.4.2 and then do an upgrade I'll go to 2.4.3, not -P1?
I think I have an old vm backup somewhere, but would be good to reinstall from proper image.
-
I'm not aware of a way to pull down individual patches for their general releases.
-
OK so I finally got the time this weekend with the kids asleep to install pfSense 2.4.2 (I can't find a way to load -P1 but it doesn't matter)
And with the same config/QoS config, I get the following results (repeatable every time)
pfSense 2.4.2 - Speedtest 92Mb/s Down, 18Mb/s Up
pfsense 2.4.3 - Speedtest 49Mb/s Down, 18Mb/s UpBoth of those are with the same FAIRQ configuration.
So there appears to be some sort of performance regression between 2.4.2 and 2.4.3. What can I do to diagnose this further? I'd log a ticket on Redmine but I can't actually point to a bug and "It's different between versions" isn't something any sane develop can work with.
Does anyone have any suggestions for how to debug this further so can we can bisect it to a FreeBSD or pfSense patch?
-
I'd log a ticket on Redmine but I can't actually point to a bug and "It's different between versions" isn't something any sane develop can work with.
I would do exactly that, just as long as you can faithfully reproduce the condition.
-
Thanks KOM, I have created a ticket here.
I fully expect to get shouted at though :)Thanks for your help.