bad performance of pfsense 2.5.[0,1] compared to 2.4.5_p1
-
I have done several measurements using vmware 7.0.2 on a decent host (r740 with many cores). Network-Adapter is vmxnet on pfsense side, physical adapter is a mellanox connectix 3 (40 Gbit/s).
Performance does not scale with cpu until
hw.pci.honor_msi_blacklist="0"
is set in /boot/loader.confThe results show that pfsense 2.5.[0,1] is much slower than a 2.4.5_p1.
8 vCPUs, 2 Adapters (vmxnet 3),Forwarding results (pf disabled) 2.4.5_p1
(client in LAN, Server at WAN)
7,44 Gbit/s 1 Streams
9,32 Gbit/s 2
9,21 Gbit/s 4
10,2 Gbit/s 8
(Server at WAN, Client at LAN)
8,03 Gbit/s 1
8,23 Gbit/s 2
8,64 Gbit/s 4
9,82 Gbit/s 8Results with pf active (pass-rule) 2.4.5_p1
(client in LAN, Server at WAN)
4,40 Gbit/s 1
6,04 Gbit/s 2
8,75 Gbit/s 4
9,43 Gbit/s 8
(Server at WAN, Client at LAN)
4,34 Gbit/s 1
6,07 Gbit/s 2
8,57 Gbit/s 4
9,31 Gbit/s 8Forwarding results (pf disabled) 2.5.1
(client in LAN, Server at WAN)
2,71 Gbit/s 1
5,22 Gbit/s 2
8,95 Gbit/s 4
9,66 Gbit/s 8
(server at WAN, Client at LAN)
2,80 Gbit/s 1
4.37 Gbit/s 2
7,55 Gbit/s 4
9,58 Gbit/s 8Results with pf active (pass-rule) 2.5.1
(client in LAN, Server at WAN)
2,17 Gbit/s 1
3,56 Gbit/s 2
8,36 Gbit/s 4
9,46 Gbit/s 8Results with pf active (pass-rule) 2.5.0
(Server in LAN, client in LAN)
2,24 Gbit/s 1
4,19 Gbit/s 2
5,88 Gbit/s 4
8,82 Gbit/s 8For few streams i see a massive loss of performance.
I have compared to a debian-system which reaches a forwarding performance of more than out 12 Gbit/s forwarding (using one vcpu).
Any ideas for improvement ??Kind regards... If anything is missing or unclear, just ask.
-
@fwcheck said in bad performance of pfsense 2.5.[0,1] compared to 2.4.5_p1:
honor_msi_blacklist
https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist
https://redmine.pfsense.org/issues/11010
https://docs.netgate.com/pfsense/en/latest/hardware/tune.html#vmware-vmx-4-interfaces -
I have checked the suggested optimizations.
default:
vmx0: Using 512 TX descriptors and 512 RX descriptors
vmx0: Using 8 RX queues 8 TX queueswith optimizations:
vmx0: Using 4096 TX descriptors and 2048 RX descriptors
vmx0: Using 8 RX queues 8 TX queues
vmx0: Using MSI-X interrupts with 9 vectors2.5.1
mean: pf 2,15 Gbit/s / forwarding: 2,74 Gbit/s
mean: pf 4,35 Gbit/s / forwarding: 5,48 Gbit/s
mean: pf 7,42 Gbit/s / forwarding: 8,67 Gbit/s
mean: pf 8,67 Gbit/s / forwarding: 8,61 Gbit/sThe values are even a little bit slower....
Any other ideas ? I will crosscheck with freebsd 12-2. I have checkd
Freebsd 12-2 and Freebsd 13-RC4 bare metal. Factor 2 and Freebsd 13 is much faster.