Massive 10x Performance Regression in AES-GCM
-
Please see the results of my openssl speed tests:
I am seeing a massive performance regression in AES-GCM in pfSense 2.8.0 and 2.8.1-BETA.
I have two boxes in production and a test box. The production box on pfSense+ 24.11 is fine. I have a test box exactly the same as that one, which I am testing a new firewall build on, and I did some crypto speed tests and was floored with seeing a 10x decrease in speed for AES-GCM!!! Both have AES-NI enabled and apparently working. The AES-CBC results between the two are similar. It's just the AES-GCM that is terrible... and that is the cipher standard I use.
I then jumped on another production box that uses different CPUs, but still 4-core, and ran tests. It's on pfSense CE 2.8.0 (but thinking of upgrading that one to Plus). The results are also extremely poor for AES-GCM, but again AES-CBC is fine. The AES-GCM one does improve at 8192 bytes size, and is fine at 16384 bytes though. Weird.
I don't think this is a CE vs Plus thing , as I have results from all this gear from and older version of pfSense where all of it was testing fine. I didn't record the exact version of pfSense for that one (2.6.x??), but it was definitely from a time before we upgraded anything to Plus.
Seems to be a newer kernel thing. Something is broken, and I am hesitant to upgrade the Plus box to 25.7 if the new version has an issue with AES-GCM.
Anyone care to do their own testing and share results? Or shed light on what might be going on? Thank you.
-
I did the timed theoretical throughput tests for OpenVPN, and this is what I got:
Again, the AES-GCM results on the test node with CE 2.8.1-BETA seem to be well below what is to be expected.
-
OK a third post to myself. Seems like this is a known issue, at least to some with web searching skills...
https://github.com/openwrt/openwrt/issues/18929
The box that is always testing fine shows this output, from "openssl version"
OpenSSL 3.0.14 4 Jun 2024 (Library: OpenSSL 3.0.14 4 Jun 2024)The other two (that report speed problems) have:
OpenSSL 3.0.16 11 Feb 2025 (Library: OpenSSL 3.0.16 11 Feb 2025)The summary in the linked URL is:
"
This performance degradation is only reflected in the speed test program that comes with OpenSSL and does not affect the actual performance.
If it is caused by a bug fix in the test program, it should be OK, but it will cause some users to misunderstand that there is a performance problem.
"Reading this https://github.com/openssl/openssl/issues/28063 I've learned that there was an intentional change from 3.0.15 to 3.0.16 due to prior results being "erroneously high because much of the decryption effort was skipped" but I believe that their implementation is buggy, because of how poorly GCM compares against CBC for the low byte sizes - there's no way that CBC can beat GCM by that much (a factor of 20x) for the 16 byte size.
Anyway I hope more people complain to OpenSSL devs, and they can fix this up in 3.0.17. At least it's only just the testing that gives bugged results, and that real-world performance supposed to be fine.
-
So basically... only the benchmark results themselves are degraded—NOT 'real world' AES-GCM performance—which limited regression is a known issue with (at least) the OpenSSL library version 3.0.16 11 Feb 2025.
Great beta testing. Thanks for all the info!
-
Hmm, that's fun*!
For reference what sort of crypto hardware do you have enabled on those systems?
-
Both my "CPU Type A" boxes are Intel Xeon E5-2697v4 dual CPU Dell R730 chassis set up for virtualization. The one running pfSense+ 24.11 is a VM on ESXi 7.0 U2 with CPU AES-NI exposed to VM and enabled with AES-NI CPU-Based Acceleration enabled in pfSense settings.
The second "CPU Type A" "Test node" is the exact same Dell R730 configuration but with Proxmox 8.4 which I will migrate to as I escape the clutches of Broadcom. It runs Qemu CPU profile "x86-64-v3" which exposes all the features of the Xeon E5-2697v4 to guests - particularly CPU-based AES-NI, which is enabled on the pfSense 2.8.1-BETA running on that. I am doing a full firewall rebuild as it's just cleaner what with the hypervisor change, and changing over from vmx3 to vtnet interfaces, plus a whole bunch of other reasons I won't go into here. Once i'm satisfied, I'll get Netgate to xfer the Plus licence to that, which we paid for 2-year sub about 6 months ago.
Both hypervisors are dimensioned for 4-core, 8GB RAM, 60GB with UFS thin-provisioned virtual disk on SSD storage. I have tried ZFS for VMs before but there's too much disk write amplification, as the backing storage is also Copy-on-write (ESXi VMFS, and the ZFS I use for my Proxmox). ZFS on ZFS... yukky.
The "CPU Type B" is a Protectli Vault 6-port model FW6B (revision 1). Intel 7th Generation Core i3 7100U Kaby Lake-U, dual-core CPU. 6x Intel
Gigabit 82583V Ethernet NICs. coreboot BIOS. 32GB RAM, running on dual SSDs formatted as ZFS RAID1. I thought this was a 4-core model, but actually it's only dual core (I have some other newer small boxes that are 4-core).
I was half-way through the building of the new pfSense firewall on Proxmox when I discovered the AES-GCM test anomalies. Spent half a day running speed tests and checking all my server BIOS and hypervisor settings and running around in circles before I realised it's just an OpenSSL non-event. Doh!
-
@Gcon Now , I'm understood that why on pfSense 25.07.1 OpenVPN client with even DCO enabled internet speed dropped from almost 1 gb to 200 mb with AES-GCM. I was to use proton VPN on 24.11 and speed was almost 1 gb , now is terrible))) Both AES-NI CPU Crypto: Yes (active) and IPsec-MB Crypto: Yes (active)! But if even set CHACHA20-POLY1305, the same low speed. For this moment transferred to wireguard.
-
-
@Antibiotic Hi apparently the issue I mention here supposed to be only a cosmetic bug due to a change in the code for performing AES-GCM (and other AHEAD cipher) speed tests, that rely on OpenSSL routines. The change change came in, in between OpenSSL 3.0.15 and 3.0.16.
So there are most-likely other reasons for your speed regression. I suggest reading the URLs I linked to further up and digging a bit further into those.