Massive performance drop after upgrade from 23.05 to 23.09
-
Hi,
I’ve recently updated to 23.09 and realized that HA Proxy throughput was reduced by more than 30%.
I'm using HA Proxy to offload SSL and allow all internal hosted applications to have a valid SSL certificate. One instance is OpenSpeedTest.
Download dropped from 1540Mbps to 870-950 Mbps and Upload even further down to ~760Mbps.
Downgrade to R25.xx immediately recovers the original Performance, upgraded another time, brings the figures down again. No config was changed.
PFSense is running on NUC11 Core I3, with 2,5GB connection, bypassing SSL brings the performance up to almost interface Speed, Up&Down 2467Mbps.
Anyone seen a similar performance issue, maybe related to the new OpenSSL Version, any suggestion what could be tunned / configured to get at least back to previous throughput?
Thx & Greets
-
-
Hmm, interesting. How exactly are you testing? Where from?
Are you able to roll back to 23.05 to check it's still good there? Was is actually 23.05 or 23.05.1?
Steve
-
@stephenw10
Testing from a Win10 Client, Chrome, 13700k on 10G fiber -> https://openspeedtest.local.mydomain -> NUC 11 2,5G HA Proxy on PFsense -> openspeedtest on the same host.
Yes, rollback / downgrade brings back the previous performance. -
So the client device is directly on the WAN side of the firewall or somewhere remote?
Not that it should make any difference to a relative difference between 23.05 and 23.09. I agree it seems likely it was something in the openssl update.
To be clear were you running 23.05 or 23.05.1 previously?
-
@stephenw10
All local, Client and Server are on the LAN side
I was on 23.05.01. -
@sunny1081 I have the same problem on a Netgate 6100.
When I fetch data from a website (local lan), the top of the sense shows me a utilization of around 70% or moreNetgate 6100, pfSense+ 23.09
is more information needed?
-
Also via HAProxy I assume?
-
@stephenw10 yes, the website is delivered via the haproxy.
I noticed it when I did my local speed test and the result was very bad (I connected the sense and my client via 10Gbit/s)
And what I also noticed when I tested the AES performance in the terminal was that it was also very bad. far worse than my hardware before the Netgate 6100.[23.09-RELEASE][admin@fw1.in.xxx.de]/root: openssl speed -elapsed -evp aes-256-gcm You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-GCM for 3s on 16 size blocks: 8843058 AES-256-GCM's in 3.03s Doing AES-256-GCM for 3s on 64 size blocks: 2811809 AES-256-GCM's in 3.02s Doing AES-256-GCM for 3s on 256 size blocks: 735595 AES-256-GCM's in 3.02s Doing AES-256-GCM for 3s on 1024 size blocks: 184877 AES-256-GCM's in 3.01s Doing AES-256-GCM for 3s on 8192 size blocks: 23542 AES-256-GCM's in 3.04s Doing AES-256-GCM for 3s on 16384 size blocks: 11779 AES-256-GCM's in 3.05s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x4ff8e3bfefebffff:0x2294e283 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-GCM 46676.76k 59520.26k 62284.18k 62940.77k 63459.06k 63339.37k
-
Try running top with:
top -HaSP
to show the per core usage. -
Also which crypto device settings are you using on the 6100?
-
OK, I think we have a lead on this. Devs are digging into it....
-
@stephenw10 said in Massive performance drop after upgrade from 23.05 to 23.09:
Try running top with:
top -HaSP
to show the per core usage.I'm happy to do that if I can help narrow down the problem.
yes QAT is activated
-
I have now tested 23.09.1. the values are slightly better. But I would have thought it would be more (I have no comparison with the netgate 6100) I now have 60MByte/s in the download.
[23.09.1-RELEASE][admin@fw1.in.xxx.de]/root: openssl speed -elapsed -evp aes-256-gcm You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-GCM for 3s on 16 size blocks: 25549143 AES-256-GCM's in 3.02s Doing AES-256-GCM for 3s on 64 size blocks: 18785835 AES-256-GCM's in 3.02s Doing AES-256-GCM for 3s on 256 size blocks: 7343485 AES-256-GCM's in 3.04s Doing AES-256-GCM for 3s on 1024 size blocks: 2143908 AES-256-GCM's in 3.01s Doing AES-256-GCM for 3s on 8192 size blocks: 281723 AES-256-GCM's in 3.03s Doing AES-256-GCM for 3s on 16384 size blocks: 141599 AES-256-GCM's in 3.03s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x4ff8e3bfefebffff:0x2294e283 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-GCM 135556.07k 398687.98k 618589.50k 729886.52k 761360.76k 765346.97k
-
You should openssl speed back at essentially the 23.05.1 speeds.
Had you never tested that in 23.05.1?
-
@stephenw10 I can't say, I always only had 23.09 on the netgate 6100
-
Do you still see HAProxy using close to 100% of one CPU core?
Is that speedtest actually using multiple streams?
-
@stephenw10 As you can see in the screenshot, the cpu is very busy https://forum.netgate.com/assets/uploads/files/1702051796330-scr-20231208-oypq.png. I can't say anything about the second question, I didn't develop it. I just use it.
-
Hi,
i can confirm that with 23.09.1 AES-NI Acceleration is back. For me this was a massive Performance drop in OpenVPN with 23.09 which is now fixed with 23.09.1.
23.05.1
root: openssl speed -elapsed -evp aes-256-gcm You have chosen to measure elapsed time instead of user CPU time. Doing aes-256-gcm for 3s on 16 size blocks: 15066311 aes-256-gcm's in 3.00s Doing aes-256-gcm for 3s on 64 size blocks: 7855959 aes-256-gcm's in 3.01s Doing aes-256-gcm for 3s on 256 size blocks: 2719459 aes-256-gcm's in 3.00s Doing aes-256-gcm for 3s on 1024 size blocks: 756518 aes-256-gcm's in 3.00s Doing aes-256-gcm for 3s on 8192 size blocks: 97459 aes-256-gcm's in 3.00s Doing aes-256-gcm for 3s on 16384 size blocks: 48892 aes-256-gcm's in 3.01s OpenSSL 1.1.1t-freebsd 7 Feb 2023 built on: reproducible build, date unspecified options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) compiler: clang The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-gcm 80353.66k 167158.48k 232060.50k 258224.81k 266128.04k 266321.96k
23.09
You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-GCM for 3s on 16 size blocks: 5880314 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 64 size blocks: 1862428 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 256 size blocks: 493665 AES-256-GCM's in 3.01s Doing AES-256-GCM for 3s on 1024 size blocks: 125115 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 8192 size blocks: 15699 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 16384 size blocks: 7843 AES-256-GCM's in 3.00s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x43d8e3bfefebffff:0x2282 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-GCM 31361.67k 39731.80k 42016.66k 42705.92k 42868.74k 42833.24k
23.09.1
You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-GCM for 3s on 16 size blocks: 15367274 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 64 size blocks: 7672483 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 256 size blocks: 2691809 AES-256-GCM's in 3.01s Doing AES-256-GCM for 3s on 1024 size blocks: 754613 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 8192 size blocks: 97312 AES-256-GCM's in 3.00s Doing AES-256-GCM for 3s on 16384 size blocks: 48977 AES-256-GCM's in 3.01s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x43d8e3bfefebffff:0x2282 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-GCM 81958.79k 163679.64k 229104.41k 257574.57k 265726.63k 266784.97k
This is with enabled AES-NI CPU Acceleration
-
Issue is resovled for me on 23.09.1, thx to whom ever fixed it.
You have chosen to measure elapsed time instead of user CPU time.
Doing AES-256-GCM for 3s on 16 size blocks: 126245097 AES-256-GCM's in 3.00s
Doing AES-256-GCM for 3s on 64 size blocks: 75396240 AES-256-GCM's in 3.00s
Doing AES-256-GCM for 3s on 256 size blocks: 41552001 AES-256-GCM's in 3.01s
Doing AES-256-GCM for 3s on 1024 size blocks: 15179553 AES-256-GCM's in 3.00s
Doing AES-256-GCM for 3s on 8192 size blocks: 2251028 AES-256-GCM's in 3.02s
Doing AES-256-GCM for 3s on 16384 size blocks: 1139204 AES-256-GCM's in 3.09s
version: 3.0.12
built on: reproducible build, date unspecified
options: bn(64,64)
compiler: clang
CPUINFO: OPENSSL_ia32cap=0xfffab2234f8bffff:0x18405f5ef1bf07ab
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-256-GCM 673307.18k 1608453.12k 3536560.96k 5181287.42k 6099157.46k 6033040.27kStill not line speed, but back to the level it was on 23.05.01
-
@micneu said in Massive performance drop after upgrade from 23.05 to 23.09:
I have now tested 23.09.1. the values are slightly better. But I would have thought it would be more (I have no comparison with the netgate 6100) I now have 60MByte/s in the download.
Please, what the software You use for this testing (the picture looks like Speedtest from Fast.com”)?