Mobile IPSEC VPN Slower with Async Crypto Enabled After Upgrade to 2.5.0 CE
-
Post-upgrade to 2.5.0, I've noticed a significant drop in IPSEC mobile/road-warrior throughput, specifically with asynchronous crypto enabled. Disabling async crypto actually increases speed, but is then limited by single-core performance, leaving me with less total throughput than on 2.4.5p1. CPU use, with async crypto disabled looks like you'd expect being limited to a single core. With it enabled, the CPU is pegged and throughput is worse. At the moment, I haven't been able to lab up a site-to-site to compare the below results to, so I'm not sure if this is limited to mobile configs or not.
Past threads and bug reports going back a few years suggest potential issues with either the SHA256 hash or AES GCM modes and how they're implemented in some hardware modules, so I've explicitly avoided using those in the following test cases (except at the very end, where noted).
Originally, I encountered this on a virtualized installation and over the internet. To avoid the possibility of resource contention being to blame here, all of the below testing was done with a spare (old-ish) Dell server, entirely over an internal network, starting on 2.4.5p1 and working up to the latest 2.5.0 RC.
The iperf3 server used is still virtualized, but is not on a particularly busy network. Tests done without the VPN that still going through the DUT for NAT and routing, are basically running at wire rate, suggesting this isn't due to other network or hardware issues.
The following are config bits on the DUT that I figure are worth noting. But if I've missed something, I have backups of the config on each version (not already included, as I haven't sanitized them).
Full tunnel, mobile extensions enabled
IPSEC MSS: 1350 (probably not needed here, but used for consistency with prior/original observations)
NAT-T: force
IKEv1, aggressive mode, x-auth + PSK
P1: AES-CBC 128, SHA1, DH 14
P2: ESP, AES-CBC 128, SHA1, PFS offPowerD enabled, all set to max.
DUT firewall rules are any/any from LAN to any.
No installed packages.
Kernel PTI and MDS Mitigation were left as default ("Enabled" and "Inactive", respectively)
DUT CPU:
Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
12 CPUs: 1 package(s) x 6 core(s) x 2 hardware threadssupported HW crypto, as reported on v2.4.5p1: AES-CBC,AES-XTS,AES-GCM,AES-ICM
supported HW crypto, as reported on v2.5.0: AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTSClient, an M1 MBP 13" with a wired USB gigabit ethernet dongle, is connecting to the DUT via a few switches to the "LAN" network. Iperf3 server is virtualized on ESXI, connected to the DUT via a few switches over the "WAN" network. Other than the DUT, no other devices are routing this traffic.
Network layout is thus roughly:
client device <-> switches <-> DUT <-> switches <-> iperf3 serverSince my original observations were with iperf3 in "reverse" mode, I've stuck with that for these tests. I've kept the iperf3 test case as simple as possible otherwise.
Server: 'iperf3 -s'
Client: 'iperf3 -R -c <server>'Summary output of each iperf3 run, per hardware and async crypto settings and pfsense version:
2.4.5-r-p1:
AES-NI hw crypto off, async crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 442 MBytes 371 Mbits/sec 70 sender
[ 7] 0.00-10.00 sec 441 MBytes 370 Mbits/sec receiverAES-NI hw crypto off, async crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 1.01 GBytes 867 Mbits/sec 83 sender
[ 7] 0.00-10.00 sec 1.01 GBytes 864 Mbits/sec receiverAES-NI hw crypto on, aysnc crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 443 MBytes 371 Mbits/sec 55 sender
[ 7] 0.00-10.00 sec 442 MBytes 370 Mbits/sec receiverAES-NI hw crypto on, aysnc crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 1.01 GBytes 870 Mbits/sec 64 sender
[ 7] 0.00-10.00 sec 1.01 GBytes 867 Mbits/sec receiverw/o VPN:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 1.04 GBytes 891 Mbits/sec 68 sender
[ 7] 0.00-10.00 sec 1.03 GBytes 887 Mbits/sec receiver2.5.0-r
AES-NI hw crypto off, async crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 474 MBytes 398 Mbits/sec 67 sender
[ 7] 0.00-10.00 sec 473 MBytes 397 Mbits/sec receiverAES-NI hw crypto off, async crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 386 MBytes 324 Mbits/sec 395 sender
[ 7] 0.00-10.00 sec 382 MBytes 321 Mbits/sec receiverAES-NI hw crypto on, aysnc crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 474 MBytes 397 Mbits/sec 77 sender
[ 7] 0.00-10.00 sec 472 MBytes 396 Mbits/sec receiverAES-NI hw crypto on, aysnc crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 387 MBytes 324 Mbits/sec 663 sender
[ 7] 0.00-10.00 sec 383 MBytes 321 Mbits/sec receiverw/o VPN:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 1.06 GBytes 914 Mbits/sec 46 sender
[ 7] 0.00-10.00 sec 1.06 GBytes 911 Mbits/sec receiver2.5.1.rc.20210318.0300
AES-NI hw crypto off, async crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 477 MBytes 400 Mbits/sec 71 sender
[ 7] 0.00-10.00 sec 475 MBytes 398 Mbits/sec receiverAES-NI hw crypto off, async crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 386 MBytes 324 Mbits/sec 171 sender
[ 7] 0.00-10.00 sec 384 MBytes 322 Mbits/sec receiverAES-NI hw crypto on, aysnc crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 474 MBytes 398 Mbits/sec 125 sender
[ 7] 0.00-10.00 sec 473 MBytes 397 Mbits/sec receiverAES-NI hw crypto on, aysnc crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 385 MBytes 323 Mbits/sec 357 sender
[ 7] 0.00-10.00 sec 383 MBytes 321 Mbits/sec receiverw/o VPN:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 1.05 GBytes 904 Mbits/sec 51 sender
[ 7] 0.00-10.00 sec 1.05 GBytes 901 Mbits/sec receiverAll other tests done with just AES-NI flipped on/off, BSD crypto never enabled. The following is with AES-NI and BSD crypto enabled, just to be sure (still on 2.5.1rc). Once enabled, the DUT was rebooted to ensure the modules had been loaded.
AES-NI and BSD crypto enabled:
async off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 476 MBytes 399 Mbits/sec 76 sender
[ 7] 0.00-10.00 sec 475 MBytes 398 Mbits/sec receiverasync on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 387 MBytes 324 Mbits/sec 360 sender
[ 7] 0.00-10.00 sec 383 MBytes 321 Mbits/sec receiverAnd, since theres been a few cases reported of the upgrade causing some funky configs, I went through each part of the IPSEC configs and re-saved, then retested (AES-NI and BSD crypto enabled).
async crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 477 MBytes 400 Mbits/sec 108 sender
[ 7] 0.00-10.00 sec 476 MBytes 399 Mbits/sec receiverasync crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 385 MBytes 323 Mbits/sec 305 sender
[ 7] 0.00-10.00 sec 382 MBytes 320 Mbits/sec receiverFor kicks, I gave in and tested AES-CBC 256 (v2.5.1.rc):
async crypto off:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 429 MBytes 360 Mbits/sec 57 sender
[ 7] 0.00-10.00 sec 428 MBytes 359 Mbits/sec receiverasync crypto on:
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-10.00 sec 344 MBytes 288 Mbits/sec 308 sender
[ 7] 0.00-10.00 sec 341 MBytes 286 Mbits/sec receiverSo far, I haven't found a configuration or external reason for the drop in speeds. Where I originally ran into this, rolling back to a snapshot prior to the upgrade (i.e., rolling back to 2.4.5p1) immediately restored the expected throughput. Since I can replicate this on completely different hardware, and without pfsense virtualized, it strongly suggests this isn't hardware related.
Happy to test just about anything with the DUT, it'll just take a few days to turn around as I'm not always on-site with it.
Thanks!