AMD AES-NI performance issues? faster when off
-
Whenever I enable AES-NI acceleration on AMD platforms (both my ATHLON 5150 and A4-5300), vpn performance drops a lot
AES-NI ON
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 3410.06k 13240.97k 48273.89k 145732.61k 356466.69kAES-NI OFF
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 265019.69k 364898.52k 453860.48k 480797.35k 487650.65kAnyone else having issues with AES-NI on AMD platforms ?
Im running pfsense 2.3.2.1 directly installed on the computer (no virtualization. But this also happens on virtualized environments)
thanks
-
afaik aes-ni will only help with with aes-gcm
-
I read on this forum it only works with cbc.
anyway, I tried gcm and the results are equal both with aes-ni on and off
-
take a look at this
https://forum.pfsense.org/index.php?topic=115627.msg646775#msg646775
-
yes that explains the huge performance drop when the options is enabled.
thank you!
-
The AES-NI checkbox in the GUI enables AES-NI for AES 128/192/256 CBC via cryptodev. That means that for each block of data to encrypt, the openssl library will issue an ioctl to send that block to the kernel, suffering a context switch penalty. Since the computation being performed is exactly the same as what openssl would do without cryptodev (and in that case, without the context switch) it is necessarily slower; there is no advantage at all in enabling AES-NI via cryptodev. You do not see a penalty for GCM modes because those are not implemented in cryptodev and so openssl continues to use its internal routines.
So why does the AES-NI kernel module exist at all? If you are using ipsec, which does all of its encryption in the kernel, then you need the AES-NI kernel module to let the ipsec module to use AES-NI–and in that case it's a performance gain because everything is happening in-kernel. Ideally, pfsense would enable a configuration in which you can load aesni.ko for ipsec without loading cryptodev, so you can get the benefits without the drawbacks.
So when would you ever want cryptodev? The /dev/crypto interface is only worth using with external crypto processors, like the old via padlock or the hifn cards (though you're generally much better off just throwing out such hardware and buying something new if you care at all about crypto performance; the crypto accelerator on the old alix boards, for example, was about as fast as a new raspberry pi or an APU1 without hardware crypto, and an order of magnitude slower than an APU2). In theory it might also have a benefit for quick assist, but I think that's implemented in openssl in a way that avoids using /dev/crypto. There's been speculation over the years that cryptodev might help improve cpu utilization, but I haven't seen results on modern hardware where any speculative gain outweighs the performance penalty of the context switching overhead.