Weird - openSSL running better without hardware crypto?
-
I've dabbled with pfSense on and off over the last few years, and finally moved house and sorted a decent network. We have 200/20 cable and I installed the latest pfSense on an APU2C4 with a mSATA SSD. All is working great after a few basic tweaks (thanks to guides on here), but I noticed something weird when testing openssl speeds.
The board's CPU supports AES, and on IPFire (which I also tested) I got better results than pfSense. IPFire is Linux, so no biggie, but I did notice something about pfSense that confused me. When testing with crypto disabled (none) in the webUI, I get better throughput than when I enable it (AES-NI). That doesn't make sense, right? Here are some results using openssl speed -elapsed -evp aes-128-cbc
IPFire Core 107
Crypto hardware disabled:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc 15113.74k 16143.35k 16541.54k 42767.93k 43560.07kCrypto hardware enabled:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 124401.38k 166706.39k 202282.67k 215010.65k 218371.41kMakes sense. Using the AES instructions of the CPU we see a big increase.
Latest pfSense
Crypto hardware disabled (webUI):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 115278.96k 164246.85k 200432.64k 214334.81k 216842.24kNot quite IPFire fast but definitely in the right ballpark.
Crypto hardware enabled (webUI) and not using -evp parameter:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc 14669.95k 15897.94k 16247.14k 41517.74k 42207.91kCrypto hardware enabled (webUI) and with using -evp parameter:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 1463.93k 5774.74k 21920.26k 63687.00k 156327.94kThis is literally back to front, isn't it? :-\ . Shouldn't enabling AES-NI in the webUI increase speeds and not cripple them? Or have I missed something? Thanks in advance guys.
-
It's been said in other threads around here that OpenSSL/VPN can directly access the AES-NI hardware when it's available. In that case, adding a kernel module would add more overhead, hence the slower results when the kernel module is enabled (through the web UI).I can't speak from personal experience, just restating what has been said elsewhere here in various threads showing OpenSSL results.
-
@virgiliomi:
It's been said in other threads around here that OpenSSL/VPN can directly access the AES-NI hardware when it's available. In that case, adding a kernel module would add more overhead, hence the slower results when the kernel module is enabled (through the web UI).I can't speak from personal experience, just restating what has been said elsewhere here in various threads showing OpenSSL results.
I see, thanks. Everything I'd read so far (threads, wiki, Mastering pfSense) suggested that the module would either work or do nothing and so enabling it was recommended. I didn't consider that it would actually decrease performance that way. Thanks again… Still learning. :)
-
There was a big thread not too long ago discussing this very thing. Search for it, I don't have it bookmarked. Maybe someone else will post link to it..
-
Here is the thread.
https://forum.pfsense.org/index.php?topic=121141.0
https://forum.pfsense.org/index.php?topic=115627.0
-
On less potent Hardware, such the PC Engines APU series is, you may see different results
using Linux, because its more acting agile and/or is hardware near programmed or coded
and better sorted with vendor drivers as FreeBSD is. But on more stronger and/or powerful
hardware this might be not issue and the results will be coming more nearly the same numbers.Anyway, Linux is not FreeBSD and/or BSD like! It might be unix-oide but it is not the same
an so you may shouldn´t think also in that direction too, please. Its more like fruits, but
comparing apples with pears are also not very wise.If a high encryption throughput is really needed be sure you get hardware that is coming
with AES-NI and use IPsec (AES-GCM) instead of OpenVPN or get a CPU with ~3,0GHz
an to or four CPU cores such an Intel Core i processor or a smaller Xeon E3. -
Also note that OpenVPN is currently single threaded.
If your goal is to connect to PIA or the like then you need to run multiple clients to take advantage of multi core CPUs.
-
Thanks again for the replies everyone. I'm learning a lot (and reading Mastering pfSense to try to get the best out of it). I'm currently considering upgrading the APU2C4 to a Dell PowerEdge T20 (Xeon E3-1225 v3), which should handle our line speed for VPN easily. The APU handles our 200Mbps ISP speed (no VPN) without even breaking a sweat, just a few % CPU, but it'd be nice to offload the VPN to the router/firewall rather than having multiple locally connected clients at home.