Aes-ni not working?
-
Intel(R) Core(TM) i5 CPU 660 @ 3.33GHz
(cryptodev) BSD cryptodev engine
[RSA, DSA, DH, AES-128-CBC, AES-192-CBC, AES-256-CBC]
[ available ]
(rsax) RSAX engine support
[RSA]
[ available ]cryptotest -a aes 100000 100000
23.461 sec, 200000 aes crypts, 100000 bytes, 852493443 byte/sec, 6504.0 Mb/sec/usr/local/bin/openssl speed -evp aes-128-cbc -engine cryptodev -multi 4
OpenSSL 1.0.1c 10 May 2012
evp 33879.67k 137175.74k 474658.63k 1254087.68k 1675531.61k/usr/local/bin/openssl speed -evp aes-256-cbc -engine cryptodev -multi 4
evp 33888.18k 135526.57k 447022.51k 1109458.88k 1423601.97k -
Input from my machine an virtualized pfsense in esxi 5.1. (AES NI working on other win7 guest, so its correctly passthroughed)
ESXI host specs:
Xeon 1220
32gb ram
Intel NICspfSense guest specs:
2 cores
1gb ram
VMxNet3 nicsBefore kldload aesni
[2.1-BETA1][admin@pfsense.localdomain]/root(1): /usr/bin/openssl speed -evp aes-128-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
To get the most accurate results, try to run this
program when this computer is idle.
Doing aes-128-cbc for 3s on 16 size blocks: 25200854 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 7556040 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 256 size blocks: 1974553 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 506622 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 63906 aes-128-cbc's in 3.01s
OpenSSL 0.9.8q 2 Dec 2010
built on: date not available
options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: cc
available timing options: USE_TOD HZ=128 [sysconf value]
timing function used: gettimeofday
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 134377.68k 160686.63k 167961.52k 172378.58k 173953.57k[2.1-BETA1][admin@pfsense.localdomain]/root(3): /usr/local/bin/openssl speed -evp aes-128-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 111268869 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 30363529 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 256 size blocks: 7753535 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 1944836 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 243389 aes-128-cbc's in 3.01s
OpenSSL 1.0.1c 10 May 2012
built on: Sun Jan 27 13:08:29 EST 2013
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -pthread -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,–noexecstack -DL_ENDIAN -DTERMIOS -O3 -DMD32_REG_T=int -Wall -O2 -pipe -fno-strict-aliasing -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 593433.97k 646072.80k 659916.45k 662113.10k 662887.96kafter kldload aesni
[2.1-BETA1][admin@pfsense.localdomain]/root(5): /usr/bin/openssl speed -evp aes-128-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
To get the most accurate results, try to run this
program when this computer is idle.
Doing aes-128-cbc for 3s on 16 size blocks: 2914003 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 2776488 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 256 size blocks: 2127090 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 1097708 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 129159 aes-128-cbc's in 3.01s
OpenSSL 0.9.8q 2 Dec 2010
built on: date not available
options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: cc
available timing options: USE_TOD HZ=128 [sysconf value]
timing function used: gettimeofday
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 15517.00k 59045.34k 180937.99k 373499.22k 351573.93k[2.1-BETA1][admin@pfsense.localdomain]/root(6): /usr/local/bin/openssl speed -evp aes-128-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 2870466 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 2702743 aes-128-cbc's in 3.02s
Doing aes-128-cbc for 3s on 256 size blocks: 2093458 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 1087780 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 130583 aes-128-cbc's in 3.01s
OpenSSL 1.0.1c 10 May 2012
built on: Sun Jan 27 13:08:29 EST 2013
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -pthread -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,–noexecstack -DL_ENDIAN -DTERMIOS -O3 -DMD32_REG_T=int -Wall -O2 -pipe -fno-strict-aliasing -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 15309.15k 57359.77k 178177.74k 370331.17k 355652.47kI can add that i have tested actual VPN performance which conclude.
Speed measured with iperf on 2 windows 7 machines one on LAN and on WANIf i just route between 2 nets without an tunnel the speeds are well above Gbit speed. CPU usage = ~75%
If i use the vpn tunnel with AES 128 the speed is around 300mbit (around same speed with BSD engine, no hardware , and RSX engine). CPU usage ~40%
If i use the vpn tunnel with NO encryption the speed is still around 300mbit.
Not really sure why as soon as the tunnel is used the speed no more than 300mbit.
Hope this helps!
Let me know if I should test something else. -
You might be hitting a general openvpn limit at some point there, check threads around the forum here, you might at least try this tweak:
http://forum.pfsense.org/index.php/topic,47567.0.htmlYour numbers seem to coincide with the similar numbers from the previous tester as well.
Did you happen to try the VPN speed without aesni.ko loaded? Or just with and toggling the engine setting?
-
Actually now that you say it. I only tested the vpn speed without the aesni.ko loaded. I should test it with it loaded.
Ill also check the thread with the tweak.
EDIT: I tested with the aesni.ko loaded no speedchange. Might be higher cpu usage though not entirely sure.
Also tested the ip fastforwarding tweak which had no effect. - 6 months later
-
Did anyone ever discover why there was no apparent change in performance with aes-ni enabled? I did a search for aes-ni and aesni but didn't see any further threads. I don't have a system with aes-ni on 2.1 yet.
-
I see you are testing IPsec earlier and some openvpn. I would be interested in knowing what the maximum throughput you might get with all 4 cores enabled, using 4 separate clients connecting to 1 server each client on a different port with separate openvpn instance for each. Its probably not part of your testing, bit would be interesting to know if it will saturate a gigabit interface.
As far as file transferes from 1 computer to another be careful that drive read/write speed isn't a bottleneck.
-
Did anyone ever discover why there was no apparent change in performance with aes-ni enabled? I did a search for aes-ni and aesni but didn't see any further threads. I don't have a system with aes-ni on 2.1 yet.
Not yet, mostly for lack of a good test setup. We're building up some test rigging/infrastructure to get some good throughput numbers for the new book and for other purposes and I believe some of that hardware does have AES-NI, so we may have better information in the coming months.
-
jimp, If I have time I will probably throw a 2.1 snapshot one of the Dell R320 servers I have and see how the openssl test does on it. I assume it will reveal the same results as everyone else though.
kejianshi, I just don't have time to do that kind of testing right now.
-
A single test probably won't really tell us much. What we'd really need to see is a pair of identical systems configured identically back-to-back (but with different IPs/subnets as needed) and see what kind of LAN-to-LAN throughput we can obtain through an active/live VPN in each of the test cases
1. aesni.ko loaded, OpenVPN set to use cryptodev
2. aesni.ko loaded, OpenVPN set to "no hardware"
3. aesni.ko unloaded, OpenVPN set to use cryptodev
4. aesni.ko unloaded, OpenVPN set to "no hardware" -
My AMD FX-8150 at a remote site with aes-ni absolutely smokes my Intel CPUs without aes-ni in these openssl tests.
Its not even close.