Openvpn bsdcrypto acceleration



  • Running across something strange.  When I run openssl speed -evp aes-128-cbc, I get 121k.  When I run openssl speed -evp aes-128-cbc -engine cryptodev, I get the same result. I thought the bsd crypto engine was a software accelerator? any help is appreciated. Thanks


  • Banned

    And? You have some HW accelerator on your system?



  • I don't have a hardware accelerator on my system. I read the pfsense post on vpn accelerators and am trying to determine if I want to buy this: http://store.netgate.com/-P40.aspx  with a VPN1411 or a atom 1.8gz, 4gb of ram box to build a SMB router.  I mainly need to stream cctv video over openvpn.  I have a 1.8gx atom now but only achieve very slow benchmarks, 27k compared to an neoware 1gz, 1GB ram system.  Is the 2d13 with an accelerator a better system for vpn performance then an atom?


  • Banned

    Well, if you do not have an accelerator, then you obviously will get the same result. This is on the exact same board you linked"

    openssl speed -evp aes-128-cbc

    
    To get the most accurate results, try to run this
    program when this computer is idle.
    Doing aes-128-cbc for 3s on 16 size blocks: 77880 aes-128-cbc's in 0.13s
    Doing aes-128-cbc for 3s on 64 size blocks: 79122 aes-128-cbc's in 0.11s
    Doing aes-128-cbc for 3s on 256 size blocks: 67042 aes-128-cbc's in 0.08s
    Doing aes-128-cbc for 3s on 1024 size blocks: 48649 aes-128-cbc's in 0.09s
    Doing aes-128-cbc for 3s on 8192 size blocks: 11248 aes-128-cbc's in 0.01s
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-cbc       9330.79k    44088.70k   202735.21k   550313.46k 12985289.74k
    
    

    openssl speed -evp aes-128-cbc -engine cryptodev

    
    engine "cryptodev" set.
    To get the most accurate results, try to run this
    program when this computer is idle.
    Doing aes-128-cbc for 3s on 16 size blocks: 82417 aes-128-cbc's in 0.03s
    Doing aes-128-cbc for 3s on 64 size blocks: 80139 aes-128-cbc's in 0.07s
    Doing aes-128-cbc for 3s on 256 size blocks: 69611 aes-128-cbc's in 0.09s
    Doing aes-128-cbc for 3s on 1024 size blocks: 48680 aes-128-cbc's in 0.08s
    Doing aes-128-cbc for 3s on 8192 size blocks: 11247 aes-128-cbc's in 0.00s
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-cbc      52340.72k    73602.20k   202518.54k   610452.37k 80397403.14k
    
    


  • Which board, the ALIX? If so, then it smokes my atom dual core 1.8gz.  I was only getting 27k on the atom with 4gb of ram.


  • Banned

    Yeah, Alix.2D13. Note that just the aes-128 is supported, though.

    
    glxsb0: <amd geode="" lx="" security="" block="" (aes-128-cbc,="" rng)="">mem 0xefff4000-0xefff7f:</amd> 
    

    There are some figures for a hifn accelerator here: http://store.netgate.com/Soekris-VPN1411-Crypto-accelerator-P319.aspx

    IPsec maximum throughput on ALIX boards without and with vpn1411*:

    3DES: 13.7 Mbps vs 34.6 Mbps
    AES: 19.4 Mbps vs 34.2 Mbps
    AES256: 13.5 Mbps vs 34.2 Mbps



  • obviously you probably don't run snort on a cf card do you?  How many concurrent connections for openvpn have you had? have you noticed a performance drop or do you think highly of the alix board?


  • Banned

    I do NOT run snort, ever. Anywhere. Period. :P (For the Alix, absolutely a no go anyway.)

    As for OpenVPN, pretty good for the stuff the HW is used. All I need is a couple of users connected via OVPN or IPsec using some DB servers on LAN, though. Haven't done any bandwidth benchmarks frankly, not needed for me.

    For busy sites, well this one should rock: http://www.pcengines.ch/apu.htm - however "Production expected for early 2014"  :( :'(



  • Do you not like snort? I know it probably wouldn't work well on the cf card, but do you have a reason not to use it?  So overall you are satisfied with the alix?  I just need those for smaller networks supporting 10 or less users with cctv over openvpn at night.


  • Banned

    @newbieuser1234:

    Do you not like snort? I know it probably wouldn't work well on the cf card, but do you have a reason not to use it?

    The CF is not the main problem. The CPU/RAM definitely is. Otherwise, beyond being the ultimate source of all sorts of cryptic breakage, requiring 24/7 babysitting and endless tuning and disabling of the broken rules, I'm pretty sure its excellent software.  ::)



  • 10-4. Do you use pfblocker?


  • Netgate Administrator

    You should be able to get at least 50Mpbs of VPN from that Atom board, probably more. Without anything else running at least. See this post:

    http://forum.pfsense.org/index.php/topic,27780

    Steve



  • Thanks. I wonder why mine is so slow. I have glxsb or whatever checked and the option for bsdcryptoengine is selected in the openvpn server settings.


  • Netgate Administrator

    Glxsb won't help you on an Atom, it's a Geode specific hardware driver.
    Are you actually seeing very bad vpn throughput or just bad results from open SSL speed?

    Steve



  • no complaints, just the speed test for openssl. both of the ones i test are running a magnetic hd, would a cf card unit return a faster speed?


  • Netgate Administrator

    It shouldn't make any difference to either real vpn throughput or open-ssl speed results.

    Steve



  • I don't see the geode recognized in the dmesg output:  Do i need to have 64 bit?  This is weird.

    cryptosoft0: <software crypto="">on motherboard
    padlock0: No ACE support.

    there is no entry for glsxb either as noted in this post:

    "Boards utilizing the AMD Geode platform typically have the "AMD Geode LX Security Block" which supports certain encryption types. It will show up in dmesg as the glxsb device:"  glxsb0: <amd geode="" lx="" security="" block="" (aes-128-cbc,="" rng)="">mem 0xefff4000-0xefff7fff irq 9 at device 1.2 on pci0
                      http://doc.pfsense.org/index.php/Are_cryptographic_accelerators_supported</amd></software>


  • Netgate Administrator

    This is on your Atom box yes? Then that's expected, there's no hardware crypto.

    Steve



  • So essentially the dual core is slower for openvpn than an Alix 2d3 with a vpn1411 accelerator ?


  • Netgate Administrator

    Well I would say no because of Databeestje's test report on the D510. He was seeing >50Mbps VPN traffic in one direction. The Alix can't manage that even with the Hifn accelerator.

    You haven't posted a complete output from openssl speed yet. That might show something.
    Coincidentally I have been playing around with an old firebox testing it's Safenet crypto card this evening. I've found some interesting things. Here's some output for comparrison:

    Without the Safenet 1141.

    [2.0.3-RELEASE][root@pfSense.localdomain]/root(1): openssl speed -evp aes-128-cbc
    
    Doing aes-128-cbc for 3s on 16 size blocks: 4443103 aes-128-cbc's in 2.89s
    Doing aes-128-cbc for 3s on 64 size blocks: 1258138 aes-128-cbc's in 2.91s
    Doing aes-128-cbc for 3s on 256 size blocks: 318359 aes-128-cbc's in 2.87s
    Doing aes-128-cbc for 3s on 1024 size blocks: 80907 aes-128-cbc's in 2.89s
    Doing aes-128-cbc for 3s on 8192 size blocks: 10450 aes-128-cbc's in 2.98s
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) 
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-cbc      24627.37k    27709.88k    28411.35k    28646.12k    28707.23k
    
    

    With the card:

    [2.0.3-RELEASE][root@pfSense.localdomain]/root(13): openssl speed -evp aes-128-cbc
    
    Doing aes-128-cbc for 3s on 16 size blocks: 117285 aes-128-cbc's in 0.14s
    Doing aes-128-cbc for 3s on 64 size blocks: 110095 aes-128-cbc's in 0.05s
    Doing aes-128-cbc for 3s on 256 size blocks: 93032 aes-128-cbc's in 0.04s
    Doing aes-128-cbc for 3s on 1024 size blocks: 56316 aes-128-cbc's in 0.05s
    Doing aes-128-cbc for 3s on 8192 size blocks: 8643 aes-128-cbc's in 0.00s
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) 
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-cbc      13690.32k   156398.83k   538937.61k  1147202.67k 70803456.00k
    
    

    The numbers make it look as though the card speeds things up massively but in reality my testing has showed that the box performs better, for OpenVPN at least, without the card in it. Moreover the card has to actually be removed from the box. No amount of selecting 'no hardware encryption' had any effect, which is how the OCF is supposed to work as I understand it. The wiki page exaplins this somewhat by saying that in reality VPN traffic is small blocks of data so the really big numbers are not any help.

    Steve



  • $ openssl speed -evp aes-128-cbc
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
    aes-128-cbc      20742.63k    22943.08k    23652.34k    23832.40k    23883.13k

    I get 7Mb on a 1gz, 1 gb of ram via neoware box.  smokes my dual core…

    $ openssl speed -evp aes-128-cbc
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
    aes-128-cbc      38135.72k  190433.15k  884307.27k  2274631.30k  4013679.73k

    $ openssl speed -evp aes-128-cbc -engine via
    OpenSSL 0.9.8y 5 Feb 2013
    built on: date not available
    options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
    compiler: cc
    available timing options: USE_TOD HZ=128 [sysconf value]
    timing function used: getrusage
    The 'numbers' are in 1000s of bytes per second processed.
    type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
    aes-128-cbc      46147.44k  189884.01k  676914.89k  3349549.48k  6945314.10k

    Any way to test actual throughput?  iperf?


  • Netgate Administrator

    It should do the Via probably has the Padlock encryption engine built in. But like I say above numbers aren't everything.  ;)

    Steve



  • How are you testing throughput?


  • Netgate Administrator

    A ridiculous long chain of machines!  :D
    An OpenVPN connection between two machines, the box I'm testing and one that's much more powerful to guarantee it's not slowing things down. I establish the VPN and then run iperf using the powerful end as the server and a laptop behind the test box as a client.
    I saw ~25Mbps with various encryption types with the card but ~33Mbps once I removed it.

    Steve



  • two machines on the same router, but different interfaces?


  • Netgate Administrator

    Yes as it happens they are connected via separate interfaces on my home router. They could just as easily have been connected directly though.

    Steve



  • Thanks for the info. I'll try it out and see what results I get.  What setup do you like for the best bang for your buck for SMB users. 10 or less users.



  • I did your test with openvpn and iperf on seperate interfaces.  I got around 70 Mbits/sec. Far better than what the openssl test showed. weird stuff. thanks for your help.  Both the server and client were running in VM's so that may have slowed it down a bit too, not sure. I will try with standalone machines next.


  • Netgate Administrator

    Ah well there you go.  :)
    About twice as fast as my Pentium 3 era Celeron 1200.
    I was running 'top' on the console of the test box to make sure it was running at 100%, it could not pass more traffic. Also I tested the connection outside the VPN to make sure I wasn't being restricted by something else in the route. However if that's possible you have top be sure that the test traffic is actually using the VPN!  ;) I did that by using the WAN interface on the remote box to test the route and the LAN to test the VPN. The LAN address is only accessible over the VPN.

    Steve



  • ill try the lan/wan test next.  I tried it where I had the client and server on different lan interfaces that couldn't talk to each other except for vpn.  when i disconnected the vpn, and tried without it on the same lan, i got 250Mb.  Is that normal for a gig interface? maybe the VM was limiting it some? They were on the same switch.


  • Netgate Administrator

    I would expect more from an Atom with Gigabit interfaces. Something >500Mbps.
    It's not clear exactly how you had the test setup connected. If that's between two VMs connected to the same switch I would expect near Gigabit results, the traffic would not be going through the pfSense box at all.

    It's very easy to overlook something and end up testing the wrong thing in these sorts of test.

    Steve


Log in to reply