Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    AMD AES-NI performance issues? faster when off

    Scheduled Pinned Locked Moved OpenVPN
    6 Posts 4 Posters 2.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      spyshagg
      last edited by

      Whenever I enable AES-NI acceleration on AMD platforms (both my ATHLON 5150  and A4-5300), vpn performance drops a lot

      AES-NI ON

      The 'numbers' are in 1000s of bytes per second processed.
      type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
      aes-128-cbc      3410.06k    13240.97k    48273.89k  145732.61k  356466.69k

      AES-NI OFF

      type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
      aes-128-cbc    265019.69k  364898.52k  453860.48k  480797.35k  487650.65k

      Anyone else having issues with AES-NI on AMD platforms ?

      Im running pfsense 2.3.2.1 directly installed on the computer (no virtualization.  But this also happens on virtualized environments)

      thanks

      1 Reply Last reply Reply Quote 0
      • H
        heper
        last edited by

        afaik aes-ni will only help with with aes-gcm

        1 Reply Last reply Reply Quote 0
        • S
          spyshagg
          last edited by

          I read on this forum it only works with cbc.

          anyway, I tried gcm and the results are equal both with aes-ni on and off

          1 Reply Last reply Reply Quote 0
          • M
            mauroman33
            last edited by

            take a look at this

            https://forum.pfsense.org/index.php?topic=115627.msg646775#msg646775

            1 Reply Last reply Reply Quote 0
            • S
              spyshagg
              last edited by

              yes that explains the huge performance drop when the options is enabled.

              thank you!

              1 Reply Last reply Reply Quote 0
              • V
                VAMike
                last edited by

                The AES-NI checkbox in the GUI enables AES-NI for AES 128/192/256 CBC via cryptodev. That means that for each block of data to encrypt, the openssl library will issue an ioctl to send that block to the kernel, suffering a context switch penalty. Since the computation being performed is exactly the same as what openssl would do without cryptodev (and in that case, without the context switch) it is necessarily slower; there is no advantage at all in enabling AES-NI via cryptodev. You do not see a penalty for GCM modes because those are not implemented in cryptodev and so openssl continues to use its internal routines.

                So why does the AES-NI kernel module exist at all? If you are using ipsec, which does all of its encryption in the kernel, then you need the AES-NI kernel module to let the ipsec module to use AES-NI–and in that case it's a performance gain because everything is happening in-kernel. Ideally, pfsense would enable a configuration in which you can load aesni.ko for ipsec without loading cryptodev, so you can get the benefits without the drawbacks.

                So when would you ever want cryptodev? The /dev/crypto interface is only worth using with external crypto processors, like the old via padlock or the hifn cards (though you're generally much better off just throwing out such hardware and buying something new if you care at all about crypto performance; the crypto accelerator on the old alix boards, for example, was about as fast as a new raspberry pi or an APU1 without hardware crypto, and an order of magnitude slower than an APU2). In theory it might also have a benefit for quick assist, but I think that's implemented in openssl in a way that avoids using /dev/crypto. There's been speculation over the years that cryptodev might help improve cpu utilization, but I haven't seen results on modern hardware where any speculative gain outweighs the performance penalty of the context switching overhead.

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.