Higher than expected cpu utilization with openvpn on RG-2440…

  • I've got the Netgate RG-2440.  It is supposed to have hardware crypto.  The dashboard states this:
    CPU Type Intel(R) Atom(TM) CPU C2358 @ 1.74GHz
    2 CPUs: 1 package(s) x 2 core(s)
    AES-NI CPU Crypto: Yes (active)
    Hardware crypto AES-CBC,AES-XTS,AES-GCM,AES-ICM

    It is a client connection.  /var/etc/openvpn/client1.conf has this content:

    dev ovpnc1
    verb 1
    dev-type tun
    dev-node /dev/tun1
    writepid /var/run/openvpn_client1.pid
    #user nobody
    #group nobody
    script-security 3
    keepalive 10 60
    proto udp4
    cipher AES-128-CBC
    auth SHA256
    up /usr/local/sbin/ovpn-linkup
    down /usr/local/sbin/ovpn-linkdown
    local x.x.x.x
    engine rdrand
    lport 0
    management /var/etc/openvpn/client1.sock unix
    remote gateway.domain.com 1194
    ca /var/etc/openvpn/client1.ca
    cert /var/etc/openvpn/client1.cert
    key /var/etc/openvpn/client1.key
    tls-auth /var/etc/openvpn/client1.tls-auth 1
    comp-lzo adaptive
    resolv-retry infinite

    In the pfsense gui, VPN -> OpenVPN -> Clients -> Edit shows Hardware Crypto set to Intel RDRAND engline - RAND

    I'm seeing 8-10Mbit traffic on the vpn connection – but around 40 to 60% cpu utilization.  When this was running in a VM, the cpu utilization was in the single digits.

    Any idea what's going on?  Is this normal utilization for the RG-2440?  The RG-2440 was originally set up by restoring from the VM version of pfsense -- is there a conflicting setting that got restored?

  • LAYER 8 Netgate

    Hardware crypto does very little to accelerate OpenVPN. OpenVPN spends most of it's time context switching, not performing AES.

    Be sure powerd is enabled in System > Advanced, Miscellaneous. Set everything to Hiadaptive. Since you restored a configuration it might not be enabled.

    I would not load any crypto modules or select any crypto modules in OpenVPN.

    OpenVPN/OpenSSL will make use of AES-NI regardless.

  • Thanks Derelict.

    Looks like those powerd settings weren't changed.

    I've disabled the crypto module in OpenVPN and enabled the settings mentioned in another thread:

    In particular, enabling fast-io, and increasing the buffer size to 512k.

    I saw bump in speed – sustained at 15Mbits at 64% CPU utilization.  However, my ping tests showed a 22% packet loss.
    Dropping the buffer down to 256k got a sustained at 12Mbits at 64%, but again packet drops.
    Dropping the buffer down to 128k gets 10Mbits sustained without the drops, but I'm still at 64%.
    (These numbers are from what Dashboard is reporting)

    From what I remember, openvpn is supposed to be single threaded, so I'm a bit perplexed at that 64% number.  When I watch the process with top, the state column switches from RUN, to CPU0, to CPU1.  Is this the context switching you're talking about? The number for utilization shows 70+% too -- higher than what Dashboard is reporting for total cpu usage.

    I had a question about AES-NI -- Netgate posted a blog entry back in May of last year about requiring AES-NI in hardware for version 2.5 and up:

    Is AES-NI different than the hardware crypto choices?  I assume it must be because it wouldn't make much sense to require AES-NI hardware if those instruction sets aren't really ever used by anything.

    Does anyone else know if this is as good as it can get for this RG-2440?  The server is on a 300mbit circuit (guaranteed SLA from ATT).  The client is on a 1gbit fiber circuit (ATT - no SLA).

    This test suggests that I should get better performance:
    time openvpn --test-crypto --secret /tmp/secret --verb 0 --tun-mtu 20000 --cipher aes-128-cbc
    Sun Apr  1 13:04:56 2018 disabling NCP mode (--ncp-disable) because not in P2MP client or server mode
    31.917u 0.094s 0:33.30 96.0%    813+177k 0+0io 0pf+0w

    3200/32 = 100mbits?

  • I see that OpenVPN performance issues have been discussed a lot here (and elsewhere on the internet).

    From what I've read:

    • OpenVPN is still single threaded, so single core CPU performance only.  Netgate home/business equipment is not up to the task for gigabit speeds.  One workaround is to create multiple VPN tunnels and somehow combine them, but this apparently comes with its own problems.

    • OpenVPN is partway userland and partway kernel.  This is why context switching is a thing.  One question about this – as I watched top, I could see the OpenVPN process jumping back and forth between CPU0 to CPU1.  Is this required for userland<->kernel switching?  Wouldn't there be a performance boost setting the affinity to a single core?

    • IPSec seems to be recommended as an alternative… has anyone done this with pfsense?