Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Gigabit VPN Router

    Scheduled Pinned Locked Moved Hardware
    26 Posts 11 Posters 4.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      Ryu945
      last edited by

      @johnkeates:

      It's algorithms that need to be executed fastee than a CPu can

      CPU is holding it back?  Didn't someone say it was something else besides the CPU?

      @Derelict:

      In general, it is the context switching between user and kernel/system mode and the tun driver that gets in the way of faster OpenVPN performance.

      In a hardware context, is it having to swap out the data on its cache when it switches between kernal/system mode?  Does it not have enough cache to do both and swapping out the cache is not happening fast enough due to a bus speed limitation?

      Bascially, I am trying to understand it to the point where I can look at a piece of hardware and understand how fast it can go up to.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        OpenVPN will always be slower than IPSec can be because for each packet that is sent there are more instructions required to be processed due to switching context between kernel and user mode. In it's current form at least.

        Steve

        1 Reply Last reply Reply Quote 0
        • R
          Ryu945
          last edited by

          @stephenw10:

          OpenVPN will always be slower than IPSec can be because for each packet that is sent there are more instructions required to be processed due to switching context between kernel and user mode. In it's current form at least.

          Steve

          I know OpenVPN is slower.  What I am trying to figure out is what hardware limitation is OpenVPN running into.  Somewhere on the hardware, something is being maxed out and that is why it is not going any faster.

          1 Reply Last reply Reply Quote 0
          • J
            jgiannakas
            last edited by

            @Ryu945:

            @stephenw10:

            OpenVPN will always be slower than IPSec can be because for each packet that is sent there are more instructions required to be processed due to switching context between kernel and user mode. In it's current form at least.

            Steve

            I know OpenVPN is slower.  What I am trying to figure out is what hardware limitation is OpenVPN running into.  Somewhere on the hardware, something is being maxed out and that is why it is not going any faster.

            This might help you understand the limitations of OpenVPN, it certainly helped me :)

            https://community.openvpn.net/openvpn/wiki/Gigabit_Networks_Linux

            Summary:
            1. First bottleneck is the OpenSSL encryption / decryption routines perform better with larger packet sizes. This also helps reducing the context switching between user space and kernel space as more data are fed in one packet hence reducing the switching overhead (less switching is done)
            2. Second is AES NI acceleration on the CPU
            3. Encryption itself. Without encryption they managed to hit almost gigabit speeds with jumbo frames in the TUN

            with the above settings I am hitting about 300mbps from my Digital Ocean web server to my gigabit connection at home. CPU utilisation on the Digital Ocean Ubuntu box is about 90% on the OpenVPN process so it could be the virtual CPU limiting me or the network stack/virtualisation drivers they are using. On my personal devices I use IPSec where I get a comfortable 400-500 mbps throughput.

            1 Reply Last reply Reply Quote 0
            • DerelictD
              Derelict LAYER 8 Netgate
              last edited by

              You are simply not listening. http://www.linfo.org/context_switch.html

              Context and Mode switching.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • ?
                Guest
                last edited by

                Basically, unless OpenVPN could be implemented like IPsec or implemented like for example OpenVSwitch which does initial matching, flow creation and setup etc. in user space, and then all packets/frames after that can be handled in the kernel, it will not get 'faster'.

                1 Reply Last reply Reply Quote 0
                • ?
                  Guest
                  last edited by

                  Why isn't NI-AES hardware crypto acceleration mentioned here? My understanding is that both i3 and i5 support this instruction set.

                  Because the main or mostly effort and gain will IPsec getting from using that instruction set.

                  I wouldn't expect gigabit speeds over VPN out of any commodity hardware.  And no, I'm not aware of any situation where Optane would improve pfSense performance.  I could be wrong, though. The technology is very new.

                  That said, your hardware list does look like it would provide very good performance.

                  VPN is a structure with two ends! And the most users forget this by thinking they have powerful hardware in game "on their site"
                  but the other site or VPN end must be also strong enough to handle or offer the wished speed result.

                  I am trying to understand what is this limit being hit that prevents gigabit vpn.

                  GB VPN will be not a big secret and will be also able to reach for sure but not with OpenVPN, hardware we discuss here or
                  based on the tun/tap design or the entire OpenVPN code.

                  Also, it wasn't until recently that IPSec could be done really fast on commodity hardware, it used to need crypto accelerators. OpenVPN has different issues, but will need a comparable order of magnitude improvement before you can use it for high speeds.

                  @gonzopancho is owing a 1 GBit/s symmetric Internet connection and he is using the mid ranged SG-4860 appliance from pfsense, together with AES-NI and IPsec he gets out something around ~470 MBit/s over IPsec and on top of this the VPN and TCP/IP
                  overhead it is nearly ~500 MBit/s, once more again this is a small 4 core Intel Atom CPU! Reddit: SG-4860 vs SG-8860

                  I know OpenVPN is slower.  What I am trying to figure out is what hardware limitation is OpenVPN running into.  Somewhere on the hardware, something is being maxed out and that is why it is not going any faster.

                  Perhaps if OpenVPN will be new written and it is using multiple CPU cores this will be more scaling up or pending on the tun/tap
                  interface it will be better to get other mechanisms that will better matching then.

                  1 Reply Last reply Reply Quote 0
                  • ?
                    Guest
                    last edited by

                    Exactly. Also, making OpenVPN multicore won't immediately get you Ncores performance increase, as ctx switches will still happen, but now Ncores times more, as well as possible IPC. On top of that, imagine having to do: packet-ctx(to kernel)-packet-ctx(to user mode)-ipc-ctx(back to kernel)-packet before a packet in multicore mode can be processed if multiple threads or processed need to swap out information on certain packets. The horror.

                    I think a split user-kernel design would help a lot more, but I have no idea how that would be implemented since key material should probably not be stored in two places in memory, and unless you can do session setup and control in user space, and raw packet processing in the other, it would probably require such a big redesign that it won't be compatible with existing versions (i.e. needing a protocol change to allow separate control and data flows). At the same time, the fact that there is a daemon mode with control interface and a client for that means that they might have been thinking about that. Oh well, I should really dig in more before talking about all of this.

                    1 Reply Last reply Reply Quote 0
                    • PippinP
                      Pippin
                      last edited by

                      Info:
                      https://community.openvpn.net/openvpn/wiki/RoadMap#OpenVPN3.0

                      I gloomily came to the ironic conclusion that if you take a highly intelligent person and give them the best possible, elite education, then you will most likely wind up with an academic who is completely impervious to reality.
                      Halton Arp

                      1 Reply Last reply Reply Quote 0
                      • ?
                        Guest
                        last edited by

                        @Pippin:

                        Info:
                        https://community.openvpn.net/openvpn/wiki/RoadMap#OpenVPN3.0

                        Nice! Looks like my speculation wasn't too far off. I guess anyone having more questions about OpenVPN speed should just be directed there as it both explains the current limitations as well as solutions, answering their questions just fine.

                        1 Reply Last reply Reply Quote 0
                        • R
                          Ryu945
                          last edited by

                          @jgiannakas:

                          @Ryu945:

                          @stephenw10:

                          OpenVPN will always be slower than IPSec can be because for each packet that is sent there are more instructions required to be processed due to switching context between kernel and user mode. In it's current form at least.

                          Steve

                          I know OpenVPN is slower.  What I am trying to figure out is what hardware limitation is OpenVPN running into.  Somewhere on the hardware, something is being maxed out and that is why it is not going any faster.

                          This might help you understand the limitations of OpenVPN, it certainly helped me :)

                          https://community.openvpn.net/openvpn/wiki/Gigabit_Networks_Linux

                          Summary:
                          1. First bottleneck is the OpenSSL encryption / decryption routines perform better with larger packet sizes. This also helps reducing the context switching between user space and kernel space as more data are fed in one packet hence reducing the switching overhead (less switching is done)
                          2. Second is AES NI acceleration on the CPU
                          3. Encryption itself. Without encryption they managed to hit almost gigabit speeds with jumbo frames in the TUN

                          with the above settings I am hitting about 300mbps from my Digital Ocean web server to my gigabit connection at home. CPU utilisation on the Digital Ocean Ubuntu box is about 90% on the OpenVPN process so it could be the virtual CPU limiting me or the network stack/virtualisation drivers they are using. On my personal devices I use IPSec where I get a comfortable 400-500 mbps throughput.

                          @Derelict:

                          You are simply not listening. http://www.linfo.org/context_switch.html

                          Context and Mode switching.

                          @Pippin:

                          Info:
                          https://community.openvpn.net/openvpn/wiki/RoadMap#OpenVPN3.0

                          Thank you.  Those links were really useful.

                          From what this seems to be saying,  the bottleneck is how fast the CPU swaps processes.  My understanding is that the hardware bottleneck is how long it takes to send and receive data between the L1 cache and the RAM.  This leaves me to believe that higher GHZ RAM will mean a faster VPN router since it is the RAM access speed that is slowing it down.  Is this understanding correct?  When OpenVPN context swaps,  is it dumping its L1 cache and CPU state into the RAM or is it just being dumped into L2/L3 cache ?  Is the RAM access the bottleneck making context swaps take so long?

                          1 Reply Last reply Reply Quote 0
                          • ?
                            Guest
                            last edited by

                            It's not that, the speeds (bandwidth) isn't the issue, it's response time or latency or 'expensive' operations (i.e. waste many CPU cycles between tasks to get from one task to another). It's probably more comparable to the SSD vs. HDD thing where an SSD isn't necessarily faster (in terms of bandwidth) but is always faster in terms of access time which is what users experience.

                            There is no magical fix here, mostly because of architectural and x86 reasons, both which cannot be changed anytime soon. OpenVPN 3 might help with some OpenVPN architectural changes, so that is your best bet. Putting 'better' hardware in a box only does a little for VPN speeds. Spending 10x more money might get you only 5% more speed, and it gets worse as you get higher.

                            1 Reply Last reply Reply Quote 0
                            • R
                              Ryu945
                              last edited by

                              @johnkeates:

                              It's not that, the speeds (bandwidth) isn't the issue, it's response time or latency or 'expensive' operations (i.e. waste many CPU cycles between tasks to get from one task to another). It's probably more comparable to the SSD vs. HDD thing where an SSD isn't necessarily faster (in terms of bandwidth) but is always faster in terms of access time which is what users experience.

                              There is no magical fix here, mostly because of architectural and x86 reasons, both which cannot be changed anytime soon. OpenVPN 3 might help with some OpenVPN architectural changes, so that is your best bet. Putting 'better' hardware in a box only does a little for VPN speeds. Spending 10x more money might get you only 5% more speed, and it gets worse as you get higher.

                              Should I be looking at cache response time (if context switches only happen in cache) or should I look into RAM response time( if there is RAM interaction in this context switch)?

                              1 Reply Last reply Reply Quote 0
                              • ?
                                Guest
                                last edited by

                                @Ryu945:

                                @johnkeates:

                                It's not that, the speeds (bandwidth) isn't the issue, it's response time or latency or 'expensive' operations (i.e. waste many CPU cycles between tasks to get from one task to another). It's probably more comparable to the SSD vs. HDD thing where an SSD isn't necessarily faster (in terms of bandwidth) but is always faster in terms of access time which is what users experience.

                                There is no magical fix here, mostly because of architectural and x86 reasons, both which cannot be changed anytime soon. OpenVPN 3 might help with some OpenVPN architectural changes, so that is your best bet. Putting 'better' hardware in a box only does a little for VPN speeds. Spending 10x more money might get you only 5% more speed, and it gets worse as you get higher.

                                Should I be looking at cache response time (if context switches only happen in cache) or should I look into RAM response time( if there is RAM interaction in this context switch)?

                                There is no single component that does it all, and there also is no guarantee on CPU behaviour with specific programs. That's the whole issue here: it's not just some specific action on a specific port on a specific device that makes OpenVPN either slow or fast. It's everything. Unless you are into low level system architecture and design, there is very little you can do to either fix it in code or in hardware in this case.

                                If you really really really really want more information, just hook into a OpenVPN process with a debugger/tracer and start recording call performance etc. Not sure on what OS you'll be doing it, but check things like strace, dtrace, gdb, valgrind etc.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  I agree it reads like you're looking for an answer that doesn't exist here.

                                  If you want the highest OpenVPN speeds you can have, get the fastest single thread performance CPU you can afford. Though as said above spending twice as much will probably not result in twice the throughput.

                                  Steve

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.