IPSec VPN performance slow…



  • I've have dozens of tunnels throughout the world, all using pfSense, and I love it!  However, I never seem to be able to get the performance I want out of the VPN connections no matter what I do.  Please allow me to provide as much detail as possible to help troubleshoot this problem.  I appreciate, in advance, anyone taking time to respond and assist with this issue.  ;D

    I'll break this down as much as possible.  Here goes…

    ::::::::::::::::::
    :: HARDWARE ::
    ::::::::::::::::::

    HQ Office:  Core 2 Duo 2.0 GHz | 2 GB RAM | 4 x 1 Gbps Intel | 4 GB CF | VPN Accelerator (Hifn 7955) | pfSense 2.0.2 | SHDSL 16/16 Mbps Internet
    Remote Office: AMD Geode 500 MHz | 256 MB RAM | 3 x 1 100 Mbps Via | 4 GB CF | No accel card | pfSense 2.0.2 | Cable 50/10 Mbps Internet

    :::::::::::::::::::::::::
    :: CURRENT CONFIG ::
    :::::::::::::::::::::::::

    IPSec Site-to-Site

    PH1:

    Auth:  Mutual PSK
    Neg: main
    Policy: Default
    Proposal: Default
    Enc: 3DES
    Hash: MD5
    DH: 2

    PH 2:

    Proto: ESP
    Enc: 3DES
    Hash: MD5
    PFS: Off

    When I have the link configured in this way I get the following via iperf:


    Client connecting to 192.168.4.4, TCP port 5001
    TCP window size: 64.0 KByte (default)

    [156] local 192.168.1.198 port 5599 connected with 192.168.4.4 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [156]  0.0-10.3 sec  3.12 MBytes  2.55 Mbits/sec

    UDP shows the following:

    –----------------------------------------------------------
    Client connecting to 192.168.4.4, UDP port 5001
    Sending 1470 byte datagrams
    UDP buffer size: 64.0 KByte (default)

    [156] local 192.168.1.198 port 51293 connected with 192.168.4.4 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [156]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec
    [156] Server Report:
    [156]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec  8.630 ms    1/  893 (0.11%)
    [156] Sent 893 datagrams

    Obviously I'm doing these tests when both connections are idle and nobody else is using the Internet so they are given the best chance of performing.

    Well, the first thing I did to troubleshoot the issue is try to figure out what my processors could crunch the fastest.  So I ran openssl speed on each device and received the following results:

    HQ Office:
    The 'numbers' are in 1000s of bytes per second processed.
    type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
    md2              1659.82k    3480.80k    4840.05k    5335.25k    5504.70k
    mdc2              5319.60k    6059.71k    6271.54k    6313.53k    6330.05k
    md4              17240.30k    61751.73k  182953.16k  359497.68k  499266.77k
    md5              14282.69k    49024.11k  134284.62k  237846.57k  307724.72k
    hmac(md5)        17198.61k    57369.41k  149283.13k  249123.43k  309769.22k
    sha1            12477.83k    36818.88k    81640.94k  117601.57k  134872.33k
    rmd160          11511.13k    33534.44k    72386.00k  101824.57k  115529.89k
    rc4            215933.90k  268120.96k  283980.69k  291399.65k  290060.12k
    des cbc          44939.77k    46998.29k    47431.21k    47599.64k    47656.99k
    des ede3        17001.83k    17173.74k    17233.57k    17243.63k    17241.12k
    idea cbc            0.00        0.00        0.00        0.00        0.00
    seed cbc            0.00        0.00        0.00        0.00        0.00
    rc2 cbc          19782.37k    20331.20k    20485.82k    20582.68k    20465.19k
    rc5-32/12 cbc  136187.42k  152410.81k  157617.24k  158551.36k  159041.45k
    blowfish cbc    72896.90k    76868.12k    77971.14k    78347.78k    78362.01k
    cast cbc        70242.55k    74448.71k    75896.64k    75750.25k    75835.41k
    aes-128 cbc      49219.02k    50609.53k    51101.22k    51247.47k    51068.72k
    aes-192 cbc      42385.80k    43573.45k    43955.34k    43921.37k    43831.72k
    aes-256 cbc      37324.43k    38069.19k    38452.41k    38451.08k    38332.24k
    camellia-128 cbc    48012.21k    49951.89k    50456.26k    50615.27k    50535.68        k
    camellia-192 cbc    37194.79k    38297.02k    38630.58k    38670.14k    38673.31        k
    camellia-256 cbc    37231.60k    38331.17k    38499.48k    38640.83k    38632.70        k
    sha256            8397.97k    20822.54k    37781.77k    48714.03k    52936.68k
    sha512            3449.34k    13739.54k    20885.03k    29089.56k    32931.82k
    aes-128 ige      49748.12k    52399.09k    53206.97k    53414.35k    53404.16k
    aes-192 ige      42801.25k    44821.20k    45391.82k    45519.34k    45488.85k
    aes-256 ige      37698.81k    39056.68k    39505.62k    39616.58k    39566.29k
                      sign    verify    sign/s verify/s
    rsa  512 bits 0.000682s 0.000075s  1465.2  13392.6
    rsa 1024 bits 0.003240s 0.000180s    308.6  5561.9
    rsa 2048 bits 0.018860s 0.000561s    53.0  1782.7
    rsa 4096 bits 0.125113s 0.001934s      8.0    517.1
                      sign    verify    sign/s verify/s
    dsa  512 bits 0.000550s 0.000627s  1819.4  1595.0
    dsa 1024 bits 0.001577s 0.001824s    634.2    548.3
    dsa 2048 bits 0.005055s 0.006057s    197.8    165.1

    This device has an accelerator card and so we can see some good numbers here.  While the remote site's device is only an AMD 500 Mhz with no accelerator and so obviously won't be as performant and received the following results:

    The 'numbers' are in 1000s of bytes per second processed.
    type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
    md2                334.01k      710.90k    1005.26k    1108.05k    1151.45k
    mdc2              561.47k      634.79k      659.04k      666.16k      673.44k
    md4              2284.33k    7985.06k    22237.18k    40621.68k    53257.10k
    md5              1737.32k    5777.80k    15250.69k    26202.77k    32755.54k
    hmac(md5)        2020.69k    6529.12k    16518.14k    27100.72k    32963.17k
    sha1              1485.90k    4109.91k    8470.16k    11565.00k    12991.89k
    rmd160            1485.02k    4131.16k    8613.36k    11930.47k    13347.61k
    rc4              22104.75k    26471.34k    27729.55k    28097.17k    28028.23k
    des cbc          5942.92k    6286.65k    6385.52k    6426.94k    6396.98k
    des ede3          2154.33k    2203.92k    2229.73k    2225.74k    2200.75k
    idea cbc            0.00        0.00        0.00        0.00        0.00
    seed cbc            0.00        0.00        0.00        0.00        0.00
    rc2 cbc          2857.53k    2974.34k    3015.45k    3020.43k    3021.39k
    rc5-32/12 cbc    16781.33k    19511.91k    20359.36k    20582.86k    20626.70k
    blowfish cbc      9983.13k    11051.41k    11291.04k    11360.61k    11369.82k
    cast cbc          8726.10k    9416.21k    9636.47k    9695.25k    9703.13k
    aes-128 cbc      5468.26k    5715.25k    5817.59k    5843.74k    5836.81k
    aes-192 cbc      4732.20k    4964.46k    5037.87k    5102.09k    5086.74k
    aes-256 cbc      4291.03k    4450.14k    4510.26k    4511.12k    4521.81k
    camellia-128 cbc    5800.16k    6158.16k    6281.17k    6320.09k    6295.45              k
    camellia-192 cbc    4615.07k    4867.09k    4911.12k    4968.32k    4943.24              k
    camellia-256 cbc    4561.92k    4826.93k    4914.36k    4873.11k    4878.62              k
    sha256            993.95k    2260.63k    3922.08k    4833.08k    5153.44k
    sha512            395.29k    1578.48k    2368.90k    3301.85k    3715.76k
    aes-128 ige      5565.77k    5908.00k    6017.24k    6058.39k    6056.14k
    aes-192 ige      4809.31k    5086.57k    5220.05k    5208.61k    5217.06k
    aes-256 ige      4288.23k    4511.80k    4602.72k    4624.59k    4605.74k
                      sign    verify    sign/s verify/s
    rsa  512 bits 0.006852s 0.000667s    145.9  1500.3
    rsa 1024 bits 0.031129s 0.001624s    32.1    615.9
    rsa 2048 bits 0.176131s 0.004923s      5.7    203.1
    rsa 4096 bits 1.100097s 0.016854s      0.9    59.3
                      sign    verify    sign/s verify/s
    dsa  512 bits 0.005221s 0.005985s    191.5    167.1
    dsa 1024 bits 0.014252s 0.016788s    70.2    59.6
    dsa 2048 bits 0.044936s 0.054130s    22.3    18.5

    Judging from this data, and using 1024 bits, we could presume that when creating an IPSec tunnel we would want to use Blowfish as it can pull just over 11 Mbps on that 500 MHz processor and then use MD5 which outperformed SHA1 by more than 2x.

    So, I went back and changed only the encryption and auth protocols so that they were blowfish 128 and MD5.  After doing that and restarting raccoon (just for good measure) on the remote site the tunnel reconnected and I reran my iperf tests which got me the following:

    –----------------------------------------------------------
    Client connecting to 192.168.4.4, TCP port 5001
    TCP window size: 64.0 KByte (default)

    [156] local 192.168.1.198 port 7681 connected with 192.168.4.4 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [156]  0.0-10.1 sec  6.88 MBytes  5.72 Mbits/sec

    As you can see the tunnel improved a bit.  But, I'm still not seeing anything near the 11 Mbps that I should be able to acheive with these settings.  So the next step I took was to leave the P1 to blowfish 128 & MD5 but to turn the tunnel to AH (no encryption) with MD5.  After doing that I received nearly the same results with encryption on.  I would think with AH I should get much higher speeds!?

    What am I doing wrong that my speeds are going so slow?  Does anyone have any guidance for me on this?  Thanks again in advance.



  • Cable 50/10 Mbps Internet

    That means you have MAX 10 Mbps uplink on that cable connection site! You did not explain is this Cable site the downloading site or uploading site. AND what kind of results do you have from the ISP network in general? (speedtest.net) If you have other traffic ongoing you of course loose some usable bandwith in ipsec tunnel.



  • Clouseau:

    Thank you for your response.  It seems to be the same in either direction when it comes to file transfers.  However, for our purposes, let's say I'm located at the cable site (remote site) and I want to download a single 3 GB file from HQ site.  Since my download is 50 Mbps and the uploading site (HQ) can pump out a reliable 16 Mbps all day long I should, theoretically and not counting basic overhead, be able to pull at least 10 Mbps and even upwards of 14+ Mbps most days.

    Both ISPs are providing the advertised speeds very reliably over the last couple of years so we can rule them out in this particular issue.  Again, I've performed these tests after hours (and watched both sides routers to see the traffic levels - nearly zero at the time of the test) and so have near perfect conditions on which to get the best possible speeds.  So I have to assume the problem lies in the protocols, hardware limitations at the router level, or some other issue and that's what I'm trying to troubleshoot.

    Again, thanks for the response. Hopefully my answers here can help you to help me.  ;)



  • I think it might be your window size. TCP window size: 64.0 KByte (default).
    According to your tables the remote MD5 is capable of only 5777.80k/s at that size. I think the problem is as the remote side with the AMD Geode 500MHz.



  • Podilarious:

    Thank you for your response as well.  I was a little confused as to how the openssl speed test was registering the table.  I was assuming that the header was for the encryption level I set and not for the size of the TCP window.  But now that I'm seeing those speeds almost darned near matching what you posted then it makes much more sense.

    That begs the question then, how do I change the TCP window to get to 1024 so I can get the 26 Mbps instead of the 6 Mbps I'm seeing now.  Is that something I can do on pfSense or some tweak I need to make to Windows SMB client/servers to make it work?  This is great progress, thanks Podilarious!



  • What kind of latency does the IPsec tunnel have ? If it's very high, you'd have to tune your TCP window size accordingly.

    I also would start with checking the performance of AMD Geode 500MHz (btw if it's an Alix I seem to remember that it has VPN accel for AES-128, so try using that after enabling it from webGUI System -> Advanced -> Miscellaneous -> Crypto HW). Try testing the IPsec tunnel with a more powerful system.

    Finally, on my pfsense 2.1-BETA system the latest openssl 1.0.1 performs much better (30% to 100% faster) than the old openssl 0.9.8, and on pfs2.1 ipsec-tools is compiled with the new openssl 1.0.1e. That might help people with heavily loaded VPN servers.

    On 2.1 try:
    /usr/bin/openssl speed aes-128-cbc
    /usr/local/bin/openssl speed aes-128-cbc



  • I think you want -w. See http://doc.pfsense.org/index.php/Iperf_man_page for more details.
    dhatz is right, I have seen better performance since moving to 2.1.
    I think on most systems that the Window size is auto set … never really have to change that or troubleshoot that before.



  • Dhatz:

    With the tunnel saturated (currently pushing about 6 Mbps through it) I'm able to get an average of 79 ms which isn't too bad.  There are 14 hops between us and pinging outside the tunnel to the routers WAN IP gives me an average of 70 ms so the tunnel has little effect on my ping which is great.

    The remote site is using an Alix.2D13 (http://store.netgate.com/-P40.aspx) board.  And now that you mention it that site does say it comes with an OCF encryption accelerator.  I enabled the Crypto option as you suggest and ran the test again (mind that the tunnel is active so the results will be a little scewed) and got this:

    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    md2                329.17k      708.05k      996.65k     1118.89k     1148.68k
    mdc2               549.04k      637.85k      659.26k      663.64k      655.26k
    md4               2289.13k     7854.97k    21920.32k    40305.54k    52503.86k
    md5               1753.12k     5772.70k    15143.99k    26220.17k    32645.97k
    hmac(md5)         2004.90k     6419.02k    16479.07k    26775.02k    33190.16k
    sha1              1475.25k     4074.76k     8455.58k    11490.71k    12844.19k
    rmd160            1468.63k     4096.69k     8569.98k    11790.03k    13403.30k
    rc4              22092.32k    26689.23k    27538.92k    27871.09k    27791.23k
    des cbc           5988.65k     6292.11k     6409.92k     6530.34k     6423.43k
    des ede3          2176.81k     2194.97k     2241.66k     2237.70k     2205.02k
    idea cbc             0.00         0.00         0.00         0.00         0.00
    seed cbc             0.00         0.00         0.00         0.00         0.00
    rc2 cbc           2854.43k     2940.76k     3002.53k     2941.11k     2999.06k
    rc5-32/12 cbc    16558.64k    19628.17k    20311.70k    20462.82k    20403.23k
    blowfish cbc      9999.56k    10856.40k    11422.71k    11376.40k    11248.24k
    cast cbc          8665.50k     9402.30k     9916.11k     9865.39k     9650.14k
    aes-128 cbc       5381.94k     5666.46k     5714.98k     5762.96k     5767.86k
    aes-192 cbc       4734.72k     4987.94k     4974.26k     5053.12k     5030.78k
    aes-256 cbc       4266.21k     4379.63k     4440.91k     4463.24k     4461.63k
    camellia-128 cbc     5725.98k     6261.62k     6313.85k     6278.99k     6223.47                  k
    camellia-192 cbc     4604.60k     4865.82k     4862.31k     4924.04k     4892.54                  k
    camellia-256 cbc     4502.63k     4857.29k     4870.93k     4862.82k     4926.56                  k
    sha256            1007.89k     2288.96k     3873.00k     4783.14k     5079.48k
    sha512             390.13k     1567.99k     2360.24k     3260.45k     3649.77k
    aes-128 ige       5449.03k     5863.30k     6074.94k     6049.77k     6101.76k
    aes-192 ige       4723.68k     5036.47k     5225.38k     5217.74k     5220.90k
    aes-256 ige       4214.93k     4501.45k     4583.63k     4629.43k     4645.62k
                     sign    verify    sign/s verify/s
    rsa  512 bits 0.006918s 0.000674s    144.6   1484.0
    rsa 1024 bits 0.031551s 0.001653s     31.7    605.0
    rsa 2048 bits 0.179939s 0.004950s      5.6    202.0
    rsa 4096 bits 1.113613s 0.016874s      0.9     59.3
                     sign    verify    sign/s verify/s
    dsa  512 bits 0.005288s 0.006044s    189.1    165.5
    dsa 1024 bits 0.014283s 0.016861s     70.0     59.3
    dsa 2048 bits 0.045229s 0.053605s     22.1     18.7

    I'm not sure I see much of an improvement, at least for that test.  Secondly, I switched over the tunnel to be as follows:

    IPSec Site-to-Site

    PH1:

    Auth:  Mutual PSK
    Neg: main
    Policy: Default
    Proposal: Default
    Enc: AES (128 bits)
    Hash: SHA1
    DH: 2

    PH 2:

    Proto: ESP
    Enc: AES (128 bits)
    Hash: SHA1

    I'm not seeing much of a difference in the tunnel.  Is this the part in ADVANCED -> SYSTEM TUNABLES that I would change and if so what are some options that I should try?  Also, would I change this on both sides or just the remote side (As I have other VPNs to other sites as well that I don't want to effect yet)?

    net.inet.tcp.recvspace Maximum incoming/outgoing TCP datagram size (receive) default (65228)

    net.inet.tcp.sendspace Maximum incoming/outgoing TCP datagram size (send) default (65228)

    I really appreciate everyone's help and I'll do my best to provide the data you need to help me.  I hope this helps other in the future as well!


Log in to reply