Occasional Large Spike in Latency through Site-Site OpenVPN Tunnel



  • New user here:

    I have two pfSense Gateways. One running on dedicated hardware, the other running under KVM. Both instances are running on fully capable hardware… Both are running 2.0.3-RELEASE amd64

    KVM Pfsense: LAN: 192.168.99.250  Tunnel Client: 10.0.8.2
    Bare-metal Pfsense: LAN: 192.168.1.1 Tunnel Server: 10.0.8.1

    The site to site uses a shared key, has LZO enabled and AES 128bit CBC for cypto.

    The latency between both sites is relatively low: In the clear (no tunnel) the latency stays between 14ms-17ms and does not spike.

    Through the tunnel latency averages 15ms-20ms. Every so often (sometimes seems predictable) there will be a large (500ms) spike in latency that slowly tapers back to normal levels. This happen with no other traffic passing through the gateways.

    Here it is tapering/spike:
    64 bytes from 192.168.1.1: icmp_seq=57 ttl=63 time=529 ms
    64 bytes from 192.168.1.1: icmp_seq=58 ttl=63 time=516 ms
    64 bytes from 192.168.1.1: icmp_seq=59 ttl=63 time=506 ms
    64 bytes from 192.168.1.1: icmp_seq=60 ttl=63 time=497 ms
    64 bytes from 192.168.1.1: icmp_seq=61 ttl=63 time=487 ms
    64 bytes from 192.168.1.1: icmp_seq=62 ttl=63 time=492 ms
    64 bytes from 192.168.1.1: icmp_seq=63 ttl=63 time=485 ms
    64 bytes from 192.168.1.1: icmp_seq=64 ttl=63 time=479 ms
    64 bytes from 192.168.1.1: icmp_seq=65 ttl=63 time=469 ms
    64 bytes from 192.168.1.1: icmp_seq=66 ttl=63 time=464 ms
    64 bytes from 192.168.1.1: icmp_seq=68 ttl=63 time=463 ms
    64 bytes from 192.168.1.1: icmp_seq=69 ttl=63 time=443 ms
    64 bytes from 192.168.1.1: icmp_seq=70 ttl=63 time=424 ms
    64 bytes from 192.168.1.1: icmp_seq=71 ttl=63 time=404 ms
    64 bytes from 192.168.1.1: icmp_seq=72 ttl=63 time=387 ms
    64 bytes from 192.168.1.1: icmp_seq=73 ttl=63 time=367 ms
    64 bytes from 192.168.1.1: icmp_seq=74 ttl=63 time=347 ms
    64 bytes from 192.168.1.1: icmp_seq=75 ttl=63 time=328 ms
    64 bytes from 192.168.1.1: icmp_seq=76 ttl=63 time=322 ms
    64 bytes from 192.168.1.1: icmp_seq=77 ttl=63 time=303 ms
    64 bytes from 192.168.1.1: icmp_seq=78 ttl=63 time=283 ms
    64 bytes from 192.168.1.1: icmp_seq=79 ttl=63 time=263 ms
    64 bytes from 192.168.1.1: icmp_seq=80 ttl=63 time=245 ms
    64 bytes from 192.168.1.1: icmp_seq=81 ttl=63 time=225 ms
    64 bytes from 192.168.1.1: icmp_seq=82 ttl=63 time=206 ms
    64 bytes from 192.168.1.1: icmp_seq=83 ttl=63 time=186 ms
    64 bytes from 192.168.1.1: icmp_seq=84 ttl=63 time=173 ms
    64 bytes from 192.168.1.1: icmp_seq=85 ttl=63 time=153 ms
    64 bytes from 192.168.1.1: icmp_seq=86 ttl=63 time=133 ms
    64 bytes from 192.168.1.1: icmp_seq=87 ttl=63 time=117 ms
    64 bytes from 192.168.1.1: icmp_seq=88 ttl=63 time=112 ms
    64 bytes from 192.168.1.1: icmp_seq=89 ttl=63 time=96.9 ms
    64 bytes from 192.168.1.1: icmp_seq=90 ttl=63 time=90.5 ms
    64 bytes from 192.168.1.1: icmp_seq=91 ttl=63 time=76.6 ms
    64 bytes from 192.168.1.1: icmp_seq=92 ttl=63 time=65.2 ms
    64 bytes from 192.168.1.1: icmp_seq=93 ttl=63 time=55.3 ms
    64 bytes from 192.168.1.1: icmp_seq=94 ttl=63 time=43.1 ms
    64 bytes from 192.168.1.1: icmp_seq=95 ttl=63 time=37.9 ms
    64 bytes from 192.168.1.1: icmp_seq=96 ttl=63 time=26.8 ms
    64 bytes from 192.168.1.1: icmp_seq=97 ttl=63 time=19.9 ms
    64 bytes from 192.168.1.1: icmp_seq=98 ttl=63 time=16.1 ms
    64 bytes from 192.168.1.1: icmp_seq=99 ttl=63 time=16.9 ms

    Have tried the following to see if it fixes the issue:
    Disabled Encryption
    Disabled LZO
    Changed OpenVPN process priority to -19 nice
    Played around with MTU settings

    I have already ran multiple ICMP tests between sites with no tunnel and it stays between 14ms-17ms with no spike in latency.

    I do have the option of isolating which gateway is the cause by connecting either gateways to a third pfSense box and seeing if this issue is replicated.

    Are there any ideas or further optimization/configurations I can perform to mitigate this issue?

    Time and help is always appreciated :)


  • Banned

    1/ What's the "fully capable HW"?
    2/ QEMU is sloooow without HW acceleration (see #1)
    3/ "Disabled Encryption" - oh pleaaaase…  ::)



  • :) lol… Sorry the Pfsense VM is not running under qemu but kvm.

    The bare-metal is running on a HP Proliant M115

    The host machine for the kvm pfsense running dual xeons (6 Cores each) Westmere. It has 4 cores allotted and  2gb ram.

    I disabled encryption to see if the problem was CPU bound...