Pfsense 2.1-release OpenVPN, can't see LAN and weird packet loss



  • Okay, I've been banging my head against this, on and off, for a week now. And I'm completely stumped, so I'm hoping someone else can shed some light on the situation. I'm a pfSense and BSD newbie, but I have some networking experience.

    This is my LAN: 172.16.104.0/21 (172.16.104.1 - 172.16.111.254). The pfSense LAN address is 172.16.104.1.

    I configured OpenVPN using the wizard to be at 172.16.103.0/24.

    I can connect to it using Viscosity (1.4.7) from my Mac to test it (tethered to my phone). But I can't ping or reach anything in the LAN. I can ping 172.16.103.1 (OpenVPN), 172.16.103.2 (myself), but most of the time, nothing else.

    Occasionally, if I let ping run long enough, I'll get a response from something on the LAN side, but with incredibly high packet loss and latency:

    
    PING 172.16.111.50 (172.16.111.50): 56 data bytes
    Request timeout for icmp_seq 0
    Request timeout for icmp_seq 1
    Request timeout for icmp_seq 2
    Request timeout for icmp_seq 3
    Request timeout for icmp_seq 4
    Request timeout for icmp_seq 5
    Request timeout for icmp_seq 6
    Request timeout for icmp_seq 7
    Request timeout for icmp_seq 8
    Request timeout for icmp_seq 9
    Request timeout for icmp_seq 10
    64 bytes from 172.16.111.50: icmp_seq=0 ttl=63 time=11489.223 ms
    64 bytes from 172.16.111.50: icmp_seq=1 ttl=63 time=10488.166 ms
    64 bytes from 172.16.111.50: icmp_seq=2 ttl=63 time=9487.107 ms
    64 bytes from 172.16.111.50: icmp_seq=3 ttl=63 time=8485.985 ms
    64 bytes from 172.16.111.50: icmp_seq=4 ttl=63 time=7485.086 ms
    64 bytes from 172.16.111.50: icmp_seq=5 ttl=63 time=6481.646 ms
    64 bytes from 172.16.111.50: icmp_seq=6 ttl=63 time=5659.207 ms
    64 bytes from 172.16.111.50: icmp_seq=7 ttl=63 time=4664.262 ms
    64 bytes from 172.16.111.50: icmp_seq=8 ttl=63 time=3663.154 ms
    64 bytes from 172.16.111.50: icmp_seq=9 ttl=63 time=5124.334 ms
    64 bytes from 172.16.111.50: icmp_seq=10 ttl=63 time=4123.130 ms
    64 bytes from 172.16.111.50: icmp_seq=11 ttl=63 time=3121.966 ms
    
    

    At first I thought, hey, cell connection, may not be too stable. But I can ping the pfSense box without any issues:

    
    PING 172.16.104.1 (172.16.104.1): 56 data bytes
    64 bytes from 172.16.104.1: icmp_seq=0 ttl=64 time=587.627 ms
    64 bytes from 172.16.104.1: icmp_seq=1 ttl=64 time=313.684 ms
    64 bytes from 172.16.104.1: icmp_seq=2 ttl=64 time=929.949 ms
    64 bytes from 172.16.104.1: icmp_seq=3 ttl=64 time=157.482 ms
    64 bytes from 172.16.104.1: icmp_seq=4 ttl=64 time=205.083 ms
    64 bytes from 172.16.104.1: icmp_seq=5 ttl=64 time=610.010 ms
    64 bytes from 172.16.104.1: icmp_seq=6 ttl=64 time=299.231 ms
    64 bytes from 172.16.104.1: icmp_seq=7 ttl=64 time=256.673 ms
    64 bytes from 172.16.104.1: icmp_seq=8 ttl=64 time=528.816 ms
    64 bytes from 172.16.104.1: icmp_seq=9 ttl=64 time=357.289 ms
    64 bytes from 172.16.104.1: icmp_seq=10 ttl=64 time=372.927 ms
    64 bytes from 172.16.104.1: icmp_seq=11 ttl=64 time=237.702 ms
    64 bytes from 172.16.104.1: icmp_seq=12 ttl=64 time=538.612 ms
    64 bytes from 172.16.104.1: icmp_seq=13 ttl=64 time=755.040 ms
    64 bytes from 172.16.104.1: icmp_seq=14 ttl=64 time=310.363 ms
    64 bytes from 172.16.104.1: icmp_seq=15 ttl=64 time=164.132 ms
    64 bytes from 172.16.104.1: icmp_seq=16 ttl=64 time=464.220 ms
    64 bytes from 172.16.104.1: icmp_seq=17 ttl=64 time=466.591 ms
    
    

    And I can ping random server on the Internet with no packet loss:

    
    PING yahoo.com (206.190.36.45): 56 data bytes
    64 bytes from 206.190.36.45: icmp_seq=0 ttl=45 time=160.929 ms
    64 bytes from 206.190.36.45: icmp_seq=1 ttl=45 time=533.260 ms
    64 bytes from 206.190.36.45: icmp_seq=2 ttl=45 time=454.148 ms
    64 bytes from 206.190.36.45: icmp_seq=3 ttl=45 time=435.526 ms
    64 bytes from 206.190.36.45: icmp_seq=4 ttl=45 time=446.299 ms
    64 bytes from 206.190.36.45: icmp_seq=5 ttl=45 time=193.898 ms
    64 bytes from 206.190.36.45: icmp_seq=6 ttl=45 time=422.611 ms
    
    

    So, then I thought, "packet loss on a wired internal network?" So I tried it from 172.16.111.50, and it worked fine.

    
    PING 172.16.104.1 (172.16.104.1): 56 data bytes
    64 bytes from 172.16.104.1: icmp_seq=0 ttl=64 time=0.253 ms
    64 bytes from 172.16.104.1: icmp_seq=1 ttl=64 time=0.215 ms
    64 bytes from 172.16.104.1: icmp_seq=2 ttl=64 time=0.239 ms
    64 bytes from 172.16.104.1: icmp_seq=3 ttl=64 time=0.190 ms
    64 bytes from 172.16.104.1: icmp_seq=4 ttl=64 time=0.212 ms
    64 bytes from 172.16.104.1: icmp_seq=5 ttl=64 time=0.223 ms
    64 bytes from 172.16.104.1: icmp_seq=6 ttl=64 time=0.266 ms
    64 bytes from 172.16.104.1: icmp_seq=7 ttl=64 time=0.265 ms
    
    

    But if I try to ping out across OpenVPN, I get packet loss and high latency:

    
    PING 172.16.103.2 (172.16.103.2): 56 data bytes
    Request timeout for icmp_seq 0
    Request timeout for icmp_seq 1
    Request timeout for icmp_seq 2
    Request timeout for icmp_seq 3
    Request timeout for icmp_seq 4
    64 bytes from 172.16.103.2: icmp_seq=0 ttl=63 time=5378.553 ms
    64 bytes from 172.16.103.2: icmp_seq=1 ttl=63 time=4377.591 ms
    64 bytes from 172.16.103.2: icmp_seq=2 ttl=63 time=4874.247 ms
    64 bytes from 172.16.103.2: icmp_seq=3 ttl=63 time=4873.775 ms
    64 bytes from 172.16.103.2: icmp_seq=4 ttl=63 time=4034.666 ms
    64 bytes from 172.16.103.2: icmp_seq=5 ttl=63 time=4060.898 ms
    64 bytes from 172.16.103.2: icmp_seq=6 ttl=63 time=5976.489 ms
    
    

    So it seems like any time I have to traverse the VPN, I get high packet loss.

    Here's my config:

    
    dev ovpns1
    dev-type tap
    dev-node /dev/tap1
    writepid /var/run/openvpn_server1.pid
    #user nobody
    #group nobody
    script-security 3
    daemon
    keepalive 10 60
    ping-timer-rem
    persist-tun
    persist-key
    proto tcp-server
    cipher AES-128-CBC
    up /usr/local/sbin/ovpn-linkup
    down /usr/local/sbin/ovpn-linkdown
    client-connect /usr/local/sbin/openvpn.attributes.sh
    client-disconnect /usr/local/sbin/openvpn.attributes.sh
    local REDACTED
    tls-server
    server 172.16.103.0 255.255.255.0
    client-config-dir /var/etc/openvpn-csc
    client-cert-not-required
    username-as-common-name
    auth-user-pass-verify /var/etc/openvpn/server1.php via-env
    tls-verify /var/etc/openvpn/server1.tls-verify.php
    lport 1194
    management /var/etc/openvpn/server1.sock unix
    push "route 172.16.104.0 255.255.248.0"
    duplicate-cn
    ca /var/etc/openvpn/server1.ca
    cert /var/etc/openvpn/server1.cert
    key /var/etc/openvpn/server1.key
    dh /etc/dh-parameters.1024
    tls-auth /var/etc/openvpn/server1.tls-auth 0
    persist-remote-ip
    float
    
    

    The machine is a MITAC PD12TI Mini-ITX (D2500CCE), Intel Atom Processor D2500 (Dual Core), 1.86ghz with 4gb RAM.

    Any ideas?



  • Just wanted to say, finally got this to work right.

    In case anyone comes after me, I never did figure out what was causing all the latency issues, but they seemed to go away when I switched from TCP to UDP (!). No idea if there's still packet loss.

    These instructions were enormously helpful: http://hardforum.com/showthread.php?t=1663797



  • It appears you have routed setup, so why are you using Device Mode "Tap"?  You should be using "Tun".



  • I never did figure out what was causing all the latency issues, but they seemed to go away when I switched from TCP to UDP (!).

    OpenVPN has its own protocol for keeping track of and retransmitting lost packets, and that has timers etc. in it. It is meant to work over UDP - it does not particularly want it underlying transport protocol to provdie reliability services. TCP provides the reliability also, retransmitting packets that are lost at TCP level. Often the OpenVPN timers are similar or shorter than the TCP ones - so OpenVPN retransmits lost things, and a bit later TCP retransmits also and things get a bit confusing for the receiving end.
    I used to be in a project that had to run DecNET over TCP/IP - that had all the same issues in that DecNET did its own reliability loss handling, but underneath TCP was doing it also. It was a pain when there was any packet loss happening. Unfortunately there was no DecNET over UDP available.
    I would definitely recommend to always use UDP with OpenVPN - letting OpenVPN do implement its own reliability protocol without interference by a lower layer.



  • @marvosa:

    It appears you have routed setup, so why are you using Device Mode "Tap"?  You should be using "Tun".

    Yeah, that was a derp on my part. tap is actually correct - I was trying to get to a server bridged configuration (so I could get broadcasts working across the VPN). It was just figuring out how to do that in the "pfSense way." I could have copied my old config out of DD-WRT and the script I had written to bring everything online, but then I wouldn't have learned anything.

    @phil.davis:

    OpenVPN has its own protocol for keeping track of and retransmitting lost packets, and that has timers etc.

    Wow, TIL! It makes sense now that I know that.


Log in to reply