Pfsense 2.1-release OpenVPN, can't see LAN and weird packet loss

peckrob

Okay, I've been banging my head against this, on and off, for a week now. And I'm completely stumped, so I'm hoping someone else can shed some light on the situation. I'm a pfSense and BSD newbie, but I have some networking experience.

This is my LAN: 172.16.104.0/21 (172.16.104.1 - 172.16.111.254). The pfSense LAN address is 172.16.104.1.

I configured OpenVPN using the wizard to be at 172.16.103.0/24.

I can connect to it using Viscosity (1.4.7) from my Mac to test it (tethered to my phone). But I can't ping or reach anything in the LAN. I can ping 172.16.103.1 (OpenVPN), 172.16.103.2 (myself), but most of the time, nothing else.

Occasionally, if I let ping run long enough, I'll get a response from something on the LAN side, but with incredibly high packet loss and latency:


PING 172.16.111.50 (172.16.111.50): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5
Request timeout for icmp_seq 6
Request timeout for icmp_seq 7
Request timeout for icmp_seq 8
Request timeout for icmp_seq 9
Request timeout for icmp_seq 10
64 bytes from 172.16.111.50: icmp_seq=0 ttl=63 time=11489.223 ms
64 bytes from 172.16.111.50: icmp_seq=1 ttl=63 time=10488.166 ms
64 bytes from 172.16.111.50: icmp_seq=2 ttl=63 time=9487.107 ms
64 bytes from 172.16.111.50: icmp_seq=3 ttl=63 time=8485.985 ms
64 bytes from 172.16.111.50: icmp_seq=4 ttl=63 time=7485.086 ms
64 bytes from 172.16.111.50: icmp_seq=5 ttl=63 time=6481.646 ms
64 bytes from 172.16.111.50: icmp_seq=6 ttl=63 time=5659.207 ms
64 bytes from 172.16.111.50: icmp_seq=7 ttl=63 time=4664.262 ms
64 bytes from 172.16.111.50: icmp_seq=8 ttl=63 time=3663.154 ms
64 bytes from 172.16.111.50: icmp_seq=9 ttl=63 time=5124.334 ms
64 bytes from 172.16.111.50: icmp_seq=10 ttl=63 time=4123.130 ms
64 bytes from 172.16.111.50: icmp_seq=11 ttl=63 time=3121.966 ms

At first I thought, hey, cell connection, may not be too stable. But I can ping the pfSense box without any issues:


PING 172.16.104.1 (172.16.104.1): 56 data bytes
64 bytes from 172.16.104.1: icmp_seq=0 ttl=64 time=587.627 ms
64 bytes from 172.16.104.1: icmp_seq=1 ttl=64 time=313.684 ms
64 bytes from 172.16.104.1: icmp_seq=2 ttl=64 time=929.949 ms
64 bytes from 172.16.104.1: icmp_seq=3 ttl=64 time=157.482 ms
64 bytes from 172.16.104.1: icmp_seq=4 ttl=64 time=205.083 ms
64 bytes from 172.16.104.1: icmp_seq=5 ttl=64 time=610.010 ms
64 bytes from 172.16.104.1: icmp_seq=6 ttl=64 time=299.231 ms
64 bytes from 172.16.104.1: icmp_seq=7 ttl=64 time=256.673 ms
64 bytes from 172.16.104.1: icmp_seq=8 ttl=64 time=528.816 ms
64 bytes from 172.16.104.1: icmp_seq=9 ttl=64 time=357.289 ms
64 bytes from 172.16.104.1: icmp_seq=10 ttl=64 time=372.927 ms
64 bytes from 172.16.104.1: icmp_seq=11 ttl=64 time=237.702 ms
64 bytes from 172.16.104.1: icmp_seq=12 ttl=64 time=538.612 ms
64 bytes from 172.16.104.1: icmp_seq=13 ttl=64 time=755.040 ms
64 bytes from 172.16.104.1: icmp_seq=14 ttl=64 time=310.363 ms
64 bytes from 172.16.104.1: icmp_seq=15 ttl=64 time=164.132 ms
64 bytes from 172.16.104.1: icmp_seq=16 ttl=64 time=464.220 ms
64 bytes from 172.16.104.1: icmp_seq=17 ttl=64 time=466.591 ms

And I can ping random server on the Internet with no packet loss:


PING yahoo.com (206.190.36.45): 56 data bytes
64 bytes from 206.190.36.45: icmp_seq=0 ttl=45 time=160.929 ms
64 bytes from 206.190.36.45: icmp_seq=1 ttl=45 time=533.260 ms
64 bytes from 206.190.36.45: icmp_seq=2 ttl=45 time=454.148 ms
64 bytes from 206.190.36.45: icmp_seq=3 ttl=45 time=435.526 ms
64 bytes from 206.190.36.45: icmp_seq=4 ttl=45 time=446.299 ms
64 bytes from 206.190.36.45: icmp_seq=5 ttl=45 time=193.898 ms
64 bytes from 206.190.36.45: icmp_seq=6 ttl=45 time=422.611 ms

So, then I thought, "packet loss on a wired internal network?" So I tried it from 172.16.111.50, and it worked fine.


PING 172.16.104.1 (172.16.104.1): 56 data bytes
64 bytes from 172.16.104.1: icmp_seq=0 ttl=64 time=0.253 ms
64 bytes from 172.16.104.1: icmp_seq=1 ttl=64 time=0.215 ms
64 bytes from 172.16.104.1: icmp_seq=2 ttl=64 time=0.239 ms
64 bytes from 172.16.104.1: icmp_seq=3 ttl=64 time=0.190 ms
64 bytes from 172.16.104.1: icmp_seq=4 ttl=64 time=0.212 ms
64 bytes from 172.16.104.1: icmp_seq=5 ttl=64 time=0.223 ms
64 bytes from 172.16.104.1: icmp_seq=6 ttl=64 time=0.266 ms
64 bytes from 172.16.104.1: icmp_seq=7 ttl=64 time=0.265 ms

But if I try to ping out across OpenVPN, I get packet loss and high latency:


PING 172.16.103.2 (172.16.103.2): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
64 bytes from 172.16.103.2: icmp_seq=0 ttl=63 time=5378.553 ms
64 bytes from 172.16.103.2: icmp_seq=1 ttl=63 time=4377.591 ms
64 bytes from 172.16.103.2: icmp_seq=2 ttl=63 time=4874.247 ms
64 bytes from 172.16.103.2: icmp_seq=3 ttl=63 time=4873.775 ms
64 bytes from 172.16.103.2: icmp_seq=4 ttl=63 time=4034.666 ms
64 bytes from 172.16.103.2: icmp_seq=5 ttl=63 time=4060.898 ms
64 bytes from 172.16.103.2: icmp_seq=6 ttl=63 time=5976.489 ms

So it seems like any time I have to traverse the VPN, I get high packet loss.

Here's my config:


dev ovpns1
dev-type tap
dev-node /dev/tap1
writepid /var/run/openvpn_server1.pid
#user nobody
#group nobody
script-security 3
daemon
keepalive 10 60
ping-timer-rem
persist-tun
persist-key
proto tcp-server
cipher AES-128-CBC
up /usr/local/sbin/ovpn-linkup
down /usr/local/sbin/ovpn-linkdown
client-connect /usr/local/sbin/openvpn.attributes.sh
client-disconnect /usr/local/sbin/openvpn.attributes.sh
local REDACTED
tls-server
server 172.16.103.0 255.255.255.0
client-config-dir /var/etc/openvpn-csc
client-cert-not-required
username-as-common-name
auth-user-pass-verify /var/etc/openvpn/server1.php via-env
tls-verify /var/etc/openvpn/server1.tls-verify.php
lport 1194
management /var/etc/openvpn/server1.sock unix
push "route 172.16.104.0 255.255.248.0"
duplicate-cn
ca /var/etc/openvpn/server1.ca
cert /var/etc/openvpn/server1.cert
key /var/etc/openvpn/server1.key
dh /etc/dh-parameters.1024
tls-auth /var/etc/openvpn/server1.tls-auth 0
persist-remote-ip
float

The machine is a MITAC PD12TI Mini-ITX (D2500CCE), Intel Atom Processor D2500 (Dual Core), 1.86ghz with 4gb RAM.

Any ideas?

peckrob

Just wanted to say, finally got this to work right.

In case anyone comes after me, I never did figure out what was causing all the latency issues, but they seemed to go away when I switched from TCP to UDP (!). No idea if there's still packet loss.

These instructions were enormously helpful: http://hardforum.com/showthread.php?t=1663797

marvosa

It appears you have routed setup, so why are you using Device Mode "Tap"? You should be using "Tun".

phil.davis

I never did figure out what was causing all the latency issues, but they seemed to go away when I switched from TCP to UDP (!).

OpenVPN has its own protocol for keeping track of and retransmitting lost packets, and that has timers etc. in it. It is meant to work over UDP - it does not particularly want it underlying transport protocol to provdie reliability services. TCP provides the reliability also, retransmitting packets that are lost at TCP level. Often the OpenVPN timers are similar or shorter than the TCP ones - so OpenVPN retransmits lost things, and a bit later TCP retransmits also and things get a bit confusing for the receiving end.
I used to be in a project that had to run DecNET over TCP/IP - that had all the same issues in that DecNET did its own reliability loss handling, but underneath TCP was doing it also. It was a pain when there was any packet loss happening. Unfortunately there was no DecNET over UDP available.
I would definitely recommend to always use UDP with OpenVPN - letting OpenVPN do implement its own reliability protocol without interference by a lower layer.

peckrob

@marvosa:

It appears you have routed setup, so why are you using Device Mode "Tap"? You should be using "Tun".

Yeah, that was a derp on my part. tap is actually correct - I was trying to get to a server bridged configuration (so I could get broadcasts working across the VPN). It was just figuring out how to do that in the "pfSense way." I could have copied my old config out of DD-WRT and the script I had written to bring everything online, but then I wouldn't have learned anything.

@phil.davis:

OpenVPN has its own protocol for keeping track of and retransmitting lost packets, and that has timers etc.

Wow, TIL! It makes sense now that I know that.