Pfsense 2.1-release OpenVPN, can't see LAN and weird packet loss
-
Okay, I've been banging my head against this, on and off, for a week now. And I'm completely stumped, so I'm hoping someone else can shed some light on the situation. I'm a pfSense and BSD newbie, but I have some networking experience.
This is my LAN: 172.16.104.0/21 (172.16.104.1 - 172.16.111.254). The pfSense LAN address is 172.16.104.1.
I configured OpenVPN using the wizard to be at 172.16.103.0/24.
I can connect to it using Viscosity (1.4.7) from my Mac to test it (tethered to my phone). But I can't ping or reach anything in the LAN. I can ping 172.16.103.1 (OpenVPN), 172.16.103.2 (myself), but most of the time, nothing else.
Occasionally, if I let ping run long enough, I'll get a response from something on the LAN side, but with incredibly high packet loss and latency:
PING 172.16.111.50 (172.16.111.50): 56 data bytes Request timeout for icmp_seq 0 Request timeout for icmp_seq 1 Request timeout for icmp_seq 2 Request timeout for icmp_seq 3 Request timeout for icmp_seq 4 Request timeout for icmp_seq 5 Request timeout for icmp_seq 6 Request timeout for icmp_seq 7 Request timeout for icmp_seq 8 Request timeout for icmp_seq 9 Request timeout for icmp_seq 10 64 bytes from 172.16.111.50: icmp_seq=0 ttl=63 time=11489.223 ms 64 bytes from 172.16.111.50: icmp_seq=1 ttl=63 time=10488.166 ms 64 bytes from 172.16.111.50: icmp_seq=2 ttl=63 time=9487.107 ms 64 bytes from 172.16.111.50: icmp_seq=3 ttl=63 time=8485.985 ms 64 bytes from 172.16.111.50: icmp_seq=4 ttl=63 time=7485.086 ms 64 bytes from 172.16.111.50: icmp_seq=5 ttl=63 time=6481.646 ms 64 bytes from 172.16.111.50: icmp_seq=6 ttl=63 time=5659.207 ms 64 bytes from 172.16.111.50: icmp_seq=7 ttl=63 time=4664.262 ms 64 bytes from 172.16.111.50: icmp_seq=8 ttl=63 time=3663.154 ms 64 bytes from 172.16.111.50: icmp_seq=9 ttl=63 time=5124.334 ms 64 bytes from 172.16.111.50: icmp_seq=10 ttl=63 time=4123.130 ms 64 bytes from 172.16.111.50: icmp_seq=11 ttl=63 time=3121.966 ms
At first I thought, hey, cell connection, may not be too stable. But I can ping the pfSense box without any issues:
PING 172.16.104.1 (172.16.104.1): 56 data bytes 64 bytes from 172.16.104.1: icmp_seq=0 ttl=64 time=587.627 ms 64 bytes from 172.16.104.1: icmp_seq=1 ttl=64 time=313.684 ms 64 bytes from 172.16.104.1: icmp_seq=2 ttl=64 time=929.949 ms 64 bytes from 172.16.104.1: icmp_seq=3 ttl=64 time=157.482 ms 64 bytes from 172.16.104.1: icmp_seq=4 ttl=64 time=205.083 ms 64 bytes from 172.16.104.1: icmp_seq=5 ttl=64 time=610.010 ms 64 bytes from 172.16.104.1: icmp_seq=6 ttl=64 time=299.231 ms 64 bytes from 172.16.104.1: icmp_seq=7 ttl=64 time=256.673 ms 64 bytes from 172.16.104.1: icmp_seq=8 ttl=64 time=528.816 ms 64 bytes from 172.16.104.1: icmp_seq=9 ttl=64 time=357.289 ms 64 bytes from 172.16.104.1: icmp_seq=10 ttl=64 time=372.927 ms 64 bytes from 172.16.104.1: icmp_seq=11 ttl=64 time=237.702 ms 64 bytes from 172.16.104.1: icmp_seq=12 ttl=64 time=538.612 ms 64 bytes from 172.16.104.1: icmp_seq=13 ttl=64 time=755.040 ms 64 bytes from 172.16.104.1: icmp_seq=14 ttl=64 time=310.363 ms 64 bytes from 172.16.104.1: icmp_seq=15 ttl=64 time=164.132 ms 64 bytes from 172.16.104.1: icmp_seq=16 ttl=64 time=464.220 ms 64 bytes from 172.16.104.1: icmp_seq=17 ttl=64 time=466.591 ms
And I can ping random server on the Internet with no packet loss:
PING yahoo.com (206.190.36.45): 56 data bytes 64 bytes from 206.190.36.45: icmp_seq=0 ttl=45 time=160.929 ms 64 bytes from 206.190.36.45: icmp_seq=1 ttl=45 time=533.260 ms 64 bytes from 206.190.36.45: icmp_seq=2 ttl=45 time=454.148 ms 64 bytes from 206.190.36.45: icmp_seq=3 ttl=45 time=435.526 ms 64 bytes from 206.190.36.45: icmp_seq=4 ttl=45 time=446.299 ms 64 bytes from 206.190.36.45: icmp_seq=5 ttl=45 time=193.898 ms 64 bytes from 206.190.36.45: icmp_seq=6 ttl=45 time=422.611 ms
So, then I thought, "packet loss on a wired internal network?" So I tried it from 172.16.111.50, and it worked fine.
PING 172.16.104.1 (172.16.104.1): 56 data bytes 64 bytes from 172.16.104.1: icmp_seq=0 ttl=64 time=0.253 ms 64 bytes from 172.16.104.1: icmp_seq=1 ttl=64 time=0.215 ms 64 bytes from 172.16.104.1: icmp_seq=2 ttl=64 time=0.239 ms 64 bytes from 172.16.104.1: icmp_seq=3 ttl=64 time=0.190 ms 64 bytes from 172.16.104.1: icmp_seq=4 ttl=64 time=0.212 ms 64 bytes from 172.16.104.1: icmp_seq=5 ttl=64 time=0.223 ms 64 bytes from 172.16.104.1: icmp_seq=6 ttl=64 time=0.266 ms 64 bytes from 172.16.104.1: icmp_seq=7 ttl=64 time=0.265 ms
But if I try to ping out across OpenVPN, I get packet loss and high latency:
PING 172.16.103.2 (172.16.103.2): 56 data bytes Request timeout for icmp_seq 0 Request timeout for icmp_seq 1 Request timeout for icmp_seq 2 Request timeout for icmp_seq 3 Request timeout for icmp_seq 4 64 bytes from 172.16.103.2: icmp_seq=0 ttl=63 time=5378.553 ms 64 bytes from 172.16.103.2: icmp_seq=1 ttl=63 time=4377.591 ms 64 bytes from 172.16.103.2: icmp_seq=2 ttl=63 time=4874.247 ms 64 bytes from 172.16.103.2: icmp_seq=3 ttl=63 time=4873.775 ms 64 bytes from 172.16.103.2: icmp_seq=4 ttl=63 time=4034.666 ms 64 bytes from 172.16.103.2: icmp_seq=5 ttl=63 time=4060.898 ms 64 bytes from 172.16.103.2: icmp_seq=6 ttl=63 time=5976.489 ms
So it seems like any time I have to traverse the VPN, I get high packet loss.
Here's my config:
dev ovpns1 dev-type tap dev-node /dev/tap1 writepid /var/run/openvpn_server1.pid #user nobody #group nobody script-security 3 daemon keepalive 10 60 ping-timer-rem persist-tun persist-key proto tcp-server cipher AES-128-CBC up /usr/local/sbin/ovpn-linkup down /usr/local/sbin/ovpn-linkdown client-connect /usr/local/sbin/openvpn.attributes.sh client-disconnect /usr/local/sbin/openvpn.attributes.sh local REDACTED tls-server server 172.16.103.0 255.255.255.0 client-config-dir /var/etc/openvpn-csc client-cert-not-required username-as-common-name auth-user-pass-verify /var/etc/openvpn/server1.php via-env tls-verify /var/etc/openvpn/server1.tls-verify.php lport 1194 management /var/etc/openvpn/server1.sock unix push "route 172.16.104.0 255.255.248.0" duplicate-cn ca /var/etc/openvpn/server1.ca cert /var/etc/openvpn/server1.cert key /var/etc/openvpn/server1.key dh /etc/dh-parameters.1024 tls-auth /var/etc/openvpn/server1.tls-auth 0 persist-remote-ip float
The machine is a MITAC PD12TI Mini-ITX (D2500CCE), Intel Atom Processor D2500 (Dual Core), 1.86ghz with 4gb RAM.
Any ideas?
-
Just wanted to say, finally got this to work right.
In case anyone comes after me, I never did figure out what was causing all the latency issues, but they seemed to go away when I switched from TCP to UDP (!). No idea if there's still packet loss.
These instructions were enormously helpful: http://hardforum.com/showthread.php?t=1663797
-
It appears you have routed setup, so why are you using Device Mode "Tap"? You should be using "Tun".
-
I never did figure out what was causing all the latency issues, but they seemed to go away when I switched from TCP to UDP (!).
OpenVPN has its own protocol for keeping track of and retransmitting lost packets, and that has timers etc. in it. It is meant to work over UDP - it does not particularly want it underlying transport protocol to provdie reliability services. TCP provides the reliability also, retransmitting packets that are lost at TCP level. Often the OpenVPN timers are similar or shorter than the TCP ones - so OpenVPN retransmits lost things, and a bit later TCP retransmits also and things get a bit confusing for the receiving end.
I used to be in a project that had to run DecNET over TCP/IP - that had all the same issues in that DecNET did its own reliability loss handling, but underneath TCP was doing it also. It was a pain when there was any packet loss happening. Unfortunately there was no DecNET over UDP available.
I would definitely recommend to always use UDP with OpenVPN - letting OpenVPN do implement its own reliability protocol without interference by a lower layer. -
It appears you have routed setup, so why are you using Device Mode "Tap"? You should be using "Tun".
Yeah, that was a derp on my part. tap is actually correct - I was trying to get to a server bridged configuration (so I could get broadcasts working across the VPN). It was just figuring out how to do that in the "pfSense way." I could have copied my old config out of DD-WRT and the script I had written to bring everything online, but then I wouldn't have learned anything.
OpenVPN has its own protocol for keeping track of and retransmitting lost packets, and that has timers etc.
Wow, TIL! It makes sense now that I know that.