One external IP is being (wrongly) routed to OpenVPN

MTHead

I should give a little extra background: I only discovered the OpenVPN tie-in after barking up many wrong trees. I don't understand the connection at all…

The client is a doctor's office, and Palmetto is the Medicare carrier for our region, so the office accesses the website every day. This has always worked, up until Monday morning. Also, I set up an OpenVPN road-warrior tunnel at the same time that I installed pfSense (about six months ago), and there hadn't been any previous problems. Suddenly, on Monday the office could not reach the Palmetto site; besides being unable to browse the site, I could not ping from the client machines, from the pfSense shell (via Putty), or from Diagnostics/Ping in the WebGUI. I have several other clients in the same medical building who also use pfSense, and I was able to ping Palmetto from their pfSense boxes, so initially I blamed the ISP. The ISP's tech support instructed me to bypass the router and plug my laptop directly into the T1… and imagine how foolish I felt when it worked! However, I still couldn't find anything in the pfSense configuration that would cause such a result - all packets destined for Palmetto seemed to vanish into thin air. (And ONLY Palmetto - all other sites seem to be just fine, and even an IP address that's off-by-one from the Palmetto address works.)

Finally this morning I logged in via Putty from home, and got the result that I posted earlier. It hadn't even crossed my mind earlier that OpenVPN might be involved, as this seemed to be a strictly internal issue (and since I hadn't made any changes recently.)
It makes no difference whether anyone is connected via OpenVPN or not; if the tunnel is enabled, "arp who-has Palmetto" gets the answer "OpenVPN has it" - and presumably, the packet then gets sent through OpenVPN and into a black hole (especially if no-one is connected at the time!) If I disable the tunnel, packets intended for Palmetto are routed through the WAN interface as they should be, and the ping succeeds. I could delete the tunnel and re-create it, but I would rather not (unless I was sure that it would fix the problem) because I dread the thought of re-creating and re-distributing certificates...

In any case, I think I've looked in all the usual places. I was hoping that someone would have some ideas/experience of UNusual places to look?

danswartz

I am confused by the client arping for an address not on its subnet and getting a reply. This may be some openvpn oddity, though. Have you looked at the openvpn config (/conf/config.xml) to see if that address shows up?

jimp

Yes, the contents of the OpenVPN client (on windows) config and the server on pfsense (/var/etc/openvpn_*.conf) might also help shed some light on the situation.

MTHead

@jimp:

Yes, the contents of the OpenVPN client (on windows) config and the server on pfsense (/var/etc/openvpn_*.conf) might also help shed some light on the situation.

Windows config file (tz.ovpn):


client
dev tap
dev-node OpenVPN
proto udp
remote OurPublicIP 1194
resolv-retry infinite
nobind
persist-key
persist-tun
mute-replay-warnings

ca tz-ca.crt
cert tz-marc-hp.crt
key tz-marc-hp.key

ns-cert-type server
cipher BF-CBC
comp-lzo
verb 3

cat openvpn_server0.conf


writepid /var/run/openvpn_server0.pid
#user nobody
#group nobody
daemon
keepalive 10 60
ping-timer-rem
persist-tun
persist-key
dev tun
proto udp
cipher BF-CBC
up /etc/rc.filter_configure
down /etc/rc.filter_configure
tls-server
ifconfig 192.168.35.1 192.168.35.2
push "route 192.168.254.0 255.255.255.0"
lport 1194
push "dhcp-option DNS 192.168.254.254"
push "dhcp-option NBT 4"
ca /var/etc/openvpn_server0.ca
cert /var/etc/openvpn_server0.cert
key /var/etc/openvpn_server0.key
dh /var/etc/openvpn_server0.dh
comp-lzo
dev tap0
server-bridge 192.168.254.254 255.255.255.0 192.168.254.150 192.168.254.160

MTHead

@danswartz:

I am confused by the client arping for an address not on its subnet and getting a reply. This may be some openvpn oddity, though.

I'm not sure what you mean by "the client" in this context. I was pinging from the pfSense shell… I do wonder where the reply came from. Is there any switch I can use with tcpdump (or another tool) to find out?

@danswartz:

Have you looked at the openvpn config (/conf/config.xml) to see if that address shows up?

Yes I have, and no it doesn't. In fact, even the first octet doesn't appear anywhere in the file. I can post it if you'd like, but it would obviously require some obfuscation.

I think the answer lies in the fact that somebody somewhere on the network is answering that ARP who-has, and if I can find out who it is (and stop it) the problem will be solved. Perhaps I'm in for a day of good old-fashioned unplugging and re-plugging… I'm hoping for a more modern answer, though.

danswartz

sorry misread who was doing the arp. you could try 'tcpdump -lnvvv' instead. what i think must be going on is that the local IP (192.168.x.x) would not be arping for the palmetto address normally, much less getting a reply, so this must be coming to/from the tunnel. What does 'clog /var/log/openvpn.log' show?

jimp

Why are you using "dev tap" on the client? That is probably your problem. And why the server-bridge line on the server side?

iirc, tap devices are more for bridging than routing. You probably want that to be tun.

If you are trying to bridge, you want that to be tap on both sides, but I'm not familiar with a bridged setup.

danswartz

Good point about tap vs tun.

MTHead

@jimp:

Why are you using "dev tap" on the client? That is probably your problem. And why the server-bridge line on the server side?

iirc, tap devices are more for bridging than routing. You probably want that to be tun.

If you are trying to bridge, you want that to be tap on both sides, but I'm not familiar with a bridged setup.

I'm using tap intentionally because I do use bridging. (It's not necessary at this office, but I have been using a standard configuration across all of my installations - much less hassle that way!) One of my offices has a permanent (or, at least, we WISH it were permanent) IPSec tunnel to a hosting company. The hosting company's road-warrior IPSec VPN (from a company that rhymes with Crisco) is much less reliable than the site-to-site, so I decided to have my road warriors connect to the office network and access the hosted network via bridging. It's been very stable and very easy to add/remove clients, as opposed to dealing with the hosting people… Once I got that set up and working, I standardized my OpenVPN setup; this particular office doesn't currently need bridging, but it's been working smoothly.

I can change the OpenVPN setup to use tun instead; again, I'd prefer not to unless I'm sure it will fix this problem, because it means modifying all the client machines (most of which are at the doctors'/employees' houses.) I have this same setup in nearly a dozen offices, running for over two years now, and this is the first time anything like this has come up. Perhaps this is a hidden danger of using bridging, in which case I should expect to get hit with it at my other sites sooner or later; meanwhile, I'd like to get to the bottom of why it happened here.

It's persistent across reboots.
It doesn't matter whether any OpenVPN clients are connected or not.
As far as I can tell, it's only this one IP address that's affected. (Google.com and forum.pfsense.org, for instance, work just fine.)

These things make me think that pfSense has stored this (bad) routing information somewhere. At this point it's more of an intellectual curiosity to me to find out where. Should I perhaps ask on a FreeBSD forum?

MTHead

@danswartz:

sorry misread who was doing the arp. you could try 'tcpdump -lnvvv' instead. what i think must be going on is that the local IP (192.168.x.x) would not be arping for the palmetto address normally, much less getting a reply, so this must be coming to/from the tunnel. What does 'clog /var/log/openvpn.log' show?


# clog /var/log/openvpn.log
Jan  1 22:55:25 pfsense openvpn[345]: OpenVPN 2.0.6 i386-portbld-freebsd7.2 [SSL] [LZO] built on Dec  4 2009
Jan  1 22:55:25 pfsense openvpn[345]: WARNING: file '/var/etc/openvpn_server0.key' is group or others accessible
Jan  1 22:55:25 pfsense openvpn[345]: WARNING: Since you are using --dev tap, the second argument to --ifconfig must be a netmask, for example something like 255.255.255.0\. (silence this warning with --ifconfig-nowarn)
Jan  1 22:55:25 pfsense openvpn[345]: TUN/TAP device /dev/tap0 opened
Jan  1 22:55:25 pfsense openvpn[345]: /sbin/ifconfig tap0 192.168.35.1 netmask 192.168.35.2 mtu 1500 up
Jan  1 22:55:25 pfsense openvpn[345]: /etc/rc.filter_configure tap0 1500 1574 192.168.35.1 192.168.35.2 init
Jan  1 22:55:25 pfsense openvpn[352]: UDPv4 link local (bound): [undef]:1194
Jan  1 22:55:25 pfsense openvpn[352]: UDPv4 link remote: [undef]
Jan  1 22:55:25 pfsense openvpn[352]: Initialization Sequence Completed

MTHead

Just thought I'd post the eventual solution, in case anyone else ever has the same problem. I added a static route:

Interface Network Gateway Description

WAN 216.251.231.64/32 (our gateway) Palmetto

in other words, I added an explicit rule to reinforce what should be happening anyway. And now it works. What caused the original problem, I don't know…