Problem with OpenVPN clients and routing?

cosmoxl

An update on Saturday changed something and I basically need to know if things are buggy or if this is new, expected behavior. Prior to the update on Saturday everything was working well - the same as under 2.3.x.

I run two openvpn clients simultaneously. These are clients connecting to two different VPN services.

The first symptom of a problem I saw was that gateway monitoring couldn't ping an openvpn gateway. I'm seeing an error that TTL is exceeded. It seems always one of the two can't be pinged - perhaps whichever one connects second? The gateways are different of course: 10.30.0.1 and 10.201.0.1

Secondly, if I change something with an openvpn client setup upon attempt to reconnect it can't, citing a freebsd ifconfig error.

Neither of these has ever been a problem before. I understand that instability is part of the beta testing process. But, I'm not experienced enough to know if this is new, intended behavior or if I just need to be patient for bugs to get worked out.

Yes, I was more ambitious than I should have been installing 2.4 beta but I wanted to start testing OpenVPN 2.4. Thanks.

jimp

What version were you on before that update?

I'm not seeing anything in the changelog that looks relevant, and I haven't seen any behavior changes in OpenVPN on any of my 2.4 firewalls here.

cosmoxl

@jimp:

What version were you on before that update?

I'm not seeing anything in the changelog that looks relevant, and I haven't seen any behavior changes in OpenVPN on any of my 2.4 firewalls here.

I was on 2.4 beta for several days with no problem. Then after one of the updates things went wrong. I can't remember the exact build. :)

jimp

You'll need to provide a lot more details about what "went wrong", including specific error messages, log entries, etc, etc.

Also make sure that the update completed all the way. There is another bug we're tracking at the moment where in some rare cases the kernel gets updated but the rest of the OS isn't up-to-date, and a mismatched kernel and world can cause various unpredictable issues.

cosmoxl

@jimp:

You'll need to provide a lot more details about what "went wrong", including specific error messages, log entries, etc, etc.

Also make sure that the update completed all the way. There is another bug we're tracking at the moment where in some rare cases the kernel gets updated but the rest of the OS isn't up-to-date, and a mismatched kernel and world can cause various unpredictable issues.

I'm happy to provide more details. But, I'll need to be coached as to what data you want and perhaps how to get it.

If I recall correctly, this all started after I ran 'pkg upgrade -f' because the GUI update info was saying there was an update to the same version I was supposedly already running. So, I thought I would try to fix it. And here I am now…

All subsequent updates done via the GUI have seemed to proceed correctly. Currently I'm on 2.4.0.b.20170815.0703

jimp

Since you had to take that step of manually using pkg, to be absolutely certain it's not related to a mismatch of some kind, you should reinstall. Thankfully it's super quick to get back to your current setup on 2.4.

1. Backup your config for safety
2. Download and write out a fresh 2.4 snapshot install image
3. Boot the image and choose "Recover config.xml" and then pick your existing installation drive (it will read in your current config and copy it back post-install)
4. Continue through the install and then reboot

It will boot back up with your current configuration, reinstall packages if it needs to, and then you'll be up and running.

If you still have a problem, then start by going to the OpenVPN logs (Status > System Logs, OpenVPN tab) and copy/paste the log here. You can obfuscate your public IP addresses if needed.

cosmoxl

OK, I did the reinstall. Everything came up just right. Very nice.

The state of the system prior to reinstall was that 1 OpenVPN client was running and gateway monitoring was working.

Reboot after reinstall and that continued to work.

However, as soon as I start up a second OpenVPN client, the ability to ping the 1st stopped. Gateway monitoring is now working fine on the 2nd OVPN client.

From the shell I see this regarding OVPN client 1:

ovpnc1: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 20000
        options=80000 <linkstate>inet6 fe80::dacb:8aff:fe70:1374%ovpnc1 prefixlen 64 scopeid 0x7
        inet 10.30.0.2 --> 10.30.0.1  netmask 0xffff0000
        nd6 options=21 <performnud,auto_linklocal>groups: tun openvpn
        Opened by PID 23894</performnud,auto_linklocal></linkstate></up,pointopoint,running,multicast>

and if I try to ping 10.30.0.1, which I have manually set as the IP to monitor for gateway monitoring, I get:

ping 10.30.0.1
PING 10.30.0.1 (10.30.0.1): 56 data bytes
36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 e3e3   0 0000  01  01 0000 127.0.0.1  10.30.0.1

36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 fdbf   0 0000  01  01 0000 127.0.0.1  10.30.0.1

Really odd that it just worked prior to starting up OVPN client 2.

Below is ifconfig of my second OVPN client:

ovpnc3: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 20000
        options=80000 <linkstate>inet6 fe80::dacb:8aff:fe70:1374%ovpnc3 prefixlen 64 scopeid 0x8
        inet6 2001:db8:f0:b2::4 prefixlen 64
        inet 10.201.255.1 --> 10.201.0.1  netmask 0xffff0000
        nd6 options=21 <performnud,auto_linklocal>groups: tun openvpn
        Opened by PID 85118</performnud,auto_linklocal></linkstate></up,pointopoint,running,multicast>

cosmoxl

Another thing I haven't mentioned is that traffic through OVPN gateways that can't be pinged continues to flow. I just have to disable gateway monitoring action. So, it's not actually causing a problem. But, this knowledge might help somebody figure out the problem.

cosmoxl

When I make changes to an OVPN client, upon reconnection this is relevant part of the log:

Aug 15 20:49:22 	openvpn 	61193 	TUN/TAP device ovpnc1 exists previously, keep at program end
Aug 15 20:49:22 	openvpn 	61193 	TUN/TAP device /dev/tun1 opened
Aug 15 20:49:22 	openvpn 	61193 	do_ifconfig, tt->did_ifconfig_ipv6_setup=0
Aug 15 20:49:22 	openvpn 	61193 	/sbin/ifconfig ovpnc1 10.30.0.13 10.30.0.1 mtu 20000 netmask 255.255.0.0 up
Aug 15 20:49:22 	openvpn 	61193 	FreeBSD ifconfig failed: external program exited with error status: 1
Aug 15 20:49:22 	openvpn 	61193 	Exiting due to fatal error

This was never a problem with pfsense 2.2, 2.3, and my first few days of 2.4. It seems now routes aren't being flushed properly? Or the usage of the route that already exists doesn't work anymore? I don't see this problem if I'm running only 1 OVPN client.

JeGr

Aug 15 20:49:22 openvpn 61193 /sbin/ifconfig ovpnc1 10.30.0.13 10.30.0.1 mtu 20000 netmask 255.255.0.0 up

You sure that is correct? Is that happening more than once? Seems to me that the config is bonkers as an MTU of 20000 makes no sense to me?!

johnpoz

Yeah that mtu seems a bit high ;) and the mask as well.. /16 on a vpn interface?

Here example from my log for bringing up a vpn interface
/sbin/ifconfig ovpns2 10.0.200.1 10.0.200.2 mtu 1500 netmask 255.255.255.0 up

cosmoxl

Yes, I've set the MTU high based on some other reading I've done which indicated high MTU sped up encrypt/decrypt. I've used that for about a year now with no problem. Just for kicks I removed the tun-mtu 20000 directive and it does not fix the problems I'm having.

The 10.30.0.1 VPN is AirVPN, a quality, reputable VPN provider. What mask they push is what they push. :)

Are others running 2 openvpn clients with no problem and I'm the only one?

chpalmer

@cosmoxl:

Are others running 2 openvpn clients with no problem and I'm the only one?

Im running 6 servers here right now.. I have one machine with one server and one client.

All my tunnels are 10.10.1.x/30

johnpoz

"I've done which indicated high MTU sped up encrypt/decrypt"

What?? Where did you read such a thing?

cosmoxl

@johnpoz:

"I've done which indicated high MTU sped up encrypt/decrypt"

What?? Where did you read such a thing?

Some time ago I came across an article on some testing done on high throughput openvpn. I think this may have been it. https://community.openvpn.net/openvpn/wiki/Gigabit_Networks_Linux

johnpoz

"For a LAN-based setup this can work, but when handling various types of remote users (road warriors, cable modem users, etc) this is not always a possibility. "

So this is a LAN based setup?

cosmoxl

@johnpoz:

"For a LAN-based setup this can work, but when handling various types of remote users (road warriors, cable modem users, etc) this is not always a possibility. "

So this is a LAN based setup?

no, but in the testing I've done I have seen some small improvement in performance with that setting.

Anyway, this is getting off topic. As I've tried to reiterate, this setting I've used for quite some time. It doesn't cause the problem nor does its removal fix the problem.

johnpoz

src 127.0.0.1 doesn't seem right.. Shouldn't the source be the be Your IP on this side of the tunnel..

cosmoxl

@johnpoz:

src 127.0.0.1 doesn't seem right.. Shouldn't the source be the be Your IP on this side of the tunnel..

That is from the command line of the firewall which has a NAT rule to access all VPN tunnels. This should simulate what gateway monitoring does, right?

The NAT outbound rules allow the firewall, 127.0.0.0/8, out to each VPN interface.

Just for testing purposes I made all those NAT rules as "this firewall" out to each interface, instead of 127.0.0.0/8

The same problem persists.

As soon as I even enable the gateway (system_gateways.php) of another openvpn client (I didn't even start the tunnel), I'm suddenly unable to ping the other side of the VPN tunnel that's already up.

ovpnc1: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 20000
        options=80000 <linkstate>inet6 fe80::dacb:8aff:fe70:1374%ovpnc1 prefixlen 64 scopeid 0x7
        inet 10.30.0.13 --> 10.30.0.1  netmask 0xffff0000
        nd6 options=21 <performnud,auto_linklocal>groups: tun openvpn
        Opened by PID 86920
[2.4.0-BETA][removed]/root: ping 10.30.0.1
PING 10.30.0.1 (10.30.0.1): 56 data bytes
64 bytes from 10.30.0.1: icmp_seq=0 ttl=64 time=21.782 ms
64 bytes from 10.30.0.1: icmp_seq=1 ttl=64 time=21.251 ms
^C
--- 10.30.0.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 21.251/21.517/21.782/0.266 ms
[2.4.0-BETA][removed]/root: ping 10.30.0.1
PING 10.30.0.1 (10.30.0.1): 56 data bytes
36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 781b   0 0000  01  01 0000 127.0.0.1  10.30.0.1

36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 631d   0 0000  01  01 0000 127.0.0.1  10.30.0.1</performnud,auto_linklocal></linkstate></up,pointopoint,running,multicast>

cosmoxl

I went back to 2.3.4p1 and I have no more problems. Also please remember I didn't have problems for several days on 2.4. One of the 2.4 updates broke things. I hope it can be found and fixed.