OpenVPN broken since pfSense 2.1.1
-
Great, we've reached step 1!
My suggestion would be to let it run a day or so (until you're happy it's OK), then try and break it again ;)
Actually, the previous post from cmb:
@cmb:
If you don't have any gateway alarms, there is a potential second cause I just fixed. Edit /etc/rc.newwanip and find where it has curnwanip in there. Replace that with curwanip (just remove the n), save, might want to reboot afterwards just to be 100% sure nothing is using the old code. Be careful when editing any code like that. That made things think your WAN IP had changed in cases where it hadn't, so it did things like restart VPNs where it was unnecessary.
Looks veeery much like the real culprit to your (and perhaps others' problems). That being said, I think it's important to keep your changes down to one at a time. So my suggestion is to leave it run with the "latency fix" for tonight/tomorrow (whatever you feel "proves" the fix) and then back the latency fix out (reset the values back to nothing/default).
Then apply the "rc.newwanip" fix and see if the problem stays away. If so I think that would be a great indication that you've isolated out an insidious little bug :-\
Perhaps cmb (or someone appropriate) could be talked into turning this into a patch?
Thanks for all the work on this guys :)
-
Last night's connections were definitely improved… but still not perfect.
Gateway log
May 20 23:56:37 apinger: ALARM: DSL1GW([REDACTED]) *** DSL1GWdelay *** May 20 23:56:52 apinger: alarm canceled: DSL1GW([REDACTED]) *** DSL1GWdelay *** May 20 23:58:19 apinger: alarm DSL1GW([REDACTED])canceled: WAN_DHCP([WAN-GW REDACTED]) *** WAN_DHCPdelay *** May 20 23:58:21 apinger: alarm DSL1GW([REDACTED])canceled: WAN_DHCP([WAN-GW REDACTED]) *** delay *** May 20 23:59:47 apinger: ALARM: DSL1GW([REDACTED]) *** DSL1GWdelay *** May 21 00:00:04 apinger: alarm canceled: DSL1GW([REDACTED]) *** DSL1GWdelay ***
OpenVPN log
May 20 22:54:19 openvpn[92487]: VPN-User/[REMOTE-VPN]:63582 send_push_reply(): safe_cap=940 May 20 23:54:23 openvpn: user 'VPN-User' authenticated May 20 23:56:49 openvpn[92487]: event_wait : Interrupted system call (code=4) May 20 23:56:49 openvpn[92487]: /usr/local/sbin/ovpn-linkdown ovpns1 1500 1558 192.168.129.1 192.168.129.2 init May 20 23:56:49 openvpn[92487]: SIGTERM[hard,] received, process exiting May 20 23:56:49 openvpn[36824]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 May 20 23:56:49 openvpn[36824]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts May 20 23:56:49 openvpn[36824]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file May 20 23:56:49 openvpn[36824]: TUN/TAP device ovpns1 exists previously, keep at program end May 20 23:56:49 openvpn[36824]: TUN/TAP device /dev/tun1 opened May 20 23:56:49 openvpn[36824]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 May 20 23:56:49 openvpn[36824]: /sbin/ifconfig ovpns1 192.168.129.1 192.168.129.2 mtu 1500 netmask 255.255.255.255 up May 20 23:56:50 openvpn[36824]: /usr/local/sbin/ovpn-linkup ovpns1 1500 1558 192.168.129.1 192.168.129.2 init May 20 23:56:50 openvpn[38144]: UDPv4 link local (bound): [AF_INET][DSL1-IP]:1194 May 20 23:56:50 openvpn[38144]: UDPv4 link remote: [undef] May 20 23:56:50 openvpn[38144]: Initialization Sequence Completed May 20 23:57:04 openvpn[38144]: event_wait : Interrupted system call (code=4) May 20 23:57:04 openvpn[38144]: /usr/local/sbin/ovpn-linkdown ovpns1 1500 1558 192.168.129.1 192.168.129.2 init May 20 23:57:04 openvpn[38144]: SIGTERM[hard,] received, process exiting May 20 23:57:05 openvpn[28211]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 May 20 23:57:05 openvpn[28211]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts May 20 23:57:05 openvpn[28211]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file May 20 23:57:05 openvpn[28211]: TUN/TAP device ovpns1 exists previously, keep at program end May 20 23:57:05 openvpn[28211]: TUN/TAP device /dev/tun1 opened May 20 23:57:05 openvpn[28211]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 May 20 23:57:05 openvpn[28211]: /sbin/ifconfig ovpns1 192.168.129.1 192.168.129.2 mtu 1500 netmask 255.255.255.255 up May 20 23:57:05 openvpn[28211]: /usr/local/sbin/ovpn-linkup ovpns1 1500 1558 192.168.129.1 192.168.129.2 init May 20 23:57:05 openvpn[30193]: UDPv4 link local (bound): [AF_INET][DSL1-IP]:1194 May 20 23:57:05 openvpn[30193]: UDPv4 link remote: [undef] May 20 23:57:05 openvpn[30193]: Initialization Sequence Completed May 20 23:57:54 openvpn: user 'VPN-User' authenticated May 20 23:57:54 openvpn[30193]: [REMOTE-VPN]:50268 [VPN-User] Peer Connection Initiated with [AF_INET][REMOTE-VPN]:50268 May 20 23:57:54 openvpn[30193]: VPN-User/[REMOTE-VPN]:50268 MULTI_sva: pool returned IPv4=192.168.129.6, IPv6=(Not enabled) May 20 23:57:56 openvpn[30193]: VPN-User/[REMOTE-VPN]:50268 send_push_reply(): safe_cap=940 May 20 23:59:59 openvpn[30193]: event_wait : Interrupted system call (code=4) May 20 23:59:59 openvpn[30193]: /usr/local/sbin/ovpn-linkdown ovpns1 1500 1558 192.168.129.1 192.168.129.2 init May 20 23:59:59 openvpn[30193]: SIGTERM[hard,] received, process exiting May 21 00:00:00 openvpn[59943]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 May 21 00:00:00 openvpn[59943]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts May 21 00:00:00 openvpn[59943]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file May 21 00:00:00 openvpn[59943]: TUN/TAP device ovpns1 exists previously, keep at program end May 21 00:00:00 openvpn[59943]: TUN/TAP device /dev/tun1 opened May 21 00:00:00 openvpn[59943]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 May 21 00:00:00 openvpn[59943]: /sbin/ifconfig ovpns1 192.168.129.1 192.168.129.2 mtu 1500 netmask 255.255.255.255 up May 21 00:00:00 openvpn[59943]: /usr/local/sbin/ovpn-linkup ovpns1 1500 1558 192.168.129.1 192.168.129.2 init May 21 00:00:00 openvpn[61620]: UDPv4 link local (bound): [AF_INET][DSL1-IP]:1194 May 21 00:00:00 openvpn[61620]: UDPv4 link remote: [undef] May 21 00:00:00 openvpn[61620]: Initialization Sequence Completed
Since it is catching issues on the regular WAN as well, I think I need to apply the /etc/rc.newwanip fix and see what happens. I think my few problems were just internet latency issues (ISP, traffic, whatever)… but I'd like to try the other solution separately to see the result.
Thanks everyone for all your input and help.
-nb -
Hi,
I had similar issues. Adjusting the gateway monitoring issues seem to work for me as well.
Thanks,
Sean -
@cmb:
If you don't have any gateway alarms, there is a potential second cause I just fixed. Edit /etc/rc.newwanip and find where it has curnwanip in there. Replace that with curwanip (just remove the n), save, might want to reboot afterwards just to be 100% sure nothing is using the old code. Be careful when editing any code like that. That made things think your WAN IP had changed in cases where it hadn't, so it did things like restart VPNs where it was unnecessary.
Early indications (meaning: I applied this fix and rebooted, so exactly one data point) are that this fixed the EADDRINUSE bug that was causing my Openvpn (and that of others) to show up as "not running" in the gui even though it was (because somehow the pid in the pid file was wrong).
-
…what really is unpleasant is the time till the pfSense starts to use the new IP from DynDNS to re-establish the tunnel after a new IP has been given by the provider.
Right now, the new IP was at the DynDNS service at 21:49:18, but the box on the other side tried 6 times with the old IP to establish the tunnel and only at the 7th time (at 21:54:24, i.e. about 5 minutes after the update of the IP at the DynDNS) the tunnel came back...
-
@chemlud:
…what really is unpleasant is the time till the pfSense starts to use the new IP from DynDNS to re-establish the tunnel after a new IP has been given by the provider.
Right now, the new IP was at the DynDNS service at 21:49:18, but the box on the other side tried 6 times with the old IP to establish the tunnel and only at the 7th time (at 21:54:24, i.e. about 5 minutes after the update of the IP at the DynDNS) the tunnel came back...
OpenVPN will attempt reconnect once a minute, and will succeed as soon as its DNS lookup obtains the new IP. That's dependent on the TTL of your dyndns provider, and whether or not the DNS servers you're using obey the TTL to being with. OpenVPN will successfully reconnect within 1 minute at most of your DNS record's TTL expiring. From that, it sounds like you have a 5 minute TTL on your dyndns.
The best case scenario is the DNS TTL plus up to 1 minute. Cases where up to a ~2 minute outage in the case of an IP change server-side aren't acceptable shouldn't be using dynamic IPs on the server side (doesn't matter client-side). You can get it faster than 6 minutes for sure, but potentially up to 2 minutes is the best case scenario.
-
Many thanks for the detailed explanation, the TTL is crucial, I understand that fully… When I have a little time I will switch server and client for some tunnels... ;)
-
Update: Yesterday tunnels were stable for 24 hrs (since the last IP change on one tunnel end), no mysterious restarts of packages.
But 3 minutes after successful re-connect yesterday evening (change of IP on the other end of the tunnel), suddenly the Syslog has a
"check_reload_status: updating dyndns WAN_DHCP"
although the IP has NOT changed for this connection and the same second the box restarts IPsec/openVPN tunnels, reloads filters, restarts snort, the full program…
ADD: It was an apinger alarm delay, again…. :-\
Is it possible that there is an additional bug in the dyndns-updating script?
At another pfSense, where the IP does not change frequently I get a daily:
"php:rc.dyndns.update: phpDynDNS (): No change in my IP address and/or 25 days not passed. Not updating dynamic DNS entry"
in the Syslog.
![restart tunnels packages 22.05.2014.JPG](/public/imported_attachments/1/restart tunnels packages 22.05.2014.JPG)
![restart tunnels packages 22.05.2014.JPG_thumb](/public/imported_attachments/1/restart tunnels packages 22.05.2014.JPG_thumb) -
@cmb:
If you don't have any gateway alarms, there is a potential second cause I just fixed. Edit /etc/rc.newwanip and find where it has curnwanip in there. Replace that with curwanip (just remove the n), save, might want to reboot afterwards just to be 100% sure nothing is using the old code. Be careful when editing any code like that. That made things think your WAN IP had changed in cases where it hadn't, so it did things like restart VPNs where it was unnecessary.
I just tried reverting the latency settings for the gateway back to default and applying the change to /etc/rc/newwanip
Looks like that setting alone is not enough to fix the problem.
I'll add the latency settings back to the gateway.
-nb![DSL1 still choking.jpg](/public/imported_attachments/1/DSL1 still choking.jpg)
![DSL1 still choking.jpg_thumb](/public/imported_attachments/1/DSL1 still choking.jpg_thumb) -
It looks like i'm in the same boat as you guys … Keep loosing my VPN tunnel ... I'll try to change the latency settings + edit that file...
Will see how it goes...
-
Just to add something… It looks like the tunnel drop each hours for me ... When it drops, I need to restart the OpenVPN instance ... I'll post some logs later, Maybe I have another issue....
-
Hello!
Same for me - if I remember right - the problem started with 2.1.1 - OpenVPN-Tunnles crash on heavy throughput.
openvpn[13066]: event_wait : Interrupted system call (code=4)
Bye,
eweri -
I'm still having occasional problems with bandwidth over OpenVPN links, even after applying the update to rc.newwanip and increasing my latency thresholds to 1000/2000ms.
Ugh.
I'll have to start digging around through logs again next week.
-
Check if you have the same MTU on both ends.
Maybe you should lower MTU a bit, perhaps the route between the two ends caps the package size, and fragmentation occurs, and it gets unhandled properly. -
What I noted earlier in this thread, and here:
https://forum.pfsense.org/index.php?topic=76975.msg426742#msg426742will fix the "event_wait : Interrupted system call". specifics discussed there.
-
I believe I'm also having this issue. I was seeing the same Interrupt messages until I put in the latency fix mentioned earlier. Now I see the below in the logs. What I don't understand is why are both of my OpenVPN Client Gateways showing an IP address (that they should get from the OpenVPN server), and yet, both gateways show as down under STATUS>OpenVPN? I'm running 2.1.4 i386. Thanks!
Aug 3 18:27:36 openvpn[94132]: UDPv4 link remote: [AF_INET]OpenVPN_Server:1194 Aug 3 18:27:36 openvpn[94132]: UDPv4 link local (bound): [AF_INET]WAN_IP Aug 3 18:27:36 openvpn[94132]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Aug 3 18:27:36 openvpn[94132]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Aug 3 18:27:34 openvpn[67441]: UDPv4 link remote: [AF_INET]OpenVPN_Server:1194 Aug 3 18:27:34 openvpn[67441]: UDPv4 link local (bound): [AF_INET]WAN_IP Aug 3 18:27:34 openvpn[67441]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Aug 3 18:27:34 openvpn[67441]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Aug 3 18:27:34 openvpn[94132]: SIGUSR1[soft,ping-restart] received, process restarting Aug 3 18:27:34 openvpn[94132]: [UNDEF] Inactivity timeout (--ping-restart), restarting Aug 3 18:27:32 openvpn[67441]: SIGUSR1[soft,ping-restart] received, process restarting Aug 3 18:27:32 openvpn[67441]: [UNDEF] Inactivity timeout (--ping-restart), restarting