OpenVPN keeps dropping out



  • Hello !!!  :)

    I have a problem where users VPN drops out, this happens every 3-4 hours and lasts a few seconds before they're reconnected. The rest of the firewall continues to work fine, it only appears to be the VPN.

    This affects 5 users connected from 3 different locations.

    The pfSense version is 2.1.3.

    Can someone take a look and point me in the right direction please ? :)

    This is the system/general logs

    Jun 24 06:56:52	check_reload_status: updating dyndns GW_WAN_2
    Jun 24 06:56:52	check_reload_status: Restarting ipsec tunnels
    Jun 24 06:56:52	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Jun 24 06:56:52	check_reload_status: Reloading filter
    Jun 24 06:56:54	php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use GW_WAN_2.
    Jun 24 06:56:54	php: rc.openvpn: OpenVPN: Resync server1
    Jun 24 06:56:54	kernel: ovpns1: link state changed to DOWN
    Jun 24 06:56:54	check_reload_status: Reloading filter
    Jun 24 06:56:55	kernel: ovpns1: link state changed to UP
    Jun 24 06:56:55	check_reload_status: rc.newwanip starting ovpns1
    Jun 24 06:56:57	php: rc.newwanip: rc.newwanip: Informational is starting ovpns1.
    Jun 24 06:56:57	php: rc.newwanip: rc.newwanip: on (IP address: 10.0.0.1) (interface: []) (real interface: ovpns1).
    Jun 24 06:56:57	php: rc.newwanip: pfSense package system has detected an ip change -> 10.0.0.1 ... Restarting packages.
    Jun 24 06:56:57	check_reload_status: Starting packages
    Jun 24 06:56:59	php: rc.start_packages: Restarting/Starting all packages.
    Jun 24 06:57:37	check_reload_status: updating dyndns GW_WAN_2
    Jun 24 06:57:37	check_reload_status: Restarting ipsec tunnels
    Jun 24 06:57:37	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Jun 24 06:57:37	check_reload_status: Reloading filter
    Jun 24 06:57:39	php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use GW_WAN_2.
    Jun 24 06:57:40	php: rc.openvpn: OpenVPN: Resync server1
    Jun 24 06:57:40	kernel: ovpns1: link state changed to DOWN
    Jun 24 06:57:40	check_reload_status: Reloading filter
    Jun 24 06:57:40	kernel: ovpns1: link state changed to UP
    Jun 24 06:57:40	check_reload_status: rc.newwanip starting ovpns1
    Jun 24 06:57:42	php: rc.newwanip: rc.newwanip: Informational is starting ovpns1.
    Jun 24 06:57:42	php: rc.newwanip: rc.newwanip: on (IP address: 10.0.0.1) (interface: []) (real interface: ovpns1).
    Jun 24 06:57:42	php: rc.newwanip: pfSense package system has detected an ip change -> 10.0.0.1 ... Restarting packages.
    Jun 24 06:57:42	check_reload_status: Starting packages
    Jun 24 06:57:45	php: rc.start_packages: Restarting/Starting all packages.
    Jun 24 06:57:54	kernel: pid 51257 (ntpd), uid 0: exited on signal 11 (core dumped)
    

    This is the OpenVPN logs

    Jun 24 06:56:54	openvpn[28777]: event_wait : Interrupted system call (code=4)
    Jun 24 06:56:54	openvpn[28777]: /usr/local/sbin/ovpn-linkdown ovpns1 1500 1558 10.0.0.1 10.0.0.2 init
    Jun 24 06:56:54	openvpn[28777]: SIGTERM[hard,] received, process exiting
    Jun 24 06:56:55	openvpn[2918]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014
    Jun 24 06:56:55	openvpn[2918]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
    Jun 24 06:56:55	openvpn[2918]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file
    Jun 24 06:56:55	openvpn[2918]: TUN/TAP device ovpns1 exists previously, keep at program end
    Jun 24 06:56:55	openvpn[2918]: TUN/TAP device /dev/tun1 opened
    Jun 24 06:56:55	openvpn[2918]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0
    Jun 24 06:56:55	openvpn[2918]: /sbin/ifconfig ovpns1 10.0.0.1 10.0.0.2 mtu 1500 netmask 255.255.255.255 up
    Jun 24 06:56:55	openvpn[2918]: /usr/local/sbin/ovpn-linkup ovpns1 1500 1558 10.0.0.1 10.0.0.2 init
    Jun 24 06:56:55	openvpn[4001]: UDPv4 link local (bound): [AF_INET]xx.xx.xxx.xx:xxxx
    Jun 24 06:56:55	openvpn[4001]: UDPv4 link remote: [undef]
    Jun 24 06:56:55	openvpn[4001]: Initialization Sequence Completed
    Jun 24 06:57:40	openvpn[4001]: event_wait : Interrupted system call (code=4)
    Jun 24 06:57:40	openvpn[4001]: /usr/local/sbin/ovpn-linkdown ovpns1 1500 1558 10.0.0.1 10.0.0.2 init
    Jun 24 06:57:40	openvpn[4001]: SIGTERM[hard,] received, process exiting
    Jun 24 06:57:40	openvpn[37329]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014
    Jun 24 06:57:40	openvpn[37329]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
    Jun 24 06:57:40	openvpn[37329]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file
    Jun 24 06:57:40	openvpn[37329]: TUN/TAP device ovpns1 exists previously, keep at program end
    Jun 24 06:57:40	openvpn[37329]: TUN/TAP device /dev/tun1 opened
    Jun 24 06:57:40	openvpn[37329]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0
    Jun 24 06:57:40	openvpn[37329]: /sbin/ifconfig ovpns1 10.0.0.1 10.0.0.2 mtu 1500 netmask 255.255.255.255 up
    Jun 24 06:57:40	openvpn[37329]: /usr/local/sbin/ovpn-linkup ovpns1 1500 1558 10.0.0.1 10.0.0.2 init
    Jun 24 06:57:40	openvpn[38397]: UDPv4 link local (bound): [AF_INET]xx.xx.xx.xx:xxxx
    Jun 24 06:57:40	openvpn[38397]: UDPv4 link remote: [undef]
    Jun 24 06:57:40	openvpn[38397]: Initialization Sequence Completed
    Jun 24 06:57:54	openvpn: user 'xxx' authenticated
    Jun 24 06:57:54	openvpn[38397]: xx.xx.xx.xx:xxxx [xxx] Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:xxxx
    Jun 24 06:57:54	openvpn[38397]: xxx/xx.xx.xx.xx:xxxx MULTI_sva: pool returned IPv4=10.0.0.6, IPv6=(Not enabled)
    Jun 24 06:57:56	openvpn: user 'xxx' authenticated
    Jun 24 06:57:56	openvpn[38397]: xx.xx.xx.xx:xxxxx [xxx] Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:xxxx
    Jun 24 06:57:56	openvpn[38397]: xxx/xx.xx.xx.xx:xxxxx MULTI_sva: pool returned IPv4=10.0.0.10, IPv6=(Not enabled)
    Jun 24 06:57:56	openvpn[38397]: xxx/xx.xx.xx.xx:xxxxx send_push_reply(): safe_cap=940
    Jun 24 06:57:59	openvpn[38397]: xxx/xx.xx.xx.xx:xxxxx send_push_reply(): safe_cap=940
    Jun 24 07:57:54	openvpn: user 'xxx' authenticated
    Jun 24 07:57:56	openvpn: user 'xxx' authenticated
    Jun 24 08:30:21	openvpn: user 'xxx.xxx' authenticated
    

    This is the client logs

    Tue Jun 24 06:58:06 2014 [vpn.xxxxxxx.xxx] Inactivity timeout (--ping-restart), restarting
    Tue Jun 24 06:58:06 2014 SIGUSR1[soft,ping-restart] received, process restarting
    Tue Jun 24 06:58:08 2014 UDPv4 link local (bound): [undef]
    Tue Jun 24 06:58:08 2014 UDPv4 link remote: [AF_INET]31.24.109.65:1194
    Tue Jun 24 06:58:09 2014 [vpn.xxxxxxxx.xxx Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:xxxx
    Tue Jun 24 06:58:11 2014 Preserving previous TUN/TAP instance: Local Area Connection
    Tue Jun 24 06:58:11 2014 Initialization Sequence Completed
    


  • i'm assuming the pfsense is a client. If so i've more to add to the picture. If you have 2 openvpn clients both will drop at the same time, and if both have "retry server box ticked on the gui", both will auto restart.

    i can consistently get the situation to occur when in one vpn i slam a large amount of data across the vpn.

    I don't see the problem with 2.0.1, i do see the problem with 2.1.2 and 2.1.3.
    I thought it was due to the alix 2.1.2 board utilizing 100% system when it would happen.  Yesterday I replaced the alix  with APU 2.1.3 only using 30% of the system, today i find a post with the same problem, which persisted in spite of upgrading the horse power of the box.

    it seems that when the line jams with data, the keep alive is thinking the line is down.

    the data being pushed is a vnc remote desktop.. i get 1/3 to 2/3 of the screen across before the reset occurs. I don't see it when the bandwidth is able to hit 100m. I'm consistently seeing it when the connection has 7m max and it takes several second to populate the remote viewer.



  • check https://redmine.pfsense.org/issues/3669

    might or might not be related … that will cause frequent openvpn-restarts. (test by disabling gateway-monitoring on all openvpn instances)



  • heper you are correct. i disabled the gateway monitoring and bingo.. functional. :)



  • this should be fixed in 2.1.4-release (should be out any day now)



  • Thanks, I've now disabled the gateway monitoring and will monitor and let you know.

    FYI this is what the gateway logs show

    Jun 20 19:24:33	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 20 20:18:15	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 20 20:18:24	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 20 20:44:12	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 20 20:44:17	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 20 22:35:17	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 20 22:35:26	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 01:18:29	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 21 01:19:12	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 21 03:44:40	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 03:44:51	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 06:04:14	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 06:04:31	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 08:11:06	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 08:11:17	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 10:11:00	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 10:11:09	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 12:22:11	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 12:22:21	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 17:43:21	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 17:43:30	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 17:44:06	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 17:44:15	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 21 21:56:27	apinger: Starting Alarm Pinger, apinger(19495)
    Jun 21 22:07:40	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** down ***
    Jun 21 22:07:40	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** down ***
    Jun 22 02:38:48	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 22 02:39:03	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 22 10:10:41	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 22 10:10:48	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 22 15:25:19	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 22 15:25:28	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 22 16:57:57	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 22 16:58:07	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 22 18:50:44	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 22 18:51:27	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 22 23:53:24	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 22 23:54:16	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 00:37:40	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 00:38:14	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 06:52:05	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 06:52:13	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 09:38:50	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 09:40:00	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 12:14:19	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 12:14:35	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 12:14:40	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 12:15:31	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 12:33:56	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 12:34:10	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 14:49:35	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 14:49:45	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 16:57:56	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** down ***
    Jun 23 16:57:58	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** down ***
    Jun 23 19:12:47	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 19:13:07	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 20:19:31	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:19:46	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:21:55	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:22:17	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:23:23	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:23:35	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:23:45	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 20:24:10	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:24:21	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:25:12	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 23 20:25:28	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 20:25:44	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 23:20:50	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 23 23:21:00	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 00:56:06	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 24 00:56:49	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 24 01:56:43	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 01:56:51	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 02:06:24	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 02:06:33	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 03:15:03	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 24 03:15:24	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 24 06:56:41	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 24 06:57:27	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 24 10:23:01	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 10:23:08	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 11:58:16	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 11:58:26	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 16:40:05	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 16:40:13	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 19:30:09	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 19:30:20	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 19:35:46	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 19:35:58	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 21:47:07	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 24 21:47:16	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 25 01:50:57	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 25 01:51:40	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** loss ***
    Jun 25 08:00:36	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 25 08:00:46	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 25 08:25:46	apinger: ALARM: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 25 08:25:54	apinger: alarm canceled: GW_WAN_2(xx.xx.xx.xx) *** delay ***
    Jun 25 09:25:24	apinger: SIGHUP received, reloading configuration.
    Jun 25 09:25:24	apinger: No usable targets found, exiting
    


  • I thought I'd best post an update:

    After turning off the gateway monitoring I monitored the VPN connectivity for around 48 hours, it dropped maybe 3 packets the whole time but the VPN stayed connected.

    Thanks.


Log in to reply