OpenVPN continues to work even after it's terminated due to fatal error
-
I am using a just installed pfSense version 2.1.2-RELEASE (amd64). I'm using the OpenVPN client in pfSense to connect to my VPN provider Private Internet Access (PIA). Earlier this evening I noticed that my VPN connection is often dropping with message "Inactivity timeout (–ping-restart), restarting". Any way to stop that?
Also the last time it happened OpenVPN exited with a fatal error. The really weird thing is I'm still connected to the VPN, and my LAN clients are still connected to my VPN provider even though according to the logs and the web interface the connection is down. I'm 100% certain it's up and running, I'm looking at it right this moment and I've got upstream and downstream traffic working. I've refreshed the log multiple times but it still shows as down. Also Status > Services and Status > OpenVPN both show OpenVPN as not running. And yet… it is? Is this a bug or am I just uninformed? Can someone explain this to me? Thanks.
Apr 17 19:07:36 openvpn[63307]: [server] Inactivity timeout (--ping-restart), restarting Apr 17 19:07:36 openvpn[63307]: SIGUSR1[soft,ping-restart] received, process restarting Apr 17 19:07:38 openvpn[63307]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Apr 17 19:07:38 openvpn[63307]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 19:07:38 openvpn[63307]: UDPv4 link local (bound): [AF_INET]#LAN-IP# Apr 17 19:07:38 openvpn[63307]: UDPv4 link remote: [AF_INET]75.126.39.103:1194 Apr 17 19:07:39 openvpn[63307]: [server] Peer Connection Initiated with [AF_INET]75.126.39.103:1194 Apr 17 19:07:41 openvpn[63307]: Preserving previous TUN/TAP instance: ovpnc1 Apr 17 19:07:41 openvpn[63307]: NOTE: Pulled options changed on restart, will need to close and reopen TUN/TAP device. Apr 17 19:07:41 openvpn[63307]: /usr/local/sbin/ovpn-linkdown ovpnc1 1500 1542 10.139.1.6 10.139.1.5 init Apr 17 19:07:42 openvpn[63307]: TUN/TAP device ovpnc1 exists previously, keep at program end Apr 17 19:07:42 openvpn[63307]: TUN/TAP device /dev/tun1 opened Apr 17 19:07:42 openvpn[63307]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 Apr 17 19:07:42 openvpn[63307]: /sbin/ifconfig ovpnc1 10.165.1.6 10.165.1.5 mtu 1500 netmask 255.255.255.255 up Apr 17 19:07:42 openvpn[63307]: /usr/local/sbin/ovpn-linkup ovpnc1 1500 1542 10.165.1.6 10.165.1.5 init Apr 17 19:07:42 openvpn[63307]: Initialization Sequence Completed Apr 17 20:41:49 openvpn[63307]: [server] Inactivity timeout (--ping-restart), restarting Apr 17 20:41:49 openvpn[63307]: SIGUSR1[soft,ping-restart] received, process restarting Apr 17 20:41:51 openvpn[63307]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Apr 17 20:41:51 openvpn[63307]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 20:41:51 openvpn[63307]: UDPv4 link local (bound): [AF_INET]#LAN-IP# Apr 17 20:41:51 openvpn[63307]: UDPv4 link remote: [AF_INET]50.97.94.50:1194 Apr 17 20:41:55 openvpn[63307]: [server] Peer Connection Initiated with [AF_INET]50.97.94.50:1194 Apr 17 20:41:57 openvpn[63307]: Preserving previous TUN/TAP instance: ovpnc1 Apr 17 20:41:57 openvpn[63307]: NOTE: Pulled options changed on restart, will need to close and reopen TUN/TAP device. Apr 17 20:41:57 openvpn[63307]: /usr/local/sbin/ovpn-linkdown ovpnc1 1500 1542 10.165.1.6 10.165.1.5 init Apr 17 20:41:58 openvpn[63307]: TUN/TAP device ovpnc1 exists previously, keep at program end Apr 17 20:41:58 openvpn[63307]: TUN/TAP device /dev/tun1 opened Apr 17 20:41:58 openvpn[63307]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 Apr 17 20:41:58 openvpn[63307]: /sbin/ifconfig ovpnc1 10.164.1.6 10.164.1.5 mtu 1500 netmask 255.255.255.255 up Apr 17 20:41:58 openvpn[63307]: /usr/local/sbin/ovpn-linkup ovpnc1 1500 1542 10.164.1.6 10.164.1.5 init Apr 17 20:41:58 openvpn[63307]: Initialization Sequence Completed Apr 17 20:42:02 openvpn[70822]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 Apr 17 20:42:02 openvpn[70822]: WARNING: file '/etc/pia/userpass.txt' is group or others accessible Apr 17 20:42:02 openvpn[70822]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Apr 17 20:42:02 openvpn[70822]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 20:42:02 openvpn[71343]: UDPv4 link local (bound): [AF_INET]#LAN-IP# Apr 17 20:42:02 openvpn[71343]: UDPv4 link remote: [AF_INET]75.126.39.103:1194 Apr 17 20:42:02 openvpn[71343]: event_wait : Interrupted system call (code=4) Apr 17 20:42:02 openvpn[71343]: SIGTERM[hard,] received, process exiting Apr 17 20:42:02 openvpn[72426]: OpenVPN 2.3.2 amd64-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 Apr 17 20:42:02 openvpn[72426]: WARNING: file '/etc/pia/userpass.txt' is group or others accessible Apr 17 20:42:02 openvpn[72426]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Apr 17 20:42:02 openvpn[72426]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 20:42:02 openvpn[72478]: UDPv4 link local (bound): [AF_INET]#LAN-IP# Apr 17 20:42:02 openvpn[72478]: UDPv4 link remote: [AF_INET]50.97.94.15:1194 Apr 17 20:42:02 openvpn[72478]: WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Apr 17 20:42:03 openvpn[72478]: [server] Peer Connection Initiated with [AF_INET]50.97.94.15:1194 Apr 17 20:42:05 openvpn[72478]: TUN/TAP device ovpnc1 exists previously, keep at program end Apr 17 20:42:05 openvpn[72478]: Cannot open TUN/TAP dev /dev/tun1: Device busy (errno=16) Apr 17 20:42:05 openvpn[72478]: Exiting due to fatal error
-
That was an attempt to get a second instance of the same OpenVPN client running, it failed for that reason and the original continued on. Were you trying to manually start it or something?
-
@cmb:
That was an attempt to get a second instance of the same OpenVPN client running, it failed for that reason and the original continued on. Were you trying to manually start it or something?
No I'm sure I didn't do anything. I went in the browser to get the log so that I could post about the inactivity timeout issue and that's when I saw it had exited due to a fatal error and its status was down. I only have one OpenVPN client profile and it's for Private Internet Access.
-
Anything in the system log from around the time it happened?
-
@cmb:
Anything in the system log from around the time it happened?
Yes. I assume you mean System > General ? There was a link state down a few minutes before it happened.
Apr 17 19:04:47 kernel: ovpnc1: link state changed to DOWN Apr 17 19:04:47 check_reload_status: Reloading filter Apr 17 19:04:48 check_reload_status: updating dyndns WANGW Apr 17 19:04:48 check_reload_status: Restarting ipsec tunnels Apr 17 19:04:48 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 19:04:48 check_reload_status: Reloading filter Apr 17 19:04:48 check_reload_status: updating dyndns WANGW Apr 17 19:04:48 check_reload_status: Restarting ipsec tunnels Apr 17 19:04:48 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 19:04:48 kernel: ovpnc1: link state changed to UP Apr 17 19:04:48 check_reload_status: rc.newwanip starting ovpnc1 Apr 17 19:04:49 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 19:04:50 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WANGW. Apr 17 19:04:50 php: rc.openvpn: OpenVPN: Resync client1 Private Internet Access Apr 17 19:04:53 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WANGW. Apr 17 19:04:53 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 19:04:53 kernel: ovpnc1: link state changed to DOWN Apr 17 19:04:54 php: rc.openvpn: OpenVPN: Resync client1 Private Internet Access Apr 17 19:04:54 php: rc.newwanip: rc.newwanip: Informational is starting ovpnc1. Apr 17 19:04:54 php: rc.newwanip: Interface does not have an IP address, nothing to do. Apr 17 19:04:58 kernel: ovpnc1: link state changed to UP Apr 17 19:04:58 check_reload_status: rc.newwanip starting ovpnc1 Apr 17 19:04:58 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 19:05:00 php: rc.newwanip: rc.newwanip: Informational is starting ovpnc1. Apr 17 19:05:00 php: rc.newwanip: rc.newwanip: on (IP address: 10.139.1.6) (interface: opt1) (real interface: ovpnc1). Apr 17 19:05:05 php: rc.newwanip: Creating rrd update script Apr 17 19:05:07 php: rc.newwanip: pfSense package system has detected an ip change 10.171.1.6 -> 10.139.1.6 ... Restarting packages. Apr 17 19:05:07 check_reload_status: Starting packages Apr 17 19:05:07 check_reload_status: Reloading filter Apr 17 19:05:09 php: rc.start_packages: Restarting/Starting all packages. Apr 17 19:05:09 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 19:05:20 check_reload_status: updating dyndns OPT1_VPNV4 Apr 17 19:05:20 check_reload_status: Restarting ipsec tunnels Apr 17 19:05:20 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 19:05:20 check_reload_status: Reloading filter Apr 17 19:05:22 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use OPT1_VPNV4. Apr 17 19:05:22 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 19:07:41 kernel: ovpnc1: link state changed to DOWN Apr 17 19:07:41 check_reload_status: Reloading filter Apr 17 19:07:42 kernel: ovpnc1: link state changed to UP Apr 17 19:07:42 check_reload_status: rc.newwanip starting ovpnc1 Apr 17 19:07:43 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 19:07:44 php: rc.newwanip: rc.newwanip: Informational is starting ovpnc1. Apr 17 19:07:44 php: rc.newwanip: rc.newwanip: on (IP address: 10.165.1.6) (interface: opt1) (real interface: ovpnc1). Apr 17 20:41:49 php: rc.newwanip: Creating rrd update script Apr 17 20:41:51 php: rc.newwanip: pfSense package system has detected an ip change 10.139.1.6 -> 10.165.1.6 ... Restarting packages. Apr 17 20:41:51 check_reload_status: Starting packages Apr 17 20:41:51 check_reload_status: Reloading filter Apr 17 20:41:53 php: rc.start_packages: Restarting/Starting all packages. Apr 17 20:41:53 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 20:41:57 kernel: ovpnc1: link state changed to DOWN Apr 17 20:41:57 check_reload_status: Reloading filter Apr 17 20:41:58 kernel: ovpnc1: link state changed to UP Apr 17 20:41:58 check_reload_status: rc.newwanip starting ovpnc1 Apr 17 20:41:59 check_reload_status: updating dyndns OPT1_VPNV4 Apr 17 20:41:59 check_reload_status: Restarting ipsec tunnels Apr 17 20:41:59 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 20:41:59 check_reload_status: updating dyndns WANGW Apr 17 20:41:59 check_reload_status: Restarting ipsec tunnels Apr 17 20:41:59 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 20:41:59 check_reload_status: updating dyndns WANGW Apr 17 20:41:59 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 20:41:59 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 20:42:00 php: rc.newwanip: rc.newwanip: Informational is starting ovpnc1. Apr 17 20:42:00 php: rc.newwanip: rc.newwanip: on (IP address: 10.164.1.6) (interface: opt1) (real interface: ovpnc1). Apr 17 20:42:01 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use OPT1_VPNV4. Apr 17 20:42:01 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WANGW. Apr 17 20:42:01 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WANGW. Apr 17 20:42:01 php: rc.openvpn: OpenVPN: Resync client1 Private Internet Access Apr 17 20:42:02 php: rc.openvpn: OpenVPN: Resync client1 Private Internet Access Apr 17 20:42:05 php: rc.newwanip: Creating rrd update script Apr 17 20:42:07 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 20:42:07 php: rc.newwanip: pfSense package system has detected an ip change 10.165.1.6 -> 10.164.1.6 ... Restarting packages. Apr 17 20:42:07 check_reload_status: Starting packages Apr 17 20:42:09 php: rc.start_packages: Restarting/Starting all packages. Apr 17 20:42:13 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 20:42:20 check_reload_status: updating dyndns OPT1_VPNV4 Apr 17 20:42:20 check_reload_status: Restarting ipsec tunnels Apr 17 20:42:20 check_reload_status: Restarting OpenVPN tunnels/interfaces Apr 17 20:42:20 check_reload_status: Reloading filter Apr 17 20:42:23 php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use OPT1_VPNV4. Apr 17 20:42:23 php: rc.filter_configure_sync: Could not find IPv6 gateway for interface(wan). Apr 17 21:24:39 lighttpd[18673]: (connections.c.137) (warning) close: 8 Connection reset by peer Apr 17 22:29:12 check_reload_status: Syncing firewall Apr 17 22:29:12 syslogd: exiting on signal 15 Apr 17 22:29:12 syslogd: kernel boot file is /boot/kernel/kernel Apr 17 22:29:12 kernel: pflog0: promiscuous mode enabled
-
I rebooted a few days ago but it happened again.
-
I'm trying to chase down something similar and was planning to do more homework before posting, but seeing this makes me wonder. I have the same symptom - dashboard Services Status reports openvpn down, but it isn't. And my logs contain fatal error messages from openvpn, pertaining to EADDRINUSE (1194) and it looks to my untrained (as far as pfSense goes) eyes that the system tried to start multiple openvpn instances.
Everything is "working" except for the status reporting.
I have two different pfSense installations (different networks) and as far as I can tell both are configured "similarly" but obviously something is different. Was settling in for a debug/archeology session but thought maybe I'd get some pointers/clues here.
Here's what my log looks like after booting:
Apr 17 11:02:43 pfsense openvpn[32177]: OpenVPN 2.3.2 i386-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 Apr 17 11:02:43 pfsense openvpn[59735]: OpenVPN 2.3.2 i386-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 Apr 17 11:02:43 pfsense openvpn[50028]: OpenVPN 2.3.2 i386-portbld-freebsd8.3 [SSL (OpenSSL)] [LZO] [eurephia] [MH] [IPv6] built on Mar 27 2014 Apr 17 11:02:43 pfsense openvpn[59735]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 11:02:43 pfsense openvpn[32177]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 11:02:43 pfsense openvpn[50028]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Apr 17 11:02:44 pfsense openvpn[32177]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file Apr 17 11:02:44 pfsense openvpn[59735]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file Apr 17 11:02:44 pfsense openvpn[32177]: TCP/UDP: Socket bind failed on local address [AF_INET]66.68.35.6:1194: Address already in use Apr 17 11:02:44 pfsense openvpn[32177]: Exiting due to fatal error Apr 17 11:02:45 pfsense openvpn[59735]: TUN/TAP device ovpns1 exists previously, keep at program end Apr 17 11:02:45 pfsense openvpn[59735]: TUN/TAP device /dev/tun1 opened Apr 17 11:02:45 pfsense openvpn[59735]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 Apr 17 11:02:45 pfsense openvpn[59735]: /sbin/ifconfig ovpns1 192.168.80.129 192.168.80.130 mtu 1500 netmask 255.255.255.255 up Apr 17 11:02:45 pfsense openvpn[50028]: Control Channel Authentication: using '/var/etc/openvpn/server1.tls-auth' as a OpenVPN static key file Apr 17 11:02:45 pfsense openvpn[50028]: TCP/UDP: Socket bind failed on local address [AF_INET]66.68.35.6:1194: Address already in use Apr 17 11:02:45 pfsense openvpn[50028]: Exiting due to fatal error Apr 17 11:02:45 pfsense openvpn[59735]: /usr/local/sbin/ovpn-linkup ovpns1 1500 1558 192.168.80.129 192.168.80.130 init Apr 17 11:02:45 pfsense openvpn[80689]: UDPv4 link local (bound): [AF_INET]66.68.35.6:1194 Apr 17 11:02:45 pfsense openvpn[80689]: UDPv4 link remote: [undef] Apr 17 11:02:45 pfsense openvpn[80689]: Initialization Sequence Completed
-
I have not noticed this recently, but it certainly happened in the past. Somehow the OpenVPN instance tried to be started again, and I think the PID file got updated. Then the process dies because it can't listen on the specified port (the other process has it already). The PID file now has the new (gone) process id. Now the system can't find the old process any more - it thinks the PID file is a reliable indicator to use to find the process. But, as you say, the VPN is happily running and working.
Eventually you "ps aux | grep openvpn" and find the lost process, kill it (interrupting service) and then you can restart from the webGUI and the Dashboard will be happy again.
I guess there will be a timing thing somewhere with particular real-time interface events. If you can work out a
sequence to make it happen, then it should be easy to analyse and fix ;)Edit: Fixed wrong command text.
-
Ok, that at least helped me correct the situation - thanks!
If it's a race condition my only clue I can provide is I'm on a soekris 6501-50 (both instances). My configuration that works is pretty vanilla (one WAN one LAN). The configuration exhibiting the problem has an extra lan card (8 ports in the machine) and 6 networks configured (one WAN, 5 LAN, 4 of which are actually connected and one is dangling/down). The problem manifests at boot (the log I posted is from bootup). Not entirely sure if it manifests every boot or just sometimes. I don't reboot on a regular basis :)
Not much help, I know. Willing to provide more data if given some pointers/tips things to try/instrument/whatever.
-
Eventually you "ps aux | grep ovpn" and find the lost process, kill it (interrupting service) and then you can restart from the webGUI and the Dashboard will be happy again.
I just tried that and it doesn't find any process ovpn. I tried the restart button in the GUI but that doesn't do anything either so I'm rebooting again. Thanks anyway
-
ps auxww | grep openvpn
I half-jokingly thought the "not quite right" instruction was a safety hurdle - before being qualified to kill random PIDs maybe Phil wanted to make sure we could get over that hump. :) In reality I imagine it was just a typo / brain-hiccup.
-
Thanks for sharing guys. I ran into the same issue here. Killing the process through SSH and then restarting OpenVPN through the web interface made it work again for me.
For the sake of completeness for the less Linux skilled people, like myself, the complete sequence of actions would be:
-
SSH into your pfSense server, i.e. using Putty
-
Log in as user root
-
Use option 8 in the menu to go to the shell
-
Type: ps auxww | grep openvpn
-
It should give a list of openvpn processes. The first number shown is the process id (pid). You'll need that to kill the process.
-
Type: kill
-
Now through the pfSense Web interface, go to Status -> Services and click on the Play icon in the line with openvpn to start it. It should now start again.
Attached a screen capture of the commands I used to do this trick. Worked like a charm!
-
-
Thank you for the SSH instruction, but what if this happens on a regular basis; is there a way to prevent this from happening?
-
Thank you for the SSH instruction, but what if this happens on a regular basis; is there a way to prevent this from happening?
Can't answer that question for you. What I can say that once I went through the steps outlined above, it has never occurred anymore on my pfSense system, so it might be a one time thing that can occur.
-
Pretty sure that's this scenario.
https://redmine.pfsense.org/issues/3894