Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Cron based OpenVPN watchdog

    Scheduled Pinned Locked Moved General pfSense Questions
    6 Posts 3 Posters 5.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      ssheikh
      last edited by

      In my losing battle in trying to keep OpenVPN running consistently and not crashing all the time, I have thrown in the towel and created yet another cron based restart script, this time for OpenVPN.

      I run this every 7 minutes.

      Note that this will not restart the service if it is already running but incorrectly shows the OpenVPN as not being connected on the sevrer side. That is another problem I am having with OpenVPN p-2-p tunnels.

      
      #!/usr/local/bin/php -f
      /* $Id$ */
      /*
          /etc/watchdog_openvpn
          written for pfSense (http://www.pfSense.com)
          Shahid Sheikh
      
          This is a quick hack to restart any stopped
          or crashed opencpn client or server instances.
      */
      
      require_once("service-utils.inc");
      require_once("system.inc");
      
      $services = get_services();
      foreach ($services as $service) {
      	if ($service["name"] == "openvpn") {
      		if(!get_service_status($service)) {
      			log_error($service[description] . " was found dead.");
      			$settings = openvpn_get_settings($service[mode], $service[vpnid]);
      			openvpn_restart($service[mode], $settings);
      		}
      	}
      }
      ?>
      
      

      For those interested, my OpenVPN client is the one that mostly dumps. At least once in 24 hours. Here is the log sorted in reverse:

      Sep 9 22:26:12	openvpn[52037]: Exiting due to fatal error
      Sep 9 22:26:12 openvpn[52037]: FreeBSD ifconfig failed: external program exited with error status: 1
      Sep 9 22:26:12 openvpn[52037]: /sbin/ifconfig ovpnc1 192.168.5.2 192.168.5.1 mtu 1500 netmask 255.255.255.255 up
      Sep 9 22:26:12 openvpn[52037]: do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0
      Sep 9 22:26:12 openvpn[52037]: TUN/TAP device /dev/tun1 opened
      Sep 9 22:26:12 openvpn[52037]: TUN/TAP device ovpnc1 exists previously, keep at program end
      Sep 9 22:26:10 openvpn[52037]: [<cn of="" my="" cert="">] Peer Connection Initiated with [AF_INET]<openvpn_server_ip>:16001
      Sep 9 22:26:10 openvpn[52037]: WARNING: 'ifconfig' is present in remote config but missing in local config, remote='ifconfig 192.168.5.2 192.168.5.1'
      Sep 9 22:26:10 openvpn[52037]: WARNING: 'tun-ipv6' is present in remote config but missing in local config, remote='tun-ipv6'
      Sep 9 22:26:09 openvpn[52037]: UDPv4 link remote: [AF_INET]<openvpn_server_ip>:16001
      Sep 9 22:26:09 openvpn[52037]: UDPv4 link local (bound): [AF_INET]</openvpn_server_ip></openvpn_server_ip></cn>
      
      1 Reply Last reply Reply Quote 0
      • P
        phil.davis
        last edited by

        The new Service Watchdog package should do that for you - it works for me.
        Of course, it doesn't work in the odd cases where the OpenVPN instance is still running in a process somewhere, but the pid file does not point to that pid. That happens sometimes, somehow, with all the "killed" memory problems. Then the OpenVPN link is working but the dashboard can't find it. Service Watchdog, like the good puppy it is, faithfully tries to restart the OpenVPN instance every minute, without success since the port is already in use by the working but "lost" process. (I don't think your script will be any better at detecting this condition, since it uses the built-in pfSense routine get_service_status the same as the dashboard…)

        As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
        If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

        1 Reply Last reply Reply Quote 0
        • S
          ssheikh
          last edited by

          It is too brute force. In that it attempts to restart the service as soon as it goes down. That causes loops and race conditions and all sorts of other headaches. A backoff algorithm needs to be built in to that watchdog service to make it a little less intrusive.

          1 Reply Last reply Reply Quote 0
          • K
            kejianshi
            last edited by

            Have you been able to diagnose why your openvpn crashes all the time?  Thats strange behavior.  Not at all something happening for me here.

            1 Reply Last reply Reply Quote 0
            • S
              ssheikh
              last edited by

              @phil.davis:

              …but the pid file does not point to that pid... That happens sometimes, somehow, with all the "killed" memory problems.

              In my case that wasn't happening because of low memory conditions. It was happening because PHP was attempting to restart openvpn faster than the disk could commit the .pid file and eventually would lose track which pid was running and which one had crashed.

              There need to be proper dynamic wait states for forks of processes to finish processing before a loop re-iterates or carries on.

              Ever since I simplified my setup to the point that OpenVPN was completely CARP independent even when running on a CARP cluster and every client always had one or two unique server IPs to connect to, things have been fairly stable. Both my primary and backup firewalls have OpenVPN connections established to primary and backup firewalls at the other sides. Now I am trying to get OSPF to be stable enough to correctly make all routing decisions.

              The setup is a full cross mesh of firewall pairs each at the three sites and each site having a primary and backup internet connection.

              1 Reply Last reply Reply Quote 0
              • S
                ssheikh
                last edited by

                @kejianshi:

                Have you been able to diagnose why your openvpn crashes all the time?  Thats strange behavior.  Not at all something happening for me here.

                Its a combination of using the firewalls in CARP clusters, having OSPF running and having a full cross mesh OpenVPN connections between three sites.

                The single biggest reason for OpenVPN to dump due to a fatal error is because of not being able to bring up the ovpn tunnel interface or not being able to inject the route in the kernel's routing table.

                the next thing I am working on it to make the start script resilient to such problems and try to recover from them, fix the issue and restart the openvpn service. Probably not going to be able to finish it since I am already 2 weeks behind in delivering this overall solution to a client.

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.