HOWTO: Ping hosts and reset/reboot on failure



  • Virtually all my clients are remote and running pfSense.  A few have some unresolved issues where they loose their Internet connection and/or pfSense is inaccessible (even on-site).  Consequently, I was looking for a way to test/restart the WAN connection and reboot pfSense if all else failed.  I found pfSense's ping_hosts.sh, but that was overkill.  Plus, I want to ping several hosts (in case one was down) and execute an action ONLY if ALL hosts fail to respond, with a reboot being the last resort.

    I couldn't find exactly what I wanted, so I wrote my own script and wanted to contribute it.  I wrote/tested this on pfSense v1.2.3 RC1, but it's simple enough that it should work on all versions.  And although I'm a veteran programmer, this is my first *nix shell script, so if there's a better way to do something here, please post improvements and I'll add them to the code.

    For those of you who are *nix handicapped like me, installing and activating this script is easy to do via the web interface.

    • Go to Diagnostics > Edit file

    • Cut & paste the code below, editing the user variables to match your settings.

    • Save the file as /usr/local/bin/pingtest.sh (or path/name of your choosing)

    • Go to Diagnostics > Command

    • Execute the command "chmod +x /usr/local/bin/pingtest.sh" (makes the file executable)

    • Go to System > Packages and install the Cron package

    • Go to Services > Cron

    • Install a new cron with settings "*/5 * * * * root /usr/local/bin/pingtest.sh" (runs test every 5 minutes)

    If you want to verify the test is running, uncomment (remove the "#") in front the "echo" commands on lines 33 & 37.  Then, use Diagnostics > Edit file to view the contents of /root/pingtest.log.  There should be a log entry every 5 minutes.  Once you're sure it's running, be sure to add the comments back so you don't fill your hard drive with this script's status message.

    #!/bin/sh
    
    #=====================================================================
    # pingtest.sh, v1.0.1
    # Created 2009 by Bennett Lee
    # Released to public domain
    #
    # (1) Attempts to ping several hosts to test connectivity.  After
    #     first successful ping, script exits.
    # (2) If all pings fail, resets interface and retries all pings.
    # (3) If all pings fail again after reset, then reboots pfSense.
    #
    # History
    # 1.0.1   Added delay to ensure interface resets (thx ktims).
    # 1.0.0   Initial release.
    #=====================================================================
    
    #=====================================================================
    # USER SETTINGS
    #
    # Set multiple ping targets separated by space.  Include numeric IPs
    # (e.g., remote office, ISP gateway, etc.) for DNS issues which
    # reboot will not correct.
    ALLDEST="google.com yahoo.com 24.93.40.36 24.93.40.37"
    # Interface to reset, usually your WAN
    BOUNCE=em0
    # Log file
    LOGFILE=/root/pingtest.log
    #=====================================================================
    
    COUNT=1
    while [ $COUNT -le 2 ]
    do
    
    	for DEST in $ALLDEST
    	do
    		#echo `date +%Y%m%d.%H%M%S` "Pinging $DEST" >> $LOGFILE
    		ping -c1 $DEST >/dev/null 2>/dev/null
    		if [ $? -eq 0 ]
    		then
    			#echo `date +%Y%m%d.%H%M%S` "Ping $DEST OK." >> $LOGFILE
    			exit 0
    		fi
    	done
    
    	if [ $COUNT -le 1 ]
    	then
    		echo `date +%Y%m%d.%H%M%S` "All pings failed. Resetting interface $BOUNCE." >> $LOGFILE
    		/sbin/ifconfig $BOUNCE down
    		# Give interface time to reset before bringing back up
    		sleep 10
    		/sbin/ifconfig $BOUNCE up
    		# Give WAN time to establish connection
    		sleep 60
    	else
    		echo `date +%Y%m%d.%H%M%S` "All pings failed twice. Rebooting..." >> $LOGFILE
    		/sbin/shutdown -r now >> $LOGFILE
    		exit 1
    	fi
    
    	COUNT=`expr $COUNT + 1`
    done
    


  • It's nice but… does it help? resetting interface or rebooting the firewall?



  • @Eugene:

    It's nice but… does it help? resetting interface or rebooting the firewall?

    One of my clients has "sticky" IPs but the PPPoE auth doesn't always work (don't know if it's pfSense or modem problem).  Just bouncing the WAN connection usually fixes it.

    Got another client who looses remote/local access to/through pfSense, particularly when I'm working remotely via IPSEC.  Hoping this is the IPSEC issue mentioned in the pfSense blog that will be fixed in RC2, but in thte meantime this script reboots pfSense and everything works again.

    Have yet another client whose pfSense box occasionally locks up so bad that this script can't even run.  Only thing that would probably help in this case is a hardware watchdog (besides fixing the actual problem).  ;)

    So, help is a mixed bag.  But if it saves even one trip to your remote site just to press the reset button, it's well worth it.



  • Script looks good to me, my only suggestion is that you add a couple seconds delay between setting the interface down and back up. I know the Intel hardware is well-behaved here, but I've seen other cards where the PHY doesn't reset properly or at all (link maintained) if you bring the state back up too soon.



  • @ktims:

    Script looks good to me, my only suggestion is that you add a couple seconds delay between setting the interface down and back up. I know the Intel hardware is well-behaved here, but I've seen other cards where the PHY doesn't reset properly or at all (link maintained) if you bring the state back up too soon.

    Updated script.  Good call.  That might save me on a couple of clients where I scraped together a firewall with crappy NICs.


Locked