How to run sh or php script for filer or cron

wakson005

@stephenw10 Just to clarify as I wasn't being too clear. I do have two tap OpenVPN TAP up and they actually are bridged but each TAP is to a different endpoint. So looping most likely is occurring (not sure how i would check that but will look into it somehow). So i enable Spanning Tree to help but not sure it is helping much.

So to clarify my setup I have the following:
Site0
Server:
Site0 to Site1 TAP
Site1 to Site2 TAP
Client: N/A
both interface bridged.

Site1
Server:
Site1 to Site2 TAP
Client:
Site0 to Site1 TAP
both interface bridged.

Site2
Server: N/A
Client:
Site0 to Site1 TAP
Site1 to Site2 TAP
both interface bridged.

That's how TAP connection work between all 3 sites. It works fine except during HA failover. So each site therefore has a Master and Backup PFSense. It is during failover I have issue. I follow similar concept for my TUN and TUN at all 3 to N location has flawless failover. But TAP requires manual intervention to deal with it. I do have script to turn off interfaces and back on but I can't find a script option to restart the VPN client/server as if we are pressing the play button:

Script i thought to turn on the VPN TAP as if pressing play didn't seem to work:

/usr/local/sbin/pfSsh.php playback svc start 21
// or
/usr/local/sbin/pfSsh.php playback svc start ovpns21

only show something like this for both case:

But when i manually press the button it switch
FROM:

TO:

without any issue.

Does anyone know the actually script to press the play button to turn it on?

stephenw10

Hmm, so at all three sites the two TAP tunnels are bridged together and bridged to a local interface?

wakson005

@stephenw10 Yes the 2 TAP tunnels at each of the 3 site are bridge to a single LAN interface. therefore the same LAN subnet are all bridge together. All DHCP range are different just subnet are the same as that was the hard requirement :( if not I would have stick with TUN as it was easier and works flawlessly between the 3 sites each with HA. I am able to ping do iperf from one pfsense to another without issue like this.

Its difficult as the TAP connection and iperf work between then though experience high retr when doing iperf but they are all up and running just during HA it never switch over flawlessly...

stephenw10

Hmm, I mean just to be clear those TAP tunnels are effectively in a mesh between the sites? I sounds like a L2 loop is inevitable without something in place to prevent it. STP on the bridges perhaps.

You would certainly need to have them run on the VIPs to avoid further loops between the HA nodes in that case.

wakson005

@stephenw10 Yea they are in a mesh between site (meaning all site share the same subnet here. Just the DHCP server handoff is a different range of that same subnet.). I do have RSTP on the bridge and put rules in place to for source to destination on each bridge hopefully to help with that but still some weird issue. Could be my setup is not correct. This is a site to site setup and I haven't find a concrete guide on the setup. Do you know if anyone being able to successfully set this up. I understand there is no guide for this and think PFSense doc said it is not recommended probably for a reason i think lol. If not i will just stick with the manual process for now until future improvement is added.

with Tunnel setting all blank beside the Tunnel Network. All else below this is left as default:

Client everything under Tunnel Network all blank and left as default.

stephenw10

Hmm, well I would expect that to work but the addition of HA makes things.... interesting.

I would expect to see some errors logged on the secondary node when it fails over. Basically I don't expect to need a script there.

wakson005

@stephenw10 Ok i have identified the issue in more detail. When the bridge interface is enabled the Failover fails. I need to disable TAP interface on both Site 0 HA1 and HA2 and Site 1 HA1 and HA 2, same for Site 2 end for it to work properly. When all TAP interface is disable failover work flawlessly. Trickly part is the timing of this which is when failover occurs all the TAP interface needs to be disable and re-enable after the TAP connection is automatically re-established. When doing failover if all the TAP interface is disable you can do as many failover as you want "using maintenance mode" without breaking the connection. Only turn on TAP interface after the failover is completed and all VPN TAP are up.

So TAP interface causing issue as that connection probably disappear and it doesn't work... wonder what makes this different from the TUN case for the interface viewpoint.

wakson005

@stephenw10 Really appreciate your guidance :) . Oh so you confirm that it definitely work by you or other people before? If script is not needed is there a step by step guidance somewhere for this?

stephenw10

Hmm, how does the failover fail when TAP is enabled? Like it actually doesn't switch nodes?

wakson005

@stephenw10 Just did multiple test on it and notice that the interfaces either disappear or becomes down and never turn back on. I did a ifconfig and it gives me much more detail. The interface disappear or got rename for some reason which i thought is weird.

@stephenw10 How does the failover fail when TAP is enabled? (There is the TAP VPN Server, Client, and Interface all 3 different things)
Answers: Just to be as clear as possible for others beside me and Stephenw10 reading the "TAP VPN Server" when enable work fine and failover work flawlessly IF the TAP interfaces is disabled. So me claiming failover not working in general is probably not 100% completely true as the VIP IP failover for the LAN network work flawlessly. This is the LAN network used in the Bridge connection with the VPN. As the bridge, LAN, VIP creation doesn't directly impact TAP VPN connections it works fine. But IF TAP interfaces are enable there is a high chance of failing (maybe because the original master is holding onto the connection. Interfaces start disappearing or up status becomes down and bridge loses the interfaces that was part of the bridge. Bridge sometime don't have the TAP interfaces anymore as it is down and doesn't register it again when it is up later on.)

Things that work:

Failover for VIPs master to backup is working great for TAP and TUN. (But this is just the VIP IP that work for the failover doesn't mean the connections for the TAP still work.)

Issues notices during the switch IF interfaces is not down for the failover:

Interface disappear from ifconfig (worst case there is a new interface called tap## which used to be ovpns## which is definitely the more weirder case...)
Interface is not part of the bridge anymore
system log or openvpn log was not too helpful only show link up, link down, fatal error... (maybe i can try a higher lv log > default to get more status...)

Ways to resolve the issue manually after all those issue appears and it work almost every time:

Turn off all TAP interfaces.
Reset only TAP interfaces that has issues (Notice that the status is not reporting correctly on the gui as ifconfig status don't match with gui. Example gui show up and ifconfig show interface down [without the up].)
Turn back on the TAP interface and everything is back to normal.

I conclude that sh script with "config interface down/up" wont be enough to resolve the issue. Same with php script to enable/disable interface is not enough too. The TUN seem to be doing much more than turning on and off the interfaces. If i do that for TAP i am definitely missing some key component in the script. Manually changing does more than just turn on/off the interfaces it actually reset the bridge, interfaces, and routes in some way i believe that's probably why it work but not with script.

So as far as i can tell there is no perfect solution yet for TAP use TUN if possible as its faster and more reliable unless absolutely necessary like poor me where I have to use it no matter what for a share subnet across both site.

Thanks!

In the mean time if others have ideas i would like to try :)

wakson005

@wakson005 is this the same as pressing the vpn restart? "/usr/local/sbin/pfSsh.php playback svc restart openvpn server Server1" if so what do i need to put into Server1 is it "S00000C00001TAP00" or "ovpns18" or "18" or "Server 18" same how do i do this for client. Though client might be fine.

Like which one is the correct one to run as it just said run

Like i tried "/usr/local/sbin/pfSsh.php playback svc stop 18" and got back

but gui shows:

and status stayed the same in gui which makes me think gui is not updating as script doesn't update gui like all the other cases i seen for interfaces.

Think the above will get me many step closer to solution as restart of vpn need to be done per TAP interface based as TUN is working i don't want to touch those.

wakson005

ok for vpn restart, start, stop refer to:
https://forum.netgate.com/topic/176435/disable-openvpn-clients-on-reboot/3

will try this with my current code hopefully should fix lots of my issues i think as this was probably the key ingredient i was missing...

stephenw10

Yup you would use: pfSsh.php playback svc restart openvpn server 18

As shown:

Netgate pfSense Plus shell: playback svc

Playback of file svc started.

Usage: playback svc <action> <service name> [service-specific options]

Examples:
playback svc stop dhcpd
playback svc restart openvpn client 2
playback svc stop captiveportal zone1

wakson005

@stephenw10 Thanks that resolved my issue :) as it let me restart the openvpn server and client perfectly. Final testing prior to calling everything fool proof.

wakson005

@stephenw10 Script is suppose to running continuously and checking carp for when the master to backup transition occurs.

Script work fine when i do the following:
DiagnosticsCommand>Prompt>Execute Shell Command and enter:
/usr/local/bin/openvpn_server_client_tap_auto_failover.sh

Issue is this forever loop stop at some point as I think it is not meant running forever until shutdown.
Tried moving .sh script to:
/usr/local/etc/rc.d/openvpn_server_client_tap_auto_failover.sh
and it causes it to trigger multiple times for some reason as if it reset itself and run.

Is there somewhere to run sh script at boot up and let the loop run forever until shutdown? Restarting the script doesn't work as it stores a temporary state of what the carp state previously so it know to reset or not reset. If script start up running every time it will reset as it assume carp status changes.

stephenw10

Can you see what's killing the script?

wakson005

@stephenw10 Sound good. Is there a easy way to monitor the script for comments output to know what is occurring?

Been using package filer to add my script and run all my script in cmd prompt for testing and its working great. Is there a better way to monitor script than just output below as I will only know it stop not why it stopped. I assume there might be a kill switch for the for or while loop for some reason...

// Start off the comments with overwrite
echo "# This is a comment" > /path/to/taperrorlog.txt
// then use below to append to current file
echo "# This is a comment" >> /path/to/taperrorlog.txt

Hopefully this gives me more idea. From my rudimentary understand .sh script under "/usr/local/etc/rc.d/" directory will run automatically. But not sure how it handle a script with a while/for loop that never ends. There is possibility a kill switch to prevent infinitely loops. I think i need this approach to keep it running forever base on what I have seen:

#!/bin/sh
# PROVIDE: autostartopenvpntap
# AFTER: NETWORKING
# KEYWORD: shutdown

. /etc/rc.subr

name="autostartopenvpntap"
desc="Auto Start OpenVPN TAP Connections"
rcvar="${name}_enable"

start_cmd="${name}_start"
stop_cmd="${name}_stop"

autostartopenvpntap_start() {
    # Add your script execution command here
#    while true; do
        # .sh script below has local variable being stored for comparison check later
        # So this needs to be fixed if not it wont work. Easiest solution is to move
        # openvpn_server_client_tap_auto_failover.sh into this script I think...
        /usr/local/bin/openvpn_server_client_tap_auto_failover.sh
		echo "Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh from autostartopenvpntap" >> /usr/local/bin/tapErrorLog.txt
#    done
}

autostartopenvpntap_stop() {
    # Add the command to stop your script here
    pkill -f /usr/local/bin/openvpn_server_client_tap_auto_failover.sh
}

load_rc_config $name
run_rc_command "$@"

## Set the script to start on boot by adding the following line to /etc/rc.conf.local:
## autostartopenvpntap_enable="YES"
## Reboot the system or start the script manually using the following command:
## /usr/local/etc/rc.d/autostartopenvpntap.sh start
#############################################################
## File:
##   /usr/local/etc/rc.d/autostartopenvpntap.sh
## Permissions:
##   755
## Script/Command:
##   N/A
#############################################################

wakson005

@stephenw10 Ok i found out why after outputting echo to a text file. So it turns out the script run bunch of time and does an infinitely reset loop so that's why I see it goes up and then goes back down instantly... so saving files in this location "/etc/local/etc/rc.d" run the script repeatively??? lol... Still trying to figure that out.

Best way is to save data to temporary file and load it during each loop as i think the local variable is reset each time. If it was the same script running in a loop the local variable data is maintain but it seem like it load a new script to run each time so local data won't be maintained.

Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh from autostartopenvpntap Time: 12:05:28
Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh Time: 12:05:28
Start Script: /usr/local/bin/enable_bridge_tap.sh Time: 12:05:28
Start Script: /usr/local/bin/disable_bridge_tap.sh Time: 12:05:28

Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh from autostartopenvpntap Time: 12:06:22
Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh Time: 12:06:22
Start Script: /usr/local/bin/enable_bridge_tap.sh Time: 12:06:22
Start Script: /usr/local/bin/disable_bridge_tap.sh Time: 12:06:23

Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh from autostartopenvpntap Time: 12:07:33
Start Script: /usr/local/bin/openvpn_server_client_tap_auto_failover.sh Time: 12:07:33
Start Script: /usr/local/bin/enable_bridge_tap.sh Time: 12:07:33
Start Script: /usr/local/bin/disable_bridge_tap.sh Time: 12:07:33

stephenw10

Hmm, if it's really an rc script like that it will get triggered by package/service restarts etc which could explain the multiple instances.

The only time I've dealt with this was with the lcdproc package. We had to add a line to kill any existing instances before starting a new process.

I would have expected it to run fine as a shellcmd to be honest.

wakson005

@stephenw10 Yea its really weird guess my best option is to move it completely out of that location. If you said it is trigger by package/service restarts there are too many things going on that could trigger it then. I will move it to "/usr/local/bin/" and use a cron to trigger my script in an infinite loop. Its the next best solution really lol. Not the best way but its what I have I guess.

As my script work when i trigger it through shellcmd manually during the failover but doesn't work when there is all this other stuff triggering it in a infinitely loop.