Using cron to restart dpinger
-
Hello,
I have a multi-wan installation. WANA is a slow satellite connection. WANB is a fast cellular connection. The WANA is connected to the satellite and WANB port on pfSense is connected to a LAN port of a MikroTik router. The MikroTik's WAN port is connected to the LAN port of the cellular modem. (Yes, I know, double-nat or more). But things work well.
The reason for the MikroTik's use is that it runs a L2TP client to a VPN provider, thus bypassing the video streaming resolution limits of the cellular provider. It does not support OpenVPN using TLS hence I have to use L2TP.
I have gateway monitoring enabled. Everything runs great but when the cellular modem reboots (due to loss of a signal/whatever), it causes the VPN connection to go down. When the modem is back up, the connection is good again and the VPN comes up also. However pfSense no longer sees the connection as being good.
The monitoring IPs I have are 4.2.2.2 and I also tried the PPP server side address of the VPN which is 10.100.100.1. Both of these will respond to ping requests but pfSense continues to not seem them after the VPN goes down and comes back up in the MikroTik. If I turn off the VPN and reboot the modem, the connection comes back up and pfSense sees it. So something with the VPN causes dpinger to not see the connection come back up properly.
I can go into diagnostics and ping 4.2.2.2 and other addresses through WANB and replies come back. It's just that dpinger no longer sees the connection properly.
I have tried messing with thresholds and payload size but nothing works.
The only thing that seems to work is if I stop and start the dpinger service.
I have set up a cron manager in pfSense. I would like the command to stop and start dpinger properly. I can set up the schedules without a problem. Just need to know what is the command to start dpinger. Or stop and start or just restart.
I seem to have run out of options trying to fix this in a different way.
-
See https://forum.netgate.com/post/922843. I set up the simple script for that client and a cron job, every half hour as "/usr/bin/nice -n20 /usr/local/bin/php-cgi -f /root/gateway_check.php". Note if you don't use the cron package the job will disappear...I think it was at the next boot, if not the next pfSense upgrade.
-
I noticed this was fixed in 2.5/21.2:
https://redmine.pfsense.org/issues/10546
"In this case, pfsense will consider a gateway down when it has actually returned to a normal state, necessitating administrator action to return it back to a proper state." -
@teamits Are you saying that this is fixed (as in not requiring admin action)?
Because right now I am using cron to restart dpinger every 2 minutes. It is working to satisfaction. But I would rather have pfSense handle this process internally and not via a hack.
(Running v2.4.5).
For anyone interested, the cron job I set up is this:
*/2 * * * * root /usr/local/sbin/pfSsh.php playback svc restart dpinger
-
I haven't upgraded our client's router yet (and probably won't for a while) but the redmine issue is marked as resolved. Not sure if this is your issue but it sounds an awful lot like mine.
-
@teamits As far as I can tell it is not resolved. Either changing the gateway monitoring IP address or stopping and restarting the dpinger service from the web interface works. That is not practical because this particular installation is at a client's office. So I resorted to using cron to restart the service every 2 minutes. It has worked out great.
-
Additionally, I reboot their cellular modem automatically at 4 a.m. or it reboots on it's own when the cell connection starts experiencing high latency (attaches to a tower that is not ideal). When this happens, pfSense switches to the backup connection (satellite) and does not come back even if the cellular connection is working again. Doing the dpinger restart via cron ensures that the connection is seen as good and all the routing priorities go through it again.
-
@rizwan602 said in Using cron to restart dpinger:
pfSense switches to the backup connection (satellite) and does not come back even if the cellular connection is working again
That sounds like the issue our client had (in the post I referenced last month), and sounds like the bug fixed in 2.5.
-
Bit of a necromancy, but I'm on 2.5.2 and have had this issue for months. One of my two WAN connections will fail out due to latency or packet loss, then never come back up. The moment I change settings on the gateway itself, or restart dpinger, it is resolved.
At the moment I am also just restarting dpinger all the time with a cron job, but I don't love it.
-
@mantis0711 See thread https://forum.netgate.com/topic/167206/gateway-drops-and-never-comes-back for another report and diagnosis.