Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    New Package: Service Watchdog

    Scheduled Pinned Locked Moved pfSense Packages
    35 Posts 9 Posters 25.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      doktornotor Banned
      last edited by

      @jimp:

      Odd, cron shows fine here for me on the latest snapshot (also amd64).

      Ditto, and yeah, cron makes no sense there. :D

      As for packages, seems like almost all of them are missing descriptions. (gwled, blinkled, nut, darkstat). The only thing I have installed and can see the description is unbound.

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        The packages apparently aren't setting their own/proper service description. The status page pulled their package descriptions instead. I just pushed a fix for that (and to skip cron and empty services)

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • P
          phil.davis
          last edited by

          Looks good - I can stop services from status services, and they spring back to life 1 minute later. Looking forward to trying listing multiple OpenVPN servers (a site-to-site and a road-warrior) once the necessary new snapshot comes.
          One thought - during the boot process cron is configured/started. The service watchdog job might run while the boot is still doing more stuff? If so, it should not really do anything, as the boot might still be getting things up and running. Should the code check for if $g['booting'] and bail out in that case?

          As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
          If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            Yeah you're right. I just pushed a new rev to make it do nothing if it's booting.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • D
              doktornotor Banned
              last edited by

              You can as well blacklist unbound I'd say, it has its own "watchdog" script - /usr/local/bin/unbound_monitor.sh. Though, not completely sure whether we'd not be better off dropping the looping shell script from unbound and using this package instead. :)

              1 Reply Last reply Reply Quote 0
              • P
                phil.davis
                last edited by

                Squid also has it own sqp_monitor process. That function could now be done by this package and the special sqp_monitor code removed, if anyone cares or thinks it is a good idea.

                As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                1 Reply Last reply Reply Quote 0
                • S
                  Supermule Banned
                  last edited by

                  Just a quick thought…Snort running a heavy set of rules can take minutes to start and running this every minute could cause Snort to start multiple times... Would it be a thing to make a 5 minute penalty period after a boot before the script begins to monitor packages??

                  1 Reply Last reply Reply Quote 0
                  • P
                    phil.davis
                    last edited by

                    2.1-RC1 (i386)
                    built on Wed Aug 28 16:55:08 EDT 2013
                    FreeBSD 8.3-RELEASE-p10

                    I created 2 OpenVPN servers on a test system. They both appear in the dropdown list of services to add for Service Watchdog. After adding the 1st server, the 2nd server no longer appears in the dropdown list, so I can't add it as well.

                    I stopped NTPD and both OpenVPN server services from status services. Waited a few minutes. NTPD  and Test Server 1 restarted.

                    Does the dropdown list need a bit more tweaking to allow multiple individual OpenVPN servers to be added to the watch list?

                    As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                    If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                    1 Reply Last reply Reply Quote 0
                    • P
                      phil.davis
                      last edited by

                      @jimp:

                      Yeah you're right. I just pushed a new rev to make it do nothing if it's booting.

                      We also have situations where the boot script itself gets "killed: out of swap space". In that case, /var/run/bootup file nevers gets removed, so it always looks like the system is booting. I submitted pull requests so that Service Watchdog will start doing its thing anyway 15 minutes after boot time. See what you think.
                      I engineered it to put a new function get_uptime_sec into the base system for the benefit of anything that cares to call it. But that means that the base system change has to go into 2.1 branch, and people have to have it in their snapshot to use my changes to Service Watchdog. So feel free to engineer it however…

                      As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                      If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        As I mentioned on the pull request, that isn't a good workaround. There are other things that will break if that flag file is left around too long, and the package shouldn't have to care about that.

                        It would be better to find a way to clean that up automatically in the base system, but that isn't really a discussion for this thread since it's unrelated.

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • jimpJ
                          jimp Rebel Alliance Developer Netgate
                          last edited by

                          @Supermule:

                          Just a quick thought…Snort running a heavy set of rules can take minutes to start and running this every minute could cause Snort to start multiple times... Would it be a thing to make a 5 minute penalty period after a boot before the script begins to monitor packages??

                          The "snort" binary would be in the list and it should show that it's running as far as this check is concerned.

                          That said, it probably would not work right with snort anyhow, since an instance for one interface would die and this would never know, because of how snort handles its instances. It would only show it down if all instances of snort were dead.

                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                          Need help fast? Netgate Global Support!

                          Do not Chat/PM for help!

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            @phil.davis:

                            2.1-RC1 (i386)
                            built on Wed Aug 28 16:55:08 EDT 2013
                            FreeBSD 8.3-RELEASE-p10

                            I created 2 OpenVPN servers on a test system. They both appear in the dropdown list of services to add for Service Watchdog. After adding the 1st server, the 2nd server no longer appears in the dropdown list, so I can't add it as well.

                            I stopped NTPD and both OpenVPN server services from status services. Waited a few minutes. NTPD  and Test Server 1 restarted.

                            Does the dropdown list need a bit more tweaking to allow multiple individual OpenVPN servers to be added to the watch list?

                            Yes that still needs some work.

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              OpenVPN and captive portal instance matching should be fixed in 1.4, up now.

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • D
                                doktornotor Banned
                                last edited by

                                Thanks again for the work on this, this feature hopefully should get to the base install once the whole stuff gets polished.

                                1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  @doktornotor:

                                  Thanks again for the work on this, this feature hopefully should get to the base install once the whole stuff gets polished.

                                  Seems better as a package for me. Not everyone needs/wants it, and it can react to changes faster as a package. After it's fairly well set it may not change much, but it still seems like a better fit as an add-on.

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • K
                                    kejianshi
                                    last edited by

                                    There are certain conditions that, in the past, have meant that raccoon was essentially offline (not about to connect anyone properly), even though the process still shows as running.

                                    If all those conditions haven't been cleared up, any chance to use this package to restart it when raccoon goes buggy?

                                    1 Reply Last reply Reply Quote 0
                                    • jimpJ
                                      jimp Rebel Alliance Developer Netgate
                                      last edited by

                                      Not likely, if the process is running this would believe it to be up.

                                      That kind of check would add a whole mess of code that would be irrelevant to anything else it does. Seems maybe maybe a better fit as some sort of dedicated racoon watchdog that is capable of more than a running/not running check.

                                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                      Need help fast? Netgate Global Support!

                                      Do not Chat/PM for help!

                                      1 Reply Last reply Reply Quote 0
                                      • P
                                        phil.davis
                                        last edited by

                                        Another little tweak:

                                        function is_service_enabled($service_name)
                                        

                                        servicewatchdog_check_services() could check is_service_enabled and only bother to try and start it if it is both enabled and not running.
                                        It is probably best to allow people to add whatever services they like to the Service Watchdog watch list, as it is now. Then they can be in the list ready to be watched, even if they happen to be disabled at any particular time.

                                        As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                        If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                        1 Reply Last reply Reply Quote 0
                                        • P
                                          phil.davis
                                          last edited by

                                          @jimp:

                                          OpenVPN and captive portal instance matching should be fixed in 1.4, up now.

                                          I can "watchdog" multiple OpenVPN instances now, and Watchdog restarts them if I stop them. A great thing.

                                          As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                          If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                          1 Reply Last reply Reply Quote 0
                                          • jimpJ
                                            jimp Rebel Alliance Developer Netgate
                                            last edited by

                                            @phil.davis:

                                            Another little tweak:

                                            function is_service_enabled($service_name)
                                            

                                            servicewatchdog_check_services() could check is_service_enabled and only bother to try and start it if it is both enabled and not running.
                                            It is probably best to allow people to add whatever services they like to the Service Watchdog watch list, as it is now. Then they can be in the list ready to be watched, even if they happen to be disabled at any particular time.

                                            I can add that but that function does only work for packages, not for base system services. And even then, only packages that actually support an enable option using the exact option name for which that function checks.

                                            Seems safer to put the burden on the user to only watch services they know they need to stay active.

                                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                            Need help fast? Netgate Global Support!

                                            Do not Chat/PM for help!

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.