Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Beta test of new NUT UPS package

    Scheduled Pinned Locked Moved pfSense Packages
    114 Posts 15 Posters 33.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • w0wW
      w0w
      last edited by

      @dennypage:

      It introduces a ten second delay in all alert circumstances. However the issue we are trying to address only occurs on service start and only with a slow (remote) UPS.

      I am not sure are we talking about the same thing? I have tested it with firefox and it works as it should in UX. If UPS connection is OK and established there is no delay in loading status page, I mean delay is less then 1 second and if there is a problem with connection (cable unplugged or whatever), then it returns "Status Alert: The UPS requires attention" as it should be but after 10 seconds delay or returns UPS data if connection is established during this 10sec sleep. Why sleep should be called in all alert circumstances? I don't understand. I am sorry. I am not an PHP programmer, but if logiс is the same for all languages I know. If I do something wrong, then feel free to tell me :)

      @dennypage:

      Btw, forgot to ask this earlier: Is the use of v2c a holdover from the prior NUT package, or did you explicitly configure it? Have you tried running without it?

      It's not from prior, it's required for some other clients on network.

      EDIT:
      Now I understand what you mean, forgive me please for my stupidity. Looked deeply into nut.inc
      Yes we need some other way to fix it.

      1 Reply Last reply Reply Quote 0
      • dennypageD
        dennypage
        last edited by

        I'm not a PHP guy either. I'm a C and assembler programmer. :)

        The proposed change introduces a 10 second delay to page processing if there is any alert condition. Let's say you have a local UPS, either local or remote, which has been running fine and then goes into alert state. You click on the widget header to go to the UPS status page. That page load will experience a 10 second delay, which it shouldn't.

        The core issue comes from the delay between service start and service availability due to the time taken in driver initialization when talking to the SNMP UPS. It is only during this interval that a delay would be appropriate.

        @w0w:

        @dennypage:

        It introduces a ten second delay in all alert circumstances. However the issue we are trying to address only occurs on service start and only with a slow (remote) UPS.

        I am not sure are we talking about the same thing? I have tested it with firefox and it works as it should in UX. If UPS connection is OK and established there is no delay in loading status page, I mean delay is less then 1 second and if there is a problem with connection (cable unplugged or whatever), then it returns "Status Alert: The UPS requires attention" as it should be. Why sleep shoud be called in all alert circumstances?

        1 Reply Last reply Reply Quote 0
        • w0wW
          w0w
          last edited by

          Now I understand what you mean, forgive me please for my stupidity. Looked deeply into nut.inc
          Yes we need some other way to fix it.

          1 Reply Last reply Reply Quote 0
          • dennypageD
            dennypage
            last edited by

            Okay. Just to make sure I understand, driver initialization fails if you remove the snmp_version=v2c from the Extra Arguments? What is in the log file?

            1 Reply Last reply Reply Quote 0
            • w0wW
              w0w
              last edited by

              No, not the driver init, but I am using SNMP card with other devices, one of them polls in SNMP v2.1c, so I decided to configure SNMP card with this version and configure all clients to use it.

              Ok, I have isolated sleep function to only when "UPS Monitor not running" condition. This looks much better ;)

              
              $status = nut_ups_status();
              		if ($status['_summary'] == "UPS Monitor not running") {
              		sleep(10);
              }
              
              $pgtitle = array(gettext("Services"), gettext("UPS"), gettext("Status"));
              include("head.inc");
              $tab_array = array();
              $tab_array[] = array(gettext("UPS Status"), true, "/nut_status.php");
              $tab_array[] = array(gettext("UPS Settings"), false, "/nut_settings.php");
              display_top_tabs($tab_array);
              
              $status = nut_ups_status();
              		if ($status['_alert']) {
              		print_info_box("Status Alert: The UPS requires attention", "alert-danger");
              }
              
              
              1 Reply Last reply Reply Quote 0
              • dennypageD
                dennypage
                last edited by

                I sent out a new version to everyone. If you have a moment, please give it a run and let me know if you see any issues. Thanks!

                1 Reply Last reply Reply Quote 0
                • w0wW
                  w0w
                  last edited by

                  Nothing changed for me.

                  same.jpg
                  same.jpg_thumb

                  1 Reply Last reply Reply Quote 0
                  • dennypageD
                    dennypage
                    last edited by

                    Yes, the changes affect service status on restart and reboot. Please confirm the checksum of nut.inc:

                    49946 12 /usr/local/pkg/nut/nut.inc

                    If you haven't already, please go to the settings page and re-save before testing again. Thanks.

                    1 Reply Last reply Reply Quote 0
                    • dennypageD
                      dennypage
                      last edited by

                      Also if you still see an issue with service restart, please post fresh system log file entries for ups*. Thanks.

                      1 Reply Last reply Reply Quote 0
                      • w0wW
                        w0w
                        last edited by

                        I have reinstalled package completely and now it's a bit different error appears.
                        Looked at nut.sh
                        You forgot to remove ampersand at /usr/local/sbin/upsdrvctl start & 
                        when I remove it, all is working as it should!

                        some_new.jpg
                        some_new.jpg_thumb

                        1 Reply Last reply Reply Quote 0
                        • dennypageD
                          dennypage
                          last edited by

                          Yes, that makes much more sense, and what I was expecting to see.

                          Removing the ampersand was a testing item to ensure I understood what was happening on your system. I wasn't intending to remove it from the production package as it is required to support the startup retry.

                          The thing that I fixed was the service status issue (the restart vs start button at the top). The "Failed to retrieve status" is kind of a fact of life because it is taking a very long time to initialize via SNMP. About the only thing that I could do about that is to have it report "pending", but then you would still have to press refresh in the page because the UI pages don't auto-refresh.

                          1 Reply Last reply Reply Quote 0
                          • w0wW
                            w0w
                            last edited by

                            Hmm… startup retry? Why do we need that?

                            Without ampersand, all working.

                            
                             Aug 2 09:35:23 	upsd 	5793 	User monuser@::1 logged into UPS [SMK-1000A]
                            Aug 2 09:35:23 	upsd 	5793 	Startup successful
                            Aug 2 09:35:23 	upsd 	5467 	Connected to UPS [SMK-1000A]: snmp-ups-SMK-1000A
                            Aug 2 09:35:23 	upsd 	5467 	listening on 127.0.0.1 port 3493
                            Aug 2 09:35:23 	upsd 	5467 	listening on ::1 port 3493
                            Aug 2 09:35:21 	snmp-ups 	5221 	Startup successful
                            Aug 2 09:35:14 	upsmon 	3642 	Startup successful
                            Aug 2 09:35:14 	snmp-ups 	49799 	Signal 15: exiting
                            Aug 2 09:35:14 	upsd 	64420 	Signal 15: exiting
                            Aug 2 09:35:14 	upsd 	64420 	mainloop: Interrupted system call
                            Aug 2 09:35:14 	upsd 	64420 	User monuser@::1 logged out from UPS [SMK-1000A]
                            Aug 2 09:35:14 	upsmon 	48594 	Signal 15: exiting 
                            
                            

                            With ampersand (Failed to retrieve status)

                            
                            Aug 2 09:39:33 	upsmon 	96136 	Communications with UPS SMK-1000A established
                            Aug 2 09:39:30 	upsd 	96547 	Connected to UPS [SMK-1000A]: snmp-ups-SMK-1000A
                            Aug 2 09:39:29 	snmp-ups 	97935 	Startup successful
                            Aug 2 09:39:28 	upsmon 	96136 	UPS SMK-1000A is unavailable
                            Aug 2 09:39:28 	upsmon 	96136 	Poll UPS [SMK-1000A] failed - Driver not connected
                            Aug 2 09:39:23 	upsmon 	96136 	Communications with UPS SMK-1000A lost
                            Aug 2 09:39:23 	upsmon 	96136 	Poll UPS [SMK-1000A] failed - Driver not connected
                            Aug 2 09:39:23 	upsd 	96547 	User monuser@::1 logged into UPS [SMK-1000A]
                            Aug 2 09:39:21 	upsd 	96547 	Startup successful
                            Aug 2 09:39:21 	upsd 	96511 	Can't connect to UPS [SMK-1000A] (snmp-ups-SMK-1000A): No such file or directory
                            Aug 2 09:39:21 	upsd 	96511 	listening on 127.0.0.1 port 3493
                            Aug 2 09:39:21 	upsd 	96511 	listening on ::1 port 3493
                            Aug 2 09:39:20 	upsmon 	95585 	Startup successful
                            Aug 2 09:39:20 	snmp-ups 	5221 	Signal 15: exiting
                            Aug 2 09:39:20 	upsd 	5793 	Signal 15: exiting
                            Aug 2 09:39:20 	upsd 	5793 	mainloop: Interrupted system call
                            Aug 2 09:39:20 	upsd 	5793 	User monuser@::1 logged out from UPS [SMK-1000A]
                            Aug 2 09:39:20 	upsmon 	4091 	Signal 15: exiting 
                            
                            
                            1 Reply Last reply Reply Quote 0
                            • w0wW
                              w0w
                              last edited by

                              Looks like this "retry" thing not working at all with SNMP connection, just causes it to re-establish twice.

                              1 Reply Last reply Reply Quote 0
                              • w0wW
                                w0w
                                last edited by

                                Increased sleep
                                ….......................................................
                                /usr/local/sbin/upsdrvctl start &
                                sleep 12
                                /usr/local/sbin/upsd -u root
                                ..........................................................

                                
                                Aug 2 09:49:14 	upsd 	31334 	User monuser@::1 logged into UPS [SMK-1000A]
                                Aug 2 09:49:13 	upsd 	31334 	Startup successful
                                Aug 2 09:49:13 	upsd 	31056 	Connected to UPS [SMK-1000A]: snmp-ups-SMK-1000A
                                Aug 2 09:49:13 	upsd 	31056 	listening on 127.0.0.1 port 3493
                                Aug 2 09:49:13 	upsd 	31056 	listening on ::1 port 3493
                                Aug 2 09:49:12 	snmp-ups 	30987 	Startup successful
                                Aug 2 09:49:01 	upsmon 	9616 	Startup successful
                                Aug 2 09:49:01 	snmp-ups 	6655 	Signal 15: exiting
                                Aug 2 09:49:01 	upsd 	7053 	Signal 15: exiting
                                Aug 2 09:49:01 	upsd 	7053 	mainloop: Interrupted system call
                                Aug 2 09:49:01 	upsd 	7053 	User monuser@::1 logged out from UPS [SMK-1000A]
                                Aug 2 09:49:01 	upsmon 	5726 	Signal 15: exiting 
                                
                                

                                Is it OK for you?

                                1 Reply Last reply Reply Quote 0
                                • dennypageD
                                  dennypage
                                  last edited by

                                  @w0w:

                                  Hmm… startup retry? Why do we need that?

                                  It was requested: https://forum.pfsense.org/index.php?topic=114871.msg638897#msg638897

                                  It's necessary to address the case of the ups being unreachable at pfSense startup.

                                  1 Reply Last reply Reply Quote 0
                                  • w0wW
                                    w0w
                                    last edited by

                                    I am sorry  :'(, but may be I have tested it in wrong way. If I unplug SNMP card from the network and then restart the service — it does not reconnects to the UPS. All I get in logs is

                                    
                                    endless....
                                    Aug 2 10:22:46 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:41 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:36 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:31 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:26 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:21 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:16 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                    Aug 2 10:22:11 	upsmon 	50160 	UPS SMK-1000A is unavailable 
                                    
                                    

                                    Looks like it does not matter sleep 1 or 12 is applied.

                                    1 Reply Last reply Reply Quote 0
                                    • dennypageD
                                      dennypage
                                      last edited by

                                      Do you still have maxretry and retrydelay in the advanced section?

                                      https://forum.pfsense.org/index.php?topic=114871.msg640123#msg640123

                                      You should be seeing a stream of messages from the snmp driver…

                                      @w0w:

                                      If I unplug SNMP card from the network and then restart the service — it does not reconnects to the UPS. All I get in logs is

                                      
                                      endless....
                                      Aug 2 10:22:46 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:41 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:36 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:31 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:26 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:21 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:16 	upsmon 	50160 	Poll UPS [SMK-1000A] failed - Driver not connected
                                      Aug 2 10:22:11 	upsmon 	50160 	UPS SMK-1000A is unavailable 
                                      
                                      
                                      1 Reply Last reply Reply Quote 0
                                      • w0wW
                                        w0w
                                        last edited by

                                        Double checked my setup and config and found that I am an idiot.
                                        maxretry=60
                                        retrydelay=60
                                        was placed into upsmon.conf instead of ups.conf
                                        I have removed both ampersand and sleep from nut.sh
                                        Testing again

                                        1 Reply Last reply Reply Quote 0
                                        • w0wW
                                          w0w
                                          last edited by

                                          Looks like all working without ampersand and sleep.
                                          But when ampersand removed it takes a while to load status page with disconnected UPS ;).
                                          So I ask you again, is it OK for you if sleep would be 12 seconds instead of 1?
                                          With this value all seems OK on my side. Reconnection after SNMP down and service restart  with SNMP online on status page is OK. No issues currently!

                                          1 Reply Last reply Reply Quote 0
                                          • dennypageD
                                            dennypage
                                            last edited by

                                            I really can't do a 12 second sleep in service start. The 95% case is that UPS startup takes well under 1 second. An additional delay in service start affects everyone and everything including system startup (boot). Introducing a 10+ second delay to system startup would not be generally acceptable. It also creates a situation in which there can be multiple service starts underway at the same time, which can result in all sorts of problems with nut. And of course, 12 seconds works in this specific case, but what happens when a case comes along that requires 20 seconds for UPS initialization?

                                            The backgrounding for driver start is required to support initialization retries. Without backgrounding, status page load (and system startup) will hang until driver initilization completes. With the maxretry/retrydelay example previously posted, this could be 1 hour. Note that maxretry/retrydelay only affects driver initialization (startup). It does not affect disconnect/reconnect.

                                            About the only thing I can do is have the status be "pending" like what is shown in the widget following system boot.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.