Beta test of new NUT UPS package
-
Yea, there isn't much ill that a 10 second sleep doesn't fix. However I don't think I can introduce a 10 second sleep in the UX. It creates a delay in communicating the error condition for local UPSs.
-
Yea, there isn't much ill that a 10 second sleep doesn't fix. However I don't think I can introduce a 10 second sleep in the UX. It creates a delay in communicating the error condition for local UPSs.
Sleep activated only if first $status query return '_alert', else NO DELAY applied. In normal condition if I understand your
nut.inc correctly this never should return '_alert' if connection is OK, it does not matter local or remote connection established. -
It introduces a ten second delay in all alert circumstances. However the issue we are trying to address only occurs on service start and only with a slow (remote) UPS.
-
Btw, forgot to ask this earlier: Is the use of v2c a holdover from the prior NUT package, or did you explicitly configure it? Have you tried running without it?
-
It introduces a ten second delay in all alert circumstances. However the issue we are trying to address only occurs on service start and only with a slow (remote) UPS.
I am not sure are we talking about the same thing? I have tested it with firefox and it works as it should in UX. If UPS connection is OK and established there is no delay in loading status page, I mean delay is less then 1 second and if there is a problem with connection (cable unplugged or whatever), then it returns "Status Alert: The UPS requires attention" as it should be but after 10 seconds delay or returns UPS data if connection is established during this 10sec sleep. Why sleep should be called in all alert circumstances? I don't understand. I am sorry. I am not an PHP programmer, but if logiс is the same for all languages I know. If I do something wrong, then feel free to tell me :)
Btw, forgot to ask this earlier: Is the use of v2c a holdover from the prior NUT package, or did you explicitly configure it? Have you tried running without it?
It's not from prior, it's required for some other clients on network.
EDIT:
Now I understand what you mean, forgive me please for my stupidity. Looked deeply into nut.inc
Yes we need some other way to fix it. -
I'm not a PHP guy either. I'm a C and assembler programmer. :)
The proposed change introduces a 10 second delay to page processing if there is any alert condition. Let's say you have a local UPS, either local or remote, which has been running fine and then goes into alert state. You click on the widget header to go to the UPS status page. That page load will experience a 10 second delay, which it shouldn't.
The core issue comes from the delay between service start and service availability due to the time taken in driver initialization when talking to the SNMP UPS. It is only during this interval that a delay would be appropriate.
@w0w:
It introduces a ten second delay in all alert circumstances. However the issue we are trying to address only occurs on service start and only with a slow (remote) UPS.
I am not sure are we talking about the same thing? I have tested it with firefox and it works as it should in UX. If UPS connection is OK and established there is no delay in loading status page, I mean delay is less then 1 second and if there is a problem with connection (cable unplugged or whatever), then it returns "Status Alert: The UPS requires attention" as it should be. Why sleep shoud be called in all alert circumstances?
-
Now I understand what you mean, forgive me please for my stupidity. Looked deeply into nut.inc
Yes we need some other way to fix it. -
Okay. Just to make sure I understand, driver initialization fails if you remove the snmp_version=v2c from the Extra Arguments? What is in the log file?
-
No, not the driver init, but I am using SNMP card with other devices, one of them polls in SNMP v2.1c, so I decided to configure SNMP card with this version and configure all clients to use it.
Ok, I have isolated sleep function to only when "UPS Monitor not running" condition. This looks much better ;)
$status = nut_ups_status(); if ($status['_summary'] == "UPS Monitor not running") { sleep(10); } $pgtitle = array(gettext("Services"), gettext("UPS"), gettext("Status")); include("head.inc"); $tab_array = array(); $tab_array[] = array(gettext("UPS Status"), true, "/nut_status.php"); $tab_array[] = array(gettext("UPS Settings"), false, "/nut_settings.php"); display_top_tabs($tab_array); $status = nut_ups_status(); if ($status['_alert']) { print_info_box("Status Alert: The UPS requires attention", "alert-danger"); }
-
I sent out a new version to everyone. If you have a moment, please give it a run and let me know if you see any issues. Thanks!
-
Nothing changed for me.
-
Yes, the changes affect service status on restart and reboot. Please confirm the checksum of nut.inc:
49946 12 /usr/local/pkg/nut/nut.inc
If you haven't already, please go to the settings page and re-save before testing again. Thanks.
-
Also if you still see an issue with service restart, please post fresh system log file entries for ups*. Thanks.
-
I have reinstalled package completely and now it's a bit different error appears.
Looked at nut.sh
You forgot to remove ampersand at /usr/local/sbin/upsdrvctl start &
when I remove it, all is working as it should!
-
Yes, that makes much more sense, and what I was expecting to see.
Removing the ampersand was a testing item to ensure I understood what was happening on your system. I wasn't intending to remove it from the production package as it is required to support the startup retry.
The thing that I fixed was the service status issue (the restart vs start button at the top). The "Failed to retrieve status" is kind of a fact of life because it is taking a very long time to initialize via SNMP. About the only thing that I could do about that is to have it report "pending", but then you would still have to press refresh in the page because the UI pages don't auto-refresh.
-
Hmm… startup retry? Why do we need that?
Without ampersand, all working.
Aug 2 09:35:23 upsd 5793 User monuser@::1 logged into UPS [SMK-1000A] Aug 2 09:35:23 upsd 5793 Startup successful Aug 2 09:35:23 upsd 5467 Connected to UPS [SMK-1000A]: snmp-ups-SMK-1000A Aug 2 09:35:23 upsd 5467 listening on 127.0.0.1 port 3493 Aug 2 09:35:23 upsd 5467 listening on ::1 port 3493 Aug 2 09:35:21 snmp-ups 5221 Startup successful Aug 2 09:35:14 upsmon 3642 Startup successful Aug 2 09:35:14 snmp-ups 49799 Signal 15: exiting Aug 2 09:35:14 upsd 64420 Signal 15: exiting Aug 2 09:35:14 upsd 64420 mainloop: Interrupted system call Aug 2 09:35:14 upsd 64420 User monuser@::1 logged out from UPS [SMK-1000A] Aug 2 09:35:14 upsmon 48594 Signal 15: exiting
With ampersand (Failed to retrieve status)
Aug 2 09:39:33 upsmon 96136 Communications with UPS SMK-1000A established Aug 2 09:39:30 upsd 96547 Connected to UPS [SMK-1000A]: snmp-ups-SMK-1000A Aug 2 09:39:29 snmp-ups 97935 Startup successful Aug 2 09:39:28 upsmon 96136 UPS SMK-1000A is unavailable Aug 2 09:39:28 upsmon 96136 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 09:39:23 upsmon 96136 Communications with UPS SMK-1000A lost Aug 2 09:39:23 upsmon 96136 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 09:39:23 upsd 96547 User monuser@::1 logged into UPS [SMK-1000A] Aug 2 09:39:21 upsd 96547 Startup successful Aug 2 09:39:21 upsd 96511 Can't connect to UPS [SMK-1000A] (snmp-ups-SMK-1000A): No such file or directory Aug 2 09:39:21 upsd 96511 listening on 127.0.0.1 port 3493 Aug 2 09:39:21 upsd 96511 listening on ::1 port 3493 Aug 2 09:39:20 upsmon 95585 Startup successful Aug 2 09:39:20 snmp-ups 5221 Signal 15: exiting Aug 2 09:39:20 upsd 5793 Signal 15: exiting Aug 2 09:39:20 upsd 5793 mainloop: Interrupted system call Aug 2 09:39:20 upsd 5793 User monuser@::1 logged out from UPS [SMK-1000A] Aug 2 09:39:20 upsmon 4091 Signal 15: exiting
-
Looks like this "retry" thing not working at all with SNMP connection, just causes it to re-establish twice.
-
Increased sleep
….......................................................
/usr/local/sbin/upsdrvctl start &
sleep 12
/usr/local/sbin/upsd -u root
..........................................................Aug 2 09:49:14 upsd 31334 User monuser@::1 logged into UPS [SMK-1000A] Aug 2 09:49:13 upsd 31334 Startup successful Aug 2 09:49:13 upsd 31056 Connected to UPS [SMK-1000A]: snmp-ups-SMK-1000A Aug 2 09:49:13 upsd 31056 listening on 127.0.0.1 port 3493 Aug 2 09:49:13 upsd 31056 listening on ::1 port 3493 Aug 2 09:49:12 snmp-ups 30987 Startup successful Aug 2 09:49:01 upsmon 9616 Startup successful Aug 2 09:49:01 snmp-ups 6655 Signal 15: exiting Aug 2 09:49:01 upsd 7053 Signal 15: exiting Aug 2 09:49:01 upsd 7053 mainloop: Interrupted system call Aug 2 09:49:01 upsd 7053 User monuser@::1 logged out from UPS [SMK-1000A] Aug 2 09:49:01 upsmon 5726 Signal 15: exiting
Is it OK for you?
-
@w0w:
Hmm… startup retry? Why do we need that?
It was requested: https://forum.pfsense.org/index.php?topic=114871.msg638897#msg638897
It's necessary to address the case of the ups being unreachable at pfSense startup.
-
I am sorry :'(, but may be I have tested it in wrong way. If I unplug SNMP card from the network and then restart the service — it does not reconnects to the UPS. All I get in logs is
endless.... Aug 2 10:22:46 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:41 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:36 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:31 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:26 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:21 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:16 upsmon 50160 Poll UPS [SMK-1000A] failed - Driver not connected Aug 2 10:22:11 upsmon 50160 UPS SMK-1000A is unavailable
Looks like it does not matter sleep 1 or 12 is applied.