-
hello,
I have an "NoName Polish brand" ups connected on Local USB that work with blazer driver in 2.3.2-RELEASE-p1 (amd64).
I get few emails every day with "UPS is unavailable" and "Communication with UPS lost" even I set cron to restart NUT every 9 minute.
Is it possible to edit and add NUT restart command to UPS monitoring module instead of sending this emails ?
Any idea where to look ?Usually a manual restart of UPS monitoring daemon it fix the problem before cron but it is getting annoying to receive so many emails and entries in syslog from NUT… :o
thank you
edit:
I try now to automate restart ups service with:
upssched.conf
CMDSCRIPT /usr/local/bin/custom-upssched-cmd PIPEFN /var/db/nut/upssched.pipe LOCKFN /var/db/nut/upssched.lock AT NOCOMM * START-TIMER ups-no-comm 20 AT COMMBAD * START-TIMER ups-comm-bad 20 AT COMMOK * CANCEL-TIMER ups-comm-bad COMMOK
custom-upssched-cmd
#!/bin/sh NAME=`basename $0` LOGGER="/usr/bin/logger -t $NAME" case $1 in ups-no-comm|ups-comm-bad) $LOGGER "Timer event '$1' fired, restarting NUT..." /usr/local/etc/rc.d/nut.sh restart ;; *) $LOGGER "Unrecognized command: '$1'" ;; esac
-
Continual disconnects generally indicate an issue with the driver, USB hardware, or with the UPS itself. Before trying to implement scripting work-arounds, I would recommend confirming the source of the problem. Lots of suggestions for this…
First thing I would check would be the physicals: good USB cable, direct connection with no hub, etc. Also, try a different USB port and see if this affects the behavior. Run "usbconfig dump_device_desc" during normal operation. What is the vendor id? Does it match what you expect? Run usbconfig again after the disconnect happens. Does the device still appear?
I would also validate the NUT driver. Did the vendor recommend using the blazer driver? Have you tried any of the other usb based drivers? On the Services / UPS /Status page, do all the items in the UPS Detail section make sense? Are they reasonable values? If not, then there is a driver/configuration problem.
-
After ~12h since my last post when I deleted the cron restart job every 9 minute and I added upssched.conf and custom-upssched-cmd until now I have 0 lost/bad communication event.
here are some data from USB and UPS with my settings
ugen0.2: <usb to="" serial="" inno="" tech=""> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (100mA) bLength = 0x0012 bDescriptorType = 0x0001 bcdUSB = 0x0110 bDeviceClass = 0x0000 <probed by="" interface="" class=""> bDeviceSubClass = 0x0000 bDeviceProtocol = 0x0000 bMaxPacketSize0 = 0x0008 idVendor = 0x0665 idProduct = 0x5161 bcdDevice = 0x0002 iManufacturer = 0x0001 <inno tech=""> iProduct = 0x0002 <usb to="" serial=""> iSerialNumber = 0x0000 <no string=""> bNumConfigurations = 0x0001</no></usb></inno></probed></usb>
upsmon.conf
DEADTIME 21 POLLFREQ 7 POLLFREQALERT 7 SHUTDOWNCMD "/sbin/halt -p"
ups.conf
pollinterval = 7 default.battery.voltage.high = 13.7 default.battery.voltage.low = 11.5 runtimecal = 180,100,360,50 ignorelb override.battery.charge.low = 50 override.battery.runtime.low = 600
UPS Detail Variable Value battery.charge 100 battery.voltage 13.70 battery.voltage.high 13.00 battery.voltage.low 10.40 battery.voltage.nominal 12.0 device.type ups driver.name blazer_usb driver.parameter.pollinterval 7 driver.parameter.port auto driver.parameter.synchronous no driver.version 2.7.4 driver.version.internal 0.12 input.current.nominal 2.0 input.frequency 50.0 input.frequency.nominal 50 input.voltage 241.2 input.voltage.fault 241.2 input.voltage.nominal 230 output.voltage 241.2 ups.beeper.status enabled ups.delay.shutdown 30 ups.delay.start 180 ups.load 19 ups.productid 5161 ups.status OL ups.type offline / line interactive ups.vendorid 0665
Blazer is the only driver that work with this UPS, about the vendor of this ActiveJet UPS they have 0 support and documentation and don't replay at emails, I bought this one because it has UTP port IN/OUT isolation which I am using for WAN but I found that is not documented or marked which one is IN and which is OUT…
-
USB to serial adapters are often the source of problems.
You might try adding this global directive:
pollinterval=10
to the section entitled "Additional configuration lines for ups.conf" in Services / UPS / Settings / Advanced Settings.
-
Now after I deleted cron UPS schedule restart I don't have any more problems with lost communication, zero problems in 2 days !
Using upssched.conf & custom-upssched-cmd is the proper way to deal with lost communication if you ask me, no need to schedule a restart for UPS driver if communications is good.
edit:
I was to happy… worked only 2-3 days...
-
Something is NOT OK !
After 2-3 days without any problem today without any reason UPS communication went offline and no chance to be restored with the changes I made to
custom-upssched-cmd & upssched.confI think commands I defined there are not executed, syslog show no trace of NUT service restart.
I restarted manually the service but this problem continued… randomly.
even I changed:DEADTIME 30
POLLFREQ 10
POLLFREQALERT 10
and
pollinterval = 10No chance to fix for long term... communication go offline randomly.
Only pfsense reboot solved the UPS communication problem.
Definitely a NUT problem as I did not touched anything on UPS.
Any ideas ?
Do I need something else to set for upssched to work ??I hope I don't have to return to cron schedule NUT restart… because to schedule a pfsense reboot every night for NUT to work properly it is not a good option.
-
Dude, how about changing your "NoName Polish brand" UPS for something that works, instead of wasting time and blaming NUT?
-
Nu chance yet "duck", Santa is not authorized to transport this devices yet.
If you use an UPS you will now that this problem is old and common to NUT UPS.
https://forum.pfsense.org/index.php?topic=33860.0
https://forum.pfsense.org/index.php?topic=78977.0
…
just search it if you want more...
https://duckduckgo.com/?q=nut+ups+lost+communicationIf a reboot of pfsense fixed this problem after 2-3 days when it worked OK then are low chances to blame only the UPS hardware.
Cron seem a quick&dirty way to fix the problem and everybody recommend this because it is easy to implement... do you want to restart a process that it is working ok every 15 min, just to be sure ?
I think a better way to quick fix is to use monitor that will restart the driver/package only when it is required, ( this is the purpose of upssched.conf ) this is what I try to have it working properly, it already send you notification at defined event why not use that to restart the NUT driver if it is required & only when it is required.
p.s.
having this ( upssched.conf ) working properly will help everybody not only me ! -
If a reboot of pfsense fixed this problem after 2-3 days when it worked OK then are low chances to blame only the UPS hardware.
Just the opposite. If a reboot is required to fix the issue, this guarantees that it is not a NUT problem. This leaves you with the USB controller kernel driver, the USB controller itself, the cabling and the UPS. The kernel driver is extremely unlikely, so we'll set that aside. This leaves us with the hardware.
Please do the following:
1. Confirm that you have a good quality USB cable.
2. Confirm that you are not using a hub.
3. Confirm that you have tried a different USB port on the host.
4. Check to see if usbconfig shows the device as still being present after the UPS goes offline.
5. Try unplugging the USB cable for 10 seconds instead of rebooting.I suspect that what you will find in #4 is that the UPS no longer appears in the USB tree. In #5 I suspect you will find that unplugging the USB connection will cause the UPS to return to the tree.
Btw, the threads you referred to are not really pertinent to your issue. Please do the above steps first before going out to research more complicated situations. After the above steps, if you want to research more information on your issues, I would google for "NUT USB 0665 5161".
-
this time after ~1:30'after pfsense restart it happened again:
1 - Original USB UPS cable, not sure how good it is, I will try change it but for the moment I don't have another one.
2 - NO HUB - direct connection
3 - YES, the same.
4 - YES it is present, also present after unplugged and replugged cable.ugen0.2: <usb to="" serial="" inno="" tech=""> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (100mA) bLength = 0x0012 bDescriptorType = 0x0001 bcdUSB = 0x0110 bDeviceClass = 0x0000 <probed by="" interface="" class=""> bDeviceSubClass = 0x0000 bDeviceProtocol = 0x0000 bMaxPacketSize0 = 0x0008 idVendor = 0x0665 idProduct = 0x5161 bcdDevice = 0x0002 iManufacturer = 0x0001 <inno tech=""> iProduct = 0x0002 <usb to="" serial=""> iSerialNumber = 0x0000 <no string=""> bNumConfigurations = 0x0001</no></usb></inno></probed></usb>
5- Not fixed automatically… fixed only after saved again the UPS Settings.
Any ideas why is not working properly custom-upssched-cmd ?
#!/bin/sh NAME=`basename $0` LOGGER="/usr/bin/logger -t $NAME" case $1 in ups-no-comm) $LOGGER "Timer event '$1' fired, restart NUT..." /usr/local/etc/rc.d/nut.sh restart ;; *) $LOGGER "Unrecognized command: '$1'" ;; esac
upssched.conf
CMDSCRIPT /usr/local/bin/custom-upssched-cmd PIPEFN /tmp/upssched.pipe LOCKFN /tmp/upssched.lock AT NOCOMM * START-TIMER ups-no-comm 30 AT COMMBAD * START-TIMER ups-no-comm 30 AT COMMOK * CANCEL-TIMER ups-no-comm COMMOK
![2016-12-20 21.35.31.jpg](/public/imported_attachments/1/2016-12-20 21.35.31.jpg)
![2016-12-20 21.35.31.jpg_thumb](/public/imported_attachments/1/2016-12-20 21.35.31.jpg_thumb)
![2016-12-20 21.35.39.jpg](/public/imported_attachments/1/2016-12-20 21.35.39.jpg)
![2016-12-20 21.35.39.jpg_thumb](/public/imported_attachments/1/2016-12-20 21.35.39.jpg_thumb)
![2016-12-20 21.38.07.jpg](/public/imported_attachments/1/2016-12-20 21.38.07.jpg)
![2016-12-20 21.38.07.jpg_thumb](/public/imported_attachments/1/2016-12-20 21.38.07.jpg_thumb)
![2016-12-20 21.56.42.jpg](/public/imported_attachments/1/2016-12-20 21.56.42.jpg)
![2016-12-20 21.56.42.jpg_thumb](/public/imported_attachments/1/2016-12-20 21.56.42.jpg_thumb) -
It is a problem of user rights ?
![2016-12-20 22.09.06.jpg](/public/imported_attachments/1/2016-12-20 22.09.06.jpg)
![2016-12-20 22.09.06.jpg_thumb](/public/imported_attachments/1/2016-12-20 22.09.06.jpg_thumb) -
this time after ~1:30'after pfsense restart it happened again:
4 - YES it is present, also present after unplugged and replugged cable.
5- Not fixed automatically… fixed only after saved again the UPS Settings.Re-saving the UPS settings just stops and starts NUT. Simpler to just restart the service.
You may be experiencing multiple communication issues. Previously, you had an event where restarting the service did not work, and you had to reboot the system, yes? It is this type of event that I'm asking you to run usbconfig for. Do you still have pollinterval set to 10 for testing? If not, please put it back as described above. If you are setting anything else, please remove it.
If you want to diagnose the source of the problem, I will do my best to help you. In the end, we will likely find that the UPS is the source of the problem. I know that this is not what you want to hear, but it's almost certainly the case. I'm not able to help you with upssched other than to tell you that working around communications errors is absolutely not what it's intended for.
-
-
This are the settings I have:
upsmon.conf
DEADTIME 30 POLLFREQ 10 POLLFREQALERT 10 SHUTDOWNCMD "/sbin/halt -p"
ups.conf
pollinterval = 10 default.battery.voltage.high = 13.7 default.battery.voltage.low = 11.5 runtimecal = 180,100,360,50 ignorelb override.battery.charge.low = 50 override.battery.runtime.low = 600
attached img with usbconfig after error, before I unplugged the cable.
![2016-12-20 22.16.17.jpg](/public/imported_attachments/1/2016-12-20 22.16.17.jpg)
![2016-12-20 22.16.17.jpg_thumb](/public/imported_attachments/1/2016-12-20 22.16.17.jpg_thumb) -
Any ideas why is not working properly custom-upssched-cmd ?
And since you are probably are still thinking about this… yes, because your UPS is going offline for longer than it takes to restart the NUT service.
-
ups.conf
pollinterval = 10 default.battery.voltage.high = 13.7 default.battery.voltage.low = 11.5 runtimecal = 180,100,360,50 ignorelb override.battery.charge.low = 50 override.battery.runtime.low = 600
This cannot be your ups.conf as there is no actual ups definition. Please do not edit files by hand. Go into the UPS configuration (Services / UPS / Settings), remove everything from the advanced section except for "pollinterval=10", and save the configuration. This is where we will start.
Please post the contents of cat /usr/local/etc/nut/ups.conf, /usr/local/etc/nut/upsmon.conf, and /usr/local/etc/nut/upsd.conf for confirmation.
-
I have this config because I wanted the firewall to shutdown quick ( ~1-2 min ) if power is lost and not when UPS will send battery low.
I also have another computer - file server on this ups that monitor by a script I made if firewall is ON => UPS is online; other way will shut down ( it ping LAN interface ).All this because I can power ON all this computers by IAMT - remote but this require UPS to be ON; if battery is empty then UPS will go OFF and will stay OFF even if main power will restore.
With this 2 computers normal running on UPS I have a load of ~20% ( 90w ) so plenty of time for UPS to stay ON when power is lost and this 2 computers will also quick power OFF.
I made a small change increasing timer to 60s in upssched.conf and changed
custom-upssched-cmd
#!/bin/sh NAME=`basename $0` LOGGER="/usr/bin/logger -t $NAME" case $1 in ups-no-comm) $LOGGER "Timer event '$1' fired, restart NUT..." /usr/local/etc/rc.d/nut.sh stop sleep 30 /usr/local/etc/rc.d/nut.sh start ;; *) $LOGGER "Unrecognized command: '$1'" ;; esac
![2016-12-20 22.50.12.jpg](/public/imported_attachments/1/2016-12-20 22.50.12.jpg)
![2016-12-20 22.50.12.jpg_thumb](/public/imported_attachments/1/2016-12-20 22.50.12.jpg_thumb) -
I have this config because I wanted the firewall to shutdown quick ( ~1-2 min ) if power is lost and not when UPS will send battery low.
I understand what you are trying to do, however your current configuration does not actually do this. Most of the configuration is in the wrong place. It is unfortunate that NUT generally ignores invalid configuration rather than generating an error. Anyway, save this stuff for later–it's just creating complication in diagnosing the issue.
If you want help you will need to listen and stop changing things willy-nilly. Stop messing around with upssched.conf. Stop messing around with the advanced settings. Remove everything in the advanced section except "pollinterval=10" in the ups.conf section. Don't edit any files by hand.
If you don't want help, that's okay too. I won't be offended at all. Really.
-
/usr/local/etc/nut/ups.conf
pollinterval = 10 [ActiveJet] driver=blazer_usb port=auto
/usr/local/etc/nut/upsmon.conf
MONITOR ActiveJet 1 monuser 09959dba1bb426d94a50 master SHUTDOWNCMD "/sbin/shutdown -p +0" POWERDOWNFLAG /etc/killpower NOTIFYCMD /usr/local/pkg/nut/nut_email.php NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC NOTIFYFLAG FSD SYSLOG+WALL+EXEC NOTIFYFLAG COMMOK SYSLOG+WALL+EXEC NOTIFYFLAG COMMBAD SYSLOG+WALL+EXEC NOTIFYFLAG SHUTDOWN SYSLOG+WALL+EXEC NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC NOTIFYFLAG NOCOMM SYSLOG+WALL+EXEC NOTIFYFLAG NOPARENT SYSLOG+WALL+EXEC SHUTDOWNCMD "/sbin/halt -p"
/usr/local/etc/nut/upsd.conf
LISTEN 127.0.0.1 LISTEN ::1
![2016-12-21 10.36.41.jpg](/public/imported_attachments/1/2016-12-21 10.36.41.jpg)
![2016-12-21 10.36.41.jpg_thumb](/public/imported_attachments/1/2016-12-21 10.36.41.jpg_thumb) -
Okay. When the problem next occurs, please do the following:
1. Grab system log entries from Status / System Logs / System / General. Logs at the time of the event and the preceding few minutes. Plus any log entries with "usb" "ups" or "nut" since startup even if not near the event.
2. Log in and run "usbconfig dump_device_desc".
3. Attempt restart of the UPS service (Status / Services).
4. If #3 does not bring the service back, unplug the host end of the USB cable for 10 seconds, then attempt service restart again.
5. If #3 and #4 do not bring the service back, run usbconfig again, and then reboot.Please post the results.