NUT package (2.8.0 and below)
-
the "usbhid-ups" binary built from the FreeBSD nut-devel port you provided earlier this week to test looks to have solved the problem for me (or at least for those that have CyberPower UPSs). As of this morning eastern time, it's been running for over 72 hours with no more "exit on signal 10" errors in my system log file. Thanks for all your help in identifying the issue and that its already been fixed in the newer version of nut.
-
@shaffergr is this available form package manager now?
-
No. Denny provided me a build from nut-devel branch so that we could validate if the signal 10 issue was fix or not.
-
-
-
@dennypage I've been seeing logs like this:
Feb 25 19:37:13 gatekeeper kernel: pid 29800 (usbhid-ups), jid 0, uid 0: exited on signal 10 Feb 25 19:37:15 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:15 gatekeeper upsmon[28298]: Communications with UPS tripplite lost Feb 25 19:37:20 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:25 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:30 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:35 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected <goes on the same until manually restarted.>
setting
interruptonly
appears to have mitigated it. UPS is a Tripp Lite SMART1500LCD rack mount unit. It seemed to work fine prior to the last update. -
@jpp-0 I sent you the dev build of usbhid-ups. Please let me know if it works for you.
-
@dennypage Thanks sending the dev version, it looks to be working, it's been running for over 8 hours with all the extra config (
interruptonly
anduser=root
etc) removed. No crashes. It correctly detected power loss and power restored. If it fails laster I'll post and update but I'm not expecting it to as it would fine within minutes before. -
@dennypage can confirm
interruptonly
does the job as a workaround with my Cyberpower UPS. Many thanks. -
@jpp-0 I couldn't get the Smart1500 LCD I had to work if my life depended on it. I must have had a defective unit. I ended up returning it and getting a APC BGM1500B which worked right out of the box.
I couldn't get
offdelay=60
andondelay=90
to hold. It would hold the settings for 20 mins max then revert to default values.To top that off the UPS would run down until the battery died and never send shutdown command to my Netgate box.
Can you screen shot or post your settings. I'm interested to find out what actually works with that UPS. I killed myself for weeks until my return window was about to close before my autism would allow me to give up.
-
@whoami-tm right now I just have the defaults set on pfsense. My setup is a bit odd in that it should never get to shutdown becasue the generator / automatic transfer kicks in after 25 seconds and I have about a week's worth of propane on site.
I have by proxmox servers (the very originally named pxoxie and moxie, both nut clients) set to shutdown after 10 mins if for some reason the generator fails.
The pfsense box is not configured to shutdown. I'm running ram disk filesystems so I'm less worried about disk corruption and want to keep it running as long as possible.
I blew away all the config I had on pfSense trying to debug the crashes, I may add some complexity back but not for a while.
Lastly I would not buy Tripplite again it completely fails to give a decent runtime estimate (best guess is the power factor on modife sine wave is different enough from tru sine on the grid that it just get's confused).
-
This post is deleted! -
@whoami-tm have you tried this
[23.01-RELEASE][root@gatekeeper]/root: upsrw -s ups.delay.shutdown tripplite Username (root): admin Password: Enter new value for ups.delay.shutdown: 59 OK
sub in your ups name and get the password for
admin
from/usr/local/etc/nut/upsd.users
ups.delay.shutdown
seems to be the only value you can program but if I'm reading it correctly it should let you make it wait longer before powering off. -
I'm using a CyberPower UPS, and I'm experiencing the same signal 10 error. I've found the interruptonly setting to work until the USB issue is resolved.
If there is anything I can do to help/test a fix then let me know :-)
Anecdotally I've noticed my firewall's processor is running hotter than it used to, I don't think the CPU is idling properly not sure if anyone else is noticing a similar problem and if it is related in some way.
-
@dennypage I just wanted to thank you again sincerely for this post and for finding my erroneous post. This research has made the upgrade to pfSense+ 23.01 stable.
-
@lamaz You're welcome. Glad it's working for you.
-
-
-
-
Thanks for the work everyone has put into this so far.
Decided to upgrade from 2.6 CE to plus 23.01 today. Having the same issues as others are too. Looks to be with the usb driver. Started happening shortly after updrade completed.
I am getting the same results after adding root as a user.
Mar 8 21:53:35 pfSense upsd[9141]: Can't connect to UPS [eaton9130rm] (usbhid-ups-eaton9130rm): Connection refused Mar 8 21:53:35 pfSense kernel: pid 8842 (usbhid-ups), jid 0, uid 0: exited on signal 10 Mar 8 21:53:39 pfSense upsmon[1672]: Poll UPS [eaton9130rm] failed - Driver not connected Mar 8 21:53:39 pfSense upsmon[1672]: Communications with UPS eaton9130rm lost Mar 8 21:53:45 pfSense upsmon[1672]: Poll UPS [eaton9130rm] failed - Driver not connected Mar 8 21:53:50 pfSense upsmon[1672]: Poll UPS [eaton9130rm] failed - Driver not connected Mar 8 21:53:55 pfSense upsmon[1672]: Poll UPS [eaton9130rm] failed - Driver not connected
-
@trentk10 said in NUT package:
Mar 8 21:53:35 pfSense kernel: pid 8842 (usbhid-ups), jid 0, uid 0: exited on signal 10
Unfortunately, this is being hit by a lot of people with NUT 8.0. See this post for information.
-
-
@dennypage I seem to have a similar but possibly different problem.
My UPS is an Eaton Eclipse ECO 650 connected by USB to my Netgate-3100 running 23.01-RELEASE (arm) with NUT 2.8.0_2.
Note: the setup worked perfectly before my update to 23.01 and 2.8.0 for over 6 months, there have been no powercuts/surges or hardware changes.
As with others I am seeing repeated log entries from upsmon.
Mar 12 09:57:09 upsmon 80760 Poll UPS [EatonUPS] failed - Driver not connected Mar 12 09:57:04 upsmon 80760 Poll UPS [EatonUPS] failed - Driver not connected
If I filter my logs to upshid-ups I only see the following:
Mar 9 22:26:16 usbhid-ups 91631 Startup successful Mar 9 22:26:09 usbhid-ups 70838 Signal 15: exiting Mar 9 22:20:46 usbhid-ups 70838 Startup successful Mar 9 22:20:41 usbhid-ups 47718 Signal 15: exiting Mar 9 22:13:14 usbhid-ups 47718 Startup successful Mar 9 22:10:05 usbhid-ups 75991 Signal 15: exiting Mar 9 21:57:06 usbhid-ups 75991 Startup successful
If I filter the logs to upsd I see:
Mar 12 09:07:25 upsd 82791 Can't connect to UPS [EatonUPS] (usbhid-ups-EatonUPS): Connection refused Mar 12 09:02:25 upsd 82791 Can't connect to UPS [EatonUPS] (usbhid-ups-EatonUPS): Connection refused Mar 12 08:57:25 upsd 82791 Can't connect to UPS [EatonUPS] (usbhid-ups-EatonUPS): Connection refused Mar 12 08:52:25 upsd 82791 Can't connect to UPS [EatonUPS] (usbhid-ups-EatonUPS): Connection refused Mar 9 22:26:16 upsd 82791 Connected to UPS [EatonUPS]: usbhid-ups-EatonUPS Mar 9 22:26:14 upsd 82791 User local-monitor@::1 logged into UPS [EatonUPS] Mar 9 22:26:10 upsd 82791 Startup successful Mar 9 22:26:10 upsd 82791 Can't connect to UPS [EatonUPS] (usbhid-ups-EatonUPS): No such file or directory Mar 9 22:26:10 upsd 82791 not listening on 127.0.0.1 port 3493 Mar 9 22:26:10 upsd 82791 listening on ::1 port 3493 Mar 9 22:26:10 upsd 82791 listening on 127.0.0.1 port 3493 Mar 9 22:26:10 upsd 82791 not listening on 192.168.200.254 port 3493 Mar 9 22:26:10 upsd 82791 listening on pfsense.{internaldomainname} port 3493 Mar 9 22:26:09 upsd 62680 Signal 15: exiting Mar 9 22:26:09 upsd 62680 mainloop: Interrupted system call Mar 9 22:26:09 upsd 62680 User local-monitor@::1 logged out from UPS [EatonUPS]
This morning the failure occurred at 08:52 (from notification email):
8:52:27 UPS Notification from pfSense.irwazu.co.uk - Sun, 12 Mar 2023 08:52:27 +0000 Communications with UPS EatonUPS lost
My configuration is extra arguments to driver:
pollfreq=90
Additional configuration lines for upsmon.conf
RUN_AS_USER root
Additional configuration lines for ups.conf
user=root pollinterval=15
Other than using the "interruptonly" option is there anything I can do to resolve or help debug the cause? Is this likely the same issue as for CyberPower UPSs you've already identified?
Full logs of a restart of the UPS service are as follows:
Mar 12 10:05:19 upsmon 69168 Communications with UPS EatonUPS established Mar 12 10:05:16 upsd 70834 Connected to UPS [EatonUPS]: usbhid-ups-EatonUPS Mar 12 10:05:15 usbhid-ups 80232 Startup successful Mar 12 10:05:14 upsmon 69168 UPS EatonUPS is unavailable Mar 12 10:05:14 upsmon 69168 Poll UPS [EatonUPS] failed - Driver not connected Mar 12 10:05:14 upsd 70834 User local-monitor@::1 logged into UPS [EatonUPS] Mar 12 10:05:10 upsd 70834 Startup successful Mar 12 10:05:10 upsd 70834 Can't connect to UPS [EatonUPS] (usbhid-ups-EatonUPS): Connection refused Mar 12 10:05:10 upsd 70834 not listening on 127.0.0.1 port 3493 Mar 12 10:05:10 upsd 70834 listening on ::1 port 3493 Mar 12 10:05:10 upsd 70834 listening on 127.0.0.1 port 3493 Mar 12 10:05:10 upsd 70834 not listening on 192.168.200.254 port 3493 Mar 12 10:05:10 upsd 70834 listening on pfsense.irwazu.co.uk port 3493 Mar 12 10:05:09 upsmon 69168 Communications with UPS EatonUPS lost Mar 12 10:05:09 upsmon 69168 UPS [EatonUPS]: connect failed: Connection failure: Connection refused Mar 12 10:05:09 upsmon 69168 Startup successful Mar 12 10:05:08 upsd 82791 Signal 15: exiting Mar 12 10:05:08 upsd 82791 mainloop: Interrupted system call Mar 12 10:05:08 upsd 82791 User local-monitor@::1 logged out from UPS [EatonUPS] Mar 12 10:05:08 upsmon 80760 Signal 15: exiting Mar 12 10:05:07 upsmon 80760 Poll UPS [EatonUPS] failed - Driver not connected
-
@davidir Your log messages do not show anything particularly unusual. Signal 15 indicates that the usbhid-ups process was terminated via a kill signal. This is usually triggered by a package restart such was when your DHCP WAN address changes.
Btw, not sure what you are intending to do with the poll interval settings. Given that you are using a usb connection, there is a good reason to be setting these, particularly pollinterval in ups.conf which may negatively affect your shutdown. Unless you have a very concrete problem that you are fixing, I would recommend that you remove both of them. As well as the RUN_AS_USER setting in upsmon.conf.
-
I've received several requests for the dev build of usbhid-ups, so I thought I would upload the file here.
For reference, the shasum and sha256sum checksums of the unzipped file are:
49ce9131502bfb8b789ee97b7fb3fc81fc9f8fff usbhid-ups 999a2653559dbc50ecc8ba592a67587b1e307a1495f6e8ebbd3d8e90e3967133 usbhid-ups
If you use the file, please post and let me know if it resolves an issue for you.
-
@dennypage thank you very much for this. I loaded it up today and so far, it has continued to run for about 5 hours. I'll report back tomorrow to let you know if it hangs up overnight.
For other folks' information, I put the file Denny shared into /usr/local/libexec/nut replacing the file already there. (Be sure to make a copy of the original in case this doesn't work for you.) Make sure that the permissions are set to rwxr-xr-x (0755). Also I had to include "user=root" in the ups.conf section in pfSense.
Thanks again Denny.