-
For those of you affected by the CyberPower issue (and those that will be in the future):
First, I want to thank all of you for your help in tracking this down. Particularly @shaffergr, who was kind enough, and trusting enough, to run test builds for me against his CyberPower.
I tracked this down to a double free by the usb code in nut. FWIW, I don’t believe that this issue is completely limited to CyberPower UPSs, but it is probably pretty difficult to encounter with other units as hitting the issue requires a reconnect of the UPS on the bus. CyberPower units are well known to randomly disconnect and reconnect seemingly at random.
For those wondering why this issue did not happen in prior to 23.01: 22.05 and below used nut version 2.7.4. In 2.7.4, nut did not actually close the device when a disconnect happened. Presumably, this may have resulted in a memory leak at shutdown, but I didn’t explore enough to confirm. In 2.8.0, when a device disconnects nut actually closes the device when it disconnects. In fact, it closes it twice on all systems other than Linux. There is even a comment in the code noting how the double close would cause corruption on Linux systems. Unfortunately contrary to the code comments it also causes corruption in FreeBSD.
Having traced this down in the 2.8.0 release code, when I went to the current development version I surprised to find that someone had beat me to it, and fixed the issue back in August. It's just not been release yet. :)
To resolve this will require a new version of nut. Either a new release from the nut team, or for the pfSense team to move from the 2.8.0 release version to the current development version of nut. The nut team is looking toward 2.8.1, but it appears that they have a few things they still want to address before putting that out. I will explore the concept of moving to nut-devel with the pfSense team as time permits (I think they are pretty busy right now). But no matter which way it goes, it’s going to take some time.
In the interim, the only known work-around using the release code (discovered by @tman222) is to add the line
interruptonly
to the Extra Arguments to driver section. This will cut down some of the information you can see about your UPS, but the important stuff needed to monitor and shutdown should still be there.
Alternatively, if brave souls are interested, I have a build of usbhid-ups made from the FreeBSD nut-devel package. I do not have a CyberPower available so I haven't been able to directly test it, but I expect that it will work.
If you you decide you would like a copy, reach out to me and I will see about getting it to you. For reference, the shasum and sha256sum checksums are:
49ce9131502bfb8b789ee97b7fb3fc81fc9f8fff usbhid-ups 999a2653559dbc50ecc8ba592a67587b1e307a1495f6e8ebbd3d8e90e3967133 usbhid-ups
-
I'm going to sleep now...
-
Hi!
Last year I was struggling to connect to Ever ECO Pro 1200 AVR CDS UPS.
This was mostly because nut was in version 2.7 and support for that UPS has been added in 2.8.Lately I tried it again. This time nut package is already in version 2.8 (2.8.0_2).
I've installed it and configured with usbhid driver and some basic extra arguments:port=auto vendorid=2e51 productid=0000
Unfortunately, it still doesn't work. Why?
Log:
Feb 22 20:42:29 upsmon 71579 Poll UPS [ever] failed - Driver not connected
-
-
@tnowak First, I would recommend removing everything from the extra arguments section. Following that, test again. If it doesn't work please post the output from usbhid-ups, either from the system log or from the command line. The command line would be this:
/usr/local/libexec/nut/usbhid-ups -a ever
FYI, the port=auto is handled by the package and should never be added to the extra arguments section for usbhid. The vendorid/productid can be added back later if necessary.
-
This post is deleted! -
@dennypage Solved it by adding user=root in ups.conf section. But this is rather a workaround than a solution. Anyway, this seems to be a problem of nut / file / dev permissions.
-
@tnowak said in NUT package:
Solved it by adding user=root in ups.conf section. But this is rather a workaround than a solution. Anyway, this seems to be a problem of nut / file / dev permissions.
I expect that you are actually in the same situation as the new gen APC listed above: No quirk covering your UPS device (I actually don't see anything from Ever in the table at all).
To confirm, use these steps:
- disable nut (Services / UPS / Settings) and save the config
- unplug the usb connection to the ups and wait 5 seconds
- re-plug the usb connection to the ups
- run "usbconfig -d ugen0.2 show_ifdrv"
and post the result. My expectation is that you will see two lines, similar to this:
ugen0.2: <American Power Conversion Smart-UPS1000 FW:UPS 16.0 / ID1047> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (10mA) ugen0.2.0: uhid0: <American Power Conversion Smart-UPS1000 FW:UPS 16.0 / ID1047, class 0/0, rev 2.00/0.01, addr 1>
-
Hi @dennypage - thanks again for all your help looking into the signal 10/11 issue with CyberPower UPS units. I'm fine to continue running mine with the
interruptonly
flag workaround for now even if fewer variables are monitored. After a couple days of running this way, things appear to be stable. If it this setup ends up crashing at some point, I'll probably give the updated usbhid-ups driver a try. Also, if you do end up releasing a nut-devel package at some point that includes fixes post 2.8.0, I'd be happy to try that out as well. -
@dennypage You are amazing I appreciate all you do. Again, thanks for taking the time to look into this issue reported within this discussion. It seems to be a problem with many other users now and you already have a solid solution for it.
-
@dennypage said in NUT package:
@tnowak said in NUT package:
Solved it by adding user=root in ups.conf section. But this is rather a workaround than a solution. Anyway, this seems to be a problem of nut / file / dev permissions.
I expect that you are actually in the same situation as the new gen APC listed above: No quirk covering your UPS device (I actually don't see anything from Ever in the table at all).
To confirm, use these steps:
- disable nut (Services / UPS / Settings) and save the config
- unplug the usb connection to the ups and wait 5 seconds
- re-plug the usb connection to the ups
- run "usbconfig -d ugen0.2 show_ifdrv"
and post the result. My expectation is that you will see two lines, similar to this:
ugen0.2: <American Power Conversion Smart-UPS1000 FW:UPS 16.0 / ID1047> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (10mA) ugen0.2.0: uhid0: <American Power Conversion Smart-UPS1000 FW:UPS 16.0 / ID1047, class 0/0, rev 2.00/0.01, addr 1>
Result:
ugen0.2: <EVER ECO PRO AVR CDS> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (100mA) ugen0.2.0: uhid0: <EVER ECO PRO AVR CDS, class 0/0, rev 2.00/1.00, addr 2>
PS. I've also noticed a problem with nut loosing connection to this UPS even with user=root after some time. Then when I restart nut it shows up again.
-
@tnowak said in NUT package:
Result:
ugen0.2: <EVER ECO PRO AVR CDS> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (100mA)
ugen0.2.0: uhid0: <EVER ECO PRO AVR CDS, class 0/0, rev 2.00/1.00, addr 2>PS. I've also noticed a problem with nut loosing connection to this UPS even with user=root after some time. Then when I restart nut it shows up again.
Yep, that shows a kernel driver attached to the device. Same situation as the new series APC devices. Not too surprising, because I don't find any Ever devices defined in the usb table.
You can either use the user=root approach, or you can develop a quirk setting for /boot/loader.conf.local. Based on your prior post, I believe that the correct value would be this:
hw.usb.quirk.0="0x2e51 0x0002 0x0000 0xffff UQ_HID_IGNORE"
You can test this in advance by running this:
usbconfig add_dev_quirk_vplh 0x2e51 0x0002 0x0000 0xffff UQ_HID_IGNORE
followed by unplugging and replugging the usb cable to your ups. If the values are correct, when you run "usbconfig -d ugen0.2 show_ifdrv" again, you should only see one line of output like so:
ugen0.2: <EVER ECO PRO AVR CDS> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (100mA)
The ugen0.2.0 should be gone.
If this test works then you can add the line to your /boot/loader.conf.local file.
As to loosing communication after a time, I would still need to see the output from usbhid-ups, either from the system log or from the command line. There seem to be a few issues that do not produce entries in the system log, so I would recommend using the command line as previously discussed.
-
@dennypage said in NUT package:
hw.usb.quirk.0="0x2e51 0x0002 0x0000 0xffff UQ_HID_IGNORE"
You can test this in advance by running this:
usbconfig add_dev_quirk_vlph 0x2e51 0x0002 0x0000 0xffff UQ_HID_IGNORE
followed by unplugging and replugging the usb cable to your ups. If the values are correct, when you run "usbconfig -d ugen0.2 show_ifdrv" again, you should only see one line of output like so:
ugen0.2: <EVER ECO PRO AVR CDS> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (100mA)
The ugen0.2.0 should be gone.
Thanks, this was very helpfull. I had to modifiy command a bit, as I've noticed its add_dev_quirk_vplh not vlph and I changed pid (product id) to 0x0000. Now the second line is gone:
Now NUT starts without user=root just fine:
Feb 23 22:27:05 upsmon 20539 Communications with UPS ever established Feb 23 22:27:05 upsd 23193 User local-monitor@127.0.0.1 logged into UPS [ever] Feb 23 22:27:01 php 16832 /usr/local/sbin/acbupload.php: End of configuration backup to https://acb.netgate.com/save (success). Feb 23 22:27:01 upsd 23193 Startup successful Feb 23 22:27:01 upsd 23193 Connected to UPS [ever]: usbhid-ups-ever Feb 23 22:27:01 upsd 23193 listening on 127.0.0.1 port 3493 Feb 23 22:27:01 upsd 23193 listening on ::1 port 3493 Feb 23 22:27:00 usbhid-ups 21611 Startup successful
But soon after one minute or so:
Feb 23 22:28:15 upsmon 20539 Poll UPS [ever] failed - Driver not connected Feb 23 22:28:10 upsmon 20539 Communications with UPS ever lost Feb 23 22:28:10 upsmon 20539 Poll UPS [ever] failed - Driver not connected Feb 23 22:28:08 kernel pid 21611 (usbhid-ups), jid 0, uid 66: exited on signal 10 Feb 23 22:28:08 upsd 23193 Can't connect to UPS [ever] (usbhid-ups-ever): Connection refused
-
@tnowak said in NUT package:
I had to modifiy command a bit, as I've noticed its add_dev_quirk_vplh not vlph
Sorry, typo. I corrected the original post.
But soon after one minute or so:
Feb 23 22:28:08 kernel pid 21611 (usbhid-ups), jid 0, uid 66: exited on signal 10Congratulations, you are a double winner.
The post above regarding the CyberPower UPS units applies to you as well. -
@dennypage said in NUT package:
Congratulations, you are a double winner.
Wow, amazing! I deployed that workaround for the time being and it works reliably now! Looking forward for future nut package releases that solves this issue.
You're the man @dennypage! Thank you VERY much for your support that is extremely competent and helpful.
-
the "usbhid-ups" binary built from the FreeBSD nut-devel port you provided earlier this week to test looks to have solved the problem for me (or at least for those that have CyberPower UPSs). As of this morning eastern time, it's been running for over 72 hours with no more "exit on signal 10" errors in my system log file. Thanks for all your help in identifying the issue and that its already been fixed in the newer version of nut.
-
@shaffergr is this available form package manager now?
-
No. Denny provided me a build from nut-devel branch so that we could validate if the signal 10 issue was fix or not.
-
-
-
@dennypage I've been seeing logs like this:
Feb 25 19:37:13 gatekeeper kernel: pid 29800 (usbhid-ups), jid 0, uid 0: exited on signal 10 Feb 25 19:37:15 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:15 gatekeeper upsmon[28298]: Communications with UPS tripplite lost Feb 25 19:37:20 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:25 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:30 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected Feb 25 19:37:35 gatekeeper upsmon[28298]: Poll UPS [tripplite] failed - Driver not connected <goes on the same until manually restarted.>
setting
interruptonly
appears to have mitigated it. UPS is a Tripp Lite SMART1500LCD rack mount unit. It seemed to work fine prior to the last update. -
@jpp-0 I sent you the dev build of usbhid-ups. Please let me know if it works for you.