-
I upgraded NUT on pfSense to latest 2.7.4_5, and ever since I did that, it seems there is a high uptick in UPS communication errors. It's possible that it's another environmental condition that is going on here, but wanted to see if anyone else has had issues. PfSense is running on a small physical box, and it is directly connected via USB to a CyberPower EC650.
I've been getting flooded with emails as a result of this (yes, I know that I can turn them off). But it points to something else that is going on. Here is a dump of the ups related events for a small window on 9/4. In the last 24 hours, I have 232 events that are the Permission denied error.
One thing I did notice, if you look in /usr/local/etc/nut, all the *.conf files have perms of 640, but all the *.conf.sample files have perms of 644. Not sure if that matters or not, and also not sure if that's "new" or not.
Sep 4 19:36:09 upsmon[32699]: Communications with UPS ups established Sep 4 19:36:09 upsmon[32699]: Communications with UPS ups established Sep 4 19:36:09 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:36:09 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:36:04 upsmon[32699]: Communications with UPS ups lost Sep 4 19:36:04 upsmon[32699]: Communications with UPS ups lost Sep 4 19:36:04 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:36:04 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:32:48 upsmon[32699]: Communications with UPS ups established Sep 4 19:32:48 upsmon[32699]: Communications with UPS ups established Sep 4 19:32:48 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:32:48 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:32:43 upsmon[32699]: Communications with UPS ups lost Sep 4 19:32:43 upsmon[32699]: Communications with UPS ups lost Sep 4 19:32:43 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:32:43 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:30:12 upsmon[32699]: Communications with UPS ups established Sep 4 19:30:12 upsmon[32699]: Communications with UPS ups established Sep 4 19:30:12 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:30:12 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:30:07 upsmon[32699]: Communications with UPS ups lost Sep 4 19:30:07 upsmon[32699]: Communications with UPS ups lost Sep 4 19:30:07 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:30:07 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied
Okay, this isn't a NUT issue.
I dove deeper into the logging before these errors hit, and I see each time a log line like this:
Sep 4 19:32:37 dpinger: Mobility_Server 192.168.35.21: Alarm latency 266us stddev 32us loss 21%
After that line, bad things were happening. I'm surprised I wasn't having other network issues - I probably was, just unknown to me.
Sep 4 19:32:48 php-cgi: nut_email.php: Message sent to <redacted>@gmail.com OK Sep 4 19:32:48 php-cgi: nut_email.php: Message sent to <redacted>@gmail.com OK Sep 4 19:32:48 upsmon[32699]: Communications with UPS ups established Sep 4 19:32:48 upsmon[32699]: Communications with UPS ups established Sep 4 19:32:48 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:32:48 upsd[73255]: User monuser@::1 logged into UPS [ups] Sep 4 19:32:43 php-cgi: nut_email.php: Message sent to <redacted>@gmail.com OK Sep 4 19:32:43 php-cgi: nut_email.php: Message sent to <redacted>@gmail.com OK Sep 4 19:32:43 upsmon[32699]: Communications with UPS ups lost Sep 4 19:32:43 upsmon[32699]: Communications with UPS ups lost Sep 4 19:32:43 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:32:43 upsmon[32699]: Poll UPS [ups] failed - Write error: Permission denied Sep 4 19:32:38 php-fpm[30305]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use Mobility_Server. Sep 4 19:32:38 php-fpm[30305]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use Mobility_Server. Sep 4 19:32:37 check_reload_status: Reloading filter Sep 4 19:32:37 check_reload_status: Reloading filter Sep 4 19:32:37 check_reload_status: Restarting OpenVPN tunnels/interfaces Sep 4 19:32:37 check_reload_status: Restarting OpenVPN tunnels/interfaces Sep 4 19:32:37 check_reload_status: Restarting ipsec tunnels Sep 4 19:32:37 check_reload_status: Restarting ipsec tunnels Sep 4 19:32:37 check_reload_status: updating dyndns Mobility_Server Sep 4 19:32:37 check_reload_status: updating dyndns Mobility_Server Sep 4 19:32:37 dpinger: Mobility_Server 192.168.35.21: Alarm latency 266us stddev 32us loss 21%</redacted></redacted></redacted></redacted>
The fix was to go into the gateways, and disable the gateway checking on 192.168.35.21. I still don't know what was causing that in the first place, but that's not something that needs to be addressed in the NUT forum.
-
dennypage, thanks for your great job!
My old Back-UPS RS 1000 connected to pfsense working great, but once or few times a day shows the following in the log\email:
Sep 6 10:47:16 upsmon 42126 UPS Back-UPS_RS_1000 on battery
Sep 6 10:47:21 upsmon 42126 UPS Back-UPS_RS_1000 on line power.Time is always random, but event duration is always 5 seconds exactly.
My settings:
ups.conf[Back-UPS_RS_1000] driver=usbhid-ups port=auto ignorelb override.battery.charge.low = 30 override.battery.runtime.low = 300
upsmon.conf
MONITOR Back-UPS_RS_1000 1 monuser 7504xxxxxxxxxxxx master SHUTDOWNCMD "/sbin/shutdown -p +0" POWERDOWNFLAG /etc/killpower NOTIFYCMD /usr/local/pkg/nut/nut_email.php NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC NOTIFYFLAG FSD SYSLOG+WALL+EXEC NOTIFYFLAG COMMOK SYSLOG+WALL+EXEC NOTIFYFLAG COMMBAD SYSLOG+WALL+EXEC NOTIFYFLAG SHUTDOWN SYSLOG+WALL+EXEC NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC NOTIFYFLAG NOCOMM SYSLOG+WALL+EXEC NOTIFYFLAG NOPARENT SYSLOG+WALL+EXEC RUN_AS_USER root
I realize that is UPS problem but apart from that it works fine.
Are there any NUT tweaks to avoid such behavior? -
dennypage, thanks for your great job!
My old Back-UPS RS 1000 connected to pfsense working great, but once or few times a day shows the following in the log\email:
Sep 6 10:47:16 upsmon 42126 UPS Back-UPS_RS_1000 on battery
Sep 6 10:47:21 upsmon 42126 UPS Back-UPS_RS_1000 on line power.Time is always random, but event duration is always 5 seconds exactly.
…
I realize that is UPS problem but apart from that it works fine.
Are there any NUT tweaks to avoid such behavior?You're welcome!
The event itself may be very brief in duration, but is logged as 5 seconds due to the polling interval. There are several reasons that a momentary transfer to battery can happen depending upon the UPS.
Some UPSs allow the variables which control transfer to be changed. The complete list of variables supported by NUT can be found here. To see what your UPS actually supports, use the upsc command. For transfer items, look for variables "input sensitivity" or "input.transfer…"
Be careful with these. :)
-
My UPS variables:
battery.charge: 100 battery.charge.low: 30 battery.charge.warning: 50 battery.date: 2001/09/25 battery.mfr.date: 2016/10/10 battery.runtime: 5798 battery.runtime.low: 300 battery.type: PbAc battery.voltage: 26.9 battery.voltage.nominal: 24.0 device.mfr: American Power Conversion device.model: Back-UPS RS 1000 device.serial:xxxxxxxxxx device.type: ups driver.flag.ignorelb: enabled driver.name: usbhid-ups driver.parameter.pollfreq: 30 driver.parameter.pollinterval: 2 driver.parameter.port: auto driver.parameter.synchronous: no driver.version: 2.7.4 driver.version.data: APC HID 0.96 driver.version.internal: 0.41 input.sensitivity: medium input.transfer.high: 264 input.transfer.low: 194 input.transfer.reason: input voltage out of range input.voltage: 228.0 input.voltage.nominal: 230 ups.beeper.status: enabled ups.delay.shutdown: 20 ups.firmware: 7.g8 .I ups.firmware.aux: g8 ups.load: 5 ups.mfr: American Power Conversion ups.mfr.date: 2004/07/28 ups.model: Back-UPS RS 1000 ups.productid: 0002 ups.realpower.nominal: 600 ups.serial: xxxxxxxxxxxx ups.status: OL ups.test.result: No test initiated ups.timer.reboot: 0 ups.timer.shutdown: -1 ups.vendorid: 051d
input.transfer.reason: input voltage out of range - probably is the event reason?
I believe that
input.transfer.high:264 and input.transfer.low:194
are just voltage threshold? Am I right?Should I try change input.sensitivity to low? If so - where to put it in NUT settings page?
-
Why would you be doing any such thing? The UPS is doing its job, just leave it alone.
-
nput.transfer.reason: input voltage out of range - probably is the event reason?
I believe that
input.transfer.high:264 and input.transfer.low:194
are just voltage threshold? Am I right?Should I try change input.sensitivity to low? If so - where to put it in NUT settings page?
Yes, those are the voltage levels that will trigger transfer.
Setting UPS variables is not something that you do in your NUT config. You set them in the UPS using the upsrw command.
As Dok notes, the UPS is doing it's job, protecting the attached systems from excessive power fluctuations. You might want to investigate the source of the voltage problem before turning down the sensitivity. Consulting with the UPS manufacturer might be a good idea as well.
-
dennypaged, oktornotor, thanks for answers.
Agreed, the UPS is doing it's job. I'm just trying to avoid annoying mail reports (at the night time especially).
Another SMART UPSes connected to same power input do not have any complains about power conditions. -
Denny (or whoever else wants to lend a hand!),
Running pfSense 2.4.1 as a firewall and router, with NUT 2.7.4_5. I have the pfSense box, a FreeNAS server, and another FreeNAS server (Replication target) all on a CyberPower CP1000AVRLCD UPS. (My switch is also on that, a 16-port Netgear item.) I currently have my FreeNAS server as the master, and both the backups box and pfSense are "seeing" the master no problem. When I pull the cord on the UPS, I get three nice e-mail notifications (one from each of the boxes), and all boxes shutdown in an orderly fashion when ordered to by the master (tested both Battery Low and On Battery, the latter with a 10 second timer in my tests).
The problem is : none of the boxes start back up when I plug the UPS back in! I'm sure this is something dumb, because I'm not that knowledgeable at all - but what I CAN say is, the blue LED is lit on the UPS (but the display itself is blank). As in, it's not shut down. If I press it, it goes completely dark, and if I press it again, it lights up again, along with the LCD display and all - and, voila, most of the boxes start up. That is, both FreeNAS boxes start up - the pfSense one needs to be started manually. All three boxes have Always On setting in the BIOS for waking up after power loss.
Any ideas…? I think it would make more sense to have the box doing the routing (pfSense) also being the master, but I'm not 100% sure how to configure that. But the better question is - do I need to? Would that fix anything?
Thanks in advance, and thanks also both for all your work on this package, and for the continued support you offer to us plebes on this forum! =)
Cheers!
-
Can you clarify what you mean by “the pfSense one needs to be started manually”?
Traveling, so expect delayed responses…
Thanks.
-
Thanks for the quick reply - far from "delayed"!
I mean I have to physically push the power button on the pfSense box.
Enjoy your travels!
-
Thanks for the quick reply - far from "delayed"!
I mean I have to physically push the power button on the pfSense box.
Enjoy your travels!
This is kind of a known issue…what should happen is that the master should shut down the UPS before the UPS fully drains. In your case, the master is your free NAS so you likely need to look there.
I reported this issue back in June or so. Now that 2.4 is out I believe a bug fix for the nut package for pfSense can be released to fix this on the pfSense side. But I don’t think this will help you out, as it sounds like the freeNAS isn’t shutting down the UPS prior to fully draining it.
-
This is kind of a known issue…what should happen is that the master should shut down the UPS before the UPS fully drains. In your case, the master is your free NAS so you likely need to look there.
I reported this issue back in June or so. Now that 2.4 is out I believe a bug fix for the nut package for pfSense can be released to fix this on the pfSense side. But I don’t think this will help you out, as it sounds like the freeNAS isn’t shutting down the UPS prior to fully draining it.
Oh! So the UPS NEEDS to be shut down too? Too easy - FreeNAS has a setting for that in the UPS settings. I'll tick that box, try again and report.
-
And THERE it is!! As soon as I instructed the UPS to shut down in my settings, everything worked perfectly. All boxes shutdown properly, UPS shut down, and when I plugged the UPS back in, all three computers spun right back up. The very first time I went through the whole cycle, comms with the UPS couldn't be re-established for some reason (got a flurry of e-mails from all three boxes), but I've tested it a few times since and it's always worked. The one I'm most worried about is the pfSense box - as long as IT comes back online while I'm away, I can VPN in and restart or whatnot, and do what I need to do, and it always has, regardless of UPS comms status. So, all seems well.
The one scenario I haven't tested yet is a power race, but that'll have to wait - the little one is up…=)
Thanks for your help!!
-
This is kind of a known issue…what should happen is that the master should shut down the UPS before the UPS fully drains. In your case, the master is your free NAS so you likely need to look there.
I reported this issue back in June or so. Now that 2.4 is out I believe a bug fix for the nut package for pfSense can be released to fix this on the pfSense side.
Hello,
I'd like to report I have the same issue: UPS brand is Eaton, fully supported by usbhid driver and "upsdrvctl shutdown" works perfectly on a Debian Jessie but not with pfSense NUT package.I've tried to dig the init script in /usr/local/etc/rc.d/nut modifying it with no luck.
Can I solve it or is it a bug in the pfSense package? If it is a bug, do the maintainer of the package has any forecast about its resolution?
Thank you
-
Sorry again for the delay. Had a few real-life things come up.
For a temporary work-around, please see reply #250.
-
PRs are in, one for mainline pfSense and one for the NUT package, both of which are required to address the power kill issue. The mainline PR needs to go into a pfSense release cycle, so 2.4.2 is the earliest possible, but I expect 2.4.3 is more likely at this point.
-
So does the latest beta 2.4.3 have these PR's?
-
Hey w0w,
I’m traveling… I don’t recall seeing merge notifications, but I get a lot of email. You should be able to check the PRs in the main pfSense and FreeBSD repos to see if they’re still outstanding. If not I’ll check it when I get home.
Thanks,
Denny -
But i found a Problem with the override values. As i said above, i have set the following to the additional ups.conf field in pfSense:
ignorelb override.battery.charge.low = 50 override.battery.runtime.low = 1200
BUT These values are added on top of the ups.conf file, which means in my case, they are ignored. When i edit the ups.conf manually from ssh / Shell, and set thes override values at the bottom of the file and restart the daemon, the everything works and also the Monitoring in pfsense Show the new values. Is this a bug? Problem is, after reboot or restart the daemon, These values are not the anymore…
The section entitled "Additional configuration lines for ups.conf" in Advanced settings is for global directives. The values you are trying to set are UPS specific arguments. UPS specific arguments should be entered in the section above that says "Extra Arguments to driver"
Sorry this isn't clear. I'll have a look at clarifying.
Hello and thanks for the great work.
I'm working with pfSense and NUT 2.7.4_5 package. The "bug" still remains, if I edit (via webpage) the "Extra Arguments to driver (optional)" withignorelb
override.battery.charge.low = 30all works correctly, it adds the 2 lines AFTER
[myups]
driver=usbhid-ups
port=autoin "/usr/local/etc/nut/ups.conf" file.
But if I edit the "Additional configuration lines for ups.conf", which it seems to be more correct (http://networkupstools.org/docs/man/ups.conf.html), it adds the lines at beginning of the file, and it doesn't work.
All other configuration settings (upsmon.conf, upsd.conf, upsd.users) add lines at the end of the files (checked via ssh), as it should be I think… -
The "bug" still remains, if I edit (via webpage) the "Extra Arguments to driver (optional)" with
ignorelb
override.battery.charge.low = 30all works correctly, it adds the 2 lines AFTER
[myups]
driver=usbhid-ups
port=autoin "/usr/local/etc/nut/ups.conf" file.
But if I edit the "Additional configuration lines for ups.conf", which it seems to be more correct (http://networkupstools.org/docs/man/ups.conf.html), it adds the lines at beginning of the file, and it doesn't work.Not sure what "bug" you are referring to…
The directives you are trying to use are not global directives, but are UPS specific directives.
It's easy to overlook this at the beginning of the ups.conf manual page:"The file begins with global directives, and then each UPS has a section which contains a number of directives that set parameters for that UPS."
In other words, the general format of ups.conf looks like this:
global_directive1 global_directive2 global_directive3 [ups1] ups_directive1 ups_directive2 ups_directive3 [ups2] ups_directive1 ups_directive2 ups_directive3
All global directives must appear before the first UPS section. So the "Additional configuration lines for ups.conf" in the Advanced section is only for global directives only, and anything put there will appear before the UPS section. The "Extra Arguments to driver" section is for things that apply to your UPS driver, and will be placed in the UPS section.
Adding extra driver arguments is somewhat common, but the vast majority of users do not need anything under Advanced settings which is why they are hidden by default.