NUT package (2.8.0 and below)
-
So, I confess that, after I'd initially setup NUT, I'd been content with that fact that I could see UPS status and that it would probably function as intended in a power outage situation. We rarely lose power here and, when we do, it usually comes back in a few minutes, so I'd never hit an extended outage period before. However, I did hit that situation a few weeks ago and realized my pfsense (Netgate 3100) did not shutdown correctly.
I'd only just these past few days been able to truly play around with it to see what might be going on. I tried running upsmon -c fsd to simulate an outage and found the system shutdown the UPS in about 10 seconds, well before pfsense could actually shutdown. Here is what I see from the console when I run that command:
[22.05-RELEASE][root@pfSense.localdomain]/root: upsmon -c fsd Network UPS Tools upsmon 2.7.4 [22.05-RELEASE][root@pfSense.localdomain]/root: Netgate pfSense Plus is now shutting down ... Network UPS Tools upsmon 2.7.4 kill: No such process UPS: ups (master) (power value 1) Using power down flag file /etc/killpower Power down flag is set Network UPS Tools - UPS driver controller 2.7.4 Network UPS Tools - Generic HID driver 0.41 (2.7.4) USB communication driver 0.33 Using subdriver: CyberPower HID 0.4 Initiating UPS shutdown pflog0: promiscuous mode disabled ugen1.2: <CPS CP685AVRa> at usbus1 (disconnected)
When it says "Netgate pfSense Plus is now shutting down", the UPS power turns off within 10 seconds and I see nothing in the logs about pfsense actually shutting down. It just goes from normal system messages to rebooting and I see it having to repair the file system (the repair messages don't seem to actually be in the system.log):
Jul 30 07:30:47 pfSense kernel: Root mount waiting for: CAM Jul 30 07:30:47 pfSense kernel: Root mount waiting for: CAM Jul 30 07:30:47 pfSense kernel: mountroot: waiting for device /dev/diskid/DISK-0E32776As2a... Jul 30 07:30:47 pfSense kernel: WARNING: / was not properly dismounted Jul 30 07:30:47 pfSense kernel: random: unblocking device.
Thoughts as to why pfSense is not actually initiating a shutdown and the UPS kill command is being sent so soon?
Thanks,
Dan -
I would recommend for shutdown testing that you connect your NUT master (pfSense in this case) directly to power rather than through the UPS. This is to guarantee that you have complete logs and to avoid potential corruption to your your file system.
Now, re-run your test and time how long it takes for the shutdown to happen. Pay particular attention to the time between the OS shutdown completion and the actual power cut.
The UPS must be configured to have sufficient delay between the kill command and the actual load shut off. Often, the default value (usually in the 10-20 seconds) is insufficient for the operating system shutdown to complete and needs to be increased.
With higher end UPSs you can usually configure this value via upsrw. Look for a "ups.delay.shutdown" or "load.off.delay" parameter.
With other UPSs the off delay has to be set every time the ups starts up. Particularly if you're using USBHID-UPS, then it's it's almost certain that will need to set the value "offdelay" in the driver configuration.
-
@dennypage Thank you very much for the feedback. Based on what you described, I think I might have misunderstood when the UPS power off command is issued. Based on this documentation on the NUT site:
6. The upsmon primary: generates a NOTIFY_SHUTDOWN event waits FINALDELAY seconds — typically 5 creates the POWERDOWNFLAG file in its local filesystem — usually /etc/killpower calls the SHUTDOWNCMD 7. On most systems, init takes over, kills your processes, syncs and unmounts some filesystems, and remounts some read-only. 8. init then runs your shutdown script. This checks for the POWERDOWNFLAG, finds it, and tells the UPS driver(s) to power off the load by sending commands to the connected UPS device(s) they manage. 9. All the systems lose power. 10. Time passes. The power returns, and the UPS switches back on. 11. All systems reboot and go back to work.
I had assumed that the power off command would not be issued until pfSense had performed most of its shutdown, but that wasn't the case at all.
I did in fact try this by pulling the plug as well with an artificially higher battery.charge.low setting and experienced the same rapid power off before pfSense even began shutting down.
If the power off command is being issued before pfSense even begins a shutdown sequence, I understand what you are saying about needing to increase the ups.delay.shutdown. I see that my UPS (CP685AVR) allows setting this via upsrw. I'll try playing around with that this evening after the household heads to bed. Don't think they'd appreciate losing Internet.
Thanks,
Dan -
@dennypage - just a quick follow-up. I tried to upsrw way of setting the ups.delay.shutdown, and, as you suggested, that didn't stick, so I set the offdelay and ondelay in the additional driver args section. I still won't be testing a full off scenario with the new settings until this evening so the family doesn't lose connectivity, though.
In the interim, would you be able to comment on my observations of the UPS power off command being sent immediately rather than after pfSense has had init shut everything down as per the NUT documentation I copied in the post above? Is that just due to the way NUT is being controlled under pfSense?
The way the NUT documentation describes is, the UPS driver shouldn't tell UPS to power off the load until after all the processes have shutdown and disks have been unmounted or remounted read-only.
Thanks,
Dan -
@dannyboy2k said in NUT package:
The way the NUT documentation describes is, the UPS driver shouldn't tell UPS to power off the load until after all the processes have shutdown and disks have been unmounted or remounted read-only.
The pfSense specific shutdown scripts are called early in the shutdown process. It's a two edged sword... on one hand you want things like rrd backup to happen early, on the other you want things like nut to happen late. Right now there is only one option, and it's early.
-
-
@whoami-tm said in NUT package:
vendorid=09ae
productid=2012
offdelay=60how on earth did you get your SMART1500LCD to even show up and stay connected. Mine just disconnects (and immediately reconnects) every 10 seconds (approx.).
I'm using an old NetGate SG-4860:
Plug-in USBugen0.4: <Tripp Lite Tripp Lite UPS> at usbus0 uhid0 on uhub1 uhid0: <Tripp Lite Tripp Lite UPS, class 0/0, rev 1.10/0.09, addr 4> on usbus0
Every 10 seconds (or so) it disconnects
ugen0.4: <Tripp Lite Tripp Lite UPS> at usbus0 (disconnected) uhid0: at uhub1, port 2, addr 4 (disconnected) uhid0: detached
usbconfig ugen0.1: <Intel EHCI root HUB> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA) ugen0.2: <vendor 0x8087 product 0x07db> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA) ugen0.3: <Generic Ultra Fast Media> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (96mA) ugen0.4: <Tripp Lite Tripp Lite UPS> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (0mA)
I even tried using the same extra arguments to
usbhid
driver that you showed in your settings but I don't think it matters since the USB device keeps disconnecting.
Perhaps I'm just unluckyBroadcast Message from root@pfSense.localdomain (no tty) at 14:31 EDT ... Communications with UPS tripplite lost Broadcast Message from root@pfSense.localdomain (no tty) at 14:31 EDT ... UPS tripplite is unavailable upscmd -l tripplite Error: Driver not connected
-
@racecarr said in NUT package:
Plug-in USB
ugen0.4: <Tripp Lite Tripp Lite UPS> at usbus0
uhid0 on uhub1
uhid0: <Tripp Lite Tripp Lite UPS, class 0/0, rev 1.10/0.09, addr 4> on usbus0Every 10 seconds (or so) it disconnects
Start with basics. Disable NUT. Reboot pfSense.
Does the log still fill with disconnect/reconnect messages every 10 seconds?
If so, do you have a USB hub in-line with the UPS? If you do, try removing it.
-
@racecarr I would test the USB cable also. Just to be sure. I have a Cyber power system it works great. Saved my Java programming final one year. Power went out during the timed test, last 10 mins.
-
@racecarr not sure if you read all my posts but the conclusion was I returned the Tripplite Smart1500 back to amazon because I couldnt get it to work with NUT the way I needed and bought a APC BGM1500B and a rack shelf. The APC just worked and did everything with NUT I needed it to do. I am also running a Netgate SG-4860-1u
-
@whoami-tm yeah it does seem like a potential big waste of time. I got the Tripp-Lite for free so I figured I would try and make it work.
Ironically it seems to work great when plugged in to a Mac -
Hello,
I have at home a VDSL line as main and a LTE Netgear Modem as backup.
Both are configured in pfSense as Multi WAN Failover Gateway.
PfSense Plus 22.0. A n APC USP is connected directly via USB.
Nut is installed and working fine.
However when one of both connections drops e.g. due high ping latency I get an message by PfSense that the UPS has been disconnected from pfsense.
Why has the Gateway an impact to the nut service?
How can I avoid that nut throws an error when one of both gateways gets down? -
@renegade A gateway failover results in pfSense restarting all packages. The reason is that pfSense does not know which services are impacted by the gateway and which are not. If you had a single gateway you could just disable the gateway monitoring action, but with two gateways that you are switching between I don't believe there is anything you can do currently. Perhaps in the future.
-
@renegade said in NUT package:
I get an message by PfSense that the UPS has been disconnected from pfsense
Can you show the message ?
-
@dennypage Alright, thank you. Understood and good to know that I can stop trying testing different things.
@gertjan I‘ll configured pushover to send me messages:
Firewall: firewall.home
UPS Notification from firewall.home - Mon, 24 Oct 2022 03:47:12 +0200Communications with UPS ups lost
————————————-
Firewall: firewall.home
UPS Notification from firewall.home - Mon, 24 Oct 2022 03:47:17 +0200Communications with UPS ups established
-
-
@gwaitsi said in NUT package:
@Teken did you look at my post above?
http://rogerprice.org/NUT/ConfigExamples.A5.pdf
i believe the scenario you require is covered in the examples.I was also told the scenario i wanted is not supported....just needed to change from a bash based script to a shell based on and everything turned on.
@teken or @gwaitsi - Did you ever find a way to do this - read the status from multiple UPS units into NUT? I just made a post along similar lines before finding these messages. But that rogerprice.org website seems to be down now, did either of you by chance save that PDF?
-
@occamsrazor no. the more I thought about the problem, I put it on the back burner.
- I have a dumb battery on the J1900
- I have an APC smart UPS on the NAS upstairs
- in between are a couple of switches.
a) if I shut the J1900 down after x minutes due to loss of comms,
I wouldn't be able to restart it because you can't use the netgear switches as a WakeOnLAN for providerb) I am now using ZFS as pfsense the file system and I should get about 30-60min with the battery.
If the outage is longer, the ZFS should provide a more resilient recovery (I think). I can then simply use the PowerOn Restore BIOS option to force the restart.On balance, I decided it was easier to take the risk of a long outage and get the benefit of an autorestart. Of course, it would have been a different story if i could have used the netgear switches to trigger WakeOnLAN for the pfsense box
-
NUT (2.7.4_20) with CyberPower RMCARD205 incorrect decimal point (0.1) vari.: battery.runtime
Problem with CyberPower RMCARD205 + NUT installations, maybe it's time to switch to 2.8 ( https://www.freshports.org/sysutils/nut ), as the "snmp-ups" driver does not include this type either.
The problem with this wrong value is that, the shutdown threshold slips by, say, from a few minute (20%) - to a few hour :) and thus, - the UPS protection does not work in real time.
(because it "thinks" there are 5 hours left in the batteries, but in fact there are only 5 minutes left)I repeat, only with this network SNMP card which we use RMCARD205 and unfortunately with the old driver set "driver.list" - snmp-ups...
OLD "snmp-ups" file in pfS inst driver.list NUT.:
RMCARD205 web interface:
pfS NUT console:
According to new compatibility:
-
@daddygo said in NUT package:
maybe it's time to switch to 2.8
Pending pull request and redmine issue.
-
Hi Denny,
Super then we thought of one (2.8) and hope it will happen quickly as we have purchased 10 pcs. 1.5KVA UPS units with RMCARD205.
(I say softly - yeah, my fault for not checking the pfS NUT package and the RMACARD drivers first)Anyway thanks for your sacrificial work with this package, I have been watching it for a while...
-
-
-