-
Is the nut pfsense package setup to deal with power race conditions?
I'm trying to understand when /usr/local/etc/rc.d/shutdown.nut.sh gets executed.
My UPS will cancel a delayed shutdown if the power returns during the shutdown delay, so that would leave my firewall in a halted state, never to restart.
So should a shutdown delay ever be used with the pfsense nut package? Or is it best to shutdown the UPS immediately when "/usr/local/sbin/upsdrvctl shutdown" gets called?
The nut FAQ has a recommendation for adding in a reboot after 120 seconds if the system is still up after the upsdrvctl shutdown command has been run.
https://networkupstools.org/docs/FAQ.html#_i_8217_m_facing_a_power_race
-
@stompro said in NUT package:
Is the nut pfsense package setup to deal with power race conditions?
No. Once a low battery situation has been declared, the pfSense system will perform a complete shutdown.
The "power race" situation discussed in the FAQ entry is rather antiquated, and was only pertinent to really dumb UPSs. The approach recommended in the FAQ required the OS to remain operational and not actually perform a complete shutdown. The expectation was that the secondary file systems would all be unmounted, and the root FS quiesced or remounted re-only to minimize damage to the file system. 20+ years ago this may have been a somewhat common approach to system shutdown, but no longer.
My UPS will cancel a delayed shutdown if the power returns during the shutdown delay, so that would leave my firewall in a halted state, never to restart.
Are you sure? Most all modern UPSs, once the kill command has been received, will carry forward with the disconnect of power regardless of whether or not mains power is present. Unless you have a very old and dumb UPS, I would not worry about it. If you do have such a UPS, I would seriously consider getting a new one.
-
@dennypage said in NUT package:
@stompro said in NUT package:
Is the nut pfsense package setup to deal with power race conditions?
No. Once a low battery situation has been declared, the pfSense system will perform a complete shutdown.
The "power race" situation discussed in the FAQ entry is rather antiquated, and was only pertinent to really dumb UPSs. The approach recommended in the FAQ required the OS to remain operational and not actually perform a complete shutdown. The expectation was that the secondary file systems would all be unmounted, and the root FS quiesced or remounted re-only to minimize damage to the file system. 20+ years ago this may have been a somewhat common approach to system shutdown, but no longer.
Are you sure about this? It seems like a better method than to hope and assume that the shutdown delay is set to a correct length of time to allow a machine to shut down before the power gets cut. Modern file systems are better, but there are tons of pfSense systems using UFS which is known to handle power cuts poorly.
I'm not sure if ram disks gets synced before the nut shutdown happens... I'll have to check on that since I use the ramdisk feature on all my firewalls.
I'm pretty sure sure all my debian based systems use the official method of running a shutdown and then killing the power at the end of the shutdown. And it looks like they also support a "POWEROFF_WAIT=15m" line in /etc/nut/nut.conf option to handle this type of power race condition.
See the systemd nutshutdown script https://github.com/networkupstools/nut/blob/master/scripts/systemd/nutshutdown.in
Looks like that was committed in 2018, so it seems like this style of system shutdown with NUT isn't something that went away 20+ years ago as I believe you are implying.
My UPS will cancel a delayed shutdown if the power returns during the shutdown delay, so that would leave my firewall in a halted state, never to restart.
Are you sure? Most all modern UPSs, once the kill command has been received, will carry forward with the disconnect of power regardless of whether or not mains power is present. Unless you have a very old and dumb UPS, I would not worry about it. If you do have such a UPS, I would seriously consider getting a new one.
Ha, I have a brand new Tripp Lite SMART750RM1U that does seem to have this problem. I'll test it again to make sure, but I also checked with Tripp Lite support and they said all their UPSs behave like that. If the power returns after the shutdown delay command is sent, the shutdown gets canceled. (I'm not sure how much a believe their support though, or my ability to communicate what I'm asking correctly.)
Here is a debian bug from 2016 that mentions that their APC ups (no model) has a similar quirk. (shutdown command is ignored when on mains power) So I don't know that this behavior is all that rare yet.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=835634But I think you have answered my question, no the pfsense nut package doesn't handle power race conditions.
-
@stompro said in NUT package:
Are you sure about this? It seems like a better method than to hope and assume that the shutdown delay is set to a correct length of time to allow a machine to shut down before the power gets cut. Modern file systems are better, but there are tons of pfSense systems using UFS which is known to handle power cuts poorly.
Yes. The default kill delay of 20 seconds covers most non NAS systems. The delay is usually controllable so you can adjust it if needed. See the description of variable offdelay in usbhid-ups.
UFS is an example of why the approach recommended in that FAQ entry is generally a bad idea because it requires that the UFS file system still be mounted when the power is cut. UFS is somewhat fragile and even if quiesced does not appreciate unclean shutdowns.
Regardless, the quiesce shutdown approach is not something that pfSense supports, let alone the pfSense NUT package. Even if it did, it still wouldn't be a good idea...
If you are using a UPS that will not obey the kill command if mains returns, it is still much better approach to use a safe complete shut down that requires manual intervention to recover rather than an unsafe shut down that may result in damage to the root file system.
Let's put some numbers to this. Let's say we have a UPS with a 10 minute run-time before low battery, a 20 second delay on the kill command, and power events that last 0-4 hours with an even distribution.
4% of the time will be no issue because mains will be restored before shutdown begins. 0.1% of the time, mains will be restored during the kill command delay, and the system will continue the shut down and require external intervention to reboot. However if the system does not use a complete shut down, 96% of the time, the system will still be running when power is cut, exposing the root file system to potential corruption. Even if we say that there is only a 1 in 100 chance (~1%) that the file system will experience corruption, it's still nearly a 10:1 win to use the full and complete shutdown approach.
Now let's look at the associated costs for those failure cases. In the 0.1% case, the cost of manual intervention is flipping a switch which a lay person can do. In the 1% case, the cost of manual intervention is an experienced system administrator spending significant time on the console fixing file system corruption or performing a complete re-install along with the associated data loss. When you look at the entire picture, it's a very clear and easy choice.
One last log to throw on the fire... with a dumb UPS and the quiesce and reboot approach, there is a significant exposure if there is a second power event shortly after the system decides to reboot. There is something like a 45 second window during which UPS will likely power off before the system even gets to the point of starting NUT, let alone completing another shutdown. With UFS, the chance of corruption in this case is much higher than 1 in 100. Yes, I know... see variable ondelay in usbhid-ups.
Ha, I have a brand new Tripp Lite SMART750RM1U that does seem to have this problem.
Well, that is unfortunate. I can't speak to your specific model, but I used baby Tripp Lites for several years and did not have that problem. I still have a leftover ECO model under my desk for my workstation.
My "main" UPS have generally been APCs, and they have obeyed power kill with mains live. On one occasion I really wished that they didn't because I was doing NUT testing without sufficient precaution and accidentally took out all my servers at once. Yes, I know... really stupid.
-
-
-
-
Was the Cyberpower usb issue resolved in 23.05??
-
@jonathanlee said in NUT package:
Was the Cyberpower usb issue resolved in 23.05??
Not at this time. While there is a new release of NUT available, it hasn't been marked as stable in FreeBSD upstream. I'll reach out to the maintainer.
-
-
-
@dennypage thanks
-
-
@dennypage said in NUT package:
@jonathanlee said in NUT package:
Was the Cyberpower usb issue resolved in 23.05??
Not at this time. While there is a new release of NUT available, it hasn't been marked as stable in FreeBSD upstream. I'll reach out to the maintainer.
Can confirm. Had to replace
usbhid-ups
with the above version from @dennypage again. It's worth noting that in 23.01 I did not have to modifyups.conf
, but with 23.05, I did need to adduser=root
toups.conf
. -
Just to confirm with all the extra setting changes with Nut installed in PfSense 23.05 example, timers and "user=root," the system again lost connection after a couple hours.
-
@dennypage said in NUT package:
Not at this time. While there is a new release of NUT available, it hasn't been marked as stable in FreeBSD upstream. I'll reach out to the maintainer.
My bad. I jumped the gun a bit. While there is a new release version pending, it has not actually been tagged yet and still has a few blockers. Sorry about that.
-
I forgot about this issue and it returned after upgrading from pfSense Plus 23.01 to pfSense Plus 23.05. It took me a bit of time to track this down and find the file in this thread that fixes it.
Because of that, I decided to put this in an Ansible playbook so the fix is in code and I don't need to worry about it in the future.
I put a gist up on GitHub of my playbook in case anyone finds it helpful:
pfSense NUT Package Fix Through Ansible -
I recently setup a new Tripp Lite ECO850LCD UPS for my Netgate SG-1100 (23.05). I installed the Nut Package (2.8.0_2). Other than adding user=root to ups.conf section of advanced settings, it's a clean install (as far as I know).
usbhid-ups seems to start up and upsmon is getting updates.
Problem is: ups.status changes from OL to OB and back to OL also indicating discharging and charging state, but there is never a LOW BATT status.
Messages are sent to console and syslog showing UPS is online and on battery, but no low battery notifications.
That means PfSense never does shutdown and the UPS never turns off load.UPS runs down to battery.charge=0 and battery.runtime=30, SG-1100 is still powered (I don't know how much longer it would go since I went back on mains at that point.)
Sorry to be such a noob, but what do I need to do to get the UPS to notify low battery or work around this problem?
Don't know if it is significant but I note that battery.charge.low is not reported by upsc also battery.charge.low can not be set using upsrw.
Any help appreciated.
-
@ghound said in NUT package:
Problem is: ups.status changes from OL to OB and back to OL also indicating discharging and charging state, but there is never a LOW BATT status.
Messages are sent to console and syslog showing UPS is online and on battery, but no low battery notifications.
That means PfSense never does shutdown and the UPS never turns off load.UPS runs down to battery.charge=0 and battery.runtime=30, SG-1100 is still powered (I don't know how much longer it would go since I went back on mains at that point.)
I have used an ECO Tripp Lite in the past, and don't recall it having that issue. That said, to address the situation you can add the following in the Extra Arguments to driver section on the UPS Settings page:
ignorelb override.battery.charge.low = 10 override.battery.runtime.low = 120
For more information, see the UPS FIELDS section in the ups.conf documentation.
-
@dennypage
Thank you Denny. Those changes did the trick!
NUT shutdown PfSense and UPS disconnected shortly afterward.
Everything started normally after reconnected to mains.I noticed new syslog entries by usbhid-ups for:
using 'battery.charge' to set battery low state
using 'battery.runtime' to set battery low stateAlso, new entries appeared from upsc for:
battery.charge.low: 10
battery.runtime.low: 120
driver.flag.ignorelb: enabledI thought I had tried those changes before without success; I suspect I didn't do a good job of isolating what I was trying.
Appreciate your help.
-
@dennypage said in NUT package:
I've received several requests for the dev build of usbhid-ups, so I thought I would upload the file here.
For reference, the shasum and sha256sum checksums of the unzipped file are:
49ce9131502bfb8b789ee97b7fb3fc81fc9f8fff usbhid-ups 999a2653559dbc50ecc8ba592a67587b1e307a1495f6e8ebbd3d8e90e3967133 usbhid-ups
If you use the file, please post and let me know if it resolves an issue for you.
I'm assuming this is an Intel build, since it (unsurprisingly) didn't work at all on my Netgate 2100. lol Do you have any idea what the ETA is for an update to the NUT package that would include this patch? If not, or if it's still going to be a while, would it be possible for you to post an ARM build of the patched driver?
-
@Maltz said in NUT package:
I'm assuming this is an Intel build, since it (unsurprisingly) didn't work at all on my Netgate 2100.
Yes, it's an intel build. I don't have an ARM system. I have some free time coming up, and will look at building a cross compile environment.
-
@dennypage said in NUT package:
@Maltz said in NUT package:
I'm assuming this is an Intel build, since it (unsurprisingly) didn't work at all on my Netgate 2100.
Yes, it's an intel build. I don't have an ARM system. I have some free time coming up, and will look at building a cross compile environment.
I've got an RPi 4 running Ubuntu, if that'll work. I tried a few times to build it, but my Netgate didn't care for the results, so far. I'm probably configuring the build wrong somehow. If you have a ./configure command, I'd be happy to build 2.8.0 myself and post the result here.
-
@Maltz said in NUT package:
I've got an RPi 4 running Ubuntu, if that'll work. I tried a few times to build it, but my Netgate didn't care for the results, so far. I'm probably configuring the build wrong somehow. If you have a ./configure command, I'd be happy to build 2.8.0 myself and post the result here.
A Pi running Ubuntu will not work. pfSense is based on FreeBSD rather than Linux. You need to be running FreeBSD to build from the FreeBSD Ports collection.
The cross compile situation I was referring to was building FreeBSD ARM from a FreeBSD Intel system.
-
@dennypage said in NUT package:
A Pi running Ubuntu will not work. pfSense is based on FreeBSD rather than Linux. You need to be running FreeBSD to build from the FreeBSD Ports collection.
The cross compile situation I was referring to was building FreeBSD ARM from a FreeBSD Intel system.
I figured Linux vs FreeBSD was the other likely reason it didn't work. I'm quite unfamiliar with FreeBSD and it's package/ports system, but that post gave me enough info to google my way through compiling 2.8.0 on FreeBSD on my spare Pi4. (Interestingly, the RPi version of FreeBSD doesn't have nut in its package repository, I DID have to compile it from ports.) The replacement USB driver is running fine so far on my Netgate 2100. I'll post the binary here later today.
Thanks so much for sussing out the various issues in this thread - it's certainly helped me tremendously, and I'm sure many others as well!
-
@Maltz said in NUT package:
The replacement USB driver is running fine so far on my Netgate 2100. I'll post the binary here later today.
Thank you for this.
Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.