NUT package (2.8.0 and below)
-
@dennypage said in NUT package:
@kevindd992002 To be clear, is this is the config for the system that is saying the upsmon process died? Or is this the config for the server side, and it is actually the client side is the one complaining about upsmon failing?
Re test result, it's because a self test has never been run on the UPS, the UPS "forgetting" the self test results, or an error in the driver.
Yes, this is the config for the NUT server that complains that the upsmon process had dies. My two pfsense boxes are both servers.
Ok. The ups does a self-test every time it is turned on from an off state, right? Doesn't that count?
-
@kevindd992002 said in NUT package:
Re test result, it's because a self test has never been run on the UPS, the UPS "forgetting" the self test results, or an error in the driver.
Ok. The ups does a self-test every time it is turned on from an off state, right? Doesn't that count?
This is at the edge of my knowledge regarding APC. I had APC units many years ago, but moved away many years ago... Most UPSs do a quick transfer test on start up, but whether or not this is considered a full self test, or if the result is stored, will vary from UPS to UPS. The UPS units I currently use do not perform a full self test on power up and will not report any test results until a test is explicitly run.
-
@dennypage said in NUT package:
@kevindd992002 said in NUT package:
@dennypage
Also, I'm getting this on my other pfsense box:upsmon parent process died - shutdown impossible
But when I check the NUT (UPS monitoring daemon) service, it is up and running. If you restart pfsense, is this message normal and will it be sent by pfsense as soon as it boots up?
Yes, this is a problem. Please post your UPS config for that pfSense box.
Just to make sure I have this correct:
The config you posted is for the server side; It is on this box that you are seeing a complaint about upsmon dying; When you see the upsmon dying message, you still see nut showing as running in the service status page (Status / Services).
Do I have that correct?
If so, then the following questions come to mind:
- Do you have the watchdog daemon installed?
- How frequently do you see upsmon dying?
- Are there any other entires in the system log around the time of upsmon death?
- Is this a new issue, or has it occurred since install? Have you always had the remote nut configuration?
- Do you have apcupsd installed?
-
@dennypage said in NUT package:
@kevindd992002 said in NUT package:
Re test result, it's because a self test has never been run on the UPS, the UPS "forgetting" the self test results, or an error in the driver.
Ok. The ups does a self-test every time it is turned on from an off state, right? Doesn't that count?
This is at the edge of my knowledge regarding APC. I had APC units many years ago, but moved away many years ago... Most UPSs do a quick transfer test on start up, but whether or not this is considered a full self test, or if the result is stored, will vary from UPS to UPS. The UPS units I currently use do not perform a full self test on power up and will not report any test results until a test is explicitly run.
Ok, makes sense. Thanks.
@dennypage said in NUT package:
@dennypage said in NUT package:
@kevindd992002 said in NUT package:
@dennypage
Also, I'm getting this on my other pfsense box:upsmon parent process died - shutdown impossible
But when I check the NUT (UPS monitoring daemon) service, it is up and running. If you restart pfsense, is this message normal and will it be sent by pfsense as soon as it boots up?
Yes, this is a problem. Please post your UPS config for that pfSense box.
Just to make sure I have this correct:
The config you posted is for the server side; It is on this box that you are seeing a complaint about upsmon dying; When you see the upsmon dying message, you still see nut showing as running in the service status page (Status / Services).
Do I have that correct?
If so, then the following questions come to mind:
- Do you have the watchdog daemon installed?
- How frequently do you see upsmon dying?
- Are there any other entires in the system log around the time of upsmon death?
- Is this a new issue, or has it occurred since install? Have you always had the remote nut configuration?
- Do you have apcupsd installed?
You got it.
- No, these are all the packages that I have installed (no watchdog daemon):
-
It looks like I receive the message after reboot but not all the time. It happens randomly and it happens for both my pfsense boxes which have the same exact UPS, with the same exact pfsense version and set of settings.
-
System logs around the time of the latest issue occurrence:
May 5 17:48:01 upsmon 12113 Startup successful May 5 17:48:02 usbhid-ups 35012 Startup successful May 5 17:48:02 upsd 47110 listening on 192.168.55.1 port 3493 May 5 17:48:02 upsd 47110 listening on 192.168.10.1 port 3493 May 5 17:48:02 upsd 47110 listening on ::1 port 3493 May 5 17:48:02 upsd 47110 listening on 127.0.0.1 port 3493 May 5 17:48:02 upsd 47110 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:02 upsd 47942 Startup successful May 5 17:48:04 upsmon 43097 Startup successful May 5 17:48:05 usbhid-ups 72062 Startup successful May 5 17:48:05 upsd 10700 listening on 192.168.55.1 port 3493 May 5 17:48:05 upsd 10700 listening on 192.168.10.1 port 3493 May 5 17:48:05 upsd 10700 listening on ::1 port 3493 May 5 17:48:05 upsd 10700 listening on 127.0.0.1 port 3493 May 5 17:48:05 upsd 10700 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:05 upsd 12533 Startup successful May 5 17:48:06 upsd 12533 User local-monitor@127.0.0.1 logged into UPS [ups] May 5 17:48:06 upsmon 43843 upsmon parent process died - shutdown impossible May 5 17:48:06 upsmon 43843 Parent died - shutdown impossible May 5 17:48:06 upsd 12533 mainloop: Interrupted system call May 5 17:48:06 upsd 12533 Signal 15: exiting May 5 17:48:06 usbhid-ups 72062 Signal 15: exiting May 5 17:48:06 upsmon 77021 Startup successful May 5 17:48:07 usbhid-ups 9091 Startup successful May 5 17:48:07 upsd 19878 listening on 192.168.55.1 port 3493 May 5 17:48:07 upsd 19878 listening on 192.168.10.1 port 3493 May 5 17:48:07 upsd 19878 listening on ::1 port 3493 May 5 17:48:07 upsd 19878 listening on 127.0.0.1 port 3493 May 5 17:48:07 upsd 19878 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:07 upsd 21841 Startup successful May 5 17:48:09 upsmon 77021 upsmon parent: read May 5 17:48:09 upsmon 37716 Startup successful May 5 17:48:10 usbhid-ups 88380 Startup successful May 5 17:48:10 upsd 11383 listening on 192.168.55.1 port 3493 May 5 17:48:10 upsd 11383 listening on 192.168.10.1 port 3493 May 5 17:48:10 upsd 11383 listening on ::1 port 3493 May 5 17:48:10 upsd 11383 listening on 127.0.0.1 port 3493 May 5 17:48:10 upsd 11383 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:10 upsd 12532 Startup successful May 5 17:48:12 upsd 12532 User local-monitor@::1 logged into UPS [ups] May 5 17:48:13 upsmon 39038 Signal 15: exiting May 5 17:48:13 upsd 12532 User local-monitor@::1 logged out from UPS [ups] May 5 17:48:13 upsd 12532 mainloop: Interrupted system call May 5 17:48:13 upsd 12532 Signal 15: exiting May 5 17:48:13 usbhid-ups 88380 Signal 15: exiting
It looks like it shut itself down and then went back up.
- This has occurred since install. Yes, the configuration I posted were there from the very start.
- I do not have apcupsd installed as shown above.
-
@kevindd992002 : not related, but you missed at least 2 version of Freeradius.
-
@Gertjan said in NUT package:
@kevindd992002 : not related, but you missed at least 2 version of Freeradius.
Yes, I noticed and I just updated both my boxes now :) Is there a way to auto-update these things? Or is auto-updating not recommended?
-
@kevindd992002 said in NUT package:
auto-updating not recommended
Noop.
The system might reboot to finish an update.
And you were just about to shut down that nuclear power plant that reaching dangerous levels ...pfSense is not that solid that it can auto update and keep care of itself like, for example .. euh .... Windows 10 :)
Put an "installed packages" widget on the dashboard, and visit the GUI ones in a while. That will do.
Same thing for pfSense updates.Sorry for the out-of-subect.
-
@Gertjan said in NUT package:
@kevindd992002 said in NUT package:
auto-updating not recommended
Noop.
The system might reboot to finish an update.
And you were just about to shut down that nuclear power plant that reaching dangerous levels ...pfSense is not that solid that it can auto update and keep care of itself like, for example .. euh .... Windows 10 :)
Put an "installed packages" widget on the dashboard, and visit the GUI ones in a while. That will do.
Same thing for pfSense updates.Sorry for the out-of-subect.
Why didn't I think of that, lol! That's not a bad idea. Thanks!
-
@kevindd992002 said in NUT package:
-
It looks like I receive the message after reboot but not all the time. It happens randomly and it happens for both my pfsense boxes which have the same exact UPS, with the same exact pfsense version and set of settings.
-
System logs around the time of the latest issue occurrence:
May 5 17:48:01 upsmon 12113 Startup successful May 5 17:48:02 usbhid-ups 35012 Startup successful May 5 17:48:02 upsd 47110 listening on 192.168.55.1 port 3493 May 5 17:48:02 upsd 47110 listening on 192.168.10.1 port 3493 May 5 17:48:02 upsd 47110 listening on ::1 port 3493 May 5 17:48:02 upsd 47110 listening on 127.0.0.1 port 3493 May 5 17:48:02 upsd 47110 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:02 upsd 47942 Startup successful May 5 17:48:04 upsmon 43097 Startup successful May 5 17:48:05 usbhid-ups 72062 Startup successful May 5 17:48:05 upsd 10700 listening on 192.168.55.1 port 3493 May 5 17:48:05 upsd 10700 listening on 192.168.10.1 port 3493 May 5 17:48:05 upsd 10700 listening on ::1 port 3493 May 5 17:48:05 upsd 10700 listening on 127.0.0.1 port 3493 May 5 17:48:05 upsd 10700 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:05 upsd 12533 Startup successful May 5 17:48:06 upsd 12533 User local-monitor@127.0.0.1 logged into UPS [ups] May 5 17:48:06 upsmon 43843 upsmon parent process died - shutdown impossible May 5 17:48:06 upsmon 43843 Parent died - shutdown impossible May 5 17:48:06 upsd 12533 mainloop: Interrupted system call May 5 17:48:06 upsd 12533 Signal 15: exiting May 5 17:48:06 usbhid-ups 72062 Signal 15: exiting May 5 17:48:06 upsmon 77021 Startup successful May 5 17:48:07 usbhid-ups 9091 Startup successful May 5 17:48:07 upsd 19878 listening on 192.168.55.1 port 3493 May 5 17:48:07 upsd 19878 listening on 192.168.10.1 port 3493 May 5 17:48:07 upsd 19878 listening on ::1 port 3493 May 5 17:48:07 upsd 19878 listening on 127.0.0.1 port 3493 May 5 17:48:07 upsd 19878 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:07 upsd 21841 Startup successful May 5 17:48:09 upsmon 77021 upsmon parent: read May 5 17:48:09 upsmon 37716 Startup successful May 5 17:48:10 usbhid-ups 88380 Startup successful May 5 17:48:10 upsd 11383 listening on 192.168.55.1 port 3493 May 5 17:48:10 upsd 11383 listening on 192.168.10.1 port 3493 May 5 17:48:10 upsd 11383 listening on ::1 port 3493 May 5 17:48:10 upsd 11383 listening on 127.0.0.1 port 3493 May 5 17:48:10 upsd 11383 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:10 upsd 12532 Startup successful May 5 17:48:12 upsd 12532 User local-monitor@::1 logged into UPS [ups] May 5 17:48:13 upsmon 39038 Signal 15: exiting May 5 17:48:13 upsd 12532 User local-monitor@::1 logged out from UPS [ups] May 5 17:48:13 upsd 12532 mainloop: Interrupted system call May 5 17:48:13 upsd 12532 Signal 15: exiting May 5 17:48:13 usbhid-ups 88380 Signal 15: exiting
It looks like it shut itself down and then went back up.
Ah... now I get it...
There should be a bit more in the system log around that time. In particular, I would expect the following lines:
/rc.start_packages: Starting service nut /rc.start_packages: Restarting/Starting all packages.
I expect that the packages are being restarted in response to a DHCP address event on one of your wan interfaces. This is normal and can be ignored. Sorry for causing you alarm.
It's a bit unfortunate, but normal because pfSense does not know what packages require restart when an IP address changes, so it restarts them all. See /etc/rc.newwanip (and newwanipv6) for detail.
-
-
@dennypage said in NUT package:
@kevindd992002 said in NUT package:
-
It looks like I receive the message after reboot but not all the time. It happens randomly and it happens for both my pfsense boxes which have the same exact UPS, with the same exact pfsense version and set of settings.
-
System logs around the time of the latest issue occurrence:
May 5 17:48:01 upsmon 12113 Startup successful May 5 17:48:02 usbhid-ups 35012 Startup successful May 5 17:48:02 upsd 47110 listening on 192.168.55.1 port 3493 May 5 17:48:02 upsd 47110 listening on 192.168.10.1 port 3493 May 5 17:48:02 upsd 47110 listening on ::1 port 3493 May 5 17:48:02 upsd 47110 listening on 127.0.0.1 port 3493 May 5 17:48:02 upsd 47110 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:02 upsd 47942 Startup successful May 5 17:48:04 upsmon 43097 Startup successful May 5 17:48:05 usbhid-ups 72062 Startup successful May 5 17:48:05 upsd 10700 listening on 192.168.55.1 port 3493 May 5 17:48:05 upsd 10700 listening on 192.168.10.1 port 3493 May 5 17:48:05 upsd 10700 listening on ::1 port 3493 May 5 17:48:05 upsd 10700 listening on 127.0.0.1 port 3493 May 5 17:48:05 upsd 10700 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:05 upsd 12533 Startup successful May 5 17:48:06 upsd 12533 User local-monitor@127.0.0.1 logged into UPS [ups] May 5 17:48:06 upsmon 43843 upsmon parent process died - shutdown impossible May 5 17:48:06 upsmon 43843 Parent died - shutdown impossible May 5 17:48:06 upsd 12533 mainloop: Interrupted system call May 5 17:48:06 upsd 12533 Signal 15: exiting May 5 17:48:06 usbhid-ups 72062 Signal 15: exiting May 5 17:48:06 upsmon 77021 Startup successful May 5 17:48:07 usbhid-ups 9091 Startup successful May 5 17:48:07 upsd 19878 listening on 192.168.55.1 port 3493 May 5 17:48:07 upsd 19878 listening on 192.168.10.1 port 3493 May 5 17:48:07 upsd 19878 listening on ::1 port 3493 May 5 17:48:07 upsd 19878 listening on 127.0.0.1 port 3493 May 5 17:48:07 upsd 19878 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:07 upsd 21841 Startup successful May 5 17:48:09 upsmon 77021 upsmon parent: read May 5 17:48:09 upsmon 37716 Startup successful May 5 17:48:10 usbhid-ups 88380 Startup successful May 5 17:48:10 upsd 11383 listening on 192.168.55.1 port 3493 May 5 17:48:10 upsd 11383 listening on 192.168.10.1 port 3493 May 5 17:48:10 upsd 11383 listening on ::1 port 3493 May 5 17:48:10 upsd 11383 listening on 127.0.0.1 port 3493 May 5 17:48:10 upsd 11383 Connected to UPS [ups]: usbhid-ups-ups May 5 17:48:10 upsd 12532 Startup successful May 5 17:48:12 upsd 12532 User local-monitor@::1 logged into UPS [ups] May 5 17:48:13 upsmon 39038 Signal 15: exiting May 5 17:48:13 upsd 12532 User local-monitor@::1 logged out from UPS [ups] May 5 17:48:13 upsd 12532 mainloop: Interrupted system call May 5 17:48:13 upsd 12532 Signal 15: exiting May 5 17:48:13 usbhid-ups 88380 Signal 15: exiting
It looks like it shut itself down and then went back up.
Ah... now I get it...
There should be a bit more in the system log around that time. In particular, I would expect the following lines:
/rc.start_packages: Starting service nut /rc.start_packages: Restarting/Starting all packages.
I expect that the packages are being restarted in response to a DHCP address event on one of your wan interfaces. This is normal and can be ignored. Sorry for causing you alarm.
It's a bit unfortunate, but normal because pfSense does not know what packages require restart when an IP address changes, so it restarts them all. See /etc/rc.newwanip (and newwanipv6) for detail.
Bingo!
May 5 17:47:40 php-fpm 379 /rc.start_packages: Starting service nut May 5 17:48:01 php-fpm 379 /rc.start_packages: Starting service nut May 5 17:48:06 php-fpm 83711 /rc.start_packages: Starting service nut May 5 17:48:13 php-fpm 22562 /rc.start_packages: Starting service nut May 5 17:48:48 php-fpm 379 /rc.start_packages: Starting service nut May 7 17:33:27 php-fpm 379 /rc.start_packages: Starting service nut
May 5 17:47:39 php-fpm 379 /rc.start_packages: Restarting/Starting all packages. May 5 17:48:00 php-fpm 379 /rc.start_packages: Restarting/Starting all packages. May 5 17:48:06 php-fpm 83711 /rc.start_packages: Restarting/Starting all packages. May 5 17:48:13 php-fpm 22562 /rc.start_packages: Restarting/Starting all packages. May 5 17:48:48 php-fpm 379 /rc.start_packages: Restarting/Starting all packages. May 7 17:33:27 php-fpm 379 /rc.start_packages: Restarting/Starting all packages.
The weird thing is that my WAN interface has a static IP. I have an OpenVPN interface though, does that count? Although I know that this error started happening even before configuring OpenVPN.
-
-
@dennypage said in NUT package:
@kevindd992002 Slightly concerning is that it does not show any self test result...
Batteries are generally rated for 3 years, so you may want to have a look at your battery health. You can initiate a battery test via nut, but you will have to use the command line. Log into the system and use
upscmd -l ups
to see what commands are available. Look for commands that begin "test.battery..." Start with a quick test if available, then proceed to a deep test.
WARNING if the battery is defective, running these tests can cause the ups to cut power to the load (your pfSense box). Use at your own risk!
Doing a quick test yielded a "done and passed" result for the last test result field. However, when I tried doing a deep test, the CLI returned an OK message but the UPS did nothing. Does that mean that the command is not supported by my UPS? I'm still receiving low battery message notifications in my email regarding this UPS.
-
@kevindd992002 said in NUT package:
The weird thing is that my WAN interface has a static IP. I have an OpenVPN interface though, does that count? Although I know that this error started happening even before configuring OpenVPN.
Do you have IPv6 perhaps?
-
@kevindd992002 said in NUT package:
Doing a quick test yielded a "done and passed" result for the last test result field. However, when I tried doing a deep test, the CLI returned an OK message but the UPS did nothing. Does that mean that the command is not supported by my UPS? I'm still receiving low battery message notifications in my email regarding this UPS.
Have to try it with your other UPS, but my guess would be that the UPS is declining to initiate the deep test (which usually includes a battery calibration) because it has a low battery condition. Usually a deep test will not initiate on anything other than a completely full battery.
Best suggestion that I could make is to completely power off the UPS and restart it from the ground up. If that doesn't fix it, I would say that it's likely that the UPS or battery is defective.
-
Nope, no IPv6. I made sure to disable it wherever I can.
-
@kevindd992002 I think rc.newwanip* can be invoked in several circumstances. Look in the system log for messages immediately preceding the Restarting/Starting all packages message.
-
@dennypage said in NUT package:
@kevindd992002 I think rc.newwanip* can be invoked in several circumstances. Look in the system log for messages immediately preceding the Restarting/Starting all packages message.
May 10 13:08:43 php-fpm 14757 /rc.newwanip: Resyncing OpenVPN instances for interface CCTV. May 10 13:08:43 php-fpm 14757 /rc.newwanip: Creating rrd update script May 10 13:08:43 upsd 45275 User local-monitor@::1 logged into UPS [ups] May 10 13:08:43 php-fpm 378 /rc.start_packages: Restarting/Starting all packages. May 10 13:08:43 radiusd 48725 Signalled to terminate May 10 13:08:43 radiusd 48725 Exiting normally May 10 13:08:43 php-fpm 378 /rc.start_packages: Stopping service nut May 10 13:08:43 upsmon 72305 Signal 15: exiting May 10 13:08:43 upsd 45275 User local-monitor@::1 logged out from UPS [ups] May 10 13:08:43 upsd 45275 mainloop: Interrupted system call May 10 13:08:43 upsd 45275 Signal 15: exiting May 10 13:08:43 usbhid-ups 24821 Signal 15: exiting May 10 13:08:43 php-fpm 378 /rc.start_packages: Starting service nut May 10 13:08:43 upsmon 29518 Startup successful May 10 13:08:43 php-fpm 40655 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.168.40.1 - Restarting packages. May 10 13:08:43 check_reload_status Starting packages
You're right. What's the best approach to solve this though? It looks like even my OPT interfaces are interpreted as WANs. All my intefaces (WAN and LAN) have static IP's, so I'm not sure why there is an IP change event there.
-
@kevindd992002 I don't believe you can prevent this from occurring. Problem is, pfSense doesn't know what each package is doing with the various interfaces or if they support dynamic discovery of interface changes (most don't). The only way for pfSense to ensure everything is functioning correctly is to restart the packages.
-
I'm seeing this in my logs
php-fpm 40655 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 82.127.34.254 -> 82.127.34.254 - Restarting packages.
I was asling yself : what changed ? Why was this line triggered ?
For me, "82.127.34.254" is identical to "82.127.34.254".My WAN is using an RFC 1918 IP (DHCP using an upstream router) - the real WAN IP - 82.127.34.254 - doesn't change, but still, packages get restarted.
For longtime I really thought this was completely unnecessary, nut now I knows that that processes like unbound or openvpn don't like the fact that an interface to which they are bound, go up and down, even when the IP on that interface stays the same. -
@dennypage said in NUT package:
@kevindd992002 I don't believe you can prevent this from occurring. Problem is, pfSense doesn't know what each package is doing with the various interfaces or if they support dynamic discovery of interface changes (most don't). The only way for pfSense to ensure everything is functioning correctly is to restart the packages.
I see. Do you happen to know why the interface IP changes for no reason, in the first place?
-
@kevindd992002 In your case it looks like a change of state for OpenVPN interfaces. I'm sorry I don't know much beyond this. If you want to explore more, I'd suggest asking in the General pfSense Questions or OpenVPN groups.