node_exporter is not working properly on 23.05
-
Hello, we have a fleet of pfsense+ 22.05 Netgate 7100 devices, and recently had to install a new one that has 23.05. This new device cannot get node_exporter running on it without a ton of errors.
Installed node_exporter through the pfsense built in package manager and I am having the identical issue to this other user on this new device. All the other hosts in the fleet have no issues with node_exporter, just this new one.
Same error message as the other user essentially:
msg="collector failed" name=uname err="cannot allocate memory"
. Also similar to that user, there is no issue with free memory - this is a 7100 device and has over 5GB free.I have the following collectors enabled, note that uname collector isn't even available in the list:
To note: I've previously had to use "extra flags" to disable certain exporters that do not show up in the collector list (ex:
--no-collector.zfs
). Why do only some of the collectors show up in that list? I am guessing I could probably do the same tactic for uname but I didn't have to do that on 22.05 and the uname collector should work fine on bsd. -
Just to confirm though, does disabling uname using extra flags prevent the errors?
-
Having same issue.
Running 23.05-RELEASE on a Intel N5105 Celeron machine.
For me, disabling uname with the extra flags stops the errors. -
Ok I replicated that here and opened a bug: https://redmine.pfsense.org/issues/14452
-
Just updated to pfSense CE 2.7.0 and am having the same issue (I wasn't on 2.6.0).
You can see in the logs when the error started to occur. This is running on a PCEngines APU2.
Below is from the web scrape page.
# TYPE node_scrape_collector_success gauge node_scrape_collector_success{collector="boottime"} 1 node_scrape_collector_success{collector="cpu"} 1 node_scrape_collector_success{collector="exec"} 1 node_scrape_collector_success{collector="filesystem"} 1 node_scrape_collector_success{collector="loadavg"} 1 node_scrape_collector_success{collector="meminfo"} 1 node_scrape_collector_success{collector="netdev"} 1 node_scrape_collector_success{collector="os"} 0 node_scrape_collector_success{collector="textfile"} 1 node_scrape_collector_success{collector="time"} 1 node_scrape_collector_success{collector="uname"} 0 node_scrape_collector_success{collector="zfs"} 0
-
error is related touname and not for meminfo.
adding -- "--no-collector.uname" in extra flags has stopped errors. I ahve not seen any impact in data gathering as well as no grafana dashboards are not impacted.