net-snmpd stops responding after certain amount of time
-
Hi,
after a certain time (1-2 weeks) my snmpd (0.1.5_2 net-snmp-5.7.3_18) daemon is no longer reachable. In the GUI it seems to be started but it doesn't respond to requests anymore.
After a daemon-restart via the GUI the service is still not available. In the CLI the daemon seems to load a CPU core:
[2.4.4-RELEASE][admin@X-FW01A]/usr/local: ps auwwx | grep snmp root 20340 100.0 0.2 31404 26640 - R 09:18 1:06.54 /usr/local/sbin/snmpd -LF 0-4 d -p /var/run/net_snmpd.pid -M /usr/share/snmp/mibs/:/usr/local/share/snmp/mibs -C -c /var/etc/netsnmpd.conf,/var/etc/netsnmpd-users.conf
After rebooting the whole pfsense (2.4.4-RELEASE-p2) instance, the daemon starts correct, is reachable and the CPU load looks fine:
[2.4.4-RELEASE][admin@X-FW01A]/root: ps auwwx | grep snmp root 59265 0.0 0.1 21164 14184 - S 09:32 0:00.03 /usr/local/sbin/snmpd -LF 0-4 d -p /var/run/net_snmpd.pid -M /usr/share/snmp/mibs/:/usr/local/share/snmp/mibs -C -c /var/etc/netsnmpd.conf,/var/etc/netsnmpd-users.conf root 3263 0.0 0.0 6564 2460 0 S+ 09:32 0:00.00 grep snmp [2.4.4-RELEASE][admin@X-FW01A]/root:
When I try to debug the cause, the program doesn't seem to really start. I think i am calling it not in the right way!?
[2.4.4-RELEASE][admin@X-FW01A]/usr/local: /usr/local/sbin/snmpd -f -D all -Le -M /usr/share/snmp/mibs/:/usr/local/share/snmp/mibs -C -c /var/etc/netsnmpd.conf trace: main(): snmpd.c, 862: snmpd/main: optind 3, argc 10 trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all" trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all,-Le" trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all,-Le,-M" trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all,-Le,-M,/usr/share/snmp/mibs/:/usr/local/share/snmp/mibs" trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all,-Le,-M,/usr/share/snmp/mibs/:/usr/local/share/snmp/mibs,-C" trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all,-Le,-M,/usr/share/snmp/mibs/:/usr/local/share/snmp/mibs,-C,-c" trace: netsnmp_ds_set_string(): default_store.c, 294: netsnmp_ds_set_string: Setting APP:2 = "all,-Le,-M,/usr/share/snmp/mibs/:/usr/local/share/snmp/mibs,-C,-c,/var/etc/netsnmpd.conf" trace: main(): snmpd.c, 883: snmpd/main: port spec: all,-Le,-M,/usr/share/snmp/mibs/:/usr/local/share/snmp/mibs,-C,-c,/var/etc/netsnmpd.conf logging:register: registering log type 3 with pri 7 Log handling defined - disabling stderr [2.4.4-RELEASE][admin@X-FW01A]/usr/local:
The config files:
[2.4.4-RELEASE][admin@X-FW01A]/root: cat /var/etc/netsnmpd.conf agentaddress udp:x.x.x.x: engineIDType 1 [snmp] tsmUseTransportPrefix no sysLocation xxx sysContact hostmaster@xxx sysName x-fw01a sysDescr pfSense x-fw01a interface_fadeout 300 interface_replace_old no ignoreDisk /dev ignoreDisk /var/dhcpd/dev includeAllDisks 20% rwuser -s usm "xxx" noauth rocommunity xxx iquerySecName "xxx" agentSecName "xxx" master agentx [2.4.4-RELEASE][admin@X-FW01A]/root: cat /var/etc/netsnmpd-users.conf createUser "xxx" SHA "xxx" AES "xxx" [2.4.4-RELEASE][admin@X-FW01A]/root:
Anyone any suggesting for further debugging?
Cheers,
Helge -
In that first bit of output,
snmpd
is using 100% CPU. So something has it really stuck chewing through CPU time.You might be able to use
truss
to attach to the process and see what it's doing. For example, in that output the PID is20340
so you'd runtruss -fp 20340
and capture some of that output. Since it's stuck in a loop it will probably dump a lot of output very quickly. You can usually press ctrl-c to break out of that. -
Nice. I will give it a try the next time it appears.
-
Hi @jimp, you were right: It's kind of a loop. This output is generated every ~1 seconds with different memory addresses:
truss-snmpd-loop.txtIt's the same behavior directly after restarting snmpd. But after rebooting it looks like this:
truss-snmpd-after-reboot.txt
and repeating, when there is no active snmpget.Unfortunately I can do almost nothing with it :-(
Cheers
Helge -
Hmm, nothing really noteworthy in there. Is there a client polling it every second? That might explain why it's repeating that often.
I'd expect things to be repeating a heck of a lot more than once per second if it's consuming 100% CPU.
-
@jimp actually, there is absolutely no snmp traffic while
truss -fp
shows this repeating output. I watched it withtcpdump -i vmx4 port 161
where vmx4 is the management interface.Strange
-
I went back to bsnmpd.. Although it has no IPv6-support.