SG1100 memory starvation - Unbound not restarting
-
I may have found the starting event to this issue. I see that the ports on the firewall went down and thereby initiated a bunch of process restarts...I'm thinking that's what killed Unbound. OOM killer got to it.
cat system.log | grep check_reload_status Oct 6 17:55:38 790CCV-FW check_reload_status[666]: Syncing firewall Oct 6 17:56:02 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 10 22:33:14 nyc-fw1-inet check_reload_status[666]: Syncing firewall Oct 11 20:35:30 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port3 Oct 11 20:35:34 nyc-fw1-inet check_reload_status[666]: updating dyndns WAN_DHCP Oct 11 20:35:34 nyc-fw1-inet check_reload_status[666]: Restarting IPsec tunnels Oct 11 20:35:34 nyc-fw1-inet check_reload_status[666]: Restarting OpenVPN tunnels/interfaces Oct 11 20:35:34 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:35:36 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port3 Oct 11 20:35:37 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port2 Oct 11 20:35:39 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:35:40 nyc-fw1-inet check_reload_status[666]: rc.newwanip starting mvneta0.4090 Oct 11 20:35:40 nyc-fw1-inet check_reload_status[666]: Restarting IPsec tunnels Oct 11 20:35:40 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port2 Oct 11 20:35:50 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:35:54 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:35:57 nyc-fw1-inet check_reload_status[666]: updating dyndns wan Oct 11 20:35:58 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:36:06 nyc-fw1-inet check_reload_status[666]: Restarting IPsec tunnels Oct 11 20:36:13 nyc-fw1-inet check_reload_status[666]: updating dyndns lan Oct 11 20:36:13 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:36:13 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:36:16 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:36:22 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:36:30 nyc-fw1-inet check_reload_status[666]: Starting packages Oct 11 20:36:31 nyc-fw1-inet check_reload_status[666]: Reloading filter Oct 11 20:36:32 nyc-fw1-inet check_reload_status[666]: Syncing firewall Oct 11 20:36:33 nyc-fw1-inet check_reload_status[666]: Syncing firewall Oct 11 20:56:36 nyc-fw1-inet check_reload_status[666]: Syncing firewall
EDIT: Stopping pfBlocker completely brought Unbound back from the dead.
-
Which ports are these on the SG1100 ??
./system.log:Oct 11 20:35:30 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port3 ./system.log:Oct 11 20:35:30 nyc-fw1-inet kernel: e6000sw0port3: link state changed to DOWN ./system.log:Oct 11 20:35:36 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port3 ./system.log:Oct 11 20:35:37 nyc-fw1-inet kernel: e6000sw0port3: link state changed to UP ./system.log:Oct 11 20:35:37 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port2 ./system.log:Oct 11 20:35:37 nyc-fw1-inet kernel: e6000sw0port2: link state changed to DOWN ./system.log:Oct 11 20:35:40 nyc-fw1-inet kernel: e6000sw0port2: link state changed to UP ./system.log:Oct 11 20:35:40 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port2
-
@michmoor I don’t see that in the docs. Is there an ISP modem that booted? pfSense would see that as a disconnect.
-
@SteveITS no ISP modem that booted. Two link down events from what appears to be two ports but nothing on the WAN or LAN side was restarted. Maybe these are internal ports as the 1100 has a switch? I dunno
-
Port3 is WAN port2 is LAN. By default at least.
-
So based on my syslog, both WAN and LAN went down?
-
Yes, that's what's logged there. Since it's actually the switch driver reporting that it pretty much has to have happened.
-
@stephenw10 hmm than i suspect something is wrong with the unit.
One port directly connects to a cable modem.
The other port directly connects to a switch.
They both go down at the same time?Is there anything else I can check to monitor hardware health?
-
I assume both those things didn't reboot?
It could have been pfSense driving the switch ports to re-link of course. A change to the internal switch config perhaps.
-
@stephenw10
The modem , pfsense and switch are on the same PDU. So it wasn't a loss of power. By some sort of freak act there could be a fault on two outlets but highly unlikely.
No configuration changes have taken place on this unit since 10/6 according to Configuration History.
Im back to thinking its perhaps a faulty unit. Replacement here would be a real PITA. Hoping there are some other diagnostics i can perform. -
Is there anything logged running up to that? Some interface change etc?
-
This is the only thing i see prior
Oct 11 20:16:00 nyc-fw1-inet sshguard[69504]: Exiting on signal. Oct 11 20:16:00 nyc-fw1-inet sshguard[77474]: Now monitoring attacks. Oct 11 20:35:30 nyc-fw1-inet check_reload_status[666]: Linkup starting $e6000sw0port3 Oct 11 20:35:30 nyc-fw1-inet kernel: e6000sw0port3: link state changed to DOWN Oct 11 20:35:31 nyc-fw1-inet php-fpm[20159]: /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp) Oct 11 20:35:31 nyc-fw1-inet php-fpm[20159]: /rc.linkup: DEVD Ethernet detached event for wan Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down Oct 11 20:35:33 nyc-fw1-inet syslogd: sendto: Network is down
I do see that the LAN side had a Hotplug event as well. Looking at the timestamps the LAN side event happened more or less at the same time as the WAN side.
To me this indicates either- As part of any link-status event, pfSense restarts the internal switch ports
- There was some weird failure on both LAN and WAN side which i honestly don't see happening.
- Other cause not yet known.
][admin@nyc-fw1-inet.moore.lan]/var/log: cat system.log | grep "Hotplug" Oct 11 20:35:31 nyc-fw1-inet php-fpm[20159]: /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp) Oct 11 20:35:38 nyc-fw1-inet php-fpm[55571]: /rc.linkup: Hotplug event detected for LAN(lan) dynamic IP address (4: 192.168.70.254, 6: track6) Oct 11 20:35:39 nyc-fw1-inet php-fpm[17116]: /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp) Oct 11 20:35:54 nyc-fw1-inet php-fpm[20159]: /rc.linkup: Hotplug event detected for LAN(lan) dynamic IP address (4: 192.168.70.254, 6: track6)