Router running very slow and hot.

Stewart

I'm trying to figure out why this router is running so poorly. Below, you'll see that the CPU is churning and the load is far too high. The only services I have running are apinger, bsnmpd, cron, dpchd, dnsmasq, and ntpd.

last pid: 40609;  load averages:  4.10,  4.65,  4.21 
173 processes: 7 running, 139 sleeping, 9 zombie, 18 waiting
CPU:  3.2% user, 28.5% nice, 67.7% system,  0.6% interrupt,  0.0% idle
Mem: 491M Active, 314M Inact, 540M Wired, 468K Cache, 209M Buf, 589M Free
Swap:

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
  290 root          124   20  6908K  1380K CPU1    1   9:58 39.45% /usr/local/sbin/check_reload_status
82421 root          122   20   157M 44620K RUN     0   0:00 27.29% /usr/local/bin/php -f /etc/rc.filter_configure_sync
   18 root          171 ki-6     0K    16K RUN     1  59:56  9.86% [idlepoll]
41945 root           44    0   149M 47160K accept  1   0:23  0.39% /usr/local/bin/php{php}
36696 root           44    0   154M 46768K accept  0   0:23  0.39% /usr/local/bin/php{php}
   12 root          -32    -     0K   288K WAIT    0   0:22  0.20% [intr{swi4: clock}]
   11 root          171 ki31     0K    32K RUN     0  22:52  0.00% [idle{idle: cpu0}]
   11 root          171 ki31     0K    32K RUN     1   8:07  0.00% [idle{idle: cpu1}]
    0 root          -16    0     0K   112K sched   0   1:01  0.00% [kernel{swapper}]

It may have something to do with the following in the logs:

Sep 23 14:43:02 	check_reload_status: updating dyndns WANGW
Sep 23 14:43:02 	check_reload_status: Restarting ipsec tunnels
Sep 23 14:43:02 	check_reload_status: Restarting OpenVPN tunnels/interfaces
Sep 23 14:43:02 	check_reload_status: Reloading filter
Sep 23 14:43:13 	check_reload_status: updating dyndns WANGW
Sep 23 14:43:13 	check_reload_status: Restarting ipsec tunnels
Sep 23 14:43:13 	check_reload_status: Restarting OpenVPN tunnels/interfaces
Sep 23 14:43:13 	check_reload_status: Reloading filter
Sep 23 14:43:17 	check_reload_status: updating dyndns WANGW
Sep 23 14:43:18 	check_reload_status: Restarting ipsec tunnels
Sep 23 14:43:18 	check_reload_status: Restarting OpenVPN tunnels/interfaces
Sep 23 14:43:18 	check_reload_status: Reloading filter
Sep 23 14:43:47 	check_reload_status: updating dyndns WANGW
Sep 23 14:43:47 	check_reload_status: Restarting ipsec tunnels
Sep 23 14:43:47 	check_reload_status: Restarting OpenVPN tunnels/interfaces
Sep 23 14:43:47 	check_reload_status: Reloading filter
Sep 23 14:44:07 	check_reload_status: updating dyndns WANGW
Sep 23 14:44:07 	check_reload_status: Restarting ipsec tunnels
Sep 23 14:44:07 	check_reload_status: Restarting OpenVPN tunnels/interfaces
Sep 23 14:44:07 	check_reload_status: Reloading filter
Sep 23 14:44:16 	check_reload_status: updating dyndns WANGW
Sep 23 14:44:16 	check_reload_status: Restarting ipsec tunnels
Sep 23 14:44:16 	check_reload_status: Restarting OpenVPN tunnels/interfaces
Sep 23 14:44:16 	check_reload_status: Reloading filter
Sep 23 14:44:23 	check_reload_status: updating dyndns WANGW
Sep 23 14:44:23 	check_reload_status: Restarting OpenVPN tunnels/interfaces

There aren't any ipsec or OpenVPN tunnels. ipsec is turned off. My config is:

Version 	2.1.4-RELEASE (amd64)
built on Fri Jun 20 15:48:47 EDT 2014
FreeBSD 8.3-RELEASE-p16

Platform 	nanobsd (4g)
NanoBSD Boot Slice 	pfsense0 / da0s1 (rw)

CPU Type 	AMD G-T40E Processor
Current: 125 MHz, Max: 1000 MHz
2 CPUs: 1 package(s) x 2 core(s)

I would appreciate any help.

muswellhillbilly

Obviously there is an OpenVPN session attempting to run, judging from the logs you've posted. Do you have anything configured in the OpenVPN section of the admin console?

Stewart

OpenVPN is empty. There's nothing in there. So, I checked the Gateway Logs and see this:

Sep 23 14:44:06 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
Sep 23 14:49:13 	apinger: ALARM: WANGW(x.x.x.x) *** loss ***
Sep 23 14:49:42 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
Sep 23 14:50:02 	apinger: ALARM: WANGW(x.x.x.x) *** delay ***
Sep 23 14:50:11 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
Sep 23 14:50:17 	apinger: ALARM: WANGW(x.x.x.x) *** delay ***
Sep 23 14:50:59 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
Sep 23 14:51:14 	apinger: ALARM: WANGW(x.x.x.x) *** delay ***

over and over and over again. So I decided to ping the GW, sitting right next to the router and over 1200 pings I have a min of .343 (not bad) to 33339.712 (not great). Avg of 3416.742. Would that be causing it?

doktornotor

If you get 33 seconds to a nexthop a meter away from your router, there's something extremely wrong with your network.

Stewart

Yup. Just trying to figure out if it's a port the router, modem, or cable.

Stewart

So, turning off polling on the NICs and a reboot seemed to make everything normal except this morning when I take a look I see:

23235 root           76    0 10012K  1512K RUN     1   0:01  6.98% netstat -I re2 -nWb -f link
27310 root           76    0 10012K  1512K RUN     1   0:01  6.98% netstat -I re2 -nWb -f link
30121 root           76    0 10012K  1512K RUN     1   0:01  6.98% netstat -I re1 -nWb -f link
24314 root           97    0 10012K  1512K RUN     0   0:01  6.88% netstat -I re2 -nWb -f link
25948 root           76    0 10012K  1512K RUN     1   0:01  6.79% netstat -I re2 -nWb -f link
32639 root           76    0 10012K  1512K RUN     0   0:01  6.15% netstat -I re2 -nWb -f link
28704 root           76    0 10012K  1512K RUN     0   0:01  5.96% netstat -I re2 -nWb -f link
31140 root           97    0 10012K  1512K RUN     0   0:01  5.76% netstat -I re2 -nWb -f link
40091 root           76    0 10012K  1512K RUN     1   0:01  4.88% netstat -I re1 -nWb -f link
42659 root           76    0 10012K  1512K RUN     1   0:01  4.69% netstat -I re2 -nWb -f link
49808 root           76    0 10012K  1512K RUN     0   0:01  4.05% netstat -I re1 -nWb -f link
77245 root           76    0   145M 32740K accept  1   0:10  3.66% /usr/local/bin/php
51647 root           76    0 10012K  1512K RUN     1   0:00  3.27% netstat -I re1 -nWb -f link
52801 root           76    0 10012K  1512K RUN     1   0:00  3.27% netstat -I re2 -nWb -f link
69619 root           63    0   147M 33036K accept  1   0:16  0.29% /usr/local/bin/php
   11 root          171 ki31     0K    32K RUN     0 115:43  0.00% [idle{idle: cpu0}]
   11 root          171 ki31     0K    32K RUN     1  70:03  0.00% [idle{idle: cpu1}]
   12 root          -32    -     0K   288K WAIT    0   4:47  0.00% [intr{swi4: clock}]

Is there any way to see why or figure out why there are so many instances of netstat running?

Edit - I wanted to add that it seems they keep closing and reloading. Each time top reloads it's a new set of PIDs.

Edit 2 - Nevermind, this is a separate issue of my own making. The original problem was resolved by turning off polling.

Harvy66

" /usr/local/sbin/check_reload_status"

Wow, 40% cpu and 10minutes of CPU time?

heper

perhaps you should start by updating to a supported version.
2.1.4 isn't exactly the latest stable

doktornotor

Well, with the horrible polling option, 2.1.x vs. 2.2.x vs. 2.3.x makes just no difference. The option shouldn't exist in the GUI.