Router running very slow and hot.



  • I'm trying to figure out why this router is running so poorly.  Below, you'll see that the CPU is churning and the load is far too high.  The only services I have running are apinger, bsnmpd, cron, dpchd, dnsmasq, and ntpd.

    last pid: 40609;  load averages:  4.10,  4.65,  4.21 
    173 processes: 7 running, 139 sleeping, 9 zombie, 18 waiting
    CPU:  3.2% user, 28.5% nice, 67.7% system,  0.6% interrupt,  0.0% idle
    Mem: 491M Active, 314M Inact, 540M Wired, 468K Cache, 209M Buf, 589M Free
    Swap:
    
      PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
      290 root          124   20  6908K  1380K CPU1    1   9:58 39.45% /usr/local/sbin/check_reload_status
    82421 root          122   20   157M 44620K RUN     0   0:00 27.29% /usr/local/bin/php -f /etc/rc.filter_configure_sync
       18 root          171 ki-6     0K    16K RUN     1  59:56  9.86% [idlepoll]
    41945 root           44    0   149M 47160K accept  1   0:23  0.39% /usr/local/bin/php{php}
    36696 root           44    0   154M 46768K accept  0   0:23  0.39% /usr/local/bin/php{php}
       12 root          -32    -     0K   288K WAIT    0   0:22  0.20% [intr{swi4: clock}]
       11 root          171 ki31     0K    32K RUN     0  22:52  0.00% [idle{idle: cpu0}]
       11 root          171 ki31     0K    32K RUN     1   8:07  0.00% [idle{idle: cpu1}]
        0 root          -16    0     0K   112K sched   0   1:01  0.00% [kernel{swapper}]
    
    

    It may have something to do with the following in the logs:

    Sep 23 14:43:02 	check_reload_status: updating dyndns WANGW
    Sep 23 14:43:02 	check_reload_status: Restarting ipsec tunnels
    Sep 23 14:43:02 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Sep 23 14:43:02 	check_reload_status: Reloading filter
    Sep 23 14:43:13 	check_reload_status: updating dyndns WANGW
    Sep 23 14:43:13 	check_reload_status: Restarting ipsec tunnels
    Sep 23 14:43:13 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Sep 23 14:43:13 	check_reload_status: Reloading filter
    Sep 23 14:43:17 	check_reload_status: updating dyndns WANGW
    Sep 23 14:43:18 	check_reload_status: Restarting ipsec tunnels
    Sep 23 14:43:18 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Sep 23 14:43:18 	check_reload_status: Reloading filter
    Sep 23 14:43:47 	check_reload_status: updating dyndns WANGW
    Sep 23 14:43:47 	check_reload_status: Restarting ipsec tunnels
    Sep 23 14:43:47 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Sep 23 14:43:47 	check_reload_status: Reloading filter
    Sep 23 14:44:07 	check_reload_status: updating dyndns WANGW
    Sep 23 14:44:07 	check_reload_status: Restarting ipsec tunnels
    Sep 23 14:44:07 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Sep 23 14:44:07 	check_reload_status: Reloading filter
    Sep 23 14:44:16 	check_reload_status: updating dyndns WANGW
    Sep 23 14:44:16 	check_reload_status: Restarting ipsec tunnels
    Sep 23 14:44:16 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    Sep 23 14:44:16 	check_reload_status: Reloading filter
    Sep 23 14:44:23 	check_reload_status: updating dyndns WANGW
    Sep 23 14:44:23 	check_reload_status: Restarting OpenVPN tunnels/interfaces
    

    There aren't any ipsec or OpenVPN tunnels.  ipsec is turned off.  My config is:

    Version 	2.1.4-RELEASE (amd64)
    built on Fri Jun 20 15:48:47 EDT 2014
    FreeBSD 8.3-RELEASE-p16
    
    Platform 	nanobsd (4g)
    NanoBSD Boot Slice 	pfsense0 / da0s1 (rw)
    
    CPU Type 	AMD G-T40E Processor
    Current: 125 MHz, Max: 1000 MHz
    2 CPUs: 1 package(s) x 2 core(s)
    

    I would appreciate any help.



  • Obviously there is an OpenVPN session attempting to run, judging from the logs you've posted. Do you have anything configured in the OpenVPN section of the admin console?



  • OpenVPN is empty.  There's nothing in there.  So, I checked the Gateway Logs and see this:

    Sep 23 14:44:06 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
    Sep 23 14:49:13 	apinger: ALARM: WANGW(x.x.x.x) *** loss ***
    Sep 23 14:49:42 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
    Sep 23 14:50:02 	apinger: ALARM: WANGW(x.x.x.x) *** delay ***
    Sep 23 14:50:11 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
    Sep 23 14:50:17 	apinger: ALARM: WANGW(x.x.x.x) *** delay ***
    Sep 23 14:50:59 	apinger: alarm canceled: WANGW(x.x.x.x) *** delay ***
    Sep 23 14:51:14 	apinger: ALARM: WANGW(x.x.x.x) *** delay ***
    

    over and over and over again.  So I decided to ping the GW, sitting right next to the router and over 1200 pings I have a min of .343 (not bad) to 33339.712 (not great).  Avg of 3416.742.  Would that be causing it?


  • Banned

    If you get 33 seconds to a nexthop a meter away from your router, there's something extremely wrong with your network.



  • Yup.  Just trying to figure out if it's a port the router, modem, or cable.



  • So, turning off polling on the NICs and a reboot seemed to make everything normal except this morning when I take a look I see:

    23235 root           76    0 10012K  1512K RUN     1   0:01  6.98% netstat -I re2 -nWb -f link
    27310 root           76    0 10012K  1512K RUN     1   0:01  6.98% netstat -I re2 -nWb -f link
    30121 root           76    0 10012K  1512K RUN     1   0:01  6.98% netstat -I re1 -nWb -f link
    24314 root           97    0 10012K  1512K RUN     0   0:01  6.88% netstat -I re2 -nWb -f link
    25948 root           76    0 10012K  1512K RUN     1   0:01  6.79% netstat -I re2 -nWb -f link
    32639 root           76    0 10012K  1512K RUN     0   0:01  6.15% netstat -I re2 -nWb -f link
    28704 root           76    0 10012K  1512K RUN     0   0:01  5.96% netstat -I re2 -nWb -f link
    31140 root           97    0 10012K  1512K RUN     0   0:01  5.76% netstat -I re2 -nWb -f link
    40091 root           76    0 10012K  1512K RUN     1   0:01  4.88% netstat -I re1 -nWb -f link
    42659 root           76    0 10012K  1512K RUN     1   0:01  4.69% netstat -I re2 -nWb -f link
    49808 root           76    0 10012K  1512K RUN     0   0:01  4.05% netstat -I re1 -nWb -f link
    77245 root           76    0   145M 32740K accept  1   0:10  3.66% /usr/local/bin/php
    51647 root           76    0 10012K  1512K RUN     1   0:00  3.27% netstat -I re1 -nWb -f link
    52801 root           76    0 10012K  1512K RUN     1   0:00  3.27% netstat -I re2 -nWb -f link
    69619 root           63    0   147M 33036K accept  1   0:16  0.29% /usr/local/bin/php
       11 root          171 ki31     0K    32K RUN     0 115:43  0.00% [idle{idle: cpu0}]
       11 root          171 ki31     0K    32K RUN     1  70:03  0.00% [idle{idle: cpu1}]
       12 root          -32    -     0K   288K WAIT    0   4:47  0.00% [intr{swi4: clock}]
    
    

    Is there any way to see why or figure out why there are so many instances of netstat running?

    Edit - I wanted to add that it seems they keep closing and reloading.  Each time top reloads it's a new set of PIDs.

    Edit 2 - Nevermind, this is a separate issue of my own making.  The original problem was resolved by turning off polling.



  • " /usr/local/sbin/check_reload_status"

    Wow, 40% cpu and 10minutes of CPU time?



  • perhaps you should start by updating to a supported version.
    2.1.4  isn't exactly the latest stable


  • Banned

    Well, with the horrible polling option, 2.1.x vs. 2.2.x vs. 2.3.x makes just no difference. The option shouldn't exist in the GUI.


Log in to reply