Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    High Load Troubleshooting - RCC-VE 2440 2.2.5

    Scheduled Pinned Locked Moved Hardware
    11 Posts 3 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cmb
      last edited by

      @luckman212:

      I have tried looking at output from "top -aSH" as well as "systat -vmstat 1" and I honestly don't know how to track down the culprit (nothing stands out).

      Post the output of those and we can tell you. Otherwise no way to even guess.

      1 Reply Last reply Reply Quote 0
      • luckman212L
        luckman212 LAYER 8
        last edited by

        sorry- I had to wait for the problem to occur again
        here is some output:

        top -aSH

        last pid: 42450;  load averages:  0.10,  0.07,  0.02
        122 processes: 3 running, 93 sleeping, 26 waiting
        CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
        Mem: 40M Active, 73M Inact, 150M Wired, 167M Buf, 3636M Free
        Swap: 8192M Total, 8192M Free
        
          PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
           11 root     155 ki31     0K    32K RUN     1 179:29 100.00% [idle{idle: cpu1}]
           11 root     155 ki31     0K    32K CPU0    0 178:18 100.00% [idle{idle: cpu0}]
         3828 root      52    0   233M 32824K accept  1   0:00   0.59% php-fpm: pool lighty (php-fpm)
            0 root     -16    0     0K   288K swapin  1   0:36   0.00% [kernel{swapper}]
           12 root     -92    -     0K   416K WAIT    0   0:31   0.00% [intr{irq259: igb1:que}]
           12 root     -92    -     0K   416K WAIT    0   0:22   0.00% [intr{irq256: igb0:que}]
           15 root     -16    -     0K    16K -       0   0:13   0.00% [rand_harvestq]
            5 root     -16    -     0K    16K pftm    1   0:10   0.00% [pf purge]
           12 root     -60    -     0K   416K WAIT    0   0:10   0.00% [intr{swi4: clock}]
        22002 root      20    0 12456K  2172K select  0   0:09   0.00% /usr/local/sbin/apinger -c /var/etc/apinger.conf
        86962 root      20    0 18984K  2688K select  0   0:09   0.00% /usr/local/libexec/nut/usbhid-ups -a ups
        91411 root      20    0 18984K  2688K select  0   0:08   0.00% /usr/local/libexec/nut/usbhid-ups -a ups
           12 root     -92    -     0K   416K WAIT    1   0:08   0.00% [intr{irq257: igb0:que}]
           12 root     -92    -     0K   416K WAIT    1   0:08   0.00% [intr{irq260: igb1:que}]
        71277 root      20    0 55628K  6108K select  0   0:06   0.00% sshd: admin@pts/0 (sshd)
           16 root     -68    -     0K    64K -       1   0:05   0.00% [usb{usbus0}]
        50575 root      20    0 14540K  1992K select  0   0:04   0.00% /usr/sbin/powerd -b hadp -a hadp -n hadp
           12 root     -88    -     0K   416K WAIT    0   0:04   0.00% [intr{irq22: ehci0}]
        90883 root      20    0 18844K  2488K select  1   0:03   0.00% /usr/local/sbin/upsd -u root
        48457 root      52   20 17136K  2436K wait    0   0:03   0.00% /bin/sh /var/db/rrd/updaterrd.sh
        10588 root      20    0 28344K 18104K select  1   0:03   0.00% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid{ntpd}
           20 root      16    -     0K    16K syncer  1   0:02   0.00% [syncer]
           16 root     -72    -     0K    64K -       1   0:02   0.00% [usb{usbus0}]
        27214 root      20    0 48692K  6036K kqread  0   0:02   0.00% /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
          244 root      44    0   229M 21852K kqread  0   0:02   0.00% php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
           12 root     -72    -     0K   416K WAIT    1   0:01   0.00% [intr{swi1: netisr 1}]
        74277 root      20    0 17476K  4056K pause   1   0:01   0.00% /bin/tcsh
        97371 uucp      20    0 18832K  2520K nanslp  0   0:01   0.00% /usr/local/sbin/upsmon ups@localhost
            7 root     -16    -     0K    16K psleep  0   0:01   0.00% [pagedaemon]
           14 root      -8    -     0K    48K -       0   0:01   0.00% [geom{g_event}]
           12 root     -88    -     0K   416K WAIT    1   0:01   0.00% [intr{irq268: ahci0}]
           16 root     -68    -     0K    64K -       0   0:01   0.00% [usb{usbus0}]
        17661 root      20    0 16804K  2200K bpf     0   0:01   0.00% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
           62 root      -8    -     0K    16K mdwait  1   0:01   0.00% [md0]
           18 root      20    -     0K    32K sdflus  0   0:01   0.00% [bufdaemon{/ worker}]
        38529 root      20    0 14532K  2228K select  0   0:01   0.00% /usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.conf -m syslog
        22306 root      20    0 28328K  3004K piperd  0   0:00   0.00% rrdtool -
        65334 root      20    0 14656K  2336K select  1   0:00   0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /var/etc/syslog.conf
           18 root     -16    -     0K    32K psleep  0   0:00   0.00% [bufdaemon{bufdaemon}]
           19 root     -16    -     0K    16K vlruwt  0   0:00   0.00% [vnlru]
        84285 proxy     20    0 12448K  2124K kqread  0   0:00   0.00% /usr/sbin/ftp-proxy
            4 root     -16    -     0K    32K -       0   0:00   0.00% [cam{doneq0}]
        99065 unbound   20    0 47036K 14404K kqread  1   0:00   0.00% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound}
        90103 root      20    0 21988K  3232K CPU1    1   0:00   0.00% top -aSH
        99065 unbound   20    0 47036K 14404K kqread  1   0:00   0.00% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound}
            0 root       8    0     0K   288K -       1   0:00   0.00% [kernel{thread taskq}]
        27598 root      20    0 16664K  2300K nanslp  0   0:00   0.00% /usr/sbin/cron -s
         2007 root      20    0 14560K  2184K nanslp  0   0:00   0.00% /usr/libexec/getty al.Pc ttyv0
            9 root     -16 ki-1     0K    16K pollid  0   0:00   0.00% [idlepoll]
            4 root     -16    -     0K    32K -       0   0:00   0.00% [cam{scanner}]
            0 root       8    0     0K   288K -       0   0:00   0.00% [kernel{ffs_trim taskq}]
            1 root      52    0  9112K   792K wait    1   0:00   0.00% [init]
          276 root      20    0 13160K  4476K select  0   0:00   0.00% /sbin/devd -q
        
        

        systat -vmstat 1

            2 users    Load  0.08  0.09  0.03                  Dec  8 01:04
        
        Mem:KB    REAL            VIRTUAL                       VN PAGER   SWAP PAGER
                Tot   Share      Tot    Share    Free           in   out     in   out
        Act  125912    9412  1200760    10336 3713544  count            
        All  205768   14056  1261620    59092          pages            
        Proc:                                                            Interrupts
          r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        ioflt  1645 total
          1          51      1688 4850 8767   98   26 5786    915 cow         uart1 3
                                                             3299 zfod     12 ehci0 22
        19.8%Sys   0.4%Intr 17.1%User  0.0%Nice 62.8%Idle         ozfod  1128 cpu0:timer
        |    |    |    |    |    |    |    |    |    |           %ozfod    35 igb0:que 0
        ==========>>>>>>>>>                                       daefr    10 igb0:que 1
                                                31 dtbuf     1758 prcfr       igb0:link
        Namei     Name-cache   Dir-cache     85468 desvn     3227 totfr    31 igb1:que 0
           Calls    hits   %    hits   %      2778 numvn          react    11 igb1:que 1
            4545    4535 100       2   0       492 frevn          pdwak       igb1:link
                                                               17 pdpgs       ahci0 268
        Disks   md0   da0  ada0 pass0 pass1                       intrn   418 cpu1:timer
        KB/t   0.00  0.00  0.00  0.00  0.00                158452 wire
        tps      12     0     0     0     0                 44712 act
        MB/s   0.00  0.00  0.00  0.00  0.00                 76524 inact
        %busy     0     0     0     0     0                       cache
                                                          3713560 free
                                                           175472 buf
        
        
        1 Reply Last reply Reply Quote 0
        • luckman212L
          luckman212 LAYER 8
          last edited by

          I think it's hard to capture what I'm seeing here in a screen full of text.  So I made a small screen recording showing the dashboard with a systat -vmstat 0.2 running on top of it.  You can see there are these quick "surges" of activity and the dashboard CPU stays pretty steady around 20% even though the system is 99%+ idle.  This is what I am trying to track down if you have any other ideas

          pfsense225.mp4.zip

          1 Reply Last reply Reply Quote 0
          • luckman212L
            luckman212 LAYER 8
            last edited by

            After 24h the Processor RRD shows almost no CPU usage…

            even though dashboard shows high load avgs … 0.11, 0.16, 0.10

            mem usage seems normal, state table very small etc.
            Not sure what's going on??

            1 Reply Last reply Reply Quote 0
            • D
              doktornotor Banned
              last edited by

              I think the dashboard is a CPU load generator, not exactly something usable for serious monitoring.

              1 Reply Last reply Reply Quote 0
              • luckman212L
                luckman212 LAYER 8
                last edited by

                Yes I know that the php scripts etc on the dashboard cause some load but 20% continuous is not something I'm seeing on the other systems I've set up.  Also, I can observe the same cpu spikes with systat (see video above) so something is definitely going on here.

                1 Reply Last reply Reply Quote 0
                • C
                  cmb
                  last edited by

                  0.1 is not a high load average at all, that's very low. Anything under 1 is generally normal, and values above that can be specious.

                  Brief small spikes of 20% that are so trivial the 1 minute average is under 0.5% total CPU usage are also normal. The RRD stats gathering for instance will chew a bit of CPU for a fraction of a second once a minute to pull the stats. Other things will do similar depending on what you're running. You're not seeing 20% continuous. Everything there looks perfectly normal.

                  1 Reply Last reply Reply Quote 0
                  • luckman212L
                    luckman212 LAYER 8
                    last edited by

                    It's weird though because of many many other systems that I've set up, I've not seen one that idles at 20% on the dashboard. They are all at 0-2%.  I understand spikes to 20% here and there when pages are loaded or background tasks are run but to have it sit steadily at that level is something I have not seen.  Where is the code that generates the load % on the dashboard? Maybe somehow it's not giving accurate readings? I mean in this case if you look at the RRD graph it certainly doesn't seem to correlate with what the dash is showing…

                    1 Reply Last reply Reply Quote 0
                    • C
                      cmb
                      last edited by

                      Maybe your dashboard widget isn't actually updating? The initial load is always higher but falls off quickly as it updates. If it stays at exactly the same value, it's definitely not updating, which would be your browser's AJAX. Try clearing cache, or a diff browser or machine, see if that's any different.

                      1 Reply Last reply Reply Quote 0
                      • luckman212L
                        luckman212 LAYER 8
                        last edited by

                        I had thought that too but if you look at the video I posted, you can see the dash value is actually updating, it's just staying steadily elevated. I think I am missing the FreeBSD mojo to dig into what is causing this… interrupts??

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.