Weird 2.2 behaviour… CPU 99%, traffic slow, SquidGuard not working



  • Hi All,

    Has anyone experienced, or got any ideas as to the cause of, this, please?

    At some point after the 2.2 update - and I noticed this is a week or more after, so potentially some package updated in the meantime too - my firewall has started behaving really strangely:

    • CPU is at 99% much of the time and almost never below 50%, compared to previous norms of 2~5%
    • top shows 30% user, 60% system usage but the top few processes (squid, php-fm, suricata) only claim around 10% between them
    • SquidGuard isn't started; when I try to restart it, the UI says started but then returns it to not started status
    • SquidGuard isn't working (confirmed by adding a block to one of the ACLs and restarting the firewall, sites were not blocked)

    Basic details - I can't get these as JPEG at the moment to attach them, so key points as text:

    • System is 2.2-RELEASE (i386) Thu Jan 22 14:04:25CST 2015
    • Latest version
    • Celeron dual core CPU (G550T) adn 8GB RAM
    • WAN connection 75mbps, single gigabit LAN connection in use
    • packages: freeradius2 2.2.12_1/2.2.6_3, OpenVPN client export 1.2.15, squid3 3.4.10_2, squidguard-squid3 1.4_4, suricata 2.0.6

    Any ideas, especially on how to work out what is using all that CPU time, greatly appreciated!

    Thanks,

    Jeff



  • $ top



  • That sounds about the same as my issue at
    https://forum.pfsense.org/index.php?topic=87569.msg482663

    Try my solution listed in there.  It can't hurt anything, in any case.



  • Thanks SnowGhost, was worth a shot but sadly no joy.  Checked powerd settings (in fact it wasn't even enabled, but I tried adaptive for the heck of it) and powered down.  Restarted 5 minutes later and it's just the same as before, 100% CPU showing.

    However, top -SH gives a result that doesn't quite agree:

    last pid: 11472;  load averages:  2.85,  1.42,  0.75              up 0+00:13:34  20:40:47
    174 processes: 6 running, 148 sleeping, 20 waiting
    CPU: 33.9% user,  0.0% nice, 65.7% system,  0.0% interrupt,  0.4% idle
    Mem: 705M Active, 276M Inact, 130M Wired, 82M Buf, 2283M Free
    Swap: 8192M Total, 8192M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
       11 root       155 ki31     0K    16K RUN     0   8:50  63.96% idle{idle: cpu0}
       11 root       155 ki31     0K    16K RUN     1   8:34  62.60% idle{idle: cpu1}
    77971 proxy       52    0 87048K 41872K kqread  0   0:06   6.79% squid
    96051 root        20    0   690M   638M uwait   0   0:19   0.29% suricata{FlowManagerThre
    88512 root        27    0 31380K 19796K accept  1   0:00   0.20% php-fpm
        0 root       -16    0     0K   144K swapin  0   0:45   0.00% kernel{swapper}
    96051 root        20    0   690M   638M nanslp  0   0:25   0.00% suricata{suricata}
    96051 root        20    0   690M   638M uwait   1   0:13   0.00% suricata{Detect1}
        4 root       -16    -     0K    16K -       0   0:04   0.00% cam{scanner}
        0 root       -92    0     0K   144K -       0   0:04   0.00% kernel{em0 que}
       12 root       -76    -     0K   160K WAIT    0   0:02   0.00% intr{swi0: uart}
       12 root       -88    -     0K   160K WAIT    0   0:01   0.00% intr{irq19: atapci0+}
       12 root       -60    -     0K   160K WAIT    0   0:01   0.00% intr{swi4: clock}
    96051 root        20    0   690M   638M bpf     1   0:01   0.00% suricata{RxPcapem01}
    96051 root        20    0   690M   638M uwait   0   0:01   0.00% suricata{Detect2}
        4 root       -16    -     0K    16K -       1   0:01   0.00% cam{doneq0}
        0 root       -92    0     0K   144K -       0   0:01   0.00% kernel{em1 que}
    57169 root        20    0 10300K  2008K select  0   0:01   0.00% syslogd
        5 root       -16    -     0K     8K pftm    0   0:00   0.00% pf purge
       16 root       -68    -     0K    64K -       1   0:00   0.00% usb{usbus1}
       16 root       -68    -     0K    64K -       1   0:00   0.00% usb{usbus0}
       15 root       -16    -     0K     8K -       0   0:00   0.00% rand_harvestq
    96051 root        20    0   690M   638M uwait   0   0:00   0.00% suricata{Detect3}
        0 root         8    0     0K   144K -       1   0:00   0.00% kernel{thread taskq}
    49274 root        20    0 10176K  1696K select  1   0:00   0.00% powerd
    87933 root        20    0 17176K 17208K select  1   0:00   0.00% ntpd{ntpd}
    32466 nobody      20    0 11400K  3624K select  0   0:00   0.00% dnsmasq
       13 root       -16    -     0K    16K sleep   1   0:00   0.00% ng_queue{ng_queue0}
    

    So 60% of that 100% CPU usage indicated in the web UI is apparently idle time!

    Neither of which explains the 40%-odd of CPU that is being used on an almost-idle system with two cores and the snail-like performance of the WAN traffic at times.

    At this rate, I think I may have to do some selective config backups and rebuild from a bare machine.  Disappointing.


  • Netgate Administrator

    Do you have any Layer7 filtering running?

    Steve



  • Hi,

    i have the same issue.
    Upgrade from pfsense 2.1 to 2.2.1 let to 100% cpu usage (30% user, 70%system). load avarage : 2.43/2.45/2.67.

    I have squid, snort and HAProxy installed.
    HW: AMD G series T40E APU, 1 GHz dual core 64bit.

    Any suggestions ?


  • Netgate Administrator

    Check what is using your CPU cycles. Login to a console and at the comand line run 'top -aSH'. Leave it running for a few seconds and check the processes at the top of the table.

    Steve


Log in to reply