PFsense SLow and High CPU



  • Sorry for the code

    Here is the output:

    ps auxw | grep '[p]hp'
    root      54636 18.2  1.4 82632 28488  -  S    10:50AM    2:34.58 /usr/local/bin/php
    root      79377 17.1  1.4 82632 28484  -  R    11:15AM    2:05.14 /usr/local/bin/php
    root      85143 15.0  1.4 82632 28480  -  S    11:31AM    0:32.80 /usr/local/bin/php
    root      15517 10.8  1.7 86728 35704  -  S    11:25AM    1:27.37 /usr/local/bin/php
    root      77242  7.2  1.6 86728 33032  -  D    11:37AM    0:04.15 /usr/local/bin/php -f /etc/rc.prunecaptiveportal test
    root      56754  7.0  1.4 82632 28488  -  S    10:57AM    1:19.90 /usr/local/bin/php
    root      70124  7.0  1.8 86824 36256  -  R    11:31AM    0:22.09 php-fpm: pool lighty (php-fpm)
    root      39322  6.8  1.5 86824 30476  -  R    11:31AM    0:21.20 php-fpm: pool lighty (php-fpm)
    root      80658  6.6  1.7 86824 35824  -  S    11:31AM    0:22.52 php-fpm: pool lighty (php-fpm)
    root       8330  3.6  1.8 86728 36020  -  S    10:32AM    1:43.94 /usr/local/bin/php
    root      85706  3.5  1.4 82632 28476  -  S    11:37AM    0:06.11 /usr/local/bin/php
    root      26757  3.2  1.4 82632 28480  -  S    11:26AM    0:44.44 /usr/local/bin/php
    root      20387  0.8  1.5 86824 31508  -  S    11:31AM    0:25.87 php-fpm: pool lighty (php-fpm)
    root      77335  0.4  1.4 82632 28476  -  S    11:37AM    0:06.43 /usr/local/bin/php
    root      12429  0.1  1.4 82632 28484  -  I    11:30AM    0:06.67 /usr/local/bin/php
    root      87715  0.1  1.4 82632 28484  -  I    10:53AM    0:39.54 /usr/local/bin/php
    root        253  0.0  0.9 78632 18512  -  Ss   Fri01AM    0:21.82 php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
    root       4965  0.0  1.4 82632 28496  -  I    11:12AM    0:57.50 /usr/local/bin/php
    root      24411  0.0  1.8 86728 36220  -  I    10:32AM    2:14.04 /usr/local/bin/php
    root      56709  0.0  0.1 10076  1704  -  S    Fri01AM    0:01.14 minicron: helper /usr/local/sbin/gmirror_status_check.php  (minic
    root      80479  0.0  1.7 86728 35716  -  I    10:49AM    0:38.18 /usr/local/bin/php
    root      82250  0.0  1.0 78536 20400  -  Is    9:37AM    0:00.37 /usr/local/bin/php
    root      83172  0.0  1.0 78536 20400  -  Is    9:37AM    0:00.39 /usr/local/bin/php
    root      83392  0.0  1.0 78536 20400  -  Is    9:37AM    0:00.40 /usr/local/bin/php
    root      83704  0.0  1.0 78536 20400  -  Is    9:37AM    0:00.40 /usr/local/bin/php
    root      83948  0.0  1.0 78536 20400  -  Is    9:37AM    0:00.42 /usr/local/bin/php
    root      84168  0.0  1.0 78536 20400  -  Is    9:37AM    0:00.44 /usr/local/bin/php
    root      84514  0.0  1.8 86728 36268  -  I     9:37AM    2:44.40 /usr/local/bin/php
    root      84694  0.0  1.8 86728 35960  -  I     9:37AM    2:29.28 /usr/local/bin/php
    root      84917  0.0  1.8 86728 36208  -  I     9:37AM    1:35.46 /usr/local/bin/php
    root      85437  0.0  1.7 86728 35724  -  I     9:37AM    2:37.51 /usr/local/bin/php
    
    


  • any chance you are getting attacked? what do the rrd graphs show for em1 & em2 ? are those wan's or lan's ?



  • em1= wan fiber Colt
    em2= customer network


  • Banned

    Well, you have a shitton of php processes running for god knows what reason.

    • Restart webConfigurator from console
    • Restart PHP-FPM from console
    • Limit the GUI processes if not done yet (System - Advanced - Max Processes).

    Also look at the dashboard whether the GW monitor does not show some totally BS figured, like 109% packet loss. If it does, kill apinger.



    • Restart webConfigurator from console

    Whats the right command to do it?

    • Restart PHP-FPM from console

    Whats the right command to do it?

    • Limit the GUI processes if not done yet (System - Advanced - Max Processes).

    It's set to 2, do i have to change it ?


  • Banned

    The right command is to use the fine console menu that's there (both via SSH and on physical console).



  • Hi,

    I've restart the 2 still 100% cpu

    last pid: 70334;  load averages:  7.58,  6.90,  7.28    up 3+11:37:44  12:41:02
    172 processes: 9 running, 135 sleeping, 28 waiting
    CPU: 38.2% user,  0.2% nice, 34.7% system, 26.9% interrupt,  0.0% idle
    Mem: 111M Active, 211M Inact, 119M Wired, 74M Buf, 1531M Free
    Swap: 4096M Total, 4096M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
       12 root       -92    -     0K   224K WAIT    0 235:41  24.27% [intr{irq259:
    49598 root        45    0 86728K 35944K RUN     0   2:07  19.68% /usr/local/bin
    93820 root        39    0 86728K 35680K accept  1   1:03  15.97% /usr/local/bin
       12 root       -92    -     0K   224K WAIT    0 239:00  11.67% [intr{irq262:
    10691 root        29    0 86728K 35716K accept  1   1:03  11.38% /usr/local/bin
     5449 root        46    0 82632K 28476K accept  1   0:05   9.57% /usr/local/bin
    41372 root        42    0 86728K 35724K accept  1   2:20   7.76% /usr/local/bin
    49355 root        74    0 86824K 33060K RUN     0   0:27   7.28% php-fpm: pool
    14204 root        74    0 86824K 33940K RUN     1   0:30   6.88% php-fpm: pool
    54828 root        52    0 86824K 33828K piperd  0   0:27   6.40% php-fpm: pool
    96397 root        52    0 86728K 33052K sysctl  0   0:04   6.40% /usr/local/bin
    48718 root        52    0 82632K 28496K accept  0   0:36   5.37% /usr/local/bin
     2200 root        52    0 86824K 35476K piperd  1   0:35   5.08% php-fpm: pool
    45926 root        25    0 17248K  6644K kqread  1   0:03   4.39% /usr/local/sbi
    29995 nobody      22    0 11400K  4020K select  1   2:35   3.47% /usr/local/sbi
    27091 root        26    0 82632K 28492K accept  0   0:09   2.49% /usr/local/bin
    54198 admin-itp   20    0 11384K  2724K CPU1    1   0:00   0.59% top -aSH
    [/code]
    
    

  • Banned

    The output with truncated paths is completely useless.



  • Sorry

    last pid: 78757;  load averages:  7.10,  7.19,  7.14                                                                                                           up 3+11:53:35  12:56:53
    167 processes: 9 running, 132 sleeping, 26 waiting
    CPU: 38.8% user,  0.0% nice, 44.4% system, 16.2% interrupt,  0.6% idle
    Mem: 143M Active, 202M Inact, 119M Wired, 85M Buf, 1509M Free
    Swap: 4096M Total, 4096M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
    26911 root        49    0 82632K 28484K accept  1   2:01  25.59% /usr/local/bin/php
       12 root       -92    -     0K   224K CPU0    0 241:42  21.78% [intr{irq262: em2:rx 0}]
    90617 root        33    0 82632K 28484K CPU0    0   1:48  17.77% /usr/local/bin/php
       12 root       -92    -     0K   224K CPU0    0 238:40  14.06% [intr{irq259: em1:rx 0}]
    14204 root        75    0 99112K 50224K RUN     1   1:40  11.47% php-fpm: pool lighty (php-fpm)
     2200 root        75    0   101M 50364K RUN     1   1:50  10.69% php-fpm: pool lighty (php-fpm)
    93820 root        40    0 86728K 35728K accept  1   1:24  10.16% /usr/local/bin/php
    54828 root        75    0   101M 51096K RUN     0   1:42   9.57% php-fpm: pool lighty (php-fpm)
    49355 root        52    0 99112K 47252K nanslp  1   1:40   9.38% php-fpm: pool lighty (php-fpm)
     5449 root        27    0 82632K 28484K accept  1   1:40   7.76% /usr/local/bin/php
    10691 root        33    0 86728K 35732K accept  1   1:26   5.66% /usr/local/bin/php
    48718 root        52    0 82632K 28496K accept  1   1:55   1.46% /usr/local/bin/php
    29995 nobody      21    0 11400K  4020K select  1   2:54   1.27% /usr/local/sbin/dnsmasq --all-servers --rebind-localhost-ok --stop-dns-rebind --dns-forward-max=5000 --cache-size=100
    54198 admin-itp   21    0 11384K  2908K CPU1    1   0:11   1.17% top -aSH
    89404 root        26    0 82632K 28488K accept  0   0:42   0.49% /usr/local/bin/php
       12 root       -92    -     0K   224K WAIT    0  12:20   0.39% [intr{irq260: em1:tx 0}]
    47366 root        20    0 13148K  5484K kqread  0   0:06   0.20% /usr/local/sbin/lighttpd -f /var/etc/lighty-test-CaptivePortal.conf
    

    We also have a lot of these logs, don't know if it's linked:

    Sep 28 12:52:42	lighttpd[47366]: (connections.c.137) (warning) close: 21 Connection reset by peer
    Sep 28 12:52:51	lighttpd[47366]: (connections.c.137) (warning) close: 13 Connection reset by peer
    Sep 28 12:53:05	lighttpd[47366]: (connections.c.137) (warning) close: 13 Connection reset by peer
    Sep 28 12:53:15	lighttpd[47366]: (connections.c.137) (warning) close: 15 Connection reset by peer
    Sep 28 12:53:46	lighttpd[47366]: (connections.c.137) (warning) close: 15 Connection reset by peer
    Sep 28 12:53:49	lighttpd[47366]: (connections.c.137) (warning) close: 18 Connection reset by peer
    Sep 28 12:54:11	lighttpd[47366]: (connections.c.137) (warning) close: 18 Connection reset by peer
    Sep 28 12:54:35	lighttpd[47366]: (connections.c.137) (warning) close: 21 Connection reset by peer
    Sep 28 12:54:57	lighttpd[47366]: (connections.c.137) (warning) close: 14 Connection reset by peer
    Sep 28 12:55:02	lighttpd[47366]: (connections.c.137) (warning) close: 16 Connection reset by peer
    Sep 28 12:55:15	lighttpd[47366]: (connections.c.137) (warning) close: 14 Connection reset by peer
    Sep 28 12:55:33	lighttpd[47366]: (connections.c.137) (warning) close: 14 Connection reset by peer
    Sep 28 12:55:54	lighttpd[47366]: (connections.c.137) (warning) close: 13 Connection reset by peer
    Sep 28 12:55:57	lighttpd[47366]: (connections.c.137) (warning) close: 14 Connection reset by peer
    Sep 28 12:56:21	lighttpd[47366]: (connections.c.137) (warning) close: 22 Connection reset by peer
    Sep 28 12:56:23	lighttpd[47366]: (connections.c.137) (warning) close: 18 Connection reset by peer
    Sep 28 12:57:05	lighttpd[47366]: (connections.c.137) (warning) close: 16 Connection reset by peer
    Sep 28 12:57:18	lighttpd[47366]: (connections.c.137) (warning) close: 14 Connection reset by peer
    

  • Banned

    Disable the Captive portal thing.



  • Do you thin kthe Platform is not powerfull enough ? ( we have 100-150 users)



  • also might try

    sysctl dev.em.0.debug=1
    ```from console

  • Banned

    karlzre - Apparently not in your case when idiots/broken mobile shit keeps hammering the CP over and over again. (Other than that, you told us exactly NOTHING about your "platform").  And yeah, there's also something very broken about the NICs. Did you enable some nonsense like polling there?



  • also might try
    Code: [Select]
    sysctl dev.em.0.debug=1
    from console

    dev.em.0.debug: -1 -> -1
    
    
    Intel(R) Atom(TM) CPU N270 @ 1.60GHz
    Current: 800 MHz, Max: 1600 MHz
    2 CPUs: 1 package(s) x 1 core(s) x 2 HTT threads
    

    Memory usage
    13% of 2006 MB
    SWAP usage
    0% of 4096 MB
    Disk usage
    / (ufs): 2% of 23G
    /var/run (ufs in RAM): 3% of 3.4M



  • Did you enable some nonsense like polling there?

    Enable device polling is disabled.


  • Banned

    @karlzre:

    Intel(R) Atom(TM) CPU N270 @ 1.60GHz
    Current: 800 MHz, Max: 1600 MHz
    2 CPUs: 1 package(s) x 1 core(s) x 2 HTT threads
    

    This CPU is just old slow POS, but that thing is running at half the freq it's spec'ed for with 100% CPU usage?  :o Try fiddling with the powerd settings.

    Also, post some information about the NICs.



  • @doktornotor:

    This CPU is just old slow POS, but that thing is running at half the freq it's spec'ed for with 100% CPU usage?  :o Try fiddling with the powerd settings.

    Also, post some information about the NICs.

    I have a bit more power behind my box and Ive run into the same thing.. Is powerd on minimum capable of causing this issue?

    my machine gets loaded up with 60mbits of torrent/netflix traffic (that is my connection speed) from time to time and I see that [kernel{em0 taskq}] and [intr{swi5: fast taskq}] tasks taking up the majority of the available CPU.. Polling is turned off..

    This machine as built from spare parts that I had laying around is it possible that im just running into a limitation with the hardware?

    IC2D 2.13ghz, Asus OEM shitty mobo, 3gb ram, Intel Pro/1000 Legacy (external), Realtek 8169 (Internal)


  • Banned

    @SplineZ:

    I have a bit more power behind my box and Ive run into the same thing.. Is powerd on minimum capable of causing this issue?

    You need to fiddle with whatever settings are available there. People here mentioned boxes where CPU is never run at full freq unless powerd is enabled somehow. Sure if set @minimum, that's a good candidate.



  • You may also want to look at https://forum.pfsense.org/index.php?topic=59028.0 to remove those messages from the log and help cut down the noise.  Looks like they may not mean anything.



  • @doktornotor:

    @SplineZ:

    I have a bit more power behind my box and Ive run into the same thing.. Is powerd on minimum capable of causing this issue?

    You need to fiddle with whatever settings are available there. People here mentioned boxes where CPU is never run at full freq unless powerd is enabled somehow. Sure if set @minimum, that's a good candidate.

    Interestingly enough, I switched it to adaptive and the overall load on the machine has lowered a HUGE amount. Lets say from 6.0 to under 0.7-0.5..  Kind of ironic that the feature meant to save energy is actually causing an increase… (in this situation)

    Seriously, Turn on the profile "minimum" and max out your connection, your load will raise slowly.. turn it to Adaptive and watch it lower almost immediately. "[kernel{em0 taskq}]" is the task using all the cpu..

    
    With "Minimum" power setting on powerd
    
    last pid: 11607;  load averages:  3.36,  1.79,  0.99 
      PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
        0 root          -92    0     0K   192K CPU0    0 647:46  79.49% [kernel{em0 taskq}]
       12 root          -56    -     0K   272K WAIT    1 432:38  51.46% [intr{swi5: fast taskq}]
    87987 root          100   20 25160K  6464K CPU0    0   0:04  26.27% /sbin/pfctl -ss
       11 root          155 ki31     0K    32K RUN     1 390.8H  15.97% [idle{idle: cpu1}]
       11 root          155 ki31     0K    32K RUN     0 385.3H  11.18% [idle{idle: cpu0}]
    
    With "HiAdaptive" power setting on powerd
    
    last pid: 61673;  load averages:  0.53,  1.23,  0.89   
         PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
       11 root          155 ki31     0K    32K RUN     1 390.9H  97.56% [idle{idle: cpu1}]
       11 root          155 ki31     0K    32K CPU0    0 385.3H  83.98% [idle{idle: cpu0}]
        0 root          -92    0     0K   192K -       0 648:19  10.16% [kernel{em0 taskq}]
       12 root          -56    -     0K   272K WAIT    1 433:00   5.66% [intr{swi5: fast taskq}]
       12 root          -92    -     0K   272K WAIT    1  58:56   0.98% [intr{irq19: em0 uhci1}]
       12 root          -92    -     0K   272K WAIT    0  40:54   0.78% [intr{irq16: re0 uhci3}]
    
    

    No other changes were made on the machine other then fiddling with the powerd settings..  I never would have thought to look there.. Kudos and thanks!



  • You're misunderstanding what're being reported. Load is relative. If your CPU is running at 100mhz, then it's not hard to reach 75% usage. The reason "load" is high when using minimum is because the CPU is being down-clocked, you are saving power.


Log in to reply