Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    NanoBSD CPU usage high on MultiWan, PHP processes abound

    Scheduled Pinned Locked Moved Hardware
    26 Posts 5 Posters 4.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      bryan.paradis
      last edited by

      1. Add the combine 10s lines to each alarm that don't have them currently in /etc/inc/gwlb.inc
      2. killall apinger
      3. System -> Routing -> WAN Gateway -> Save -> Apply Changes should regenerate your apinger conf and start apinger again if you killed it above.
      /etc/inc/gwlb.inc
      
      
      ## These parameters can be overridden in a specific alarm configuration
      alarm default {
      	command on "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' "
      	command off "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' "
      	combine 10s
      }
      
      ## "Down" alarm definition.
      ## This alarm will be fired when target doesn't respond for 30 seconds.
      alarm down "down" {
      	time {$apinger_default['down']}s
      }
      
      ## "Delay" alarm definition.
      ## This alarm will be fired when responses are delayed more than 200ms
      ## it will be canceled, when the delay drops below 100ms
      alarm delay "delay" {
      	delay_low {$apinger_default['latencylow']}ms
      	delay_high {$apinger_default['latencyhigh']}ms
      	combine 10s
      }
      
      ## "Loss" alarm definition.
      ## This alarm will be fired when packet loss goes over 20%
      ## it will be canceled, when the loss drops below 10%
      alarm loss "loss" {
      	percent_low {$apinger_default['losslow']}
      	percent_high {$apinger_default['losshigh']}
      	combine 10s
      }
      

      With zero combine lines on Loss and Delay it will act on it every failing probe, in this case every second by default. This means those processes will spawn every second. I have set the combine to 10s which is 1/10th of the load but you may see fit to change it further.

      Can you post your Status -> System Logs -> Gateway ? It would help to see how often you are seeing the alarms generated.

      1 Reply Last reply Reply Quote 0
      • A
        arriflex
        last edited by

        @bryan.paradis:

        Could you make your terminal wider so we can see the full command also if you paste the full output of all processes.

        Trying killing apinger and testing again? Do you see the same issue?

        With apinger running, default gateway settings (WAN_STUDIO_DHCP is my secondary WAN) and saturated upstream on secondary WAN:

        USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
        root      10 23.0  0.0     0     8  ??  RL   26Sep13 214570:47.75 [idle]
        root   31622 20.0  0.5  3352  1124  ??  RNs  Sat02PM 101:23.76 /usr/local/sbin/check_reload_status
        root   46807 20.0  6.9 29612 16756  ??  RN    8:53AM   0:00.82 /usr/local/bin/php -f /etc/rc.newipsecdns
        root   47089 20.0  6.8 28552 16500  ??  RN    8:53AM   0:00.79 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   46727 19.0  6.6 28604 15868  ??  RN    8:53AM   0:00.72 /usr/local/bin/php -f /etc/rc.dyndns.update WAN_STUDIO_DHCP
        root   47057 19.0  6.9 29612 16724  ??  RN    8:53AM   0:00.82 /usr/local/bin/php -f /etc/rc.openvpn WAN_STUDIO_DHCP
        root      11  9.9  0.0     0   104  ??  WL   26Sep13 2586:06.04 [intr]
        root   45930  5.0  0.0     0     0  ??  Z<    8:52AM   0:01.14 <defunct>root       9  1.0  0.0     0     8  ??  DL   26Sep13 135:44.08 [pagedaemon]
        root       0  0.0  0.0     0    72  ??  DLs  26Sep13  31:40.50 [kernel]
        root       1  0.0  0.0  1888   104  ??  RLs  26Sep13   0:10.52 /sbin/init --
        root       2  0.0  0.0     0     8  ??  DL   26Sep13   0:09.09 [g_event]
        root       3  0.0  0.0     0     8  ??  DL   26Sep13   4:19.84 [g_up]
        root       4  0.0  0.0     0     8  ??  DL   26Sep13   8:59.75 [g_down]
        root       5  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto]
        root       6  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto returns]
        root       7  0.0  0.0     0     8  ??  DL   26Sep13   8:34.62 [pfpurge]
        root       8  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [xpt_thrd]
        root      12  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [ng_queue]
        root      13  0.0  0.0     0     8  ??  DL   26Sep13 1070:47.50 [yarrow]
        root      14  0.0  0.0     0    64  ??  DL   26Sep13   4:21.51 [usb]
        root      15  0.0  0.0     0     8  ??  DL   26Sep13   0:00.44 [vmdaemon]
        root      16  0.0  0.0     0     8  ??  DL   26Sep13   0:31.04 [idlepoll]
        root      17  0.0  0.0     0     8  ??  DL   26Sep13   0:00.34 [pagezero]
        root      18  0.0  0.0     0     8  ??  DL   26Sep13   1:45.12 [bufdaemon]
        root      19  0.0  0.0     0     8  ??  DL   26Sep13   1:42.96 [vnlru]
        root      20  0.0  0.0     0     8  ??  DL   26Sep13  13:54.07 [syncer]
        root      21  0.0  0.0     0     8  ??  DL   26Sep13   1:54.32 [softdepflush]
        root      44  0.0  0.0     0     8  ??  DL   26Sep13   2:31.01 [md0]
        root      49  0.0  0.0     0     8  ??  DL   26Sep13  42:32.58 [md1]
        root     320  0.0  0.3  3936   672  ??  Is   26Sep13   0:00.11 /sbin/devd
        root    7105  0.0  3.4 13036  8252  ??  SN   Tue12AM   1:50.67 /usr/sbin/tcpdump -s 256 -v -S -l -n -e -ttt -i pflog0
        root    7316  0.0  0.2  3264   580  ??  SN   Tue12AM   3:04.83 logger -t pf -p local0.info
        root   10090  0.0  0.4  5284   940  ??  Is   26Sep13   0:00.02 /usr/sbin/sshd
        root   10260  0.0  0.2  3480   584  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
        root   12250  0.0  0.8  8036  2008  ??  Ss    8:45AM   0:00.39 sshd: admin@pts/0 (sshd)
        root   15511  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   17614  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   19285  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   19966  0.0  0.5  3264  1264  ??  Ss    8:51AM   0:00.05 /usr/local/sbin/apinger -c /var/etc/apinger.conf
        root   20286  0.0  0.5  4556  1236  ??  I     8:51AM   0:00.02 /usr/local/bin/rrdtool -
        root   22304  0.0  0.2  3384   500  ??  Is   26Sep13   0:06.30 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.conf
        root   22716  0.0  0.0  3276     0  ??  IWs  -         0:00.00 /usr/local/sbin/qstats -p /var/run/qstats.pid
        root   27886  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   28812  0.0  0.3  3324   824  ??  INs  Wed08AM   0:00.17 dhclient: vr2 [priv] (dhclient)
        root   28889  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   29800  0.0  0.3  3412   784  ??  Ss   26Sep13  68:59.23 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -f /var/etc/syslog.conf
        root   30621  0.0  0.2  3356   404  ??  Ss   26Sep13   0:36.08 /usr/sbin/cron -s
        root   31959  0.0  0.0  3352     0  ??  IWN  -         0:00.00 check_reload_status: Monitoring daemon of check_reload_status
        root   33099  0.0  1.0  6980  2444  ??  S    26Sep13  12:24.38 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
        root   33128  0.0  0.1 24436   344  ??  Is   26Sep13   0:00.61 /usr/local/bin/php
        root   33841  0.0  0.0 24436     0  ??  IWs  -         0:00.00 /usr/local/bin/php
        _dhcp  35784  0.0  0.4  3324   952  ??  SNs  Wed08AM   0:07.59 dhclient: vr2 (dhclient)
        dhcpd  42173  0.0  2.4 11456  5856  ??  Ss   Fri02PM   0:34.93 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.p
        root   42891  0.0  3.0 30708  7248  ??  I     8:50AM   0:19.96 /usr/local/bin/php
        root   45336  0.0  0.3  1512   768  ??  IN    8:52AM   0:00.00 sleep 60
        root   49780  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
        root   49928  0.0  0.1  3264   232  ??  I    26Sep13   0:29.68 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
        root   50095  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /etc/rc.expireaccounts
        root   50711  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.expireaccounts  (minicron)
        root   50841  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /etc/rc.update_alias_url_data
        root   50977  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.update_alias_url_data  (minicron)
        root   53184  0.0  0.3  3480   832  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
        root   57050  0.0  0.8  8036  2036  ??  Ss    8:46AM   0:00.41 sshd: admin@pts/1 (sshd)
        root   59543  0.0  0.5  3324  1128  ??  Is   Fri12PM   0:00.01 dhclient: vr1 [priv] (dhclient)
        _dhcp  64959  0.0  0.5  3324  1268  ??  Ss   Fri12PM   0:03.86 dhclient: vr1 (dhclient)
        root   67059  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   67915  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d key.lan -p /var/run/dnsmasq.pid -h /var/et
        root   68923  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.93 <defunct>nobody 69584  0.0  1.0  5512  2328  ??  S    Fri12PM   0:51.65 [dnsmasq]
        root   71162  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   73060  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   74771  0.0  0.2  3644   588  ??  IN   Fri12PM   1:31.40 /bin/sh /var/db/rrd/updaterrd.sh
        root   77144  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   78118  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   79237  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   81062  0.0  0.9 24436  2116  ??  I     8:51AM   0:00.00 /usr/local/bin/php
        root   89469  0.0  2.6  6280  6300  ??  SNs  Fri06AM   0:28.27 /usr/local/bin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
        root   90300  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.22 <defunct>root   52851  0.0  0.0  3720     0  u0  IWs  -         0:00.00 login [pam] (login)
        root   53321  0.0  0.0  3644     0  u0  IW   -         0:00.00 -sh (sh)
        root   54058  0.0  0.2  3644   440  u0  I+   26Sep13   0:00.02 /bin/sh /etc/rc.initial
        root   12597  0.0  0.0  3644     0   0  IWs  -         0:00.00 /bin/sh /etc/rc.initial
        root   23302  0.0  0.0  4696     0   0  IW   -         0:00.00 /bin/tcsh
        root   56804  0.0  0.8  3660  1828   0  S+    8:46AM   0:01.38 top
        root   47353  0.0  0.5  3404  1244   1  R+    8:53AM   0:00.01 ps aux
        root   57546  0.0  0.0  3644     0   1  IWs  -         0:00.00 /bin/sh /etc/rc.initial
        root   60441  0.0  0.9  4696  2164   1  S     8:46AM   0:00.07 /bin/tcsh</defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct> 
        

        killall apinger:

        USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
        root      10 91.0  0.0     0     8  ??  RL   26Sep13 214577:21.24 [idle]
        root      11  9.9  0.0     0   104  ??  WL   26Sep13 2586:40.74 [intr]
        root       0  0.0  0.0     0    72  ??  DLs  26Sep13  31:40.88 [kernel]
        root       1  0.0  0.0  1888   104  ??  ILs  26Sep13   0:10.52 /sbin/init --
        root       2  0.0  0.0     0     8  ??  DL   26Sep13   0:09.09 [g_event]
        root       3  0.0  0.0     0     8  ??  DL   26Sep13   4:19.84 [g_up]
        root       4  0.0  0.0     0     8  ??  DL   26Sep13   8:59.76 [g_down]
        root       5  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto]
        root       6  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto returns]
        root       7  0.0  0.0     0     8  ??  DL   26Sep13   8:34.64 [pfpurge]
        root       8  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [xpt_thrd]
        root       9  0.0  0.0     0     8  ??  DL   26Sep13 135:45.19 [pagedaemon]
        root      12  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [ng_queue]
        root      13  0.0  0.0     0     8  ??  DL   26Sep13 1070:52.75 [yarrow]
        root      14  0.0  0.0     0    64  ??  DL   26Sep13   4:21.52 [usb]
        root      15  0.0  0.0     0     8  ??  DL   26Sep13   0:00.44 [vmdaemon]
        root      16  0.0  0.0     0     8  ??  DL   26Sep13   0:31.04 [idlepoll]
        root      17  0.0  0.0     0     8  ??  DL   26Sep13   0:00.34 [pagezero]
        root      18  0.0  0.0     0     8  ??  DL   26Sep13   1:45.13 [bufdaemon]
        root      19  0.0  0.0     0     8  ??  DL   26Sep13   1:42.96 [vnlru]
        root      20  0.0  0.0     0     8  ??  DL   26Sep13  13:54.14 [syncer]
        root      21  0.0  0.0     0     8  ??  DL   26Sep13   1:54.32 [softdepflush]
        root      44  0.0  0.0     0     8  ??  DL   26Sep13   2:31.02 [md0]
        root      49  0.0  0.0     0     8  ??  DL   26Sep13  42:32.68 [md1]
        root     320  0.0  0.3  3936   672  ??  Is   26Sep13   0:00.11 /sbin/devd
        root    1746  0.0  0.3  1512   772  ??  IN    9:01AM   0:00.00 sleep 60
        root    7105  0.0  3.5 13036  8328  ??  SN   Tue12AM   1:50.80 /usr/sbin/tcpdump -s 256 -v -S -l -n -e -ttt -i pflog0
        root    7316  0.0  0.2  3264   580  ??  SN   Tue12AM   3:05.05 logger -t pf -p local0.info
        root   10090  0.0  0.4  5284   940  ??  Is   26Sep13   0:00.02 /usr/sbin/sshd
        root   10260  0.0  0.2  3480   584  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
        root   12250  0.0  0.8  8036  2008  ??  Ss    8:45AM   0:00.48 sshd: admin@pts/0 (sshd)
        root   15511  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   17614  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   19285  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   22304  0.0  0.2  3384   500  ??  Is   26Sep13   0:06.30 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.conf
        root   22716  0.0  0.0  3276     0  ??  IWs  -         0:00.00 /usr/local/sbin/qstats -p /var/run/qstats.pid
        root   27886  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   28812  0.0  0.3  3324   824  ??  INs  Wed08AM   0:00.17 dhclient: vr2 [priv] (dhclient)
        root   28889  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   29800  0.0  0.3  3412   784  ??  Ss   26Sep13  68:59.77 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -f /var/etc/syslog.conf
        root   30621  0.0  0.2  3356   536  ??  Is   26Sep13   0:36.08 /usr/sbin/cron -s
        root   31622  0.0  0.5  3352  1124  ??  INs  Sat02PM 101:46.00 /usr/local/sbin/check_reload_status
        root   31959  0.0  0.0  3352     0  ??  IWN  -         0:00.00 check_reload_status: Monitoring daemon of check_reload_status
        root   33099  0.0  1.0  6980  2436  ??  S    26Sep13  12:24.40 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
        root   33128  0.0  0.1 24436   344  ??  Is   26Sep13   0:00.61 /usr/local/bin/php
        root   33841  0.0  0.0 24436     0  ??  IWs  -         0:00.00 /usr/local/bin/php
        _dhcp  35784  0.0  0.4  3324   960  ??  SNs  Wed08AM   0:07.60 dhclient: vr2 (dhclient)
        dhcpd  42173  0.0  2.5 11456  5968  ??  Is   Fri02PM   0:34.97 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.p
        root   42891  0.0  2.9 30708  7036  ??  I     8:50AM   0:19.96 /usr/local/bin/php
        root   49780  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
        root   49928  0.0  0.1  3264   244  ??  I    26Sep13   0:29.68 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
        root   50095  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /etc/rc.expireaccounts
        root   50711  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.expireaccounts  (minicron)
        root   50841  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /etc/rc.update_alias_url_data
        root   50977  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.update_alias_url_data  (minicron)
        root   53184  0.0  0.3  3480   832  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
        root   57050  0.0  0.8  8036  2020  ??  Ss    8:46AM   0:00.46 sshd: admin@pts/1 (sshd)
        root   59543  0.0  0.5  3324  1128  ??  Is   Fri12PM   0:00.01 dhclient: vr1 [priv] (dhclient)
        _dhcp  64959  0.0  0.5  3324  1268  ??  Ss   Fri12PM   0:03.87 dhclient: vr1 (dhclient)
        root   67059  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   67915  0.0  0.3  3264   724  ??  Is   Fri12PM   0:01.09 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d key.lan -p /var/run/dnsmasq.pid -h /var/et
        root   68923  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.93 <defunct>nobody 69584  0.0  1.0  5512  2392  ??  S    Fri12PM   0:51.72 [dnsmasq]
        root   71162  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   72926  0.0  8.1 32628 19536  ??  IN    8:54AM   0:01.13 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   73060  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   74771  0.0  0.3  3644   640  ??  IN   Fri12PM   1:31.58 /bin/sh /var/db/rrd/updaterrd.sh
        root   77144  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   78118  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   79237  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
        root   81062  0.0  0.9 24436  2116  ??  I     8:51AM   0:00.00 /usr/local/bin/php
        root   89469  0.0  2.6  6280  6300  ??  SNs  Fri06AM   0:28.32 /usr/local/bin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
        root   90300  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.22 <defunct>root   52851  0.0  0.0  3720     0  u0  IWs  -         0:00.00 login [pam] (login)
        root   53321  0.0  0.0  3644     0  u0  IW   -         0:00.00 -sh (sh)
        root   54058  0.0  0.2  3644   440  u0  I+   26Sep13   0:00.02 /bin/sh /etc/rc.initial
        root   12597  0.0  0.0  3644     0   0  IWs  -         0:00.00 /bin/sh /etc/rc.initial
        root   23302  0.0  0.0  4696     0   0  IW   -         0:00.00 /bin/tcsh
        root   56804  0.0  0.8  3660  1828   0  S+    8:46AM   0:02.90 top
        root    2131  0.0  0.5  3404  1240   1  R+    9:01AM   0:00.01 ps aux
        root   57546  0.0  0.0  3644     0   1  IWs  -         0:00.00 /bin/sh /etc/rc.initial
        root   60441  0.0  0.9  4696  2248   1  S     8:46AM   0:00.09 /bin/tcsh</defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct> 
        
        1 Reply Last reply Reply Quote 0
        • A
          arriflex
          last edited by

          @bryan.paradis:

          Can you post your Status -> System Logs -> Gateway ? It would help to see how often you are seeing the alarms generated.

          This is a few minutes of my log utilizing the default Gateway settings with apinger running and saturated upstream bandwidth on the secondary WAN; seems to fire around three times per minute:

          Mar 3 09:06:30	apinger: Starting Alarm Pinger, apinger(17477)
          Mar 3 09:06:40	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:06:45	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:06:59	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:03	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:14	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:21	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:32	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:38	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:48	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:07:55	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:06	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:12	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:22	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:28	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:39	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:45	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:08:55	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:01	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:11	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:18	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:29	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:34	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:45	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:09:51	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:01	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:07	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:17	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:23	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:33	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:40	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:50	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:10:56	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:11:07	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          Mar 3 09:11:13	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
          
          1 Reply Last reply Reply Quote 0
          • B
            bryan.paradis
            last edited by

            @arriflex:

            @bryan.paradis:

            Could you make your terminal wider so we can see the full command also if you paste the full output of all processes.

            Trying killing apinger and testing again? Do you see the same issue?

            With apinger running, default gateway settings (WAN_STUDIO_DHCP is my secondary WAN) and saturated upstream on secondary WAN:

            USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
            root      10 23.0  0.0     0     8  ??  RL   26Sep13 214570:47.75 [idle]
            root   31622 20.0  0.5  3352  1124  ??  RNs  Sat02PM 101:23.76 /usr/local/sbin/check_reload_status
            root   46807 20.0  6.9 29612 16756  ??  RN    8:53AM   0:00.82 /usr/local/bin/php -f /etc/rc.newipsecdns
            root   47089 20.0  6.8 28552 16500  ??  RN    8:53AM   0:00.79 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   46727 19.0  6.6 28604 15868  ??  RN    8:53AM   0:00.72 /usr/local/bin/php -f /etc/rc.dyndns.update WAN_STUDIO_DHCP
            root   47057 19.0  6.9 29612 16724  ??  RN    8:53AM   0:00.82 /usr/local/bin/php -f /etc/rc.openvpn WAN_STUDIO_DHCP
            root      11  9.9  0.0     0   104  ??  WL   26Sep13 2586:06.04 [intr]
            root   45930  5.0  0.0     0     0  ??  Z<    8:52AM   0:01.14 <defunct>root       9  1.0  0.0     0     8  ??  DL   26Sep13 135:44.08 [pagedaemon]
            root       0  0.0  0.0     0    72  ??  DLs  26Sep13  31:40.50 [kernel]
            root       1  0.0  0.0  1888   104  ??  RLs  26Sep13   0:10.52 /sbin/init --
            root       2  0.0  0.0     0     8  ??  DL   26Sep13   0:09.09 [g_event]
            root       3  0.0  0.0     0     8  ??  DL   26Sep13   4:19.84 [g_up]
            root       4  0.0  0.0     0     8  ??  DL   26Sep13   8:59.75 [g_down]
            root       5  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto]
            root       6  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto returns]
            root       7  0.0  0.0     0     8  ??  DL   26Sep13   8:34.62 [pfpurge]
            root       8  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [xpt_thrd]
            root      12  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [ng_queue]
            root      13  0.0  0.0     0     8  ??  DL   26Sep13 1070:47.50 [yarrow]
            root      14  0.0  0.0     0    64  ??  DL   26Sep13   4:21.51 [usb]
            root      15  0.0  0.0     0     8  ??  DL   26Sep13   0:00.44 [vmdaemon]
            root      16  0.0  0.0     0     8  ??  DL   26Sep13   0:31.04 [idlepoll]
            root      17  0.0  0.0     0     8  ??  DL   26Sep13   0:00.34 [pagezero]
            root      18  0.0  0.0     0     8  ??  DL   26Sep13   1:45.12 [bufdaemon]
            root      19  0.0  0.0     0     8  ??  DL   26Sep13   1:42.96 [vnlru]
            root      20  0.0  0.0     0     8  ??  DL   26Sep13  13:54.07 [syncer]
            root      21  0.0  0.0     0     8  ??  DL   26Sep13   1:54.32 [softdepflush]
            root      44  0.0  0.0     0     8  ??  DL   26Sep13   2:31.01 [md0]
            root      49  0.0  0.0     0     8  ??  DL   26Sep13  42:32.58 [md1]
            root     320  0.0  0.3  3936   672  ??  Is   26Sep13   0:00.11 /sbin/devd
            root    7105  0.0  3.4 13036  8252  ??  SN   Tue12AM   1:50.67 /usr/sbin/tcpdump -s 256 -v -S -l -n -e -ttt -i pflog0
            root    7316  0.0  0.2  3264   580  ??  SN   Tue12AM   3:04.83 logger -t pf -p local0.info
            root   10090  0.0  0.4  5284   940  ??  Is   26Sep13   0:00.02 /usr/sbin/sshd
            root   10260  0.0  0.2  3480   584  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
            root   12250  0.0  0.8  8036  2008  ??  Ss    8:45AM   0:00.39 sshd: admin@pts/0 (sshd)
            root   15511  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   17614  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   19285  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   19966  0.0  0.5  3264  1264  ??  Ss    8:51AM   0:00.05 /usr/local/sbin/apinger -c /var/etc/apinger.conf
            root   20286  0.0  0.5  4556  1236  ??  I     8:51AM   0:00.02 /usr/local/bin/rrdtool -
            root   22304  0.0  0.2  3384   500  ??  Is   26Sep13   0:06.30 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.conf
            root   22716  0.0  0.0  3276     0  ??  IWs  -         0:00.00 /usr/local/sbin/qstats -p /var/run/qstats.pid
            root   27886  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   28812  0.0  0.3  3324   824  ??  INs  Wed08AM   0:00.17 dhclient: vr2 [priv] (dhclient)
            root   28889  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   29800  0.0  0.3  3412   784  ??  Ss   26Sep13  68:59.23 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -f /var/etc/syslog.conf
            root   30621  0.0  0.2  3356   404  ??  Ss   26Sep13   0:36.08 /usr/sbin/cron -s
            root   31959  0.0  0.0  3352     0  ??  IWN  -         0:00.00 check_reload_status: Monitoring daemon of check_reload_status
            root   33099  0.0  1.0  6980  2444  ??  S    26Sep13  12:24.38 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
            root   33128  0.0  0.1 24436   344  ??  Is   26Sep13   0:00.61 /usr/local/bin/php
            root   33841  0.0  0.0 24436     0  ??  IWs  -         0:00.00 /usr/local/bin/php
            _dhcp  35784  0.0  0.4  3324   952  ??  SNs  Wed08AM   0:07.59 dhclient: vr2 (dhclient)
            dhcpd  42173  0.0  2.4 11456  5856  ??  Ss   Fri02PM   0:34.93 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.p
            root   42891  0.0  3.0 30708  7248  ??  I     8:50AM   0:19.96 /usr/local/bin/php
            root   45336  0.0  0.3  1512   768  ??  IN    8:52AM   0:00.00 sleep 60
            root   49780  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
            root   49928  0.0  0.1  3264   232  ??  I    26Sep13   0:29.68 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
            root   50095  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /etc/rc.expireaccounts
            root   50711  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.expireaccounts  (minicron)
            root   50841  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /etc/rc.update_alias_url_data
            root   50977  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.update_alias_url_data  (minicron)
            root   53184  0.0  0.3  3480   832  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
            root   57050  0.0  0.8  8036  2036  ??  Ss    8:46AM   0:00.41 sshd: admin@pts/1 (sshd)
            root   59543  0.0  0.5  3324  1128  ??  Is   Fri12PM   0:00.01 dhclient: vr1 [priv] (dhclient)
            _dhcp  64959  0.0  0.5  3324  1268  ??  Ss   Fri12PM   0:03.86 dhclient: vr1 (dhclient)
            root   67059  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   67915  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d key.lan -p /var/run/dnsmasq.pid -h /var/et
            root   68923  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.93 <defunct>nobody 69584  0.0  1.0  5512  2328  ??  S    Fri12PM   0:51.65 [dnsmasq]
            root   71162  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   73060  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   74771  0.0  0.2  3644   588  ??  IN   Fri12PM   1:31.40 /bin/sh /var/db/rrd/updaterrd.sh
            root   77144  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   78118  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   79237  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   81062  0.0  0.9 24436  2116  ??  I     8:51AM   0:00.00 /usr/local/bin/php
            root   89469  0.0  2.6  6280  6300  ??  SNs  Fri06AM   0:28.27 /usr/local/bin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
            root   90300  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.22 <defunct>root   52851  0.0  0.0  3720     0  u0  IWs  -         0:00.00 login [pam] (login)
            root   53321  0.0  0.0  3644     0  u0  IW   -         0:00.00 -sh (sh)
            root   54058  0.0  0.2  3644   440  u0  I+   26Sep13   0:00.02 /bin/sh /etc/rc.initial
            root   12597  0.0  0.0  3644     0   0  IWs  -         0:00.00 /bin/sh /etc/rc.initial
            root   23302  0.0  0.0  4696     0   0  IW   -         0:00.00 /bin/tcsh
            root   56804  0.0  0.8  3660  1828   0  S+    8:46AM   0:01.38 top
            root   47353  0.0  0.5  3404  1244   1  R+    8:53AM   0:00.01 ps aux
            root   57546  0.0  0.0  3644     0   1  IWs  -         0:00.00 /bin/sh /etc/rc.initial
            root   60441  0.0  0.9  4696  2164   1  S     8:46AM   0:00.07 /bin/tcsh</defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct> 
            

            killall apinger:

            USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
            root      10 91.0  0.0     0     8  ??  RL   26Sep13 214577:21.24 [idle]
            root      11  9.9  0.0     0   104  ??  WL   26Sep13 2586:40.74 [intr]
            root       0  0.0  0.0     0    72  ??  DLs  26Sep13  31:40.88 [kernel]
            root       1  0.0  0.0  1888   104  ??  ILs  26Sep13   0:10.52 /sbin/init --
            root       2  0.0  0.0     0     8  ??  DL   26Sep13   0:09.09 [g_event]
            root       3  0.0  0.0     0     8  ??  DL   26Sep13   4:19.84 [g_up]
            root       4  0.0  0.0     0     8  ??  DL   26Sep13   8:59.76 [g_down]
            root       5  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto]
            root       6  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [crypto returns]
            root       7  0.0  0.0     0     8  ??  DL   26Sep13   8:34.64 [pfpurge]
            root       8  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [xpt_thrd]
            root       9  0.0  0.0     0     8  ??  DL   26Sep13 135:45.19 [pagedaemon]
            root      12  0.0  0.0     0     8  ??  DL   26Sep13   0:00.00 [ng_queue]
            root      13  0.0  0.0     0     8  ??  DL   26Sep13 1070:52.75 [yarrow]
            root      14  0.0  0.0     0    64  ??  DL   26Sep13   4:21.52 [usb]
            root      15  0.0  0.0     0     8  ??  DL   26Sep13   0:00.44 [vmdaemon]
            root      16  0.0  0.0     0     8  ??  DL   26Sep13   0:31.04 [idlepoll]
            root      17  0.0  0.0     0     8  ??  DL   26Sep13   0:00.34 [pagezero]
            root      18  0.0  0.0     0     8  ??  DL   26Sep13   1:45.13 [bufdaemon]
            root      19  0.0  0.0     0     8  ??  DL   26Sep13   1:42.96 [vnlru]
            root      20  0.0  0.0     0     8  ??  DL   26Sep13  13:54.14 [syncer]
            root      21  0.0  0.0     0     8  ??  DL   26Sep13   1:54.32 [softdepflush]
            root      44  0.0  0.0     0     8  ??  DL   26Sep13   2:31.02 [md0]
            root      49  0.0  0.0     0     8  ??  DL   26Sep13  42:32.68 [md1]
            root     320  0.0  0.3  3936   672  ??  Is   26Sep13   0:00.11 /sbin/devd
            root    1746  0.0  0.3  1512   772  ??  IN    9:01AM   0:00.00 sleep 60
            root    7105  0.0  3.5 13036  8328  ??  SN   Tue12AM   1:50.80 /usr/sbin/tcpdump -s 256 -v -S -l -n -e -ttt -i pflog0
            root    7316  0.0  0.2  3264   580  ??  SN   Tue12AM   3:05.05 logger -t pf -p local0.info
            root   10090  0.0  0.4  5284   940  ??  Is   26Sep13   0:00.02 /usr/sbin/sshd
            root   10260  0.0  0.2  3480   584  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
            root   12250  0.0  0.8  8036  2008  ??  Ss    8:45AM   0:00.48 sshd: admin@pts/0 (sshd)
            root   15511  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   17614  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   19285  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   22304  0.0  0.2  3384   500  ??  Is   26Sep13   0:06.30 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.conf
            root   22716  0.0  0.0  3276     0  ??  IWs  -         0:00.00 /usr/local/sbin/qstats -p /var/run/qstats.pid
            root   27886  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.15 <defunct>root   28812  0.0  0.3  3324   824  ??  INs  Wed08AM   0:00.17 dhclient: vr2 [priv] (dhclient)
            root   28889  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   29800  0.0  0.3  3412   784  ??  Ss   26Sep13  68:59.77 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -f /var/etc/syslog.conf
            root   30621  0.0  0.2  3356   536  ??  Is   26Sep13   0:36.08 /usr/sbin/cron -s
            root   31622  0.0  0.5  3352  1124  ??  INs  Sat02PM 101:46.00 /usr/local/sbin/check_reload_status
            root   31959  0.0  0.0  3352     0  ??  IWN  -         0:00.00 check_reload_status: Monitoring daemon of check_reload_status
            root   33099  0.0  1.0  6980  2436  ??  S    26Sep13  12:24.40 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
            root   33128  0.0  0.1 24436   344  ??  Is   26Sep13   0:00.61 /usr/local/bin/php
            root   33841  0.0  0.0 24436     0  ??  IWs  -         0:00.00 /usr/local/bin/php
            _dhcp  35784  0.0  0.4  3324   960  ??  SNs  Wed08AM   0:07.60 dhclient: vr2 (dhclient)
            dhcpd  42173  0.0  2.5 11456  5968  ??  Is   Fri02PM   0:34.97 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.p
            root   42891  0.0  2.9 30708  7036  ??  I     8:50AM   0:19.96 /usr/local/bin/php
            root   49780  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
            root   49928  0.0  0.1  3264   244  ??  I    26Sep13   0:29.68 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
            root   50095  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /etc/rc.expireaccounts
            root   50711  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.expireaccounts  (minicron)
            root   50841  0.0  0.0  3264     0  ??  IWs  -         0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /etc/rc.update_alias_url_data
            root   50977  0.0  0.0  3264     0  ??  IW   -         0:00.00 minicron: helper /etc/rc.update_alias_url_data  (minicron)
            root   53184  0.0  0.3  3480   832  ??  Is   26Sep13   0:00.23 /usr/local/sbin/sshlockout_pf 15
            root   57050  0.0  0.8  8036  2020  ??  Ss    8:46AM   0:00.46 sshd: admin@pts/1 (sshd)
            root   59543  0.0  0.5  3324  1128  ??  Is   Fri12PM   0:00.01 dhclient: vr1 [priv] (dhclient)
            _dhcp  64959  0.0  0.5  3324  1268  ??  Ss   Fri12PM   0:03.87 dhclient: vr1 (dhclient)
            root   67059  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   67915  0.0  0.3  3264   724  ??  Is   Fri12PM   0:01.09 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d key.lan -p /var/run/dnsmasq.pid -h /var/et
            root   68923  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.93 <defunct>nobody 69584  0.0  1.0  5512  2392  ??  S    Fri12PM   0:51.72 [dnsmasq]
            root   71162  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.14 <defunct>root   72926  0.0  8.1 32628 19536  ??  IN    8:54AM   0:01.13 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   73060  0.0  0.0     0     0  ??  Z<   Sat02PM   0:00.95 <defunct>root   74771  0.0  0.3  3644   640  ??  IN   Fri12PM   1:31.58 /bin/sh /var/db/rrd/updaterrd.sh
            root   77144  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   78118  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   79237  0.0  0.0 32628     0  ??  IWN  -         0:00.00 /usr/local/bin/php -f /etc/rc.filter_configure_sync
            root   81062  0.0  0.9 24436  2116  ??  I     8:51AM   0:00.00 /usr/local/bin/php
            root   89469  0.0  2.6  6280  6300  ??  SNs  Fri06AM   0:28.32 /usr/local/bin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
            root   90300  0.0  0.0     0     0  ??  Z<   Sat02PM   0:01.22 <defunct>root   52851  0.0  0.0  3720     0  u0  IWs  -         0:00.00 login [pam] (login)
            root   53321  0.0  0.0  3644     0  u0  IW   -         0:00.00 -sh (sh)
            root   54058  0.0  0.2  3644   440  u0  I+   26Sep13   0:00.02 /bin/sh /etc/rc.initial
            root   12597  0.0  0.0  3644     0   0  IWs  -         0:00.00 /bin/sh /etc/rc.initial
            root   23302  0.0  0.0  4696     0   0  IW   -         0:00.00 /bin/tcsh
            root   56804  0.0  0.8  3660  1828   0  S+    8:46AM   0:02.90 top
            root    2131  0.0  0.5  3404  1240   1  R+    9:01AM   0:00.01 ps aux
            root   57546  0.0  0.0  3644     0   1  IWs  -         0:00.00 /bin/sh /etc/rc.initial
            root   60441  0.0  0.9  4696  2248   1  S     8:46AM   0:00.09 /bin/tcsh</defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct></defunct> 
            

            Good that is as expected

            @arriflex:

            @bryan.paradis:

            Can you post your Status -> System Logs -> Gateway ? It would help to see how often you are seeing the alarms generated.

            This is a few minutes of my log utilizing the default Gateway settings with apinger running and saturated upstream bandwidth on the secondary WAN; seems to fire around three times per minute:

            Mar 3 09:06:30	apinger: Starting Alarm Pinger, apinger(17477)
            Mar 3 09:06:40	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:06:45	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:06:59	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:03	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:14	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:21	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:32	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:38	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:48	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:07:55	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:06	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:12	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:22	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:28	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:39	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:45	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:08:55	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:01	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:11	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:18	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:29	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:34	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:45	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:09:51	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:01	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:07	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:17	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:23	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:33	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:40	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:50	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:10:56	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:11:07	apinger: ALARM: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            Mar 3 09:11:13	apinger: alarm canceled: WAN_STUDIO_DHCP(208.67.220.220) *** down ***
            

            Crazy its the down alarm. Wasn't expecting that but makes sense for the alarm actions. Do you run any traffic shaping? Can you ping the same target as apinger and do the same load on upstream and post results?

            Are you WAN connections the same? Do you see the same problem with saturated upstream on your main connection?

            1 Reply Last reply Reply Quote 0
            • A
              arriflex
              last edited by

              @bryan.paradis:

              Crazy its the down alarm. Wasn't expecting that but makes sense for the alarm actions. Do you run any traffic shaping? Can you ping the same target as apinger and do the same load on upstream and post results?

              Are you WAN connections the same? Do you see the same problem with saturated upstream on your main connection?

              I deleted my limiters when I started noticing issues, so no traffic shaping for these tests.

              I'm not sure I understand the ping test you're asking for. Do you want me to just post ping results sent from the secondary WAN during upstream saturation?

              WAN connections are not the same but come from the same company; I do see the same issue on both. Primary is a 20/2 cable connection from the same company as the secondary with a 60/5. However, interestingly the modems are assigned different gateways by the cable company's DHCP. I'd love to load balance the upstream so I can get this Backblaze initial seed over with, but I suspect the Backblaze client doesn't open multiple connections.

              1 Reply Last reply Reply Quote 0
              • B
                bryan.paradis
                last edited by

                @arriflex:

                @bryan.paradis:

                Crazy its the down alarm. Wasn't expecting that but makes sense for the alarm actions. Do you run any traffic shaping? Can you ping the same target as apinger and do the same load on upstream and post results?

                Are you WAN connections the same? Do you see the same problem with saturated upstream on your main connection?

                I deleted my limiters when I started noticing issues, so no traffic shaping for these tests.

                I'm not sure I understand the ping test you're asking for. Do you want me to just post ping results sent from the secondary WAN during upstream saturation?

                WAN connections are not the same but come from the same company; I do see the same issue on both. Primary is a 20/2 cable connection from the same company as the secondary with a 60/5. However, interestingly the modems are assigned different gateways by the cable company's DHCP. I'd love to load balance the upstream so I can get this Backblaze initial seed over with, but I suspect the Backblaze client doesn't open multiple connections.

                Yes just do a ping or winmtr to somewhere as you saturate. It would be good to see exactly what is going on. What are you using to saturate it anyway? I cannot for the life of me reproduce this here.

                1 Reply Last reply Reply Quote 0
                • A
                  arriflex
                  last edited by

                  I'm using the Mac Backblaze client to saturate the upstream bandwidth. Backblaze is a cloud based backup host. The Mac is connected to the Alix router through two unmanaged switches, and a firewall rule sends it through the secondary WAN. Everything else on the network goes through the primary WAN using the catchall default firewall rule.

                  Interestingly, pinging Google and OpenDNS servers using the configurator's tool using the secondary WAN as the source is unsurprisingly ugly (>3000ms and plenty of packet loss); however, pinging the same servers from the Mac that's being routed through the firewall to the secondary WAN show reasonable pings and no packet loss.

                  1 Reply Last reply Reply Quote 0
                  • B
                    bryan.paradis
                    last edited by

                    @arriflex:

                    I'm using the Mac Backblaze client to saturate the upstream bandwidth. Backblaze is a cloud based backup host. The Mac is connected to the Alix router through two unmanaged switches, and a firewall rule sends it through the secondary WAN. Everything else on the network goes through the primary WAN using the catchall default firewall rule.

                    Interestingly, pinging Google and OpenDNS servers using the configurator's tool using the secondary WAN as the source is unsurprisingly ugly (>3000ms and plenty of packet loss); however, pinging the same servers from the Mac that's being routed through the firewall to the secondary WAN show reasonable pings and no packet loss.

                    Can you post your /var/etc/apinger.conf ?

                    1 Reply Last reply Reply Quote 0
                    • A
                      arriflex
                      last edited by

                      Why would the pfSense pingtimes from the secondary WAN be so much crazier than the ones from a LAN side client routed through the secondary WAN? Is the Mac providing priority to those packets before they get to the router?

                      This is the apinger.conf I'm using right now that is keeping php from going crazy on me:

                      # pfSense apinger configuration file. Automatically Generated!
                      
                      ## User and group the pinger should run as
                      user "root"
                      group "wheel"
                      
                      ## Mailer to use (default: "/usr/lib/sendmail -t")
                      #mailer "/var/qmail/bin/qmail-inject"
                      
                      ## Location of the pid-file (default: "/var/run/apinger.pid")
                      pid_file "/var/run/apinger.pid"
                      
                      ## Format of timestamp (%s macro) (default: "%b %d %H:%M:%S")
                      #timestamp_format "%Y%m%d%H%M%S"
                      
                      status {
                      	## File where the status information should be written to
                      	file "/var/run/apinger.status"
                      	## Interval between file updates
                      	## when 0 or not set, file is written only when SIGUSR1 is received
                      	interval 5s
                      }
                      
                      ########################################
                      # RRDTool status gathering configuration
                      # Interval between RRD updates
                      rrd interval 60s;
                      
                      ## These parameters can be overridden in a specific alarm configuration
                      alarm default {
                      	command on "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' "
                      	command off "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' "
                      	combine 10s
                      }
                      
                      ## "Down" alarm definition.
                      ## This alarm will be fired when target doesn't respond for 30 seconds.
                      alarm down "down" {
                      	time 10s
                      }
                      
                      ## "Delay" alarm definition.
                      ## This alarm will be fired when responses are delayed more than 200ms
                      ## it will be canceled, when the delay drops below 100ms
                      alarm delay "delay" {
                      	delay_low 200ms
                      	delay_high 500ms
                      }
                      
                      ## "Loss" alarm definition.
                      ## This alarm will be fired when packet loss goes over 20%
                      ## it will be canceled, when the loss drops below 10%
                      alarm loss "loss" {
                      	percent_low 10
                      	percent_high 20
                      }
                      
                      target default {
                      	## How often the probe should be sent
                      	interval 1s
                      
                      	## How many replies should be used to compute average delay
                      	## for controlling "delay" alarms
                      	avg_delay_samples 10
                      
                      	## How many probes should be used to compute average loss
                      	avg_loss_samples 50
                      
                      	## The delay (in samples) after which loss is computed
                      	## without this delays larger than interval would be treated as loss
                      	avg_loss_delay_samples 20
                      
                      	## Names of the alarms that may be generated for the target
                      	alarms "down","delay","loss"
                      
                      	## Location of the RRD
                      	#rrd file "/var/db/rrd/apinger-%t.rrd"
                      }
                      target "208.67.222.222" {
                      	description "WAN"
                      	srcip "174.134.xxx.xx"
                      	alarms override "loss","delay","down";
                      	rrd file "/var/db/rrd/WAN-quality.rrd"
                      }
                      
                      alarm loss "WAN_STUDIO_DHCPloss" {
                      	percent_low 30
                      	percent_high 40
                      }
                      alarm delay "WAN_STUDIO_DHCPdelay" {
                      	delay_low 4000ms
                      	delay_high 5000ms
                      }
                      alarm down "WAN_STUDIO_DHCPdown" {
                      	time 120s
                      }
                      target "208.67.220.220" {
                      	description "WAN_STUDIO_DHCP"
                      	srcip "174.135.xxx.xx"
                      	interval 20s
                      	alarms override "WAN_STUDIO_DHCPloss","WAN_STUDIO_DHCPdelay","WAN_STUDIO_DHCPdown";
                      	rrd file "/var/db/rrd/WAN_STUDIO_DHCP-quality.rrd"
                      }
                      

                      Whereas, this one is more the default configuration which causes the extra processes to take over:

                      # pfSense apinger configuration file. Automatically Generated!
                      
                      ## User and group the pinger should run as
                      user "root"
                      group "wheel"
                      
                      ## Mailer to use (default: "/usr/lib/sendmail -t")
                      #mailer "/var/qmail/bin/qmail-inject"
                      
                      ## Location of the pid-file (default: "/var/run/apinger.pid")
                      pid_file "/var/run/apinger.pid"
                      
                      ## Format of timestamp (%s macro) (default: "%b %d %H:%M:%S")
                      #timestamp_format "%Y%m%d%H%M%S"
                      
                      status {
                      	## File where the status information should be written to
                      	file "/var/run/apinger.status"
                      	## Interval between file updates
                      	## when 0 or not set, file is written only when SIGUSR1 is received
                      	interval 5s
                      }
                      
                      ########################################
                      # RRDTool status gathering configuration
                      # Interval between RRD updates
                      rrd interval 60s;
                      
                      ## These parameters can be overridden in a specific alarm configuration
                      alarm default {
                      	command on "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' "
                      	command off "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' "
                      	combine 10s
                      }
                      
                      ## "Down" alarm definition.
                      ## This alarm will be fired when target doesn't respond for 30 seconds.
                      alarm down "down" {
                      	time 10s
                      }
                      
                      ## "Delay" alarm definition.
                      ## This alarm will be fired when responses are delayed more than 200ms
                      ## it will be canceled, when the delay drops below 100ms
                      alarm delay "delay" {
                      	delay_low 200ms
                      	delay_high 500ms
                      }
                      
                      ## "Loss" alarm definition.
                      ## This alarm will be fired when packet loss goes over 20%
                      ## it will be canceled, when the loss drops below 10%
                      alarm loss "loss" {
                      	percent_low 10
                      	percent_high 20
                      }
                      
                      target default {
                      	## How often the probe should be sent
                      	interval 1s
                      
                      	## How many replies should be used to compute average delay
                      	## for controlling "delay" alarms
                      	avg_delay_samples 10
                      
                      	## How many probes should be used to compute average loss
                      	avg_loss_samples 50
                      
                      	## The delay (in samples) after which loss is computed
                      	## without this delays larger than interval would be treated as loss
                      	avg_loss_delay_samples 20
                      
                      	## Names of the alarms that may be generated for the target
                      	alarms "down","delay","loss"
                      
                      	## Location of the RRD
                      	#rrd file "/var/db/rrd/apinger-%t.rrd"
                      }
                      target "208.67.222.222" {
                      	description "WAN"
                      	srcip "174.134.xxx.xx"
                      	alarms override "loss","delay","down";
                      	rrd file "/var/db/rrd/WAN-quality.rrd"
                      }
                      
                      target "208.67.220.220" {
                      	description "WAN_STUDIO_DHCP"
                      	srcip "174.135.xxx.xx"
                      	alarms override "loss","delay","down";
                      	rrd file "/var/db/rrd/WAN_STUDIO_DHCP-quality.rrd"
                      }
                      
                      
                      1 Reply Last reply Reply Quote 0
                      • B
                        bryan.paradis
                        last edited by

                        I am not sure but this is getting interesting! May be some sort of priority thing. I won't be able to do too much more work as I am flying across the country to do some IT work.

                        1 Reply Last reply Reply Quote 0
                        • A
                          arriflex
                          last edited by

                          Safe travels. Feel free to assume that I'm making some rookie mistake and float it out there for me to check.

                          I rebooted the router and sent another client into the secondary WAN to test the ping through it and found the same erratic behaviour that performing the ping directly from the secondary WAN interface on the router was causing. Then I went back to the original client that was saturating that WAN and had previously not had issue with pings to find that now it was also consistent with the poor ping performance.

                          I am unable to replicate the state now where I had good ping times from the client that was saturating the upstream while seeing bad ones from the router on that interface. The good news is that they are all inline with each other at least!

                          arri

                          1 Reply Last reply Reply Quote 0
                          • B
                            bryan.paradis
                            last edited by

                            @arriflex:

                            Safe travels. Feel free to assume that I'm making some rookie mistake and float it out there for me to check.

                            I rebooted the router and sent another client into the secondary WAN to test the ping through it and found the same erratic behaviour that performing the ping directly from the secondary WAN interface on the router was causing. Then I went back to the original client that was saturating that WAN and had previously not had issue with pings to find that now it was also consistent with the poor ping performance.

                            I am unable to replicate the state now where I had good ping times from the client that was saturating the upstream while seeing bad ones from the router on that interface. The good news is that they are all inline with each other at least!

                            arri

                            Indeed. I missed two flights. I have hate airlines. Canceled that contract. Horrible day.

                            1. Raise ICMP priority
                            2. Lower interval or use combine to lower alarm repetitions
                            3. Traffic shape so you leave a bit of space for the ICMP
                            4. Try setting the down time higher.
                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              You might try a 2.1.1 snapshot that has many fixes for various things in it.
                              https://doc.pfsense.org/index.php/2.1.1_New_Features_and_Changes
                              The are still pre-release snapshots though so new bugs may be added.  ;)

                              Steve

                              1 Reply Last reply Reply Quote 0
                              • P
                                phil.davis
                                last edited by

                                The Mac is connected to the Alix router through two unmanaged switches, and a firewall rule sends it through the secondary WAN.

                                Maybe the rule is only for TCP/UDP and ICMP ping from the Mac is still going out the default gateway? That would explain the ping being so good. You could traceroute from Mac to an external host to see which way the ICMP is really going.
                                It does sound like the link is so saturated that the apinger monitoring is struggling to see any decent numbers and is deciding the link is bad.

                                As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                1 Reply Last reply Reply Quote 0
                                • A
                                  arriflex
                                  last edited by

                                  @phil.davis:

                                  The Mac is connected to the Alix router through two unmanaged switches, and a firewall rule sends it through the secondary WAN.

                                  Maybe the rule is only for TCP/UDP and ICMP ping from the Mac is still going out the default gateway? That would explain the ping being so good. You could traceroute from Mac to an external host to see which way the ICMP is really going.
                                  It does sound like the link is so saturated that the apinger monitoring is struggling to see any decent numbers and is deciding the link is bad.

                                  You are correct, I noticed before the reboot that my rule was limited to TCP/UDP, and updated it to "any." That explains the new consistency. Nice catch!

                                  It's working fine with the ludicrous numbers for monitoring delays and decreasing the number of checks that gateway does. If I find the time, I'll give a snapshot a try as this is the box for doing that on.

                                  arri

                                  1 Reply Last reply Reply Quote 0
                                  • B
                                    bryan.paradis
                                    last edited by

                                    @arriflex:

                                    @phil.davis:

                                    The Mac is connected to the Alix router through two unmanaged switches, and a firewall rule sends it through the secondary WAN.

                                    Maybe the rule is only for TCP/UDP and ICMP ping from the Mac is still going out the default gateway? That would explain the ping being so good. You could traceroute from Mac to an external host to see which way the ICMP is really going.
                                    It does sound like the link is so saturated that the apinger monitoring is struggling to see any decent numbers and is deciding the link is bad.

                                    You are correct, I noticed before the reboot that my rule was limited to TCP/UDP, and updated it to "any." That explains the new consistency. Nice catch!

                                    It's working fine with the ludicrous numbers for monitoring delays and decreasing the number of checks that gateway does. If I find the time, I'll give a snapshot a try as this is the box for doing that on.

                                    arri

                                    I am not sure if its going to change anything. Turn off gateway monitor on that wan and then test with backblaze and ping out and see if you see a difference. There could be something going on where the added load on the box is degrading the connection further or something. If you get the same loss/latency without apinger running on that WAN then you are dealing with normal saturation? This will need to be fixed using one of the things listed in my above post.

                                    1 Reply Last reply Reply Quote 0
                                    • A
                                      arriflex
                                      last edited by

                                      @bryan.paradis:

                                      I am not sure if its going to change anything. Turn off gateway monitor on that wan and then test with backblaze and ping out and see if you see a difference. There could be something going on where the added load on the box is degrading the connection further or something. If you get the same loss/latency without apinger running on that WAN then you are dealing with normal saturation? This will need to be fixed using one of the things listed in my above post.

                                      I finally got around to trying this for you. With the gateway monitor off on the secondary wan, and fully saturated upstream traffic out of it, ping times from both the configurator page using the secondary wan gateway and from the client routed through the firewall (this time all traffic not just tcp/udp;) are from two to five seconds.

                                      I think it's clearly just  a congested interface the way Backblaze saturates it through their SSL tunnel.

                                      1 Reply Last reply Reply Quote 0
                                      • B
                                        bryan.paradis
                                        last edited by

                                        @arriflex:

                                        @bryan.paradis:

                                        I am not sure if its going to change anything. Turn off gateway monitor on that wan and then test with backblaze and ping out and see if you see a difference. There could be something going on where the added load on the box is degrading the connection further or something. If you get the same loss/latency without apinger running on that WAN then you are dealing with normal saturation? This will need to be fixed using one of the things listed in my above post.

                                        I finally got around to trying this for you. With the gateway monitor off on the secondary wan, and fully saturated upstream traffic out of it, ping times from both the configurator page using the secondary wan gateway and from the client routed through the firewall (this time all traffic not just tcp/udp;) are from two to five seconds.

                                        I think it's clearly just  a congested interface the way Backblaze saturates it through their SSL tunnel.

                                        Is that same, worse or better than results with apinger on? Would have been interesting if apinger perpetuates the problem with the highly increased load.

                                        1 Reply Last reply Reply Quote 0
                                        • A
                                          AIMS-Informatique
                                          last edited by

                                          Having the same CPU rising behaviour on our PCEngine ALIX with Nano Intall (2.1.4 stable).

                                          In France, we experiment loads of trouble over xDSL connections. Mainly loss, caused even by a bad synchro or by a user that get the line to saturate because of big downloads / uploads.
                                          This causes pf to experience a hight CPU load when GW is considered as offline by PF.

                                          We did the trick of gateway polling in "Routing->Gateways->Edit gateway" :

                                          - Advanced->Packet Loss Thresholds = 20% / 40% (default 10% / 20%)
                                            - Probe Interval = 5s (default = 1s)
                                            - Down = 60s (default = 10s)

                                          For what we experienced so far with those values is a better responsivness of PHP UI, and RRD graph shows a fall of CPU load. Still having those settings for test for few hours on PF that are experimenting DSL sync difficulties.

                                          It looks good so far, and looks like increasing apinger tests and faillure decision, gives the ALIX more time to execute what it has to execute, and CPU graph falls dramatically (so far…).

                                          Sounds to be a good and quick idea to play with the values above.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.