Multi WAN do not allow DHCP on both WAN interfaces?



  • Hi all,

    I have difficulty to make dual WAN setup in my pfsense box.
    Original WAN assigned to em4 (namely WAN), another WAN assigned to em5 (namely WAN2), both of them are attached to GPON from 2 different ISPs (hence it's impossible for both of them having same gateway).

    1. Plug ISP1 to WAN, all good.
    2. Plug ISP2 to WAN2 (with WAN's presence), no good, I can see that LED of em5 goes off after a few second, checking /tmp/em5_output, it says "dhclient already running… giving up"
    3. After plugging in WAN2, my system is experiencing fairly high system load, using "top" I can see that "check_reload_status" is eating 100% CPU time, most operations won't work.
    4. Unplug WAN2, pfSense box apparently working again, but system load won't go down.

    
    [2.2.1-RELEASE][root@test]/var/log: ps uaxwwww
    USER      PID  %CPU %MEM    VSZ   RSS TT  STAT STARTED      TIME COMMAND
    root       11 100.8  0.0      0    32  -  RL    6:11AM 566:18.17 [idle]
    root      305 100.0  0.1  19032  2828  -  RNs   6:12AM 316:39.29 /usr/local/sbin/check_reload_status
    root        0   0.0  0.0      0   352  -  DLs   6:11AM   1:32.29 [kernel]
    root        1   0.0  0.0   9472   852  -  ILs   6:11AM   0:00.08 /sbin/init --
    root        2   0.0  0.0      0    16  -  DL    6:11AM   0:00.00 [crypto]
    root        3   0.0  0.0      0    16  -  DL    6:11AM   0:00.00 [crypto returns]
    root        4   0.0  0.0      0    32  -  DL    6:11AM   0:00.00 [cam]
    root        5   0.0  0.0      0    16  -  DL    6:11AM   0:11.02 [pf purge]
    root        6   0.0  0.0      0    16  -  DL    6:11AM   0:00.00 [sctp_iterator]
    root        7   0.0  0.0      0    16  -  DL    6:11AM   0:00.02 [enc_daemon0]
    root        8   0.0  0.0      0    16  -  DL    6:11AM   0:00.64 [pagedaemon]
    root        9   0.0  0.0      0    16  -  DL    6:11AM   0:00.00 [vmdaemon]
    root       10   0.0  0.0      0    16  -  DL    6:11AM   0:00.00 [audit]
    root       12   0.0  0.0      0   256  -  WL    6:11AM   2:01.48 [intr]
    root       13   0.0  0.0      0    32  -  DL    6:11AM   0:00.00 [ng_queue]
    root       14   0.0  0.0      0    48  -  DL    6:11AM   0:00.33 [geom]
    root       15   0.0  0.0      0    16  -  DL    6:11AM   0:05.95 [rand_harvestq]
    root       16   0.0  0.0      0   128  -  DL    6:11AM   0:02.66 [usb]
    root       17   0.0  0.0      0    16  -  DL    6:11AM   0:00.90 [acpi_thermal]
    root       18   0.0  0.0      0    16  -  DL    6:11AM   0:00.08 [acpi_cooling1]
    root       19   0.0  0.0      0    16  -  DL    6:11AM   0:00.00 [pagezero]
    root       20   0.0  0.0      0    16  -  DL    6:11AM   0:00.03 [idlepoll]
    root       21   0.0  0.0      0    16  -  DL    6:11AM   0:00.10 [bufdaemon]
    root       22   0.0  0.0      0    16  -  DL    6:11AM   0:00.99 [syncer]
    root       23   0.0  0.0      0    16  -  DL    6:11AM   0:00.10 [vnlru]
    root       52   0.0  0.0      0    16  -  DL    6:11AM   0:00.25 [md0]
    root       57   0.0  0.0      0    16  -  DL    6:11AM   0:02.68 [md1]
    root      307   0.0  0.1  19032  2384  -  IN    6:12AM   0:00.00 check_reload_status: Monitoring daemon of check_reload_status
    root      319   0.0  0.2  13164  4384  -  Is    6:12AM   0:00.44 /sbin/devd -q
    root     1510   0.0  0.1  14756  2312  -  Is    7:09AM   0:00.02 /usr/local/sbin/sshlockout_pf 15
    root     3851   0.0  0.1  14664  2440  -  Ss    7:21AM   0:02.47 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /var/etc/syslog.conf
    root     7728   0.0  0.1   8312  1960  -  IN    1:37PM   0:00.00 sleep 60
    root     8053   0.0  0.3  32428  5216  -  Is    6:12AM   0:00.01 /usr/sbin/sshd
    root     8504   0.0  0.1  14756  2216  -  Is    6:12AM   0:00.00 /usr/local/sbin/sshlockout_pf 15
    root     9788   0.0  1.5 233404 30720  -  Ss    8:21AM   0:00.21 php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
    root    12062   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    12211   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    12243   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    12452   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    12896   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    13004   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    13057   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    13219   0.0  0.4  55728  7060  -  S     1:37PM   0:00.00 /var/bandwidthd/bandwidthd
    root    13871   0.0  0.9  28172 18076  -  Is    1:37PM   0:00.03 /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
    root    17160   0.0  0.3  55632  6036  -  Ss    1:07PM   0:00.19 sshd: root@pts/0 (sshd)
    root    19398   0.0  0.1  16812  2328  -  Ss    6:12AM   0:04.23 /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
    root    19780   0.0  0.1  18788  2360  -  Is    6:12AM   0:00.05 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.conf
    root    46844   0.0  0.1  12464  2304  -  Ss    7:52AM   0:02.20 /usr/local/sbin/apinger -c /var/etc/apinger.conf
    root    46870   0.0  0.2  28352  3604  -  I     7:52AM   0:00.11 rrdtool -
    root    51739   0.0  0.1  14756  2312  -  Is    7:35AM   0:00.02 /usr/local/sbin/sshlockout_pf 15
    root    52743   0.0  0.4  50796  7232  -  S     8:21AM   0:00.57 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
    root    55486   0.0  0.1  14548  2052  -  Ss    6:12AM   0:03.17 /usr/sbin/powerd -b hadp -a hadp -n hadp
    dhcpd   63710   0.0  0.6  24820 12660  -  Ss    1:01PM   0:00.04 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid em0
    root    68486   0.0  0.1   8312  1960  -  IN    1:37PM   0:00.00 sleep 60
    root    69753   0.0  0.1  14704  2388  -  Is    1:37PM   0:00.00 dhclient: em4 [priv] (dhclient)
    root    74747   0.0  0.1  16672  2408  -  Ss    6:12AM   0:00.10 /usr/sbin/cron -s
    _dhcp   76226   0.0  0.1  14704  2524  -  Is    1:37PM   0:00.00 dhclient: em4 (dhclient)
    root    78679   0.0  0.1  12412  1976  -  Is    6:12AM   0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
    root    79327   0.0  0.1  12412  1984  -  I     6:12AM   0:00.02 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
    root    79440   0.0  0.1  12412  1976  -  Is    6:12AM   0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts
    root    79664   0.0  0.1  12412  1988  -  I     6:12AM   0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts  (minicron)
    root    79958   0.0  0.1  12412  1976  -  Is    6:12AM   0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data
    root    80491   0.0  0.1  12412  1988  -  I     6:12AM   0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data  (minicron)
    root    80572   0.0  2.3 237500 45984  -  I     1:37PM   0:00.02 php-fpm: pool lighty (php-fpm)
    root    81898   0.0  0.1  14756  2280  -  Is    6:12AM   0:00.00 /usr/local/sbin/sshlockout_pf 15
    unbound 82235   0.0  0.7  42928 13324  -  Ss    1:37PM   0:00.03 /usr/local/sbin/unbound -c /var/unbound/unbound.conf
    root    86616   0.0  0.1  17144  2696  -  IN    1:37PM   0:00.01 /bin/sh /var/db/rrd/updaterrd.sh
    root    27083   0.0  0.1  43576  2728 u0  Is    7:48AM   0:00.02 login [pam] (login)
    root    27116   0.0  0.1  17144  2776 u0  I     7:48AM   0:00.01 -sh (sh)
    root    27299   0.0  0.1  17144  2660 u0  I+    7:48AM   0:00.01 /bin/sh /etc/rc.initial
    root    81814   0.0  0.1  43576  2696 v0  Is    6:12AM   0:00.01 login [pam] (login)
    root    82272   0.0  0.1  17144  2724 v0  I     6:12AM   0:00.00 -sh (sh)
    root    82606   0.0  0.1  17144  2628 v0  I+    6:12AM   0:00.00 /bin/sh /etc/rc.initial
    root    14426   0.0  0.1  18816  2404  0  R+    1:38PM   0:00.00 ps uaxwwww
    root    26450   0.0  0.1  17144  2764  0  Is    1:07PM   0:00.00 -sh (sh)
    root    26641   0.0  0.1  17144  2664  0  I     1:07PM   0:00.00 /bin/sh /etc/rc.initial
    root    30266   0.0  0.2  17484  3828  0  S     1:07PM   0:00.04 /bin/tcsh
    
    

    Since this box is newly setup, and during tests only 1-2 clients are connecting, so it's not possible that those clients are causing such a high loading.

    But looking at the /tmp/em5_output, does it really mean that pfSense doesn't allow DHCP on both WAN?



  • After a reboot, situation changed a bit, check_reload_status is no longer pulling 100% CPU resources.
    Before reboot, I unplug WAN, with WAN2 plugged in, after reboot, WAN2 experiences a few times of UP/DOWN events and then…...gets DHCP offer from ISP2! And I tried to plugin WAN, alright nothing goes wrong!

    I thought the story should end here, but not really....

    After setting up dual WAN load balancing + fail over, I did a test and.....unplug & plug WAN to test failover, working flawlessly. But when it comes to WAN2, sorry no hope, I get the same issue, em5 disconnects every few seconds, I keep tracing /tmp/em5_output, I found that it does receive DHCP offer from my ISP2 GPON, but just a few second later it disconnects without any error showing up. I keep everything plugged in and reboot again, everything comes up. Now I know that if I unplug WAN2 again, I have to reboot again.... :-\

    Something I observed while system is dealing with WAN2: The pfSense GUI takes much longer time to respond to my clicks, as long as WAN2 doesn't have any link flapping issue, GUI access is extremely fast, I checked from console and do not see anything pulling system resources.

    Forgot to mention, my setup is a Celeron 1037U + 2GB DDR3 + 6 x Intel 82583V GbE card, plus 2GB CF card installed 2.2.1 nanobsd version.


Log in to reply