Multi WAN do not allow DHCP on both WAN interfaces?
-
Hi all,
I have difficulty to make dual WAN setup in my pfsense box.
Original WAN assigned to em4 (namely WAN), another WAN assigned to em5 (namely WAN2), both of them are attached to GPON from 2 different ISPs (hence it's impossible for both of them having same gateway).1. Plug ISP1 to WAN, all good.
2. Plug ISP2 to WAN2 (with WAN's presence), no good, I can see that LED of em5 goes off after a few second, checking /tmp/em5_output, it says "dhclient already running… giving up"
3. After plugging in WAN2, my system is experiencing fairly high system load, using "top" I can see that "check_reload_status" is eating 100% CPU time, most operations won't work.
4. Unplug WAN2, pfSense box apparently working again, but system load won't go down.[2.2.1-RELEASE][root@test]/var/log: ps uaxwwww USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 11 100.8 0.0 0 32 - RL 6:11AM 566:18.17 [idle] root 305 100.0 0.1 19032 2828 - RNs 6:12AM 316:39.29 /usr/local/sbin/check_reload_status root 0 0.0 0.0 0 352 - DLs 6:11AM 1:32.29 [kernel] root 1 0.0 0.0 9472 852 - ILs 6:11AM 0:00.08 /sbin/init -- root 2 0.0 0.0 0 16 - DL 6:11AM 0:00.00 [crypto] root 3 0.0 0.0 0 16 - DL 6:11AM 0:00.00 [crypto returns] root 4 0.0 0.0 0 32 - DL 6:11AM 0:00.00 [cam] root 5 0.0 0.0 0 16 - DL 6:11AM 0:11.02 [pf purge] root 6 0.0 0.0 0 16 - DL 6:11AM 0:00.00 [sctp_iterator] root 7 0.0 0.0 0 16 - DL 6:11AM 0:00.02 [enc_daemon0] root 8 0.0 0.0 0 16 - DL 6:11AM 0:00.64 [pagedaemon] root 9 0.0 0.0 0 16 - DL 6:11AM 0:00.00 [vmdaemon] root 10 0.0 0.0 0 16 - DL 6:11AM 0:00.00 [audit] root 12 0.0 0.0 0 256 - WL 6:11AM 2:01.48 [intr] root 13 0.0 0.0 0 32 - DL 6:11AM 0:00.00 [ng_queue] root 14 0.0 0.0 0 48 - DL 6:11AM 0:00.33 [geom] root 15 0.0 0.0 0 16 - DL 6:11AM 0:05.95 [rand_harvestq] root 16 0.0 0.0 0 128 - DL 6:11AM 0:02.66 [usb] root 17 0.0 0.0 0 16 - DL 6:11AM 0:00.90 [acpi_thermal] root 18 0.0 0.0 0 16 - DL 6:11AM 0:00.08 [acpi_cooling1] root 19 0.0 0.0 0 16 - DL 6:11AM 0:00.00 [pagezero] root 20 0.0 0.0 0 16 - DL 6:11AM 0:00.03 [idlepoll] root 21 0.0 0.0 0 16 - DL 6:11AM 0:00.10 [bufdaemon] root 22 0.0 0.0 0 16 - DL 6:11AM 0:00.99 [syncer] root 23 0.0 0.0 0 16 - DL 6:11AM 0:00.10 [vnlru] root 52 0.0 0.0 0 16 - DL 6:11AM 0:00.25 [md0] root 57 0.0 0.0 0 16 - DL 6:11AM 0:02.68 [md1] root 307 0.0 0.1 19032 2384 - IN 6:12AM 0:00.00 check_reload_status: Monitoring daemon of check_reload_status root 319 0.0 0.2 13164 4384 - Is 6:12AM 0:00.44 /sbin/devd -q root 1510 0.0 0.1 14756 2312 - Is 7:09AM 0:00.02 /usr/local/sbin/sshlockout_pf 15 root 3851 0.0 0.1 14664 2440 - Ss 7:21AM 0:02.47 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /var/etc/syslog.conf root 7728 0.0 0.1 8312 1960 - IN 1:37PM 0:00.00 sleep 60 root 8053 0.0 0.3 32428 5216 - Is 6:12AM 0:00.01 /usr/sbin/sshd root 8504 0.0 0.1 14756 2216 - Is 6:12AM 0:00.00 /usr/local/sbin/sshlockout_pf 15 root 9788 0.0 1.5 233404 30720 - Ss 8:21AM 0:00.21 php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm) root 12062 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 12211 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 12243 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 12452 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 12896 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 13004 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 13057 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 13219 0.0 0.4 55728 7060 - S 1:37PM 0:00.00 /var/bandwidthd/bandwidthd root 13871 0.0 0.9 28172 18076 - Is 1:37PM 0:00.03 /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid root 17160 0.0 0.3 55632 6036 - Ss 1:07PM 0:00.19 sshd: root@pts/0 (sshd) root 19398 0.0 0.1 16812 2328 - Ss 6:12AM 0:04.23 /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid root 19780 0.0 0.1 18788 2360 - Is 6:12AM 0:00.05 /usr/sbin/inetd -wW -R 0 -a 127.0.0.1 /var/etc/inetd.conf root 46844 0.0 0.1 12464 2304 - Ss 7:52AM 0:02.20 /usr/local/sbin/apinger -c /var/etc/apinger.conf root 46870 0.0 0.2 28352 3604 - I 7:52AM 0:00.11 rrdtool - root 51739 0.0 0.1 14756 2312 - Is 7:35AM 0:00.02 /usr/local/sbin/sshlockout_pf 15 root 52743 0.0 0.4 50796 7232 - S 8:21AM 0:00.57 /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf root 55486 0.0 0.1 14548 2052 - Ss 6:12AM 0:03.17 /usr/sbin/powerd -b hadp -a hadp -n hadp dhcpd 63710 0.0 0.6 24820 12660 - Ss 1:01PM 0:00.04 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid em0 root 68486 0.0 0.1 8312 1960 - IN 1:37PM 0:00.00 sleep 60 root 69753 0.0 0.1 14704 2388 - Is 1:37PM 0:00.00 dhclient: em4 [priv] (dhclient) root 74747 0.0 0.1 16672 2408 - Ss 6:12AM 0:00.10 /usr/sbin/cron -s _dhcp 76226 0.0 0.1 14704 2524 - Is 1:37PM 0:00.00 dhclient: em4 (dhclient) root 78679 0.0 0.1 12412 1976 - Is 6:12AM 0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh root 79327 0.0 0.1 12412 1984 - I 6:12AM 0:00.02 minicron: helper /usr/local/bin/ping_hosts.sh (minicron) root 79440 0.0 0.1 12412 1976 - Is 6:12AM 0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts root 79664 0.0 0.1 12412 1988 - I 6:12AM 0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts (minicron) root 79958 0.0 0.1 12412 1976 - Is 6:12AM 0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data root 80491 0.0 0.1 12412 1988 - I 6:12AM 0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data (minicron) root 80572 0.0 2.3 237500 45984 - I 1:37PM 0:00.02 php-fpm: pool lighty (php-fpm) root 81898 0.0 0.1 14756 2280 - Is 6:12AM 0:00.00 /usr/local/sbin/sshlockout_pf 15 unbound 82235 0.0 0.7 42928 13324 - Ss 1:37PM 0:00.03 /usr/local/sbin/unbound -c /var/unbound/unbound.conf root 86616 0.0 0.1 17144 2696 - IN 1:37PM 0:00.01 /bin/sh /var/db/rrd/updaterrd.sh root 27083 0.0 0.1 43576 2728 u0 Is 7:48AM 0:00.02 login [pam] (login) root 27116 0.0 0.1 17144 2776 u0 I 7:48AM 0:00.01 -sh (sh) root 27299 0.0 0.1 17144 2660 u0 I+ 7:48AM 0:00.01 /bin/sh /etc/rc.initial root 81814 0.0 0.1 43576 2696 v0 Is 6:12AM 0:00.01 login [pam] (login) root 82272 0.0 0.1 17144 2724 v0 I 6:12AM 0:00.00 -sh (sh) root 82606 0.0 0.1 17144 2628 v0 I+ 6:12AM 0:00.00 /bin/sh /etc/rc.initial root 14426 0.0 0.1 18816 2404 0 R+ 1:38PM 0:00.00 ps uaxwwww root 26450 0.0 0.1 17144 2764 0 Is 1:07PM 0:00.00 -sh (sh) root 26641 0.0 0.1 17144 2664 0 I 1:07PM 0:00.00 /bin/sh /etc/rc.initial root 30266 0.0 0.2 17484 3828 0 S 1:07PM 0:00.04 /bin/tcsh
Since this box is newly setup, and during tests only 1-2 clients are connecting, so it's not possible that those clients are causing such a high loading.
But looking at the /tmp/em5_output, does it really mean that pfSense doesn't allow DHCP on both WAN?
-
After a reboot, situation changed a bit, check_reload_status is no longer pulling 100% CPU resources.
Before reboot, I unplug WAN, with WAN2 plugged in, after reboot, WAN2 experiences a few times of UP/DOWN events and then…...gets DHCP offer from ISP2! And I tried to plugin WAN, alright nothing goes wrong!I thought the story should end here, but not really....
After setting up dual WAN load balancing + fail over, I did a test and.....unplug & plug WAN to test failover, working flawlessly. But when it comes to WAN2, sorry no hope, I get the same issue, em5 disconnects every few seconds, I keep tracing /tmp/em5_output, I found that it does receive DHCP offer from my ISP2 GPON, but just a few second later it disconnects without any error showing up. I keep everything plugged in and reboot again, everything comes up. Now I know that if I unplug WAN2 again, I have to reboot again.... :-\
Something I observed while system is dealing with WAN2: The pfSense GUI takes much longer time to respond to my clicks, as long as WAN2 doesn't have any link flapping issue, GUI access is extremely fast, I checked from console and do not see anything pulling system resources.
Forgot to mention, my setup is a Celeron 1037U + 2GB DDR3 + 6 x Intel 82583V GbE card, plus 2GB CF card installed 2.2.1 nanobsd version.