July 9th build: no more Internet, pfSense's DHCP client fails to get DHCP resp.



  • This morning, I updated the firmware to the July 9th build (i386). Afterwards, pfSense failed to connect to the Internet. The WAN interface showed no IP address or 0.0.0.0 in the dashboard. WAN is configured as plain DHCP (plus allowing some address in the 192.168.100.* range to pass through to get to the cable modem's WebGUI). WAN link was physically up. Rebooted cable modem, no change. Rebooted pfSense, no change.

    DHCP logs indicated that pfSense did several DHCP requests, but received no answer. System log showed one unexpected line:

    php: rc.bootup: The command '/sbin/route change -inet default '109.90.144.1'' returned exit code '1', the output was 'route: writing to routing socket: No such process route: writing to routing socket: Network is unreachable change net default: gateway 109.90.144.1: Network is unreachable'

    The IP address above is that of the ISP's gateway, as it has been before the firmware upgrade/reboot.

    I then reverted back to the previous build (via the "restore backup" function in the Firmware menu). Internet access came up back again immediately. This seems to confirm that there's issue with the latest build…


  • Rebel Alliance Developer Netgate

    That's a bit odd since not much at all changed yesterday.

    There were no other errors in the GUI anywhere? No notifications?


  • Rebel Alliance Developer Netgate

    Updated my ALIX and I didn't have any problems receiving DHCP (or DHCPv6).



  • @Klaws:

    DHCP logs indicated that pfSense did several DHCP requests, but received no answer. System log showed one unexpected line:

    php: rc.bootup: The command '/sbin/route change -inet default '109.90.144.1'' returned exit code '1', the output was 'route: writing to routing socket: No such process route: writing to routing socket: Network is unreachable change net default: gateway 109.90.144.1: Network is unreachable'

    Please post (say) six lines before that message and six lines after. The attempt to change default route suggests something was received to trigger that (unless you have a static default route configured and setting the default route is failing because your WAN interface doesn't have a suitable IP address).

    Is dhclient not restarting after exhausting its configuration attempts?



  • I do not anything suspicious in the logs, except the mentioned error message:

    Jul 10 07:49:57 php: rc.start_packages: Restarting/Starting all packages. 
    Jul 10 07:49:57 syslogd: kernel boot file is /boot/kernel/kernel 
    Jul 10 07:49:57 syslogd: exiting on signal 15 
    Jul 10 07:49:57 php: rc.bootup: Cannot reinstall packages: >>> Unable to communicate with www.pfsense.com. Please verify DNS and interface configuration, and that pfSense has functional Internet connectivity. 
    Jul 10 07:49:57 php: rc.bootup: Attempting to reinstall all packages 
    Jul 10 07:49:57 php: rc.bootup: Creating rrd update script 
    Jul 10 07:49:54 php: rc.dyndns.update: DynDNS (ks.kicks-ass.org) There was an error trying to determine the public IP for interface - wan(em0). Probably interface is not a WAN interface. 
    Jul 10 07:49:51 check_reload_status: Updating all dyndns 
    Jul 10 07:49:50 php: rc.bootup: The command '/sbin/route change -inet default '109.90.144.1'' returned exit code '1', the output was 'route: writing to routing socket: No such process route: writing to routing socket: Network is unreachable change net default: gateway 109.90.144.1: Network is unreachable' 
    Jul 10 07:49:50 php: rc.bootup: ROUTING: setting default route to 109.90.144.1 
    Jul 10 07:49:49 kernel: em1: link state changed to UP 
    Jul 10 07:49:49 check_reload_status: Linkup starting em1 
    Jul 10 07:49:45 kernel: pflog0: promiscuous mode enabled 
    Jul 10 07:49:45 php: rc.bootup: Resyncing OpenVPN instances. 
    Jul 10 07:48:33 kernel: em0: link state changed to UP 
    Jul 10 07:48:33 check_reload_status: Linkup starting em0 
    
    

    here a few lines from the DHCP log:

    
    Jul 10 08:00:06 dhcpd: Sending on Socket/fallback/fallback-net 
    Jul 10 08:00:06 dhcpd: Sending on BPF/em1/00:22:4d:7a:cb:f9/192.168.225.0/24 
    Jul 10 08:00:06 dhcpd: Listening on BPF/em1/00:22:4d:7a:cb:f9/192.168.225.0/24 
    Jul 10 08:00:06 dhcpd: Wrote 17 leases to leases file. 
    Jul 10 08:00:06 dhcpd: Wrote 0 new dynamic host decls to leases file. 
    Jul 10 08:00:06 dhcpd: Wrote 0 deleted host decls to leases file. 
    Jul 10 08:00:06 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ 
    Jul 10 08:00:06 dhcpd: All rights reserved. 
    Jul 10 08:00:06 dhcpd: Copyright 2004-2013 Internet Systems Consortium. 
    Jul 10 08:00:06 dhcpd: Internet Systems Consortium DHCP Server 4.2.5-P1 
    Jul 10 08:00:06 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ 
    Jul 10 08:00:06 dhcpd: All rights reserved. 
    Jul 10 08:00:06 dhcpd: Copyright 2004-2013 Internet Systems Consortium. 
    Jul 10 08:00:06 dhcpd: Internet Systems Consortium DHCP Server 4.2.5-P1 
    Jul 10 08:00:00 dhclient: Starting delete_old_states() 
    Jul 10 08:00:00 dhclient: FAIL 
    Jul 10 08:00:00 dhclient[7003]: No working leases in persistent database - sleeping. 
    Jul 10 08:00:00 dhclient[7003]: No DHCPOFFERS received. 
    Jul 10 07:59:57 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 3 
    Jul 10 07:59:41 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 16 
    Jul 10 07:59:26 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 15 
    Jul 10 07:59:19 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 7 
    Jul 10 07:59:13 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 6 
    Jul 10 07:59:08 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 5 
    Jul 10 07:59:05 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 3 
    Jul 10 07:59:03 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 2 
    Jul 10 07:59:03 dhclient[7003]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1 
    
    

    However, I doscovered the following line much further down in the DHCP log:

    
    Jul 10 07:55:49 dhclient[57830]: bound to 192.168.100.10 -- renewal in 30 seconds. 
    
    

    Apparently, there has been some trouble with the modem or the ISP (the modem enables it's own DHCP server and hands out 192.168.100.* addresses when it's WAN side goes down).

    It might have been a coincidence - the ISP going down at the time between firmware download and pfSense restart. And coming back online exactly when I revert to the old firmware…well, sort of hard to swallow, but possible.

    I noted a somewhat similar symptom in this post: http://forum.pfsense.org/index.php/topic,64305.0.html

    Whatever: I didn't experience this issue with the July 10th build.



  • Well, happend again this morning after the firmware update (to the July 12th build). Reboot of pfSense didn't help. Hoever, before I attempted the reboot, I noticed that the Interfaces Status page showed 0/0 in/out packets! The DHCP log however stated that the DHCP client had performed DHCP queries on em0 (which is the WAN interface).

    
    Jul 13 09:08:15	dhclient[8873]: No DHCPOFFERS received.
    Jul 13 09:08:06	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 9
    Jul 13 09:07:48	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 18
    Jul 13 09:07:37	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 11
    Jul 13 09:07:27	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 10
    Jul 13 09:07:20	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 7
    Jul 13 09:07:16	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 4
    Jul 13 09:07:14	dhclient[8873]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 2
    Jul 13 09:07:11	dhclient[8873]: DHCPREQUEST on em0 to 255.255.255.255 port 67
    Jul 13 09:07:09	dhclient[8873]: DHCPREQUEST on em0 to 255.255.255.255 port 67
    Jul 13 09:07:07	dhclient[8873]: DHCPREQUEST on em0 to 255.255.255.255 port 67
    
    

    Reboot didn't help, same result.

    I then tried the DHCP button on the Interfaces Status page. However, as I clicked the button, I heard the click of the relay cutting power to the modem to force a modem reboot (a watchdog process on the pfSense box which power cycles the modem if the internet is not reachable for 10 minutes or so). I had to wait for the modem to boot up again. Then I tried the DHCP button on the Interfaces Status page again. Immediately, pfSense got a WAN address via DHCP and everything was fine again (and still is). I then tried to reboot pfSense to see what happens, expecting that it would again not receive a DHCP response. I was wrong, everthing came back online after the reboot.

    Here's the output of the Interface Status page (WAN panel only) when Internet was at no-go:

    
    WAN interface (em0)  
    Status up  
    DHCP  up      
    MAC address 00:22:4d:7a:cb:f5  
    Gateway IPv4 109.90.144.1  
    IPv6 Link Local fe80::222:4dff:fe7a:cbf5    
    ISP DNS servers 127.0.0.1
    
    Media 1000baseT <full-duplex>In/out packets 0/0 (0 bytes/0 bytes)  
    In/out packets (pass) 0/0 (0 bytes/0 bytes)  
    In/out packets (block) 0/0 (0 bytes/0 bytes)  
    In/out errors 0/0  
    Collisions 0</full-duplex> 
    

    The DHCP button read "release", which is also strage since it had gotten no DHCP offer. I think I had to click it twice to get things going. The gateway address came, according to the logs, not from the DHCP server, but was the previously used gateway (before the firmware update).



  • More on this: yesterday evening (so pfSense was probably running the June 18th build), I had a power outage. After power was restored, pfSense booted up again (I heard the beeps), but appararently failed to connect to the Internet. I found this out on the next morning (I guess the Internet connection was down for the whole nicht). A reboot solved the problem.

    The DHCP logs indicate that since the power restoration rebbot, dhclient hadn't done anything, not even tried. So this time, the "Internet failure" appears to have an other cause than that from my prior reports (where dhclient tried, but didn't get the answer). So I checked the System General logs, and this time I found something suspicous:

    php: rc.bootup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf em0 > /tmp/em0_output 2> /tmp/em0_error_output' returned exit code '1', the output was ''

    Hm. Any ideas, anyone?



  • Some more observations.

    Here's an excerpt from the System log where the Internet connection didn't come up:

    
    Jul 19 08:15:24 kernel: em0: link state changed to UP 
    Jul 19 08:15:24 check_reload_status: Linkup starting em0 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:22 php: rc.bootup: Could not find IPv4 gateway for interface (wan). 
    Jul 19 08:15:21 php: rc.bootup: Adding TFTP nat rules 
    Jul 19 08:15:21 kernel: pflog0: promiscuous mode enabled 
    Jul 19 08:15:21 php: rc.bootup: Resyncing OpenVPN instances. 
    Jul 19 08:15:21 php: rc.bootup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf em0 > /tmp/em0_output 2> /tmp/em0_error_output' returned exit code '1', the output was '' 
    Jul 19 08:15:10 sshlockout[10581]: sshlockout/webConfigurator v3.0 starting up 
    
    

    And here's the coresponding part for a situation where Internet connection worked:

    
    Jul 18 08:53:51 kernel: pflog0: promiscuous mode enabled 
    Jul 18 08:53:51 php: rc.bootup: Resyncing OpenVPN instances. 
    Jul 18 08:53:51 check_reload_status: rc.newwanip starting em0 
    Jul 18 08:53:50 kernel: em0: link state changed to UP 
    Jul 18 08:53:50 check_reload_status: Linkup starting em0 
    Jul 18 08:53:47 sshlockout[9917]: sshlockout/webConfigurator v3.0 starting up 
    
    

    There's the possibility that the em0 interface didn't come up right away because the modem was rebooting (at least in the case where pfSense and modem rebooted simultaneously after a power outage, which happend in the evening of the 18th). What about 08:14 on the 19th, which is the one I included in this post?

    Judging from tne modem's log, it looks like there had been a power outage this orning as well:

    
     Fri Jul 19 08:14:57 2013	 2436694064	 Critical (3)	 Booting up the cable modem
    
    

    Okay, now I can report a real issue:

    **Situation: WAN is configured as DHCP.
    When pfSense boots up and the WAN link is DOWN, dhclient dies and does not recover.

    That's clearly not how it should work.**