WAN DHCP stopped working



  • I'm running 2.4.4P1 on a Netgate 4220, and it has been stable and reliable for months (and for years before that on an earlier netgate appliance). My configuration is:

    cable -> arris sb8200 -> netgate -> lgs308 -> lan

    Recently, we had a major Xfinity outage, but when the cable service came back online all was fine. Several days later, though, we had a power failure and things have not worked properly since then. Of course, I have done all the standard troubleshooting including powering off all network and computer components and repowering in the appropriate order, etc., but that has not helped. All indications are that all hardware is in good working order.

    Details: I have my WAN get its IP via DHCP and that seems not to be happening anymore (pfsense dashboard shows first 0.0.0.0 for the WAN IP, and then, a few minutes later, 'n/a'). If I plug my linux box directly into the cable modem it quickly gets an IP and a gateway and works fine. On pfsense, I attempted to change the IP4 configuration type to manual and directly enter these values (the ones passed to the linux box when directly connected), but that proved ineffective.

    I had a similar problem over a year ago, but I believe it resolved itself. The current problem has persisted for several days.

    I have full ssh access to the pfsense box so I can view log files of interest, etc. Any suggestions of how to debug this issue would be appreciated.

    -Phil


  • Global Moderator

    Hello,

    "we had a power failure and things have not worked properly since then." - have you tried to do filesystem check? If not, try to do it:

    The issue is commonly caused by the appliance being powered off abruptly (not gracefully from the WebGUI, SSH or console menu), which can lead to filesystem errors. Running this command repairs those filesystem errors to get the system back up and running.
    Running a file system consistency check:

    1. Reboot your pfSense firewall and boot into Single User Mode by pressing '2' at the loader menu. It will boot to a question asking for a path to the shell, just press return to reach the # prompt.
    2. At the #prompt run the following command:
      /sbin/fsck -y /
      Run the fsck command at least 6 times; Repeat the command until no errors are reported, even if fsck claims the filesystem has been marked "clean".
    3. Reboot by running: /sbin/reboot


  • I wasn't holding out much hope for this because I've got the firewall and the cable modem on a UPS and running NUT to do graceful shutdowns when needed, but, when booting into single user mode I see the usual stream of events logged to the console and then:

    Trying to mount root from ufs:/dev/big-long-fname [rw]...
    random: unblocking device.

    and then nothing. I've repeated the process several times, but it always hangs here. Rebooting to multiuser works OK, but seems to take longer than expected (and doesn't get a WAN IP.



  • OK dhcp experts: this should help. Recap: when connected directly to my linux box, the cable modem provides an IP and Gateway via DHCP and all works. When my netgate firewall is connected (via the same cable) there is no joy. From a command shell on the linux box I have:

    phil@phils-i5 ~ $ sudo dhclient -v eth0
    [sudo] password for phil: 
    Internet Systems Consortium DHCP Client 4.2.4
    Copyright 2004-2012 Internet Systems Consortium.
    All rights reserved.
    For info, please visit https://www.isc.org/software/dhcp/
    
    Listening on LPF/eth0/40:xx:xx:xx:xx:56
    Sending on   LPF/eth0/40:xx:xx:x:xx:56
    Sending on   Socket/fallback
    DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3 (xid=0x5e8b331)
    DHCPREQUEST of 66.xx.xx.xxx on eth0 to 255.255.255.255 port 67 (xid=0x31b3e805)
    DHCPOFFER of 66.xx.xx.xx from 66.xx.xx.1
    DHCPACK of 66.xx.xx.xxx from 66.xx.xx.1
    bound to 66.xx.xx.xxx -- renewal in 162983 seconds.
    

    And, all is good. Then, I rewire to the pfsense box and repeat to get:

    [2.4.4-RELEASE][admin@pfSense.local.lan]/root: dhclient  igb0
    dhclient already running, pid: 88871.
    exiting.
    [2.4.4-RELEASE][admin@pfSense.local.lan]/root: kill -9 88871
    [2.4.4-RELEASE][admin@pfSense.local.lan]/root: dhclient igb0
    DHCPREQUEST on igb0 to 255.255.255.255 port 67
    DHCPREQUEST on igb0 to 255.255.255.255 port 67
    DHCPREQUEST on igb0 to 255.255.255.255 port 67
    DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 6
    DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 8
    DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 18
    DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 19
    DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 10
    No DHCPOFFERS received.
    Trying recorded lease 73.xxx.xx.xxx
    bound: renewal in 41948 seconds.
    

    I'm hoping that, with this evidence, the solution will be obvious to someone...



  • @pwest

    Reading lots of on and off-point posts, the issue of MAC spoofing came up repeatedly so I first tried putting in a WAN MAC of eb:ef:eb:ef:eb:ef, but that did not work. Then I tried the MAC from my linux box and bam: I immediately got an IP and all seems to work.

    So, four years with pfsense on netgate hardware with two different ISPs and I've never spoofed my MAC. Now, all of a sudden it seems to be needed--what's up with that?



  • Some DOCSIS(cable) ISPs, such as Comcast/Xfinity, enable a filter in their modem configuration where the modem will only forward traffic for the first MAC address learnt after power cycle/reboot, presumably to prevent cases where a single customer can receive multiple public IP addresses by connecting several devices behind a switch to the modem.

    Have you tried fully power-cycling the modem when switching the connection between your Linux host and FreeBSD?



  • @nkaminski said in WAN DHCP stopped working:

    Have you tried fully power-cycling the modem when switching the connection between your Linux host and FreeBSD?

    Yes, as I mentioned in the OP: " I have done all the standard troubleshooting including powering off all network and computer components and repowering in the appropriate order, etc., but that has not helped."
    As it stands, it appears that Xfinity is not responding to the native MAC associated with my 4220. This weekend, I'll try to take some time to experiment w/some different MAC values to see if a) this problem persists, and, if so, b) is Xfinity rejecting this unique MAC, or is it some vendor-specific MAC block (as suggested by several posts) that is being rejected.


Log in to reply