[WORKAROUND] Gateways offline or can't dhcp or dhcp6 with new level of Comcast modem



  • I'm posting this here because I'm not sure it's just dhcp related.

    Situation:
    pfSense 2.2.4-RELEASE running on a Netgate APU2 with an internal mSATA SSD boot device.
    Up until now it's been working just great with ipv4 and ipv6 via a SB6141 CM with Comcast as the provider on a 150/20Mbps plan.

    Changed ancient TV gear, got new Xfinity X1 stuff, came with a new tier of service 250/25.

    Couldn't get new speed service to work on the SB6141, turns out Comcast requires 16channel bonding, the SB6141 only does 8. Bought a new SB6183 (Comcast supported) started having the issues described below, so returned just in case that was the issue for a Netgear CM500-100NAS. Nope it's not the CM it's the router as the MacBook again works flawlessly when directly connected. No settings on the router were changed initially before working out the below, and several full restores were performed to known good 2.2.3-RELEASE full backups just in case.

    All works fine @ the new speed tier (tested via Speediest) with a direct connected Macbook, ipv4 and ipv6 addresses assigned.

    The pfSense router does not work with the following symptoms (after MANY hours of testing various permutations, reboots of APU2 and CM etc.):

    The router will pick up a 192.168.100.10 address from the CM while the router is booting and getting connected. It then gets an EXPIRE and uses the last good OFFER received/saved. If the APU2 is reset or anything else forces a DHCP retry it never receives an OFFER the gateway appears offline and Internet access is down. It currently has a valid address etc. copied from the Macbook when it was up, entered as static and then put back to DHCP.

    The address range from Comcast appears to have changed  from 158.N.N.N to 76.N.N.N

    IPv6 is totally non-functional since there is no "private" equivalent IPv6 handed out by the CM to "jump start" the process. If IPv6 is enabled the IPv4 connection will never come up either with the same inaccessible gateway problem.

    The Router boot hangs on WAN config on the console for a long time for anything other than the "jumpstart" hack mentioned above.

    After the Jumpstart hack the address has to be set to static or the gateway goes off line later again.

    There is never a DHCPOFFER in the log, only a DHCPACK:

    dhclient entries from a successful “jumpstart”:

    Aug  1 20:08:17 sobek dhclient: PREINIT
    Aug  1 20:08:17 sobek dhclient[2906]: DHCPREQUEST on re1 to 255.255.255.255 port 67
    Aug  1 20:08:19 sobek dhclient[2906]: DHCPREQUEST on re1 to 255.255.255.255 port 67
    Aug  1 20:08:22 sobek dhclient[2906]: DHCPREQUEST on re1 to 255.255.255.255 port 67
    Aug  1 20:08:30 sobek dhclient[2906]: DHCPDISCOVER on re1 to 255.255.255.255 port 67 interval 2
    Aug  1 20:08:32 sobek dhclient[2906]: DHCPDISCOVER on re1 to 255.255.255.255 port 67 interval 4
    Aug  1 20:08:36 sobek dhclient[2906]: DHCPDISCOVER on re1 to 255.255.255.255 port 67 interval 11
    Aug  1 20:08:47 sobek dhclient[2906]: DHCPDISCOVER on re1 to 255.255.255.255 port 67 interval 21
    Aug  1 20:09:08 sobek dhclient[2906]: DHCPDISCOVER on re1 to 255.255.255.255 port 67 interval 10
    Aug  1 20:10:40 sobek dhclient: PREINIT
    Aug  1 20:10:40 sobek dhclient[6531]: DHCPREQUEST on re1 to 255.255.255.255 port 67
    Aug  1 20:10:41 sobek dhclient[6531]: DHCPACK from 96.120.17.21
    Aug  1 20:10:41 sobek dhclient: REBOOT
    Aug  1 20:10:41 sobek dhclient: Starting add_new_address()
    Aug  1 20:10:41 sobek dhclient: ifconfig re1 inet 76.30.21.151 netmask 255.255.252.0 broadcast 255.255.255.255
    Aug  1 20:10:41 sobek dhclient: New IP Address (re1): 76.30.21.151
    Aug  1 20:10:41 sobek dhclient: New Subnet Mask (re1): 255.255.252.0
    Aug  1 20:10:41 sobek dhclient: New Broadcast Address (re1): 255.255.255.255
    Aug  1 20:10:41 sobek dhclient: New Routers (re1): 76.30.20.1
    Aug  1 20:10:41 sobek dhclient: Adding new routes to interface: re1
    Aug  1 20:10:41 sobek dhclient: /sbin/route add default 76.30.20.1
    Aug  1 20:10:41 sobek dhclient: Creating resolv.conf
    Aug  1 20:10:41 sobek dhclient[6531]: bound to 76.30.21.151 – renewal in 113507 seconds.
    Aug  1 20:13:47 sobek dhclient[8533]: exiting.

    This is driving me bananas. Needless to say Comcast is no help as from their PoV all is working for anything but the router.

    Please Help. Be Gentle I'm an old UNIX hand but no pfSense expert. ;-)



    • Do a fresh install (pfSense) 2.1.5
    • install new firmware if available on the SB6183
    • un-power this device and repower it after 30 seconds
    • connect it to the pfSense´s WAN port and set there up DHCP

    Now you could get an Internet connection



  • @daplumber:

    I'm posting this here because I'm not sure it's just dhcp related.

    OK, a cascading setup with double NAT.

    First make the pfSense have a static IP with the Comcast CM.
    Secondly, when booting, then do this in serial. That is to say first let the CM finish, then boot the pfSense.

    IPv6 will work if the CM can function as a DHCP6-Server to issue to pfSense client DHCP6 request.



  • I'm sorry I wasn't clear: I no longer have the SB6183, I'm using the Netgear CM500-100NAS. Both are supported by Comcast for the 250/25Mbps tier of service and indeed work perfectly with a directly connected MacBook. While directly connected with the MacBook I verified on the CM and asked the Comcast tech. to force an update to latest FW.

    @Blue Kobold:

    • Why? What's different/broken in 2.2 that requires reverting to 2.1? Bear in mind that 2.2 has been working perfectly before. The ONLY change was the CM and the Comcast tier of service.
    • How do I recover all of the settings I was using previously esp. Local DHCP and DNS entries other than laboriously by hand?

    @hda

    • The CM is a straight bridge and only hands out the 192.168.100.N addresses before the cable side is connected. It then sends the EXPIRE. I'm not too sure what's happening on the cable side but I understand it to be IPv6 with IPv4 transparently tunneled to save on v4 addresses. The v4 address that is handed out by Comcast's DHCP server is a "real" address, I.e. Not NAT, although obviously not fixed and coming from a pool.

    @ALL
    I was reading that there was an issue with dhclient using a value of 16 instead of 128 for the TTL on the DHCP packets. Could this be the issue? It's conceivable that Comcast's 76.N.N.N is more hops away than the previous 158.N.N.N and that the different dinners are used for different tiers of service.



  • @daplumber:

    I'm sorry I wasn't clear:

    Bridge between which of your devices ?
    Why won't you have the real public IPv4 on the pfSense-WAN ?
    You know, pfSense is designed to manage the public side of your premises.
    Doesn't Comcast do dual-stack IPv4&v6 ? are you on a CGNAT ?



  • @hda:

    @daplumber:

    I'm sorry I wasn't clear:

    Bridge between which of your devices ?
    Why won't you have the real public IPv4 on the pfSense-WAN ?
    You know, pfSense is designed to manage the public side of your premises.
    Doesn't Comcast do dual-stack IPv4&v6 ? are you on a CGNAT ?

    Cable Modem (CM) is in bridge mode, I.e. Not acting as NAT or router.  Real IPv4 and IPv6 on the WAN port of pfSense. Cable-co-ax side of CM is not visible but says IPv6 only in CM diagnostic page. CM only directly acts as DHCP server before cable side is up and then sends the expire as soon as it is up to force DHCP on the "real" Comcast servers.



  • I understand the specific case now. You need advice from a Comcast comrade ;)



  • Your rebooting the cable modem each time you change a device on it's LAN port?  Cable plants only allow one or two (depending on the ISP) mac addresses.

    I remember past complaints of this same thing here by other Comcast customers.  Can you switch interfaces and try the opposite or one of the other interface(s) on your pfSense box as the WAN?

    And yea- don't install anything older than the present stable release. Never (usually) recommended.



  • What's different/broken in 2.2 that requires reverting to 2.1?

    If it is running not on one version it would be not a failure, to try out one lower version and one higher version
    that you will be able to determined that the failure is not owed to only one version!



  • @chpalmer:

    Your rebooting the cable modem each time you change a device on it's LAN port?  Cable plants only allow one or two (depending on the ISP) mac addresses.

    I remember past complaints of this same thing here by other Comcast customers.  Can you switch interfaces and try the opposite or one of the other interface(s) on your pfSense box as the WAN?

    And yea- don't install anything older than the present stable release. Never (usually) recommended.

    Yes I'm rebooting the CM (power cycle) every time I change the MAC of the device attached to it, and yes, that's required.

    I've tried modifying the MAC address and using the normally OPT1 (re0) interface as the WAN instead, just in case of a sudden Port problem, no difference.

    As I said the current stable release was working fine as, was previous 2.1 releases with no configuration changes.



  • WORKAROUND - Root cause still unknown

    A clean install of 2.2.4-RELEASE worked, but a restore of the saved config.xml file borked it exactly the same way. After carefully restoring individual parts of the backed up config.xml, only "system" caused problems. That still leaves one heck of a lot of stuff to redo manually including certificates, users, packages, and so on. I'd still like to know what the fsck causes this in that XML file because to the best of my ability I've configured the missing pieces the same as before from my notes.maybe I'll try and diff the new config.xml with the "Bad" one and see if anything jumps out…



  • The dhclient logs look like what you'd see if the link to the modem were flapping. I can't think of anything relevant to that that'd be in the system portion of the config though. Is there some difference in interface configuration that maybe you didn't bring across? Maybe no MAC spoofing?



  • @cmb:

    The dhclient logs look like what you'd see if the link to the modem were flapping. I can't think of anything relevant to that that'd be in the system portion of the config though. Is there some difference in interface configuration that maybe you didn't bring across? Maybe no MAC spoofing?

    Nope. I didn't take screenshots but the interface page should be identical as I entered it from my notes. If you can tell me the relevant keys I'll compare Config.xml files.

    I tried it both with and without MAC spoofing, it made no difference. All a MAC change requires is a reboot of the CM.



  • Is the WAN actually link flapping in that case? See re0 link down/up messages in the system log, see the modem and/or WAN NIC losing its link light?



  • @cmb:

    Is the WAN actually link flapping in that case? See re0 link down/up messages in the system log, see the modem and/or WAN NIC losing its link light?

    Apologies for being unclear. No, by those tests it was not flapping. I usually have a small switch between re1 and the CM to prevent exactly this and dhcpv6 annoyances. I tested both with and without the switch in the middle and the result was the same. (re1 - WAN on the APU2 - and the CM Ethernet port are the only things on that switch.)