Comcast IPv6 works for 1-2 days, then stops routing

JKnott

I started with tier 2, as I don't waste my time with tier 1. I was able to show them the problem, as well as the senior tech who came to my home. The problem was getting the network guys to accept they had a problem. Initially they refused to do anything, as I had my own router. This despite the fact I had identified the failing system. What it took to convince them was the senior tech had the same failure with his own modem/firewall and computer. He then took them to the head end I'm connected to and tried 4 different CMTS and only the one I was connected to and had identified had failed. The network guys then got off their butts and fixed the problem.

BTW, shortly before I had this problem, I was doing some work for that same ISP, in a few different head ends, though not mine, so I had a bit more knowledge of the situation than the average customer. This was in addition to my using Wireshark to identify the failing systems and decades of experience in telecom, computers and networks. I doubt an ordinary customer would have had much luck with this sort of problem, when even the ISPs techs & support don't fully understand the way things work.

STS-134

@jknott Yep, it's frustrating because their CS agents aren't properly trained. They'll rattle off everything about "demarcation" but don't even know where their own responsibilities begin and end. I told one CS agent that a phone company refusing to support my case because "you're using your own telephone" doesn't apply so long as that telephone is sending the proper tones and pulses to the system. If it's sending the proper tones and the system isn't doing what it's supposed to, it's a telephone system issue.

It seems like testing over at Comcast is lacking. They initially gave me a Comcast Business Router, which had a very similar bug where they delegate IPv6 blocks to routers but it never routes those addresses, only its own /64 block. This was over a year ago, mind you, and they apparently STILL haven't fixed the problem: https://forums.businesshelp.comcast.com/conversations/ipv6/can-not-get-internal-ipv6-traffic-to-route-with-the-cga4131com/5fe0a62cc5375f08cd960e81

I told them that I wanted to go back to a Cisco Business Wireless Gateway modem (I just disable the WiFi) because that one routes IPv6 properly, but it seems their latest firmware update even screwed that up.

I'm wondering how long I should give engineering to analyze my packet capture logs and hopefully reproduce and actually fix the issue before I bug them again.

JKnott

@sts-134

I remember those days when people were advised to keep one phone company phone, just in case.. As for your captures, do they show the failure?

I have attached my capture that shows the failure, from packet #29, where the Status Message say no prefix is available:

bootup_capture.pcapng

That error message should have told the network guys exactly where to look, but they refused to do anything, until the senior tech demonstrated the problem was only on the CMTS I was connected to. As I said, I doubt many other customers could have provided that sort of detail.

STS-134

@jknott I suppose you could say they show the failure. They show my router coming up and requesting a /59 via DHCPv6-PD, and the modem replying with a /59 prefix. Unlike in your case, where you had a "No Prefix Available" status code, I get a Success status code, so the modem did in fact acknowledge that the process of delegating the address block was successful.

But then the logs show me running an IPv6 ping test from the router's WAN port (on the modem's /64 subnet) to google.com and the replies coming back. After that, they show me running an IPv6 ping test from one of the delegated addresses, but those packets appear to disappear into a black hole and no replies ever come back.

I also gave them two traceroutes, one from the router's WAN port (successful) and the other from one of my VLANs on one of the delegated IPv6 prefix (packets go as far as the cable modem and then just stop).

JKnott

@sts-134

As I mentioned above, try pinging from elsewhere and see if it appears at your WAN interface. If it doesn't, it's not a pfsense problem.

STS-134

@jknott
I tried pinging a server on the cable modem's /64 subnet from one of the router's delegated addresses, using pfSense's ping tool, and it's failing.

What seems to be happening is that the cable modem is sending a ICMPv6 Redirect frame with both Target Address and Destination Address equal to pfSense's delegated address. The source MAC of this ICMPv6 Redirect frame is the MAC address of the cable modem and the destination MAC address of this ICMPv6 Redirect frame is the MAC address of the server.

The cable modem is then sending a Neighbor Soliciation frame asking for the IPv6 address of pfSense (presumably that it saw during the ping attempt?) and no reply to this Neighbor Solicitation is ever received.

JKnott

@sts-134

Try the pings from outside your prefix. As I mentioned, I get 2 connections through my cable modem and used 1 for testing. I have also tethered to my cell phone. The idea is to completely isolate the system. In fact, for much of my testing I used a data tap and separate computer running Wireshark, rather than using Packet Capture. When you're using something to test itself, you can sometimes get erroneous results.

If you want, you can open a chat to me, to pass me your addresses and I can try pinging them.

STS-134

@jknott
Ping to server on cable modem's /64 from computer attached to cell phone: successful
Ping to pfSense's interface on /59 delegated to it by cable modem: no reply

But this is getting interesting. I don't have a data tap, however when I run the packet capture function on pfSense's WAN port, I can see the incoming ping packets from the cell phone. I cannot see any ping replies being sent back. So either pfSense is failing to log its own replies and the cable modem has a one-way (outbound) routing problem, or it's a pfSense issue. Do I need to enable anything to make sure pfSense replies to IPv6 ping packets sent to its interface addresses for internal VLANs?

JKnott

@sts-134

Well, start analyzing those captures. You can also try running them on the LAN interface, to see if they're arriving there. You will want to ping to some device on the LAN, not the interface though. As my link shows, making a data tap is easy with a managed switch. A few years back I bought a cheap 5 port switch just for that purpose. Are there any floating rules that might interfere?

STS-134

@jknott I don't see how floating rules could possibly be the problem, given that it works for a few days before it breaks.

This did seem to start when I updated from 2.4.5-p1 to 21.02, which of course broke IPv6. I then went back to 2.4.5-p1 and loaded my configuration file that I took from before the upgrade, but IPv6 never worked properly after that. Comcast does claim that they pushed an update to the cable modem at around the same time, so I thought it definitely had to do with that. I wonder if it's possible that the configuration reload after the reinstall didn't set something properly?

JKnott

@sts-134

I'm just tossing out ideas of things to consider. Is there anyone else here on Comcast with the same problem? How do the packet capture compare when it's working vs when it's not? Given it fails after 2-3 days, it might be something with with the lease time, if it's that long. Have you captured the DHCPv6 sequence? You'll find the lease times in one of the reply XID packets. What happens if you disconnect/reconnect the WAN cable?

STS-134

@jknott No, disconnecting and reconnecting the cable does not cause the behavior to change. Even a full reboot does not seem to fix the issue. The DHCPv6 packets contain a Preferred lifetime of 86400 and Valid lifetime of 172800.

STS-134

@jknott Update: been working for about 5 days now. Was trying to get more packet capture logs and noticed something strange: ipv6 pings were failing from devices behind the pfSense, but succeeding from pfSense's ping tool from the interface associated with their VLAN. This was unlike in the past, where both seemed to succeed or fail together.

Digging into why the pings would succeed from pfSense itself but fail for devices behind the router, I looked into the firewall rules. Eventually I ended up removing a rule that was blocking IPv6 traffic, if it was sent to fe80::/10 (actually what I did was I had a rule that blocked all traffic sent to any "private address", in the sense that I had a rule at the end of the chain for traffic on that VLAN that passed all traffic sent NOT to a private address, and fe80::/10 was on the list of private addresses). Well, once I removed fe80::/10 from the definition of "private address", things actually started working, and have now been working for 5 days straight.

I'm still trying to figure out why it ever worked for so many months (actually 2+ years) with this rule in place, if that was actually the problem. It should also be noted that when traffic was refusing to route before, I never was able to get pings through, even from pfSense's "ping" tool when I selected individual VLANs as the source address, so I also wonder if Comcast actually fixed something. It's possible that there were two simultaneous issues here.

JKnott

@sts-134

Yeah, blocking link local addresses would cause problems, as IPv6 relies on them for so much.

My rule for private addresses includes the RFC1918 blocks and all ULA. As link local doesn't pass through a router, there's no need to block it.

STS-134

@jknott Yeah, I should have known that. But I simply looked at the table of "private addresses" and blindly added them all to a rule.

Do you have a clue why it would have worked for so long before failing? Why it even worked at all (for approximately 2 days) prior to February of this year? How could it have worked at all if IPv6 needs link-local addresses in order to operate?

JKnott

@sts-134

No idea.

STS-134

@jknott How do you block inter-VLAN traffic in your setup? With IPv4, you can just block RFC1918 addresses, but for IPv6, they're public, and since (for Comcast) they are subject to change, I've had to create rules to block access to every other VLAN I don't want each VLAN to have access to, i.e. reject traffic to "LAN net".

JKnott

@sts-134

You have to specifically allow routing between VLANs. So, just create rules to pass what you want.

STS-134

@jknott I don't think that works. Specifically allowing routing you want means rejecting traffic by default (when it reaches the end of your chain of rules). But if you're rejecting anything unknown, then you have no internet access, since "local" IPv6 addresses are public ones and there's no way to distinguish those from internet addresses. If Comcast changes my address block by giving my router a new block via DHCPv6-PD, then my old addresses then become external internet addresses and should be routable from any VLAN that has access to the internet.

JKnott

@sts-134

Here's an example. This is for my test LAN, but would be exactly the same on a VLAN.

I had to create both those rules to allow IPv4 & IPv6 from my test LAN to anywhere else. If I hadn't created those, I wouldn't be able to reach anything beyond the test LAN.