HE Tunnel will not come back up
-
Setup: SG-5100 with connections to both Verizon and Comcast/Xfinity.
I had some packet loss on Comcast connection, but they eventually fixed it. However the HE Tunnel never did. I cannot ping the gateway. I have another HE Tunnel on Verizon and that is fine
I double checked everything. I even rebuilt the tunnel. I see nothing in the logs. I ran a packet capture and only see traffic going Out. I contacted HE and they said they see traffic both ways.
This Tunnel has been working for years.
Help??
-
Hi,
A test on the he.net side :
Goto the IPv6 tbunnel settings page on https://www.tunnelbroker.net account.
Test : The Client Ipv4 is your actual WAN IP ?Another test :
Manually assign some random IP. It should show an error.
Re assign your real WAN IPv4. It should be good again.
If your WAN IPv4 is dynamic, check that the WAN IPv4 "dyndns" is working :
Disable the GIF iterface, and enable t again :
Double check the 2 IPv6 endpoints with your he.net settings page.
Check the IPv4 destination tunnel = GIF remote address. -
@Gertjan Thank you for the hints and help. The HE trick about changing the IP is one I didn't know.
So I ran through everything:
- Changed the IP on HE it failed, changed it back it was good
- Confirmed the IP is correct on HE and in Dynamic DNS
- Confirmed the Tunnel IPv6 addresses are correct in GIF
- Confirmed the GIF remote address
- Disabled and re-enabled
It really feels like something is blocking the replies (or its not being routed correctly somewhere).
I'm not sure if its HE, Comcast, or pfSense.
-
Alright I did another test. Since I have two tunnels, one on each provider, I swapped the config at HE. The bad tunnel on Comcast I moved to Verizon, and the good tunnel on Verizon I moved to Comcast.
So this eliminates Comcast as a problem. It must be either pfSense or HE. Now to figure out which it is.
-
Are they using the same IPv4 tunnel IP at he.net side ?
-
@gertjan As far as I can tell, yes
-
@a4ehusker
check ipv6 routes eventually post it here
Diagnostics / Routes -
Seems correct:
Internet6: Destination Gateway Flags Netif Expire default 2001:470:CCCC:DDDD::1 UGS gif1 ::1 link#8 UH lo0 2001:470:AAAA:BBBB::1 link#13 UH gif0 2001:470:AAAA:BBBB::2 link#13 UHS lo0 2001:470:CCCC:DDDD::1 link#14 UH gif1 2001:470:CCCC:DDDD::2 link#14 UHS lo0
gif0 is the bad tunnel
gif1 is the good tunnel -
@a4ehusker
where is the gateway for gif0? there should be a UGS (UP+Gateway+Static) for gif0 i suppose if it's another tunnel -
@kiokoman I'm not sure - since its down the Gateway Group is set to the one that's up.
But if I force the route for the bad tunnel, it does show up in the Gateway for default... but still doesn't work. I can see all the traffic going out but zero traffic coming in. I'm still not sure who's problem it is, but I do lean HE. But since they said they saw traffic out on their end, I don't know if I have enough to have them check anything else.
-
@a4ehusker
MTU maybe? for my pppoe i have MTU to 1472 and MSS to 1440 for example -
@kiokoman Well this tunnel was working, and the second tunnel is working fine... so I doubt it but I guess it won't hurt to try.
-
@gertjan I've figure out the issue.
I have a modem (Netgear CM1150V) that allows LAGG/LACP connections. It was broken until they released a firmware update, which I noticed shortly after the ping issue with Comcast. Once Comcast fixed my line, I decided to set that up. I had it set up with a prior modem, but never this one due to the firmware bug.
So I undid the LAGG/LACP connection, and just made it failover, and suddenly the HE tunnel came back up! I do not know why it was not working. I'm not sure if the modem has a bug with sending back reply packets - but given that IPv4 works fine otherwise and the tunnel runs over IPv4 I think the issue is in pfSense.
Where do I submit a bug report over this?
(This was driving me crazy because it made no sense, but now it totally does!)
-
@a4ehusker I am having this same issue. HE tunnel works fine on a Netgear CM1200 with LAGG/LACP off. The moment I turn LAGG/LACP on in the Netgear, and then migrate my WAN connection to the LAGG connection on my pfSense box, the HE tunnel drops.
I've ran pfsense continuously since 2011 and HE tunnels since 2009 so pretty familiar with both. I hypothesize the issue is either the modem dropping protocol 41 with LAGG enabled, or a bug in pfSense encapsulating gif/6to4 over LAGG.
I'm running 21.02.2-RELEASE on a Netgate SG-8860. I also swapped the Netgear CM1200 with a Netgear CM1100 - same behavior. Again, Tunnel works fine without LAGG/LACP enabled, then 100% packet loss with LAGG/LACP enabled. I ran packet capture on gif0 and lagg0 and can see the traffic going to Tunnelbroker but zero replies/return traffic from HE.net.
I contacted HE.net support and they don't see any of the traffic coming in.
As soon as I disable LAGG on the modem and on pfsense, with zero other changes and of course no physical layer changes, tunnel pops right back up 0% packet loss no routing issues whatsoever.
Grr……..
-
@akghetto INTERESTING! I'm glad I am not the only one experiencing this.
I use to run a Motorola MB8600 with HE & LAGG without issues, but that was a year and a half ago. So might be the modem (since the commonality is Netgear), or maybe something changed with pfSense.
-
@a4ehusker I opened a bug report with pfsense. Since I had the issue as either pfsense or the Netgear, they closed the bug but gave me a pointer to try and isolate it further. The bug feedback was
Not enough evidence here to conclude that it's a bug in FreeBSD or pfSense. You could test it further by not enabling LAGG on the modem, but setting the pfSense end to use a passive LAGG style such as failover which does not require any special setting on the modem.
I followed this advice and zero packet drop on the tunnelbroker gateway. IPv6 tunnel traffic routes correctly, 0% packet loss over the tunnel. So, strongly suspecting the Netgear as the culpret.
Since I bought this modem brand new just two weeks ago it comes with 90 days of support. I've opened a ticket tonight with Netgear along with all my tcpdumps and troubleshooting steps, including isolating it to the modem. I'll let you know what develops.
-
@akghetto Awesome! I wonder if not using a LACP type connection on pfSense would eliminate the issue. Still, as I said I've used a LAGG with a different vender so kinda figured it was something with Netgear.
My modem is a year old (it took them that long to fix the bug where a LAGG connection would freeze after 24 hours). Hope you get some answers!
-
@a4ehusker Well, Netgate support basically stinks. They called me over the phone to confirm the problem, said they'd follow-up, and never did. I'm at my 30-day return window tomorrow so I'll be sending this back to Amazon as defective. Stinkage.
-
@akghetto Ahhhhh yuck. That sucks, but honestly not surprising. Well at least others will have an answer here about what not to do.
-
Netgate or Netgear.. What did you buy from netgate - seems from reading this thread you bought a netgear modem?
I don't see how you would of gotten a 8860 recently?? Via amazon?
Confused..