he.net problem tunnel, one works the other does not
-
I'm hoping someone can help me diagnose a problem I'm having with a he.net tunnel. I have a dual WAN system, setup according to the pfsense docs for dual WAN, including ipv6. One of the tunnels, the primary ipv6 gateway, works fine and has for years. I just discovered that the secondary tunnel is only partially working.
The problem tunnel was setup in 2015 and worked at the time, although when I added a second WAN connection this tunnel became the backup ipv6 tunnel and has been little used since I setup the second in 2016. I only discovered a problem a couple of weeks ago. This second tunnel works fine for pings and tracert, but not for http/https. I initially suspected asymmetric routing but I think I've ruled that out as the problem. I also suspected an MTU problem because this is a pppoe link and MTU seemed to be an issue that others have had with pppoe. I have working DNS on both tunnels.
If I run a simple command, such as curl -6 ip6.me/api/ through the problem tunnel the connection times out. In a packet capture on that interface, I can see the initial 3 way handshake complete, then the http command goes out but no response comes back. After a few of retries it gives up. If I do this but capture the traffic on the other tunnel's interface I get no packets, which I think eliminates asymmetric routing as the problem (not positive on this point).
If I run a trace with the same command on the working tunnel's interface everything looks basically the same except the response comes back as expected and of course the source address is that of the working interface.
I tried changing the MTU on the Hurricane Electric website advanced page as low as 1280 but that hasn't made any difference. Ifconfig shows MTU on the gif interfaces for both tunnels to be 1280. I've also tried changing the MTU and MSS on the interface to various lower values with no success. The curl command only returns a few hundred bytes, less than the MTU so I'm not sure the MTU is a problem.
The backup WAN is a pppoe DSL connection while the working WAN is a cable connection. I don't have any problems with ipv4, so this seems to be only related to the he.net tunnel on the pppoe WAN. Both he.net tunnels are setup the same way. I am running these over routed /48 because I have several vlans and needed more than one /64. The internal vlans are using DHCPv6 and use the problem tunnel's ipv6 /48 as the basis for the ipv6 addresses. The working tunnel uses NPt to translate the addresses and this seems to work without issue.
I tried setting up another tunnel to replace the problematic one, the new one using NPt and kept the DHCPv6 addressing the same as before, but had the same problems, which follow the pppoe connection.
Any ideas what I could be missing? I have a feeling it's something obvious, but haven't hit on the solution yet.
-
Like you, I've set up a tunnel ages ago.
I had to make my ISP router answer to ping - or firward ping to pfSense, and have it replying. Otherwise my WAN IP isn't accepted.Basic checks are ok ? Like : on the 'tunnelbroker' side, the "Client IPv4 Address:" IS your WAN IP ?
I've no experience with multiple WAN's so, are you sure the GIF setup uses the correct outgoing WAN interface, the one with the "Client IPv4 Address:" set up to the correct IP ?
What happens when you temporary go back to one WAN ? The 'simple' setup.
Also, if issues arrive, check always https://www.tunnelbroker.net/status.php and the he.net user forum. Tunnels can have issues ones in a while.
I'm not using NPt, I just carve out a first /64, assign it to for my LAN, a second /64 for my next LAN, etc.
I never had to use VLAN's myself, physical NIC's are not an issue these days, and we all love 'many ports'.@wbond said in he.net problem tunnel, one works the other does not:
curl -6 ip6.me/api/
Strange.
I would use an option in that command to bind it to a specific IPv6 interface see https://www.freebsd.org/cgi/man.cgi?query=curl
Or do you policy route I¨Pv6 traffic ? -
@gertjan thanks for the reply.
I'm certain that the pppoe tunnel worked years ago when I set it up as I only had one WAN at the time. When I added the second WAN it was 5x faster so the new WAN became the default gateway and the pppoe WAN became the backup and barely used.
I have checked that the client IPv4 address is correct for both tunnels. I'm using the dynamic DNS method to keep those IPv4 addresses updated. If the IPv4 address is wrong the gateway status goes offline. What I forgot to mention in the original post is that the gateway status for the problem tunnel is online, which makes sense because pings are working.
I have checked the GIF status multiple times thinking something has to be wrong, but they're both setup the same. At first, I thought the problem had to be something wrong in the GIF settings, but I can't see anything wrong there. I will check again.
If I bring the working tunnel down, or unplug the ethernet cable from the cable modem, then pfsense will switch the IPv6 gateway to the pppoe tunnel as expected, because it thinks it's online. When this happens the behavior doesn't change, pings and tracert work but http/https does not. In this can the IPv4 gateway switches over and works fine.
Sorry, I should have mentioned the curl command was being run from a Windows machine, and yes using policy routing. If I run the curl command directly on the pfsense box's command line it just hangs until I get a 504 gateway time-out, curl --interface gif0 ip6.me/api/. If I run the curl command on gif1 is returns the result almost immediately.
Thanks again.
-
@wbond said in he.net problem tunnel, one works the other does not:
If I bring the working tunnel down
Not related to your question, but, in this case, your tunnel will fail, right ? The GIF setup is 'hard coded' to an interface name, probably your main '5x faster' WAN.
@wbond said in he.net problem tunnel, one works the other does not:
pfsense box's command line it just hangs until I get a 504 gateway time-out, curl --interface gif0 ip6.me/api/
There you have your error : the GIF tunnel doesn't work.
I have :
[2.5.2-RELEASE][root@pfsense.right-here.net]/root: curl --interface gif0 ip6.me/api/ IPv6,2001:470:1f12:xxx::2,Remaining fields reserved for future use,,,
where 2001:470:1f12:xxx::2 is my local tunnel IP.
-
If I bring the cable tunnel (GIF1) down the pppoe tunnel (GIF0) becomes the default gateway, because pings are working pfsense thinks the pppoe tunnel is working ok. In this case I lose IPv6 http/https connectivity, but IPv6 pings and tracert still work. When I bring the cable tunnel back up it becomes the default gateway again and everything works normally. Not sure if this answers your question.
Yes, the error is the GIF0 tunnel doesn't work with http/https, but pings/tracert do work. The part I haven't figured out is why is it not working for http/https? The GIF1 tunnel works as expected and passes any IPv6 tests I've thrown at it.
thanks
-
Below is a packet capture from the gif0 interface when trying to run "curl --interface gif0 ip6.me/api/" on the webgui diagnostics > command prompt. The 2001:470 address is the client end of the IPv6 tunnel. This creates the connection, and creates a pair of states on the firewall, but then the connection times out with a "504 gateway time-out".
All the while dpinger is able to ping a google ipv6 address (2001:4860:4860::8844) through the same interface, so the gateway shows online. The same curl command on gif1 completes without any error.
I'm not good at reading packet captures, but don't see any reason as to why it fails. It has something to do with the pppoe connection, I think, but don't understand why. If I create a completely new tunnel it will fail on the same way on the same connection. The isp is Centurylink in case it matters. The firewall is running on a Supermicro motherboard with 4 igb ports, so it's fairly decent hardware.
Any ideas of what else I could check would be most welcome.
17:36:23.962116 AF IPv6 (28), length 84: (flowlabel 0x4bdcc, hlim 64, next-header TCP (6) payload length: 40) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [S], cksum 0x6f27 (correct), seq 1841097079, win 65228, options [mss 1220,nop,wscale 7,sackOK,TS val 419389964 ecr 0], length 0
17:36:24.016422 AF IPv6 (28), length 84: (flowlabel 0x04ecc, hlim 54, next-header TCP (6) payload length: 40) 2001:4838:0:1b::201.80 > 2001:470:xxxx:yyyy::2.30098: Flags [S.], cksum 0x3a7c (correct), seq 2521845452, ack 1841097080, win 65535, options [mss 1440,nop,wscale 6,sackOK,TS val 737747318 ecr 419389964], length 0
17:36:24.016478 AF IPv6 (28), length 76: (flowlabel 0x4bdcc, hlim 64, next-header TCP (6) payload length: 32) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [.], cksum 0x66fa (correct), seq 1, ack 1, win 515, options [nop,nop,TS val 419390018 ecr 737747318], length 0
17:36:24.016619 AF IPv6 (28), length 150: (flowlabel 0x4bdcc, hlim 64, next-header TCP (6) payload length: 106) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [P.], cksum 0xa1d0 (correct), seq 1:75, ack 1, win 515, options [nop,nop,TS val 419390018 ecr 737747318], length 74: HTTP, length: 74
GET /api/ HTTP/1.1
Host: ip6.me
User-Agent: curl/7.76.1
Accept: /17:36:24.380784 AF IPv6 (28), length 150: (flowlabel 0x4bdcc, hlim 64, next-header TCP (6) payload length: 106) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [P.], cksum 0xa063 (correct), seq 1:75, ack 1, win 515, options [nop,nop,TS val 419390383 ecr 737747318], length 74: HTTP, length: 74
GET /api/ HTTP/1.1
Host: ip6.me
User-Agent: curl/7.76.1
Accept: /17:36:24.910777 AF IPv6 (28), length 150: (flowlabel 0x4bdcc, hlim 64, next-header TCP (6) payload length: 106) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [P.], cksum 0x9e51 (correct), seq 1:75, ack 1, win 515, options [nop,nop,TS val 419390913 ecr 737747318], length 74: HTTP, length: 74
GET /api/ HTTP/1.1
Host: ip6.me
User-Agent: curl/7.76.1
Accept: /17:36:25.025399 AF IPv6 (28), length 84: (flowlabel 0x04ecc, hlim 54, next-header TCP (6) payload length: 40) 2001:4838:0:1b::201.80 > 2001:470:xxxx:yyyy::2.30098: Flags [S.], cksum 0x368c (correct), seq 2521845452, ack 1841097080, win 65535, options [mss 1440,nop,wscale 6,sackOK,TS val 737748326 ecr 419389964], length 0
17:36:25.025444 AF IPv6 (28), length 76: (hlim 64, next-header TCP (6) payload length: 32) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [.], cksum 0x62c0 (correct), seq 75, ack 1, win 514, options [nop,nop,TS val 419391027 ecr 737747318], length 0
17:36:25.770775 AF IPv6 (28), length 150: (flowlabel 0x4bdcc, hlim 64, next-header TCP (6) payload length: 106) 2001:470:xxxx:yyyy::2.30098 > 2001:4838:0:1b::201.80: Flags [P.], cksum 0x9af5 (correct), seq 1:75, ack 1, win 515, options [nop,nop,TS val 419391773 ecr 737747318], length 74: HTTP, length: 74
GET /api/ HTTP/1.1
Host: ip6.me
User-Agent: curl/7.76.1
Accept: / -
I'm revisiting this because I haven't found a solution. Previous posts left out that I'm using pfsense 2.5.2.
In the packet capture I think the 3 way handshake and state creation rule out asymmetric routing. Is that true? What else could prevent a response to the curl request from coming back? The packet size of a response that works on the other GIF connection is only 357 which is well below the MSS of 1220, so I think that rules out a MTU problem on the pppoe WAN.
Both asymmetric routing and MTU problems have seemingly been causes of others having similar problems so I'm trying to rule those out. I'm not sure if the problem is something in my pfsense configuration, but that seems to me the most likely problem.
-
@wbond I don't have a 2nd IP to test with and he will only let you bring up a 2nd tunnel from another IP.
Are you trying to bring up a 2nd tunnel with the same /48 or a different /48? Are you going to the same HE pop?
-
@johnpoz thanks for the reply.
They are two different /48's on different HE pops, Denver and Fremont. You are correct in that you need a 2nd IP to bring up a second tunnel. I tried creating a new tunnel in another pop and had the same result. The problem follows the pppoe / DSL connection. I don't think it could be an HE.net problem, but don't really see how it could be a Centurylink problem either. This leads me back to my pfsense configuration being the likely problem, but nothing I've tried changes the behavior.
I have a backup firewall box and think I'll try starting from scratch with a single WAN on the pppoe connection and see if that behaves correctly.
-
I have a backup firewall box and think I'll try starting from scratch with a single WAN on the pppoe connection and see if that behaves correctly.
On my backup firewall I deleted all references to the second WAN interface, 2nd GIF, extra gateways, etc., leaving only the pppoe connection and one HE net connection. The behavior is the same, pings and traceroute work but http/https does not. I think this rules out the multi WAN configuration as being the problem, but still no closer to figuring out what the problem is.
-
@wbond
Just one WAN and still issues ?
Most of us have one WAN, and it works.
I was using pppoe myself in the past, but isn't possible any more, as my ISP forces me us to use their xDSL router.
Can you switch your modem to router mode and see if the issue disappears, thus related to 'pppoe' ?No firewall rules on WAN are needed to make this work.
Exception : your real WAN IPv4 must reply on pings from the tunnel server to get your WAN IPv4 accepted as a tunnel endpoint. -
@gertjan yes, issues even with just the one WAN.
I currently have the DSL modem in bridge mode. I had not considered switching it, but that is worth trying, thanks for the idea.
Yes, the only related rule on real WAN is the rule allowing pings from HE.net.
-
@wbond
Ok, great.
The dull thing about "ISP router in in router mode" is : it should work, as my he IPv6 tunnels are all up right now @work and also the one @home.
I'm using the he.net POP in Paris.Note : he.net is supplying me with IPv6, as my ISP doesn't know what that is (@work) - or, @home, they just supply on /64. so none are available for the LAN's.
Btw : always check the tunnel status. And if doubt, the forum on he.net.