V6 Quality drop problem



  • Hi,

    I've been running 2.1beta0 for awhile now, with a v6 tunnel from he.net up and going for a month or so now.  When I first configured v6, it was extremely fast.  As of about Tuesday August 14th, something changed which has affected either the actual RTT through the tunnel, or affected the RRD "Quality" reporting for v6.  I'm attempting to determine which is the case.

    1. Has anyone else noticed a quality change with Tunnelbroker/he.net with a significant uptick in RTT ms times beginning at around that date?

    2. As I've been updating pfSense sporadically (when I am not traveling) I also have to question whether there was an update to RRD graphing that would have changed during that time?

    I am in Dallas, TX area and connect to the Equinix tunnel broker endpoint here in town.  I use TimeWarner cable for my local Internet service and haven't seen a change for non-v6 traffic/quality during that time.

    I'm looking for ideas on what might have happened here, so any info/discussion is welcome.  I've also sent an email to HE.net to see if anything has changed at the Equinix endpoint or with routing.

    The graphs posted below show the v6 and v4 figures respectively.


    Thanks!
    Treffin



  • The values in the RRD are what they are, nothing has changed there. HE.net's service can be a little hit and miss at times so it just may be greater load on that endpoint, or it's possible they moved that endpoint IP somewhere else, it's possible your ISP had an issue with a peering and had to drop it temporarily, it's possible for business reasons they or your ISP took down a peering and now it has to take a much longer path, there are numerous potential causes. Try pinging the v4 IP that's the endpoint of the tunnel and see how it responds. From what I've seen, it's always right in line with the ping times on v6 within the tunnel.



  • Thanks cmb!  I did some research and found the following via ping and trace route:

    [2.1-BETA0][admin@storm.xxx.xxx]/root(1): ping 216.218.224.42
    PING 216.218.224.42 (216.218.224.42): 56 data bytes
    64 bytes from 216.218.224.42: icmp_seq=0 ttl=55 time=50.247 ms
    64 bytes from 216.218.224.42: icmp_seq=1 ttl=55 time=50.202 ms
    64 bytes from 216.218.224.42: icmp_seq=2 ttl=55 time=50.382 ms
    64 bytes from 216.218.224.42: icmp_seq=3 ttl=55 time=47.988 ms

    So that looks sort of close to what I'm seeing recently in the RRD graphs.  It certainly wasn't that slow in the beginning, but it appears something has changed with the routing/peering between TW and HE.  Here is the traceroute, which looks ugly…especially towards the bottom.  It appears that the traffic may be forwarded to LAX, to PHX and back to DFW, or at least it's being shoved through those core router nets.

    dtuc-mac-e:~ dt$ traceroute 216.218.224.42
    traceroute to 216.218.224.42 (216.218.224.42), 64 hops max, 52 byte packets
    1  storm (10.0.1.1)  1.567 ms  0.642 ms  0.562 ms
    cpe-xx-xx-xx-x.tx.res.rr.com (x.x.x.x)  10.844 ms  21.627 ms  13.379 ms
    3  24.164.210.253 (24.164.210.253)  9.240 ms  8.634 ms  9.440 ms
    tge1-4.dllatx40-tr02.texas.rr.com (24.175.38.32)  25.671 ms  20.352 ms  23.774 ms
    be26.dllatx10-cr02.texas.rr.com (24.175.36.216)  20.657 ms  28.513 ms  24.039 ms
    6  24.175.49.8 (24.175.49.8)  20.945 ms  23.499 ms  24.511 ms
    ae-2-0.cr0.hou30.tbone.rr.com (66.109.6.108)  21.874 ms  23.194 ms  23.427 ms
    ae-0-0.pr0.dfw10.tbone.rr.com (66.109.6.181)  21.586 ms
        107.14.17.141 (107.14.17.141)  20.466 ms
        ae-0-0.pr0.dfw10.tbone.rr.com (66.109.6.181)  21.870 ms
    tengigabitethernet2-1.ar4.dal2.gblx.net (64.211.60.81)  19.162 ms
        66.109.9.214 (66.109.9.214)  15.513 ms  17.511 ms
    10  64.209.105.42 (64.209.105.42)  57.022 ms  61.205 ms  56.050 ms
    11  10gigabitethernet1-3.core1.lax2.he.net (72.52.92.122)  58.119 ms  58.108 ms  64.109 ms
    12  10gigabitethernet2-3.core1.phx2.he.net (184.105.222.85)  62.391 ms  61.794 ms  73.904 ms
    13  10gigabitethernet5-3.core1.dal1.he.net (184.105.222.78)  56.680 ms  63.593 ms  57.502 ms
    14  tserv1.dal1.he.net (216.218.224.42)  52.135 ms  52.867 ms  50.681 ms

    In any case, it does seem that the problem is HE<=>TW related, now that I'm into it a bit further.  Thanks for the input!

    Treffin



  • Wow yeah, that's one heck of a path. Judging by the latency, I suspect that is accurate. I'm in Austin at the moment and I have basically the exact same connectivity and latency to that .42 host as you have, about 50 ms, +/- 5 ms. I'm going ATX > DAL > LA > PHX > DAL. Terrible path…

    One of our developers is about 40 miles outside of Chicago, to get to the Chicago HE.net endpoint, his traffic goes to NYC and back. At home I'm about 300 miles away from Chicago using the same one, and my latency is about the same if not a little better than his. Not always the best routing in the world on those, unfortunately...


Locked