IPv6 connectivity from LAN is lost after PPPoE reconnect



  • Hi,

    I have a working (!) DualStack configuration working for years now with Deutsche Telekom. Configuration see below.

    What I see is that every now and then while my LAN networks still get IPv6 addresses, outbound IPv6 communication is not working. After watching this for a while it seems to be connected to PPPoE reconnects. Once the PPPoE is reconnected IPv4 works like a charm but IPv6 does not.

    What does work is that when I go to the LAN interface configuration and change the track-IP from e.g. 0 to 5, the IPv6 on the pfsense LAN interface disappears and after a while comes back (but still the same IP to the best of my knowledge so the prefix-id remains 0 and does not change to 5). BUT suddenly IPv6 from the clients behind the LAN works again.

    Is there any idea you guys can have? At first I thought that the prefix delegation (done via DHCPv6 on the WAN interface) is screwed up or the new prefixes are not attached to the LAN interface. But since the above workaround works without having me to dis-/reconnect the PPPoE it looks as if the prefixes are reaching the WAN interface just fine.

    I am at a complete loss here. Not sure if this started with 2.4.5 but that is an update I recently made and I can "reproduce" this since then. Might have been there before. By reproduce I mean that this only seems to happen if the PPPoE reconnect is triggered due to a real loss of connectivity or from the carrier. If I hit disconnect/reconnect it does not happen. Maybe something with the monitor scripts?

    What "troubles" me is that the LAN Ipv6 on the pfsense interfaces (I have three LAN interfaces with different VLANs) do not change. It appears as if everything is working but something in the routing engine or pf is messed up.

    General setup information:

    Two WAN connections.

    • One Dualstack IPv6&IPv6 Deutsche Telekom with dynamic IP
    • One IPv4 with static IPv4

    Logfile showing tonights PPPoE link loss. Maybe you can see something in it that I failed to recognize:
    log.txt

    Configuration for the main WAN and the LAN interfaces:

    1807b3b2-4d99-4b93-b3d0-9e357055e712-image.png

    1bfd9152-1826-4db0-9700-f29952459921-image.png

    9b3b2220-b138-44f0-8d8b-a72753976c27-image.png

    a24f0900-1c11-4cd7-8a75-c2d63c2ca6f6-image.png

    Would be very grateful for any hint on how to debug this.

    Regards,
    JP



  • Hi JP!

    Could you resolve your issue in the meantime?

    I can confirm everything you have written with the dual stack configuration of "Deutsche Telekom". I had it running for years and this year I noticed the same effects as you describe.

    If you try to reach the IPv6 world directly from the pfSense, it works all the time, even after the clients won't get through anymore.

    If someone from the development team wants to have a look on this - which log files are relevant?

    All the best!



  • @mphilippi

    Do some packet capture, to see what's actually going out.



  • I believe I can explain that (I have the same problem): Your clients generate their addresses via SLAAC. When the WAN prefix changes, radvd announces the new prefix and the clients generate a new IP address with it. Unfortunately, the old prefix doesn't get deprecated and the clients still use it until the preferred lifetime is over, but the ISP of course doesn't route the old prefix any more for you.

    You can e.g. check the IP addresses incl. the lifetime information on Windows with "netsh interface ipv6 show addresses". In this situation it shows 2 preferred public and 2 preferred temporary IPv6 addresses, the the old and with the new prefix. When you notice the problem the next time, go to a client that has this problem and explicitly specify the correct (new) source address (on Windows -S) when trying to ping something on the Internet. I expect that this will work.

    I haven't found a setting that avoids this problem. The only workaround I found to far is to decrease the "Default preferred lifetime" so the client deprecate the address faster. But ideally radvd would just deprecate the old prefix (i.e. announce it with preferred lifetime 0). I never do a manual disconnect/reconnect (and I don't want to reconnect now to try it out ;) ), but @j-koopmann says that it doesn't happen then, so I assume that radvd does exactly that in this situation. Maybe @mphilippi you can check this when you are currently experimenting anyway.



  • @HG said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    I haven't found a setting that avoids this problem.

    On the WAN page, select Do not allow PD/Address release.



  • @JKnott Thanks, I already know that this is your favorite solution, but this doesn't help because the prefix still changes. ;) I'm using this setting as well, but doesn't make any difference here as it has no effect with many ISPs like mine, especially with PPPoE connections.

    Edit: As we are talking about connection interruptions here, this setting doesn't make any difference anyway, because when the connection is interrupted, no release can be sent anyway, so the effect is exactly the same as with the "Do not allow PD/Address release" setting enabled.



  • Hi guys,

    @mphilippi I have not resolve the issue in the sense that I know what happened and I changed anything. It has not happened to me for weeks now (so it is very hard do debug and @JKnott hence not easy to take a packet capture). So I am not sure if the underlying problem has vanished or I am just lucky for weeks.

    @HG : I am aware of what your are saying and have noticed this in the past. For reasons in the IPv6 world this in the past hast NOT led to the problem that ipv6 connectivity was lost (even if my Mac has multiple IPv6 one of which is the old one that cannot be routed anymore, still a new ping6 would work like a charm).

    I am pretty sure I disconnected and reconnected LAN/WAN in these scenarios and as such would hope that in this case SLAAC would not result in the old prefix being negotiated in addition to the current one. And still in my scenario ipv6 was gone until I changed the Prefix ID in the LAN interface settings and waited for a few seconds....



  • Thanks for the replies!

    I've seen a loss of IPv6 connectivity not only with the "home user plan" with a changing WAN prefix but also with the "business plan" with a static WAN prefix. Both with the same provider (Telekom).

    I just checked the "Do not allow PD/Adress release" option for both pfSense boxes and will report if this has fixed the issue.

    The clients get their addresses either via SLAAC/unmanaged or DHCPv6/managed, depending on the VLAN. Normally it takes one day for the clients on the changing prefix plan to lose the connection (the prefix has changed then, too). With the static prefix plan, it takes a week or two and the prefix remains the same, of course.

    @HG Thanks for your in-depth answer. I will try your method as soon as the clients lose IPv6 connection again. I suspect that this will be the case tomorrow.



  • @j-koopmann said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    It has not happened to me for weeks now (so it is very hard do debug and @JKnott hence not easy to take a packet capture).

    It would be hard with the pfSense Packet Capture, but not with a managed switch and Wireshark. I configured a 5 port switch as a data tap and leave a computer running Wireshark up for as long as it takes.



  • @HG said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    You can e.g. check the IP addresses incl. the lifetime information on Windows with "netsh interface ipv6 show addresses". In this situation it shows 2 preferred public and 2 preferred temporary IPv6 addresses, the the old and with the new prefix. When you notice the problem the next time, go to a client that has this problem and explicitly specify the correct (new) source address (on Windows -S) when trying to ping something on the Internet. I expect that this will work.

    I haven't found a setting that avoids this problem. The only workaround I found to far is to decrease the "Default preferred lifetime" so the client deprecate the address faster. But ideally radvd would just deprecate the old prefix (i.e. announce it with preferred lifetime 0). I never do a manual disconnect/reconnect (and I don't want to reconnect now to try it out ;) ), but @j-koopmann says that it doesn't happen then, so I assume that radvd does exactly that in this situation. Maybe @mphilippi you can check this when you are currently experimenting anyway.

    As expected, the link with the dynamic prefix does not work with IPv6 anymore.
    The WAN prefix changed this night and the pfSense uses a new address with the new prefix (ping from the webgui works), however the clients still receive the old WAN prefix.

    The lifetime on the client resets itself every few seconds to 86400 and starts counting down again.

        inet6 2003:f1:6704:a81:****:****:****:****/64 scope global dynamic mngtmpaddr noprefixroute 
           valid_lft 86397sec preferred_lft 14397sec
    

    The pfSense has got the new prefix. IPv6 address for this interface:

    IPv6 Address
        2003:f1:670a:c981:****:****:****:****
    Subnet mask IPv6
        64
    


  • @mphilippi said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    The WAN prefix changed this night and the pfSense uses a new address with the new prefix (ping from the webgui works), however the clients still receive the old WAN prefix.

    The problem is the clients don't change their prefix when a RA with a new prefix is sent out. I don't see how pfSense can do anything to get around that.



  • @JKnott said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    @mphilippi said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    The WAN prefix changed this night and the pfSense uses a new address with the new prefix (ping from the webgui works), however the clients still receive the old WAN prefix.

    The problem is the clients don't change their prefix when a RA with a new prefix is sent out. I don't see how pfSense can do anything to get around that.

    Do you know why the lifetime of the IP is resetting itself every few seconds? When I issue a new "ip addr eth0" command on the client, the lifetime is always around 86400s.



  • @mphilippi

    Not off hand. RAs are sent out frequently and they are what tell the client what the prefix is. With SLAAC, the lifetime of privacy addresses is about a week, but the consistent address shouldn't change, other than the prefix.



  • @mphilippi, do you still have only one IP (with the old prefix) or multiple, with old and new prefix?

    My next step would definitely be to do a network capture on your client and filter for Router advertisements. As the lifetime is always reset, probably something still sends RAs with the old prefix. 86400/14400 are the default lifetime values on pfSense.

    A little story: I once had the strange situation (actually very similar to yours) that something in my LAN still advertised an old prefix. After a network capture on a client it turned out, that a second pfSense that I have (as a VPN gateway) that was actually only a client in the LAN (getting its own IP via SLAAC from RAs of the main pfSense) with RAs theoretically disabled (you even don't see the "DHCPv6 Server & RA" UI when you don't have a static IP on the interface), sent out RAs for the prefix it got via RAs itself. :D At least at that time pfSense had a default configuration that sent out RAs on LAN even if you don't see it in the UI. So the solution was, give the LAN interface a static IP to enable the "DHCPv6 Server & RA" UI, disable the RAs there and switch the LAN interface back to SLAAC. :D



  • @HG said in IPv6 connectivity from LAN is lost after PPPoE reconnect:

    @mphilippi, do you still have only one IP (with the old prefix) or multiple, with old and new prefix?

    Only one IP with the old prefix.

    I'll do the network capture, but there's only one pfSense running per site (changing prefix plan and business static prefix plan of the provider). The sites are not interconnected via VPN.
    Since others report the same problem in combination with my provider, I still believe it is a pfSense bug.

    EDIT:
    To make the IPv6 connection work again, it is sufficient to just go to Sytem -> Routing and click on save without changing anything. Thereafter, the clients receive the new address and now show both addresses with the network command. However, the old one is not getting renewed every few seconds and starts to timeout after 24h.
    But this will only work until a new prefix is issued by the provider...



  • Ok, I ran wireshark on a client and it revealed that the router advertisements come from the pfSense at an interval of 10-15s. The old IP is in the RA.



  • OK, so we know that it definitely comes from the pfSense. That's indeed very strange. But can't be a general issue, because here it works. I have the problem described earlier that the clients still use the old IP address for some time, but the pfSense starts announcing the new prefix as soon as the reconnection happens, e.g. after the DSL Sync got lost. Must be something different in your scenario/configuration...



  • @HG

    You also have to look at how often DHCPv6-PD executes and whether it does after PPPoE comes up. I have a capture of DHCPv6 at boot up and the first renewal and it's over 22 hours between them.


Log in to reply