IPv6 changes in 2.2.5
-
Yeah it's been completely broken for about a day now. Certainly not related to upgrade.
Thanks! I'll stop trying to troubleshoot. :D
-
I too am having problems with IPv6 in 2.2.5. My previously rock solid IPv6 connection is now disconnecting after only a few hours.
Different from others in this thread, I'm not on PPPoE. I just have a normal Comcast connection.
In that case, the new code I contributed to 2.2.5 cannot be to blame, as it is PPPoE specific.
You could usefully try:
ps -auwwx | grep -E -e '(dhcp6c|rtsold)'
clog /var/log/dhcpd.log | grep dhcp6c | tail -n 40As explained above, the first command displays details of any running dhcp6c and rtsold processes, whilst the second one displays the last 40 lines of available dhcp6c related messages.
You should have one dhcp6c process for every interface that uses DHCP6 and/or DHCP-PD. Likely, this will only be your WAN interface.
I have seen dhcp6c disappear once on my WAN interface in 2.2.5, apparently dying silently, so I have no idea whether that was a one-off or not.
-
…
There is a nasty 'immediately after boot' issue with IPv6 PPPoE that I never got round to characterising and reporting, which is not caused by the new code in 2.2.5 but might interact negatively with it. On my system at least, with the PPPoE parent interface on a vlan, the MAC address of the PPPoE interface is random for the first connection after boot, resulting in a random link local address for the PPPoE interface. Once the system is booted, this settles down to a predictable MAC address based on the system hardware and PPPoE works reliably. It is possible that the new code in 2.2.5 allows the random link local address to be used for IPV6CP, but the predictable link local address for dhcp6c, which will likely result in strange IPv6 brokenness.The workround for this issue is to Disconnect the PPPoE interface in Status -> Interfaces, then Connect it again. All will then work properly until the system is next booted.
...I "suffer" from this phenomenon too…
Could you tell if [System: Advanced: System Tunables](net.inet6.ip6.use_tempaddr OR net.inet6.ip6.prefer_tempaddr) is/should be related and should be a fix ?
I tested and changing the value to '0' has no effect.
-
@hda:
On my system at least, with the PPPoE parent interface on a vlan, the MAC address of the PPPoE interface is random for the first connection after boot, resulting in a random link local address for the PPPoE interface. Once the system is booted, this settles down to a predictable MAC address based on the system hardware and PPPoE works reliably.
I "suffer" from this phenomenon too…
Is your PPPoE parent interface a vlan or a physical interface?
@hda:
Could you tell if [System: Advanced: System Tunables](net.inet6.ip6.use_tempaddr OR net.inet6.ip6.prefer_tempaddr) is/should be related and should be a fix ?
I tested and changing the value to '0' has no effect.
Those two sysctls control IPv6 Privacy Extensions (see RFC 4941 for details). I'm pretty certain they are not involved here, especially as their default value under FreeBSD is 0. If these sysctls were set to 1, the link local address would always be generated using privacy extensions.
I instrumented up interface_ppps_configure() and have verified that the MAC addresses of the parent interface (the VLAN) and the parent of the parent interface (the physical interface) are set as I expect and the ngctl msg <parent interface="">: setautosrc 1 call made by interface_ppps_configure() has succeeded. I've also verified the two sysctls you mention are set to 0, as I expect.
I added a six second delay in interface_ppps_configure() just before the call to invoke mpd5 if the system is booting, but that didn't correct the behaviour.
I have a suspicion that the IPv6CP code in mpd5 is responsible for this problem. The current code arguably goes against the spirit of section 4.1 of RFC 5072, which notes (emphasis added):
The non-zero value of the tentative interface identifier SHOULD be chosen such that the value is unique to the link and, preferably, consistently reproducible across initializations of the IPV6CP finite state machine (administrative Close and reOpen, reboots, etc.). The rationale for preferring a consistently reproducible unique interface identifier to a completely random interface identifier is to provide stability to global scope addresses (see Appendix A) that can be formed from the interface identifier.
On boot, /etc/rc.bootup calls interfaces_configure() in /etc/inc/interfaces.inc. This walks through the configured interfaces, initialising them sequentially. It may well be that the WAN interface is configured before any other interfaces on the machine are up, which is significant as CreateInterfaceID() in ipv6cp.c calls GetEther(NULL, &hwaddr) in an attempt to discover a hardware address to base the interface identifier on before falling back to a random interface identifier.
If you look at the definition of GetEther() in util.c, it ignores interfaces that only have point-to-point or loopback addresses even if they have a MAC address (i.e. an EUI-48). This failure to use an available EUI-48 violates section 4.1 of RFC 5072 as the RFC requires the use of any EUI-64 or EUI-48 on the machine before falling back to other sources of uniqueness and then a random interface identifier. CreateInterfaceID() should use MAC addresses of interfaces that have no addresses at all.
I suspect the fix will require changing to CreateInterfaceID() to follow the RFC more closely, basing the interface ID on the first valid value from:
-
the MAC address(es) of any interface used as part of the PPP link (if any - PPPoE is not necessarily in use)
-
the MAC address(es) of any other interface on the machine
-
GetEther() (as now - though this is arguably redundant as all MAC addresses should be considered by the first two steps)
-
randomness (as now)</parent>
-
-
Is your PPPoE parent interface a vlan or a physical interface?
Yes, a physical interface in my case.
…
I suspect the fix will require changing to CreateInterfaceID() to follow the RFC more closely,...OK, I will continue my procedure, ever since 2.2.x, to disconnect/connect once every time after a reboot/boot of pfSense (until your recommendation will lead to code change).
Doing so, my IPv6 ISP lease will be continued every hour and not terminated after 2 hours (remarkably, the IPv4 has no problems with hourly re-lease).Thank you for the clear insight and references in the case.
-
Thank you for the clear insight and references in the case.
David_W, many many thanks from myself too for the advice you have shared here and in the Zen forums. I think I have pfSense working with IPv6 and Zen, although I need to go through everything to be one thousand percent sure I certainly would not have got this far without your taking the time to discuss your setup.
-
@hda:
I suspect the fix will require changing to CreateInterfaceID() to follow the RFC more closely,…
OK, I will continue my procedure, ever since 2.2.x, to disconnect/connect once every time after a reboot/boot of pfSense (until your recommendation will lead to code change).
Doing so, my IPv6 ISP lease will be continued every hour and not terminated after 2 hours (remarkably, the IPv4 has no problems with hourly re-lease).I've pretty much proved my diagnosis that the problem is my earlier supposition about CreateInterfaceID() by proving you can work round the problem by assigning a bogon IPv4 address to the PPPoE parent interface temporarily at boot.
Install the Shellcmd package and create an new entry:
| Command | sh -c 'ifconfig igb0 inet 192.0.2.248/31 alias > /dev/null 2>&1 ; sleep 120 ; ifconfig igb0 inet 192.0.2.248/31 -alias > /dev/null 2>&1' >/dev/null 2>/dev/null & |
| Shellcmd type | earlyshellcmd |
| Description | Temporarily assign a bogon (RFC 5737) IPv4 address to an interface to ensure sane IPv6CP interface identifier allocation immediately after boot - https://forum.pfsense.org/index.php?topic=101967.0 |You will need to change igb0 (twice) to to the interface that is normally used to set the interface identifier. You should be able to recognise this interface from its MAC address.
When you reboot, the IPv6 WAN address should be the same on the initial connection as on subsequent connections.
When I have the time, I will open a redmine bug about this issue.
-
…
When you reboot, the IPv6 WAN address should be the same on the initial connection as on subsequent connections.Yes, Confirmed 1st step after reboot. Great temporary solution, thanks David.
Will report again after 1 and 2 hr uptime.(Alix-box on 2.2.6)
-
Well, maybe half a solution :)
No IPv6 lease renewal after 1 hr and not after the 2hr limit, then connection is gone as usual in such case (for IPv6 only).Then:
Do Status-Interfaces PPPoE-Disconnect then Diagnostic-Command Prompt(ps ax | grep dhcp6c) :: still the dhcp6c PID there ! (bad)
Do Diagnostic-Command Prompt(kill -9 $PID) :: OKDo Status-Interfaces PPPoE-Connect :: Diagnostic-Command Prompt(ps ax | grep dhcp6c) - new PID dhcp6c, but address change from WAN-MAC to LAN-MAC !?
Check back after 1 & 2 hr if lease renewal is reported in System Logs.
-
1 hr, no lease renewal. Wait for definite 2hr…. No and lost IPv6 as expected.
-
@hda:
Well, maybe half a solution :)
No IPv6 lease renewal after 1 hr and not after the 2hr limit, then connection is gone as usual in such case (for IPv6 only).Then:
Do Status-Interfaces PPPoE-Disconnect then Diagnostic-Command Prompt(ps ax | grep dhcp6c) :: still the dhcp6c PID there ! (bad)
Do Diagnostic-Command Prompt(kill -9 $PID) :: OKDo Status-Interfaces PPPoE-Connect :: Diagnostic-Command Prompt(ps ax | grep dhcp6c) - new PID dhcp6c, but address change from WAN-MAC to LAN-MAC !?
Check back after 1 & 2 hr if lease renewal is reported in System Logs.
The temporary work-round I posted earlier today only fixes the random IPv6CP interface identifier after boot. That problem is now clearly characterised. Leave the temporary work-round in place pending a more permanent solution. I'm thinking of implementing this work-round in interfaces_configure() as the next stage, because reimplementing mpd5's CreateInterfaceID() to follow section 4.1 of RFC 5072 more closely is a longer term project.
The ongoing symptoms you are describing relate to the issue we were discussing earlier in the thread. I suspect I've already identified the fix:
If I was to change /etc/inc/interfaces.inc, I would suggest changing interface_configure() so that it does not call interface_dhcpv6_configure() for a PPPoE interface, allowing ppp-ipv6 to call interface_dhcpv6_configure() unconditionally once mpd5 signals IPv6 link up.
Edit to add: On further reflection, I'd make this change for all PPP interfaces, not just PPPoE.
I tried hard not to change /etc/inc/interfaces.inc, because my local systems run with the RFC 4638 patch, which changes /etc/inc/interfaces.inc in various places. I hope that the pull request to merge the RFC 4638 patch into pfSense 2.3 will be looked at soon.
I suspect that dhcp6c is somehow starting twice on your system following initial boot-up, presumably from a race condition. I know someone e-mailed me some debugging output a while back, and I can't remember whether it was you. In any event, I wonder if you would be kind enough to reboot and send me the output of the debugging commands I gave earlier in the thread (PM preferred):
clog /var/log/ppp.log | grep -A 1 -E -e 'IPV6CP: LayerUp' | tail -n 2
ifconfig pppoe0 inet6 | grep -E -e '( fe80::|nd6)'
ps -auwwx | grep -E -e '(dhcp6c|rtsold)'
clog /var/log/dhcpd.log | grep dhcp6c | tail -n 40Your interface appears from your screen shots to be pppoe1 - you will need to make the appropriate substitution.
If the expected lease renewal doesn't happen or you note there is more than one dhcp6c process running on the pppoe1 interface, try:
/usr/local/sbin/ppp-ipv6 pppoe0 down ; pkill -xf '^.*dhcp6c.*pppoe0$' ; sleep 2 ; /usr/local/sbin/ppp-ipv6 pppoe0 upAgain, if the problem is with pppoe1, make the three substitutions.
-
…
I suspect that dhcp6c is somehow starting twice on your system following initial boot-up, presumably from a race condition. I know someone e-mailed me some debugging output a while back, and I can't remember whether it was you. In any event, I wonder if you would be kind enough to reboot and send me the output of the debugging commands I gave earlier in the thread (PM preferred):
...I can confirm this all and have sent you the data. Thanks sofar. :)
EDIT:
/usr/local/sbin/ppp-ipv6 pppoe0 down ; pkill -xf '^.*dhcp6c.*pppoe0$' ; sleep 2 ; /usr/local/sbin/ppp-ipv6 pppoe0 up
This does the jobs on console shell as root, not with GUI command prompt as admin. Did it immediately after a reboot, keeps the addresses same (..:b371), makes an entry in system.log for rc.newwanipv6 and yeah after 1hr there is a renewal of the lease as I expect and know to work out OK until the next (re)boot. Thanks again David !
-
@hda:
/usr/local/sbin/ppp-ipv6 pppoe0 down ; pkill -xf '^.*dhcp6c.*pppoe0$' ; sleep 2 ; /usr/local/sbin/ppp-ipv6 pppoe0 up
This does the jobs on console shell as root, not with GUI command prompt as admin. Did it immediately after a reboot, keeps the addresses same (..:b371), makes an entry in system.log for rc.newwanipv6 and yeah after 1hr there is a renewal of the lease as I expect and know to work out OK until the next (re)boot. Thanks again David !
A fairly trivial patch to /etc/inc/interfaces.inc will hopefully fix both problems discussed in this thread.
Install System Patches (if you haven't done so already) and create a patch as follows:
| Field | Contents |
| Description | PPP IPv6 fixes |
| URL/Commit ID | https://dl.dropboxusercontent.com/u/107909287/pfSense%202.2%20patches/2.2.6-RELEASE-ppp-ipv6.patch |
| Patch Contents | (leave blank} |
| Path Strip Count | 1 |
| Base Directory | / |
| Ignore Whitespace | (checked) |
| Auto Apply | (unchecked) |Press the Save button, then, if necessary, press 'Fetch' next to the patch. At this point, the option to 'Apply' should appear, so press 'Apply'.
If you have the RFC 4638 patch applied, you should install this patch before that patch. If the RFC 4638 patch is already installed, revert the RFC 4638 patch, apply this one, then apply the RFC 4638 patch again.
Once you've installed this patch, do a Disconnect and Reconnect on the affected interface in Status -> Interfaces. Test that IPv6 works correctly.
If IPv6 works correctly, set the shellcmd entry from my earlier work round to 'disabled' (rather than 'earlyshellcmd'), as this patch incorporates a version of that work round (the temporary bogon IPv4 address is set on the first configured parent interface of a PPPoE connection and removed once the PPP connection is up). Before you reboot, take a note of the IPv6 link local address for the PPPoE interface, so that you can check after rebooting that the IPv6 addressing is the same as on subsequent connections. Please do not disconnect and reconnect for several hours, as the key test for this patch is that IPv6 keeps working correctly without any manual intervention. If IPv6 addressing is not as you wish on the first connection, you can re-enable the shellcmd, reboot and try again.
If IPv6 does not work correctly with this patch installed, revert the patch and do another Disconnect / Reconnect to resume normal operation.
I'd be grateful if you can update this thread on whether this patch works and whether it is sufficient to ensure IPv6 addressing is the same on the first connect after reboot as on subsequent connections. I'm hoping to get these issues to the point where I can submit pull requests soon, as they seem to be affecting quite a few people.
Anyone else seeing either of these issues in 2.2.5 or 2.2.6 is welcome to try this patch.
-
…
I'd be grateful if you can update this thread on whether this patch works and whether it is sufficient to ensure IPv6 addressing is the same on the first connect after reboot as on subsequent connections...Applied your patch OK.
Tested Status:Interfaces Disconnect(~10sec)/Connect(~25sec) ADSL OK.
IPv6 still good, [both IPv6(linklocal, address) changed from the WAN-fe80::EUI-MAC (initial boot) to the LAN-fe80::EUI-MAC]
Disabled the shellcommand OK.
Rebooted the ALIX-on-2.2.6 (i386/32) OK.
Looked Status:Interfaces WAN-PPPoE addresses OK.
Tested browser IPv6 from a LAN host OK.
Inspected System.Log for the 1hr lease renewal OK.It works !
Sofar Sogood for me :)Then:
Tested Status:Interfaces Disconnect(~10sec)/Connect(~20sec) ADSL OK.
IPv6 still good, but both IPv6(linklocal, address) changed from the WAN-fe80::…b371 (WAN initial boot) to the LAN-fe80::...b370 (LAN).Thanks great solution.
-
@hda:
I'd be grateful if you can update this thread on whether this patch works and whether it is sufficient to ensure IPv6 addressing is the same on the first connect after reboot as on subsequent connections…
Applied your patch OK.
Tested Status:Interfaces Disconnect(~10sec)/Connect(~25sec) ADSL OK.
IPv6 still good, [both IPv6(linklocal, address) changed from the WAN-fe80::EUI-MAC (initial boot) to the LAN-fe80::EUI-MAC]
Disabled the shellcommand OK.
Rebooted the ALIX-on-2.2.6 (i386/32) OK.
Looked Status:Interfaces WAN-PPPoE addresses OK.
Tested browser IPv6 from a LAN host OK.
Inspected System.Log for the 1hr lease renewal OK.It works !
Sofar Sogood for me :)Then:
Tested Status:Interfaces Disconnect(~10sec)/Connect(~20sec) ADSL OK.
IPv6 still good, but both IPv6(linklocal, address) changed from the WAN-fe80::…b371 (WAN initial boot) to the LAN-fe80::...b370 (LAN).That sounds pretty encouraging.
I was hoping that the interface identifier on boot would be the same as on reconnection (in your case ending :b371), but the approach I used to selecting an interface to assign a temporary bogon IPv4 addresses in the version of the patch you tried is clearly wrong now I think about it.
When I get the time, I'll change the patch to assign a temporary IPv4 address to each interface that will have an IPv4 address when pfSense has booted up, start the PPP connection, then remove those temporary addresses when the PPP connection has started. This should result in the same interface identifier on boot as on subsequent reconnections, also it will work with non-PPPoE connections.
-
Running PPPoE from clean boot now for 24hrs OK without interrupts (no need for Disconnect/Connect). Patch by David_W recommended.
-
Hi David,
Is this fix going to get included for 2.3? I have the same issue and am running 2.3 right now and could help you test.
Thanks,
Robbert
-
@rrijkse:
Is this fix going to get included for 2.3? I have the same issue and am running 2.3 right now and could help you test.
My intention is to rework the patch as described a few posts ago, so that the interface ID immediately after reboot is the same as for subsequent reconnections. Once that is done, I'll rebase the patch on 2.3, post a link here and submit pull requests for RELENG_2_2 (2.2.x) and master (2.3).
/etc/inc/interfaces.inc is essentially identical in 2.2.x and 2.3, other than PPTP support having been removed from 2.3.
I will try to get this done as soon as possible, but I'm rather busy at the moment. M_Devil has suggested that the 2.2.6 patch posted earlier in this thread will apply OK on 2.3. Ordinarily applying a patch from one branch will not work on another, but it seems this will work in the interim.
-
Thanks, there is no rush, like I said I have a 2.3 install ready so let me know when you have a pull request ready for 2.3 and I'll try it out. I have had IPv6 disabled for a while since it is not stable enough and I'd rather apply the fix when its fully ready since I'm running in somewhat production right now.
Robbert
-
@hda:
Applied your patch OK.
Tested Status:Interfaces Disconnect(~10sec)/Connect(~25sec) ADSL OK.
IPv6 still good, [both IPv6(linklocal, address) changed from the WAN-fe80::EUI-MAC (initial boot) to the LAN-fe80::EUI-MAC]
Disabled the shellcommand OK.
Rebooted the ALIX-on-2.2.6 (i386/32) OK.
Looked Status:Interfaces WAN-PPPoE addresses OK.
Tested browser IPv6 from a LAN host OK.
Inspected System.Log for the 1hr lease renewal OK.It works !
Sofar Sogood for me :)Then:
Tested Status:Interfaces Disconnect(~10sec)/Connect(~20sec) ADSL OK.
IPv6 still good, but both IPv6(linklocal, address) changed from the WAN-fe80::…b371 (WAN initial boot) to the LAN-fe80::...b370 (LAN).That sounds pretty encouraging.
I was hoping that the interface identifier on boot would be the same as on reconnection (in your case ending :b371), but the approach I used to selecting an interface to assign a temporary bogon IPv4 addresses in the version of the patch you tried is clearly wrong now I think about it.
When I get the time, I'll change the patch to assign a temporary IPv4 address to each interface that will have an IPv4 address when pfSense has booted up, start the PPP connection, then remove those temporary addresses when the PPP connection has started. This should result in the same interface identifier on boot as on subsequent reconnections, also it will work with non-PPPoE connections.
I've now reworked the patch as outlined above. If you (and anyone else using this patch) can revert the patch, refetch it (the location is unchanged), apply it and test it again, I'd be grateful. In particular, I'd be grateful for confirmation that the IPv6 address of your PPP interface(s) is the same immediately after boot as on subsequent connections (Disconnect then Connect in Status->Interfaces).
If this version of the patch solves the issue with the previous version, I'll sort out a pull request in the hope of these fixes being merged into pfSense in due course.