IPv6 DHCP-PD – radvd dies after interface reset - dhcpv6 does not reaquire addr
-
PROBLEM: radvd service is stopped after WAN interface bounce. DHCPv6 can no longer acquire IPv6 addressing …. WAN IP or PD for LAN.
This problem is reproducible.
Note: In the GUI, the radvd service is not showing as 'stopped', it is not there at all.
2.1-BETA1 (amd64)
built on Sun Mar 3 08:47:18 EDT 2013
FreeBSD 8.3-RELEASE-p6
17:41:04 kernel: re1: link state changed to DOWN
17:41:06 php: : DEVD Ethernet detached event for wan
17:41:14 apinger: ALARM: WAN_DHCP6(fe80::201:5cff:fe24:9301) *** down ***
17:41:24 php: : The command '/sbin/route change -inet default dynamic6' returned exit code '68', the output was 'route: bad address: dynamic6'
17:41:24 dhcp6c[94030]: client6_init: failed initialize control message authentication
17:41:24 dhcp6c[94030]: client6_init: skip opening control port
17:41:36 php: : The command '/usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid re0' returned exit code '1'
17:41:55 php: : The command '/sbin/route change -inet default dynamic6' returned exit code '68', the output was 'route: bad address: dynamic6'
17:41:56 dhcp6c[49425]: dhcp6_ctl_authinit: failed to open /usr/local/etc/dhcp6cctlkey: No such file or directory
17:43:49 php: : Could not find IPv6 gateway for interface(wan).
17:43:58 php: : Shutting down Router Advertisment daemon cleanly
17:44:13 dhcp6c[16033]: client6_recvadvert: XID mismatch
17:44:17 dhcp6c[6748]: client6_init: skip opening control port
17:44:17 dhcp6c[16033]: client6_recvadvert: XID mismatch
–---
After this, the WAN interface is back up, IPv4 only. IPv6 never recovers.
-
You mean it does never try to reconnect with dhcp6?
-
@ermal:
You mean it does never try to reconnect with dhcp6?
It tries, but fails. It will not reacquire IPv6 addressing after the WAN interface drops and comes back up. Neither the WAN IPV6 global address or the PD for the LAN.
The "client6_recvadvert: XID mismatch" error just keeps repeating every couple of seconds.
radvd also stops and "disappears" as noted in this bug: http://redmine.pfsense.org/issues/2878
I can reproduce this problem on demand, just by restarting the cable modem.
–--
Log sequence as follows:
17:41:24 dhcp6c[94030]: dhcp6_ctl_authinit: failed to open /usr/local/etc/dhcp6cctlkey: No such file or directory
17:41:24 dhcp6c[94030]: client6_init: failed initialize control message authentication
17:41:24 dhcp6c[94030]: client6_init: skip opening control port
17:46:34 dhcp6c[16033]: client6_recvadvert: XID mismatch <<<– This keeps repeating.
-
I originally submitted the bug 2878, but I may have having a slightly different problem than the one you've described. My WAN ipv6 global address seems to be sticking around, but as you said RADVD disappears. I tried ping through the web interface and ping6 through ssh and they are still working from the firewall on the wan interface.
Checking /var/etc/radvd.conf just has #Automatically generated, do not edit.
I have on track WAN interface for my internal interface with ipv6 prefix id of 0.
I should also note that in the dashboard, WAN_DHCP6 in the gateways section shows an fe80 loopback address, while WAN (DHCP) in the interfaces section show a 2001: external address
-
I originally submitted the bug 2878, but I may have having a slightly different problem than the one you've described. My WAN ipv6 global address seems to be sticking around, but as you said RADVD disappears. I tried ping through the web interface and ping6 through ssh and they are still working from the firewall on the wan interface.
I had that happen yesterday. WAN DHCPv6 did a renew, the WAN IPv6 address was still there. But, RADVD disappeared and there was no IPv6 address on the LAN.
In that instance, I was not receiving the "XID mismatch" errors. RADVD just died after renewal. So, I'd call it another sighting of your bug 2878.
-
This is the same thing that happens to me. On a reboot the WAN (fxp1) IPV6 ip stays constant for about 30 minutes, then gets a change and doesn't seem to refresh the local interface. This was in my system log.
Mar 15 23:31:12 php: : rc.newwanipv6: Informational is starting fxp1.
Mar 15 23:31:17 php: : rc.newwanipv6: on (IP address: 2001:558:xxxxxxxx) (interface: wan) (real interface: fxp1).
Mar 15 23:31:18 php: : The command '/usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid sk0' returned exit code '1', the output was 'Internet Systems Consortium DHCP Server 4.2.4-P2 Copyright 2004-2012 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Wrote 0 leases to leases file. No subnet6 declaration for sk0 (fe80:2::xxxxxxx). ** Ignoring requests on sk0. If this is not what you want, please write a subnet6 declaration in your dhcpd.conf file for the network segment to which interface sk0 is attached. ** Not configured to listen on any interfaces! If you did not get this software from ftp.isc.org, please get the latest from ftp.isc.org and install that before requesting help. If you did get this software from ftp.isc.org and have not yet read the README, please read it before requesting help. If you intend to request help from the dhcp-server@isc.org mailiFYI, the /etc/dhcpdv6.conf file is empty.
Not sure if this means I should be adding something in dhcpd.conf. I can't access the dhcp6 configuration for a local lan (sk0) because I am using track interface for my ipv6 configuration.
My guess would be a change to /etc/rc.newwanipv6 10 days ago.
-
I had that happen yesterday. WAN DHCPv6 did a renew, the WAN IPv6 address was still there. But, RADVD disappeared and there was no IPv6 address on the LAN.
OK, this problem is pretty serious and very real.
Right on cue, 48 hours later, the WAN dhcpv6 did a renew and BAM! IPv6 addressing assigned to the LAN (DHCP-PD) is gone and radvd has disappeared.
Is there anything I can provide to help resolve this issue?
** Edit **
Just saw something. Both times this has happened, there was this syslog message just before everything went to crap.
"pfSense package system has detected an ip change 2001:558:6020:xxxx:35ac:547f:afad:d3f -> 2001:558:6020:xxxx:4cff:12d7:4aa5:fc41 … Restarting packages."
When in reality, there was NO change. Before the 'event', the address was 2001:558:6020:xxxx:4cff:12d7:4aa5:fc41 and it still is.
So, the question is … What is triggering this? Why did pfsense erroneously think it had changed? Is the WAN address stored in a file somewhere and is compared when dhcpv6 renews? I'm thinking that file is not being updated and every time there is a renewal, it thinks it has changed.
Thoughts?
-
Just saw something. Both times this has happened, there was this syslog message just before everything went to crap.
"pfSense package system has detected an ip change 2001:558:6020:xxxx:35ac:547f:afad:d3f -> 2001:558:6020:xxxx:4cff:12d7:4aa5:fc41 … Restarting packages."
When in reality, there was NO change. Before the 'event', the address was 2001:558:6020:xxxx:4cff:12d7:4aa5:fc41 and it still is.
So, the question is … What is triggering this? Why did pfsense erroneously think it had changed? Is the WAN address stored in a file somewhere and is compared when dhcpv6 renews? I'm thinking that file is not being updated and every time there is a renewal, it thinks it has changed.
I've noticed something that doesn't look right that may explain the above.
The current WAN address from ifconfig:
inet6 2001:558:6020:14a:3117:6e4b:xxxx:b6bf prefixlen 128
The contents of /var/db/wan_ipv6:
2001:558:6020:14a:4cff:12d7:4aa5:fc41 <<<–-- OLD ... the previous address
The contents of /var/db/wan_cacheipv6:
2001:558:6020:14a:4cff:12d7:4aa5:fc41 <<<–-- OLD ... the previous address
So, the next time it renews (and gets the same address renewed), one of those files will be checked and pfsense will erroneously think the address changed. But, that is stale information.
Can somebody confirm that the expectation is that the /var/db file(s) should be the current address and not the last, no longer valid address?
Is my logic wrong?
Thanks
-
Do you have on the logs any line similar to:
rc.newwanipv6: on (IP address: {$curwanipv6}) (interface: {$interface}) (real interface: {$interface_real}). -
@ermal:
Do you have on the logs any line similar to:
rc.newwanipv6: on (IP address: {$curwanipv6}) (interface: {$interface}) (real interface: {$interface_real}).There is:
Mar 16 18:57:19 php: : rc.newwanipv6: Informational is starting re1.
Mar 16 18:57:24 php: : rc.newwanipv6: on (IP address: 2001:558:6020:14a:4cff:12d7:4aa5:fc41) (interface: wan) (real interface: re1).This ran during a scheduled dhcpv6 renewal. Since the time stamp on that one, the WAN IP had changed after reboot and there is no syslog to reflect that it ran. And the /var/db files were not updated.
For now, I've manually edited those 2 /var/db files to reflect the current address. Will see if it survives the next renewal without issue.
–-
A secondary issue is that almost every time I reboot, the WAN IPv6 address and the PD for LAN change. It's like the DUID is changing with each restart. This should be constant.
I've found some older discussions about the DUID changing here:
http://www.dslreports.com/forum/r27854661-IPv6-Seeing-two-different-LAN-side-ranges
http://forum.pfsense.org/index.php/topic,55161.msg306484.html#msg306484 -
@ermal:
Do you have on the logs any line similar to:
rc.newwanipv6: on (IP address: {$curwanipv6}) (interface: {$interface}) (real interface: {$interface_real}).Previously I had
rc.newwanipv6: on (IP address: 2001:558:xxxxxxxx) (interface: wan) (real interface: fxp1).Not sure if you need any more info.
Looks like you have made some changes to dual stack (rc.linkup) and radvd, but the latest build I see is from yesterday March 17th. I will wait for a new build later today, then do the update to let you know how it goes. Ironically, I have not lost my ipv6 lan address in the past 12 hours.
-
Yeah made some changes which should improve the situation.
Test with newer snapshots. -
@ermal:
Yeah made some changes which should improve the situation.
Test with newer snapshots.It gotten worse with the 20130318-1652 build. I can get the WAN interface IPv6 address, but cannot get the PD for LAN.
RADVD doesn't even start at all.
After upgrade, getting this on the dashboard:
[ There were error(s) loading the rules: pfctl: DIOCADDALTQ: Device busy - The line in question reads [0]: ]This message is constant: dhcp6c[16530]: client6_recvadvert: XID mismatch
–-----
Reverted back to:
2.1-BETA1 (amd64)
built on Sun Mar 17 12:19:16 EDT 2013
FreeBSD 8.3-RELEASE-p6... and everything came back.
-
It would have been helpful having the system logs rather than just it does not work!
This message is constant: dhcp6c[16530]: client6_recvadvert: XID mismatch
This might be not important since you might be seeing other clients reply/advertisments.
-
Here's what I saw after applying the 20130318-1652 build.
-
IPv4 was working fine.
-
IPv6 - The WAN obtained an address. There was never PD assigned to the LAN.
-
There are dozens of "client6_recvadvert: XID mismatch" messages.
-
I never saw RADVD start. Never saw it in a 'stopped' state, it just never appeared. There is not a single syslog message about it. Possibly it doesn't get called until there is IPV6 address on the LAN?
-
The dashboard kept flashing the following:
There were error(s) loading the rules: pfctl: DIOCADDALTQ: Device busy - The line in question reads 0 -
I attempted to revert back to previous configuration from GUI. Firmware –> Restore Full Backup.
This failed twice (as in didn't do anything, no syslog message) -
I then downloaded and installed the Sun Mar 17 12:19:16 build. When it came back up, system was normal.
Here's syslogs that appear to be anywhere relevant to the issue.
No indication that rc.newwanipv6 ever ran.
21:15:40 syslogd: kernel boot file is /boot/kernel/kernel
21:15:41 php: : Restarting/Starting all packages.21:15:54 dhcp6c[15951]: client6_recvadvert: XID mismatch
21:15:57 php: : Checking for and disabling any rules dependent upon disabled preprocessors for WAN…
21:17:46 dhcp6c[15951]: client6_recvadvert: XID mismatch21:35:41 php: : rc.newwanip: Informational is starting ovpns1.
21:35:42 apinger: Starting Alarm Pinger, apinger(23340)
21:35:42 apinger: bind socket: Can't assign requested address
21:35:45 dhcp6c[16042]: client6_recvadvert: XID mismatch21:35:48 php: : The command '/sbin/route change -inet6 default fe80::201:5cff:fe24:9301' returned exit code '1', the output was 'route: writing to routing socket: Network is unreachable route: writing to routing socket: Network is unreachable change net default: gateway fe80::201:5cff:fe24:9301: Network is unreachable'
21:35:58 php: : Checking for and disabling any rules dependent upon disabled preprocessors for WAN…
21:36:27 kernel: re1: promiscuous mode enabled
21:36:37 dhcp6c[16042]: client6_recvadvert: XID mismatch
-
-
Ok after a lot of fighting with it i finally think that tomorrow snapshot will behave ok.
The problem was that multiple dhcp6c were running at same time.Test with tomorrows snapshot and let me know.
-
Thanks for all the work you have put into it. I updated to ….
2.1-BETA1 (amd64)
built on Tue Mar 19 16:39:04 EDT 2013... and I'm pulling IPv6 addressing. A couple of issues/observations.
- RADVD - It comes up and stops and I have to manually start it.
Mar 19 19:29:51 pfsense radvd[57759]: version 1.9.1 started
Mar 19 19:29:54 pfsense radvd[57888]: our AdvManagedFlag on re0 doesn't agree with fe80::201:5cff:fe24:9301 <<–- Confirmed Gateway addr
Mar 19 19:29:57 pfsense radvd[57888]: our AdvManagedFlag on re0 doesn't agree with fe80::201:5cff:fe24:9301I have rebooted twice and same. It's stopped, but will manually start OK.
- Cannot pass IPv6 traffic from the LAN to the Internet
- The LAN is assigned the PD
- The hosts are seeing the RA. I've captured with Wireshark, on the hosts, and all looks correct.
- The WAN IPv6 gateway is up
- The default route on the hosts and pfsense is there.
However, IPv6 traffic from the LAN cannot access the Internet. My SWAG is that the error in #1 above has prevented the CMTS from inserting the LAN prefix into it's routing table, so I can't get anywhere (I've seen something like this before with other products).
If if go back to the Sun Mar 17 12:19:16 EDT 2013 build, everything starts working.
I've very sorry, but I don't know what else to add. ????
- /var/db/wan_ipv6 and /var/db/wan_cacheipv6
Neither reflects the current IPv6 WAN address. Is that an issue? It appeared to be a conflict when the address renews. rc.newwanipv6 is erroneously triggered thinking the WAN address has changed, when in reality it has not.
-
I am getting the same types of issues with the March 19th 16:35 build.
Mar 20 08:02:53 php: : rc.newwanipv6: Informational is starting fxp1.
Mar 20 08:02:58 php: : rc.newwanipv6: Failed to update wan IPv6, restarting…
Mar 20 08:02:58 dhcp6c[29010]: check_exit: exiting
Mar 20 08:03:02 php: : rc.newwanipv6: Informational is starting fxp1.
Mar 20 08:03:07 php: : rc.newwanipv6: Failed to update wan IPv6, restarting…It appears I can no longer use ping6 on the lan side, probably due to radvd not starting. I see radvd in the services section, but when I go to start it I get this error.
Mar 20 08:04:38 kernel: pid 89651 (radvd), uid 0: exited on signal 11 (core dumped) -
Thank you for the continued work on this. I've updated to the latest Mar 21 04:30:58 EDT 2013.
A few notes on this build:
- In the system log I am seeing quite frequently… dhcp6c[34635]: client6_recvadvert: XID mismatch
- Ping6 works fine from the firewall, but not from any clients on the local lan.
- radvd does not start automatically. However, I can start it now without it giving a core dump. When I do start it, I can ping from the local lan as well. :)
-
I updated to:
2.1-BETA1 (amd64)
built on Thu Mar 21 17:04:35 EDT 2013Appears to working OK. Only observations are:
-
I had to manually start radvd. Once started, there were no errors. IPv6 connectivity is good.
-
/var/db/wan_ipv6 and /var/db/wan_cacheipv6 don't have the current WAN IPv6 address. Problem or not???
So, I'll wait 48 hours and see what happens at dhcpv6 renewal time. That's when I have been loosing the LAN PD and have to reboot.
-