radvd: can't join ipv6-allrouters on hn1

bimmerdriver

I just started up my 2.6.0 Development system and updated it.

The radvd problem seems to be back:

Apr 15 16:18:46	radvd	58420	can't join ipv6-allrouters on hn1

It may be related to the loss of IPv6 because it's happening for my system. I will see if there are any other interesting messages in the log.

bimmerdriver

Here are some additional log messages:

Apr 15 16:31:38	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:31:36	radvd	18058	hn1 received RS or RA on hn1 but hn1 is not ready and setup_iface failed
Apr 15 16:31:36	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:31:32	radvd	18058	hn1 received RS or RA on hn1 but hn1 is not ready and setup_iface failed
Apr 15 16:31:32	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:31:28	radvd	18058	hn1 received RS or RA on hn1 but hn1 is not ready and setup_iface failed
Apr 15 16:31:28	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:31:22	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:31:06	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:30:50	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:30:50	radvd	18058	resuming normal operation
Apr 15 16:30:50	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:30:50	radvd	18058	warning: (/var/etc/radvd.conf:24) AdvRDNSSLifetime <= 2*MaxRtrAdvInterval would allow stale DNS servers to be deleted faster
Apr 15 16:30:50	radvd	18058	warning: AdvRDNSSLifetime <= 2*MaxRtrAdvInterval would allow stale DNS servers to be deleted faster
Apr 15 16:30:50	radvd	18058	attempting to reread config file
Apr 15 16:30:46	radvd	18058	can't join ipv6-allrouters on hn1
Apr 15 16:30:46	radvd	18058	resuming normal operation

The message "resuming normal operation" was coincident with restarting WAN IF.

A Former User

Try rebooting, noticed if changes are made to the RA page and the radvd is restarted these occur. Likely a IPV6_LEAVE_GROUP is needed prior to restarting RADVD.

bimmerdriver

@rschell said in radvd: can't join ipv6-allrouters on hn1:

Try rebooting, noticed if changes are made to the RA page and the radvd is restarted these occur. Likely a IPV6_LEAVE_GROUP is needed prior to restarting RADVD.

That seemed to help. It's running now and IPv6 is staying up.

At the moment, I have three other pfsense routers and one opnsense router connected to the same bridged port of the modem. Around the same time that this pfsense instance went nuts, IPv4 and IPv6 went down on all of the other routers. I had to restart the wan i/f on them to get them back up. I've never seen anything like that happen.

A Former User

@bimmerdriver
I think the Radvd issue was triggered by flapping the wan interface, which triggers a radvd reload. On reload, radvd is still part of the mcast group and the IPV6_JOIN_GROUP gets rejected as it already exists in the mcast group.

A reboot is a work around until we can teach radvd not to repeatedly issue the IPV6_JOIN_GROUP requests. A prior work around calling IPV6_LEAVE_GROUP before calling with IPV6_JOIN_GROUP, was removed in a recent commit on Feb 4 (0a5a889d2b60...) by the maintainer.

bimmerdriver

@rschell It was stable overnight. Not sure what triggered this. Since the instance of pfsense 2.6.0 was so hung up after I updated to it, I thought it was the trigger that affected the other routers, but who knows. I will keep my eye on it.

A Former User

One solution to the reload problem is to modify /etc/inc/services.inc by replacing line 423

			sigkillbypid("{$g['varrun_path']}/radvd.pid", "HUP");

with

			log_error(gettext("Shutting down Router Advertisment daemon to restart it cleanly"));
			killbypid("{$g['varrun_path']}/radvd.pid");
			@unlink("{$g['varrun_path']}/radvd.pid");
			mwexec("/usr/local/sbin/radvd -p {$g['varrun_path']}/radvd.pid -C {$g['varetc_path']}/radvd.conf -m syslog");

This works in my instance of changing RA settings and having it restart via a HUP signal and failing.

A Former User

@rschell said in radvd: can't join ipv6-allrouters on hn1:

One solution to the reload problem is to modify /etc/inc/services.inc by replacing line 423

Or would a better solution be to modify RADVD to correctly process a "HUP" signal on a restart?

A Former User

Looks like the RADVD port was updated today (5/8/21) to avoid the "join ipv6-allrouters" log messages on duplicate joins. Tested the port and it appears to have solved the "HUP" restart issue for now. Builds after Version 2.6.0.a.20210508.0100 should incorporate RADVD 2.19_2, so changing /etc/inc/services.inc is no longer necessary.

A Former User

The kernel now has been updated to return the desired a more correct error code that aligns with the change in RADVD

~~Well I spoke too soon, the revised patch is checking for the wrong error code.

In my case, EINVAL also needs to be trapped (edited to point to the necessary additional error code). ~~