Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes
-
@marcosm said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
Thanks @bmeeks.
A good test would be to upgrade to 25.03-BETA since that includes the fixes and try to reproduce.
Forgot to add, another Gigaclear user is running BSD-14 without seeing these issues when plugged directly to ONT. If it’s an upstream issue, wouldn’t their 14 kernel be impacted?
-
Depends on how recently it's been updated I suppose. If the commits are indeed related to the issue then I'd expect it to be resolved in an updated FreeBSD system.
-
@marcosm another GC customer is on 25.03. Doesn’t fix the issue. We have a real problem here impacting every Gigaclear pfsense customer in GBR.
BTW - Reached out to TAC. Closed as I'm a TAC Lite customer. SIGH. I pay a yearly subscription for pfSense and still receive ZERO support when reporting a bug despite being a paying customer.
-
I can recreate this.
pfSense does not respond to incoming Neighbor Solicitation messages when the source IPv6 GUA is outside any local subnet and not link-local.
In this case, it leads to the ISP ceasing IPv6 traffic after approximately 5 minutes of no response to multiple solicitation requests. DHCPv6 re-solicitation restarts this timer; interface restart temporarily resolves.
pfSense appears to be filtering or not processing Neighbor Solicitations based on the source IP address not being link-local or within the target's subnet, despite the target address frield contained within the ICMPv6 packet being the link-local address. This behavior may be exacerbated by multi-WAN configurations where the interface is not the default route.
When the source IP is a known subnet or link-local, pfSense does correctly process the Neighbor Solicitation and respond.
-
I beleive the root cause of this is when no GUA is provided for the WAN, so it is left with only a link-local address and a deligated prefix. But the key here is, it has no GUA bound to its interface.
When pfsense attempts to respond with a neighbor advertisement, it picks a LAN GUA address and not the WAN link-local address.
A way to recreate this is to have two IPv6 WANs, ensure no GUA on WAN1 and do not delegate any IPv6 suvbets from WAN1 to any LAN. In other words, give all LAN's a subnet outside any deligated range of WAN1 but inside the delegated range of WAN2. It means WAN1 is somewhat pointless at this point, but that is ok for this test.
On WAN1, when a neighbour soliciation request is received from a GUA address to WAN1 local-link address - pfsense attempts to reply with an advertisement but instead of using the local-link address of WAN1 as the source, it picks one of the LAN's interface addresses (remember this is nothing to do with WAN1) and sends an advertisement from this souce IP.
What happens to that neighbor advertisment varies depending on other firewall/routing.static route rules, i've managed to see it attempt to send out WAN2 for example. Hence WAN1 never see's it.
When a WAN only has a local-link IPv6 address, pfsense should be using this as the source address of neighbor advertisement messages, not picking a random LAN interface as the source address.
I beleive this to be the root cause and a bug.
A workaround is to add a IP Alias on the WAN for a /128 GUA from somewhere within the /48 delegated prefix. However, there should not be a need for a router to have a GUA with IPv6 .
-
@Irata said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
However, there should not be a need for a router to have a GUA with IPv6 .
By no means I'm an IPv6 expert, but what about this : If you want to use (Open)VPN over IPv6 to connect to your pfSense OpenVPN server, you'll need a WAN GUA ....
-
@Gertjan Well of course, but IPv6 routers don't need a GUA and they don't need to be VPN servers.
If you did want to run a VPN server in this setup, you'd just address it to a LAN address which has been delegated a GUA /64. Then setup the firewall to allow OpenVPN to that specific IPv6 GUA.
However, this is not related to this issue at all.
-
It's not at all clear to me why the neighbor solicitation doesn't get a reply.
Our ruleset includespass quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136}
, which ought to be enough for it to be allowed.
So far I've been unable to reproduce such issues.It'd be interesting to look at the pflog output to see where and why the packet is dropped. (I.e. is the NS dropped on the way on, or the NA on the way out? Due to what rule?)
As for the OPNSense commit it's ... well, leaving the editorialising aside, it seems to be trying to do three different things at once, at least two of which it does wrong. (Hint: look at where pd->hdr is copied out of the packet for ICMPv6. It'll never contain anything other than 0 for nd->nd_ns_target or mld->mld_addr. At least it'll always be 0 so it's mostly harmless, but it is still wrong.)
-
@kprovost said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
It's not at all clear to me why the neighbor solicitation doesn't get a reply.
Our ruleset includespass quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136}
, which ought to be enough for it to be allowed.
So far I've been unable to reproduce such issues.It'd be interesting to look at the pflog output to see where and why the packet is dropped. (I.e. is the NS dropped on the way on, or the NA on the way out? Due to what rule?)
As for the OPNSense commit it's ... well, leaving the editorialising aside, it seems to be trying to do three different things at once, at least two of which it does wrong. (Hint: look at where pd->hdr is copied out of the packet for ICMPv6. It'll never contain anything other than 0 for nd->nd_ns_target or mld->mld_addr. At least it'll always be 0 so it's mostly harmless, but it is still wrong.)
My ISP does not offer native IPv6, so I do not have an IPv6 configuration to test with. I was simply providing the user with some information from the older FreeBSD bug report that may or may not be relevant to his situation.
I agree the linked thread contained a lot of back and forth, hence the use of quotes around the word discussion
. I periodically check in at the other *sense forum as a guest to monitor for any potential Suricata issues that might be applicable to my pfSense package. That's how I encountered the linked threads.
-
Could it be a routing issue? Because it requires responding to a GUA address from a link-local address? This appears to be what other implementations allow, plus the Juniper switch itself. In all the captures I've taken, I've never once seen pfSense send a Neighbor solicitation/advertisement to a GUA from its link-local source address.
If I do both these things, it resolves the issue:
- Add a static route to IPv6 router GUA on WAN
- Add an Virtual IP/Alias to the WAN, picking any /128 from within the delegated /48 from the WAN
I think that static route is an important clue. If I just add the static route then pfsense picks some GUA from any LAN (that could be outside of the delegated /48) for the Neighbor Advertisement from pfSense. Again, it refuses to respond to the GUA from it's link-local, pfsense will do anything but use it's link-local even if that means picking some unrelated LAN GUA that it should not be sending out this WAN link.
Then adding the IP Alias stops the random picking of a LAN GUA and it now picks the Virtual IP.
Without the Virtual IP, I can't Ping that GUA from the WAN link-local address, so would the same logic be blocking these Neighbor Advertisements too?
I trawled days of syslogs and I don't see anything Neighbor Advertisements/Solicitations that hit any block rule. nothing showed up.
Sadly, the specifiation for Neghbor Solications/Advertisements are ambiguous here but it can be read that a Link-Local address can respond to a GUA and that's what others have implemented.
Having searched internet forums for the Virtual IP/static route trick, I've found several examples of this issue with pfSense and several ISP's around the globe and interestingly they post a very similar workaround.
-
@Irata said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
I beleive the root cause of this is when no GUA is provided for the WAN, so it is left with only a link-local address and a deligated prefix. But the key here is, it has no GUA bound to its interface.
The WAN interface doesn't need a GUA. All it needs is a link local address. Even if it has a GUA, that address will have nothing to do with the delegated prefix.
-
@Gertjan said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
By no means I'm an IPv6 expert, but what about this : If you want to use (Open)VPN over IPv6 to connect to your pfSense OpenVPN server, you'll need a WAN GUA ....
Use the interface address on the LAN side.
-
My v6 just failed.
Apr 7 19:53:53 dhcp6c 20400 got an expected reply, sleeping.
Apr 7 19:53:53 dhcp6c 20400 removing an event on igb0, state=RENEW
Apr 7 19:53:53 dhcp6c 20400 script "/var/etc/dhcp6c_wan_dhcp6withoutra_script.sh" terminated
Apr 7 19:53:53 dhcp6c 7146 dhcp6c renew, no change - bypassing update on igb0
Apr 7 19:53:53 dhcp6c 20400 executes /var/etc/dhcp6c_wan_dhcp6withoutra_script.sh
Apr 7 19:53:53 dhcp6c 20400 update a prefix 2a06:61c1:ba35::/48 pltime=86400, vltime=86400
Apr 7 19:53:53 dhcp6c 20400 update an IA: PD-0
Apr 7 19:53:53 dhcp6c 20400 nameserver[1] 2a02:fb8:3:300::3
Apr 7 19:53:53 dhcp6c 20400 nameserver[0] 2a02:fb8:2:200::3
Apr 7 19:53:53 dhcp6c 20400 dhcp6c Received INFO
Apr 7 19:53:53 dhcp6c 20400 get DHCP option DNS, len 32
Apr 7 19:53:53 dhcp6c 20400 IA_PD prefix: 2a06:61c1:ba35::/48 pltime=86400 vltime=86400
Apr 7 19:53:53 dhcp6c 20400 get DHCP option IA_PD prefix, len 25
Apr 7 19:53:53 dhcp6c 20400 IA_PD: ID=0, T1=43200, T2=69120
Apr 7 19:53:53 dhcp6c 20400 get DHCP option IA_PD, len 41
Apr 7 19:53:53 dhcp6c 20400 unknown or unexpected DHCP6 option opt_20, len 0
Apr 7 19:53:53 dhcp6c 20400 get DHCP option opt_20, len 0
Apr 7 19:53:53 dhcp6c 20400 DUID: 00:02:00:00:05:83:34:63:3a:36:64:3a:35:38:3a:38:37:3a:31:37:3a:66:38:00:00:00
Apr 7 19:53:53 dhcp6c 20400 get DHCP option server ID, len 26
Apr 7 19:53:53 dhcp6c 20400 DUID: 00:01:00:01:2f:75:c3:80:64:62:66:21:6b:74
Apr 7 19:53:53 dhcp6c 20400 get DHCP option client ID, len 14
Apr 7 19:53:53 dhcp6c 20400 receive reply from fe80::4e6d:58ff:fe87:1017%igb0 on igb0
Apr 7 19:53:53 dhcp6c 20400 send renew to ff02::1:2%igb0
Apr 7 19:53:53 dhcp6c 20400 set IA_PD
Apr 7 19:53:53 dhcp6c 20400 set IA_PD prefix
Apr 7 19:53:53 dhcp6c 20400 set option request (len 4)
Apr 7 19:53:53 dhcp6c 20400 set elapsed time (len 2)
Apr 7 19:53:53 dhcp6c 20400 set server ID (len 26)
Apr 7 19:53:53 dhcp6c 20400 set client ID (len 14)
Apr 7 19:53:53 dhcp6c 20400 a new XID (5c85c4) is generated
Apr 7 19:53:53 dhcp6c 20400 Sending Renew
Apr 7 19:53:53 dhcp6c 20400 reset a timer on igb0, state=RENEW, timeo=0, retrans=9557
Apr 7 19:53:53 dhcp6c 20400 IA timeout for PD-0, state=ACTIVE5 minutes later after this the connection dies. This is despite assigning a WAN VIP & static route to the first v6 hope.
-
The only way I can have v6 connectivity last >24h05m00s is by:
route -6 add -host 2a02:fb8::11 -interface igb0
ndp -s 2a02:fb8::11 4e:6d:58:87:10:17 permanentWithout adding the permanent ndp entry, it defaults to 24 hours, then 5 minutes after the logs you see above, v6 dies. Once I add the permanent ndp entry I can go past 24 hours.
-
When a WAN only has a local-link IPv6 address, pfsense should be using this as the source address of neighbor advertisement messages, not picking a random LAN interface as the source address.
I tested this on 25.03 and pfSense always sent NA's sourced with the correct link-local address. This sounds more like there may be some routing/NAT issue. There's a default rule that allows any source/destination though so it should go through regardless.
Regarding https://redmine.pfsense.org/issues/16123#note-3
I see the ISP device sending NS for pfSense's link-local address without a response from pfSense:138 2025-04-04 01:51:01.778710 0.020680 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 145 2025-04-04 01:51:04.730835 0.972840 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 152 2025-04-04 01:51:07.730711 0.359749 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 209 2025-04-04 01:51:11.150835 0.015817 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 214 2025-04-04 01:51:14.138711 1.003684 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17
While this is happening, check that pfSense has joined the respective multicast group with
ifmcstat -i igc0 -f inet6
. If it has joined as expected then it should show something like:inet6 fe80::6662:66ff:fe21:6b74%igc0 scopeid 0x3 mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3 ... group ff02::1:ff21:6b74%igc0 scopeid 0x3 mode exclude mcast-macaddr 33:33:ff:21:6b:74
A similar issue which has since been fixed exists in old versions:
https://redmine.pfsense.org/issues/13423