Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes
-
Your problem may be related to this FreeBSD bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280701.
That bug report is lengthy as there is a lot of "discussion" in the thread. I'm not sure that everything discussed in the thread is fully fixed yet in all the FreeBSD branches. I am pretty sure the OPNsense guys did some custom patching of their own to partially address this. At one time there was a long-running thread on their forum site about the issue. I think this became a problem there first probably because IPv6 is more widely deployed and used in European countries where OPNsense is based. IPv6 is not as widely deployed in the U.S. where pfSense is based.
Here are a couple of links to Github referencing the custom patches on the OPNsense side:
https://github.com/opnsense/src/issues/242
https://github.com/opnsense/src/commit/2640600509de99cf6d75d588a6c27461d15809e6The changes required are in the guts of FreeBSD (inside the
pf
code), and thus cannot be fixed without releasing an updated FreeBSD kernel. I am not sure if all of the required fixes are in pfSense Plus now as that source is proprietary. I do see some icmpv6 fixes in the FreeBSD kernel source used for pfSense CE (specifically in the tree for the not-yet-available 2.8 DEVEL snapshot):https://github.com/pfsense/FreeBSD-src/commit/5ab1e5f7e5585558a73b723f07528977a82cee82
-
@bmeeks Many thanks. Yes, I remember this original issue. I'll raise a Redmine later this week to see if I can start some formal investigation on this. I know of multiple pfSense users impacted by this here.
-
Thanks @bmeeks.
A good test would be to upgrade to 25.03-BETA since that includes the fixes and try to reproduce.
-
@marcosm I'll update it when I'm in front of the firewall in late May - hopefully GA is finalised by then. Though, I looked at the release notes for 25.03 and don't see many v6 / RA fixes at all?
-
The release notes for pfSense don't usually include upstream changes/fixes.
-
@marcosm said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
Thanks @bmeeks.
A good test would be to upgrade to 25.03-BETA since that includes the fixes and try to reproduce.
Forgot to add, another Gigaclear user is running BSD-14 without seeing these issues when plugged directly to ONT. If it’s an upstream issue, wouldn’t their 14 kernel be impacted?
-
Depends on how recently it's been updated I suppose. If the commits are indeed related to the issue then I'd expect it to be resolved in an updated FreeBSD system.
-
@marcosm another GC customer is on 25.03. Doesn’t fix the issue. We have a real problem here impacting every Gigaclear pfsense customer in GBR.
BTW - Reached out to TAC. Closed as I'm a TAC Lite customer. SIGH. I pay a yearly subscription for pfSense and still receive ZERO support when reporting a bug despite being a paying customer.
-
I can recreate this.
pfSense does not respond to incoming Neighbor Solicitation messages when the source IPv6 GUA is outside any local subnet and not link-local.
In this case, it leads to the ISP ceasing IPv6 traffic after approximately 5 minutes of no response to multiple solicitation requests. DHCPv6 re-solicitation restarts this timer; interface restart temporarily resolves.
pfSense appears to be filtering or not processing Neighbor Solicitations based on the source IP address not being link-local or within the target's subnet, despite the target address frield contained within the ICMPv6 packet being the link-local address. This behavior may be exacerbated by multi-WAN configurations where the interface is not the default route.
When the source IP is a known subnet or link-local, pfSense does correctly process the Neighbor Solicitation and respond.
-
I beleive the root cause of this is when no GUA is provided for the WAN, so it is left with only a link-local address and a deligated prefix. But the key here is, it has no GUA bound to its interface.
When pfsense attempts to respond with a neighbor advertisement, it picks a LAN GUA address and not the WAN link-local address.
A way to recreate this is to have two IPv6 WANs, ensure no GUA on WAN1 and do not delegate any IPv6 suvbets from WAN1 to any LAN. In other words, give all LAN's a subnet outside any deligated range of WAN1 but inside the delegated range of WAN2. It means WAN1 is somewhat pointless at this point, but that is ok for this test.
On WAN1, when a neighbour soliciation request is received from a GUA address to WAN1 local-link address - pfsense attempts to reply with an advertisement but instead of using the local-link address of WAN1 as the source, it picks one of the LAN's interface addresses (remember this is nothing to do with WAN1) and sends an advertisement from this souce IP.
What happens to that neighbor advertisment varies depending on other firewall/routing.static route rules, i've managed to see it attempt to send out WAN2 for example. Hence WAN1 never see's it.
When a WAN only has a local-link IPv6 address, pfsense should be using this as the source address of neighbor advertisement messages, not picking a random LAN interface as the source address.
I beleive this to be the root cause and a bug.
A workaround is to add a IP Alias on the WAN for a /128 GUA from somewhere within the /48 delegated prefix. However, there should not be a need for a router to have a GUA with IPv6 .
-
@Irata said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
However, there should not be a need for a router to have a GUA with IPv6 .
By no means I'm an IPv6 expert, but what about this : If you want to use (Open)VPN over IPv6 to connect to your pfSense OpenVPN server, you'll need a WAN GUA ....
-
@Gertjan Well of course, but IPv6 routers don't need a GUA and they don't need to be VPN servers.
If you did want to run a VPN server in this setup, you'd just address it to a LAN address which has been delegated a GUA /64. Then setup the firewall to allow OpenVPN to that specific IPv6 GUA.
However, this is not related to this issue at all.
-
It's not at all clear to me why the neighbor solicitation doesn't get a reply.
Our ruleset includespass quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136}
, which ought to be enough for it to be allowed.
So far I've been unable to reproduce such issues.It'd be interesting to look at the pflog output to see where and why the packet is dropped. (I.e. is the NS dropped on the way on, or the NA on the way out? Due to what rule?)
As for the OPNSense commit it's ... well, leaving the editorialising aside, it seems to be trying to do three different things at once, at least two of which it does wrong. (Hint: look at where pd->hdr is copied out of the packet for ICMPv6. It'll never contain anything other than 0 for nd->nd_ns_target or mld->mld_addr. At least it'll always be 0 so it's mostly harmless, but it is still wrong.)
-
@kprovost said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
It's not at all clear to me why the neighbor solicitation doesn't get a reply.
Our ruleset includespass quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136}
, which ought to be enough for it to be allowed.
So far I've been unable to reproduce such issues.It'd be interesting to look at the pflog output to see where and why the packet is dropped. (I.e. is the NS dropped on the way on, or the NA on the way out? Due to what rule?)
As for the OPNSense commit it's ... well, leaving the editorialising aside, it seems to be trying to do three different things at once, at least two of which it does wrong. (Hint: look at where pd->hdr is copied out of the packet for ICMPv6. It'll never contain anything other than 0 for nd->nd_ns_target or mld->mld_addr. At least it'll always be 0 so it's mostly harmless, but it is still wrong.)
My ISP does not offer native IPv6, so I do not have an IPv6 configuration to test with. I was simply providing the user with some information from the older FreeBSD bug report that may or may not be relevant to his situation.
I agree the linked thread contained a lot of back and forth, hence the use of quotes around the word discussion
. I periodically check in at the other *sense forum as a guest to monitor for any potential Suricata issues that might be applicable to my pfSense package. That's how I encountered the linked threads.
-
Could it be a routing issue? Because it requires responding to a GUA address from a link-local address? This appears to be what other implementations allow, plus the Juniper switch itself. In all the captures I've taken, I've never once seen pfSense send a Neighbor solicitation/advertisement to a GUA from its link-local source address.
If I do both these things, it resolves the issue:
- Add a static route to IPv6 router GUA on WAN
- Add an Virtual IP/Alias to the WAN, picking any /128 from within the delegated /48 from the WAN
I think that static route is an important clue. If I just add the static route then pfsense picks some GUA from any LAN (that could be outside of the delegated /48) for the Neighbor Advertisement from pfSense. Again, it refuses to respond to the GUA from it's link-local, pfsense will do anything but use it's link-local even if that means picking some unrelated LAN GUA that it should not be sending out this WAN link.
Then adding the IP Alias stops the random picking of a LAN GUA and it now picks the Virtual IP.
Without the Virtual IP, I can't Ping that GUA from the WAN link-local address, so would the same logic be blocking these Neighbor Advertisements too?
I trawled days of syslogs and I don't see anything Neighbor Advertisements/Solicitations that hit any block rule. nothing showed up.
Sadly, the specifiation for Neghbor Solications/Advertisements are ambiguous here but it can be read that a Link-Local address can respond to a GUA and that's what others have implemented.
Having searched internet forums for the Virtual IP/static route trick, I've found several examples of this issue with pfSense and several ISP's around the globe and interestingly they post a very similar workaround.
-
@Irata said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
I beleive the root cause of this is when no GUA is provided for the WAN, so it is left with only a link-local address and a deligated prefix. But the key here is, it has no GUA bound to its interface.
The WAN interface doesn't need a GUA. All it needs is a link local address. Even if it has a GUA, that address will have nothing to do with the delegated prefix.
-
@Gertjan said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
By no means I'm an IPv6 expert, but what about this : If you want to use (Open)VPN over IPv6 to connect to your pfSense OpenVPN server, you'll need a WAN GUA ....
Use the interface address on the LAN side.
-
My v6 just failed.
Apr 7 19:53:53 dhcp6c 20400 got an expected reply, sleeping.
Apr 7 19:53:53 dhcp6c 20400 removing an event on igb0, state=RENEW
Apr 7 19:53:53 dhcp6c 20400 script "/var/etc/dhcp6c_wan_dhcp6withoutra_script.sh" terminated
Apr 7 19:53:53 dhcp6c 7146 dhcp6c renew, no change - bypassing update on igb0
Apr 7 19:53:53 dhcp6c 20400 executes /var/etc/dhcp6c_wan_dhcp6withoutra_script.sh
Apr 7 19:53:53 dhcp6c 20400 update a prefix 2a06:61c1:ba35::/48 pltime=86400, vltime=86400
Apr 7 19:53:53 dhcp6c 20400 update an IA: PD-0
Apr 7 19:53:53 dhcp6c 20400 nameserver[1] 2a02:fb8:3:300::3
Apr 7 19:53:53 dhcp6c 20400 nameserver[0] 2a02:fb8:2:200::3
Apr 7 19:53:53 dhcp6c 20400 dhcp6c Received INFO
Apr 7 19:53:53 dhcp6c 20400 get DHCP option DNS, len 32
Apr 7 19:53:53 dhcp6c 20400 IA_PD prefix: 2a06:61c1:ba35::/48 pltime=86400 vltime=86400
Apr 7 19:53:53 dhcp6c 20400 get DHCP option IA_PD prefix, len 25
Apr 7 19:53:53 dhcp6c 20400 IA_PD: ID=0, T1=43200, T2=69120
Apr 7 19:53:53 dhcp6c 20400 get DHCP option IA_PD, len 41
Apr 7 19:53:53 dhcp6c 20400 unknown or unexpected DHCP6 option opt_20, len 0
Apr 7 19:53:53 dhcp6c 20400 get DHCP option opt_20, len 0
Apr 7 19:53:53 dhcp6c 20400 DUID: 00:02:00:00:05:83:34:63:3a:36:64:3a:35:38:3a:38:37:3a:31:37:3a:66:38:00:00:00
Apr 7 19:53:53 dhcp6c 20400 get DHCP option server ID, len 26
Apr 7 19:53:53 dhcp6c 20400 DUID: 00:01:00:01:2f:75:c3:80:64:62:66:21:6b:74
Apr 7 19:53:53 dhcp6c 20400 get DHCP option client ID, len 14
Apr 7 19:53:53 dhcp6c 20400 receive reply from fe80::4e6d:58ff:fe87:1017%igb0 on igb0
Apr 7 19:53:53 dhcp6c 20400 send renew to ff02::1:2%igb0
Apr 7 19:53:53 dhcp6c 20400 set IA_PD
Apr 7 19:53:53 dhcp6c 20400 set IA_PD prefix
Apr 7 19:53:53 dhcp6c 20400 set option request (len 4)
Apr 7 19:53:53 dhcp6c 20400 set elapsed time (len 2)
Apr 7 19:53:53 dhcp6c 20400 set server ID (len 26)
Apr 7 19:53:53 dhcp6c 20400 set client ID (len 14)
Apr 7 19:53:53 dhcp6c 20400 a new XID (5c85c4) is generated
Apr 7 19:53:53 dhcp6c 20400 Sending Renew
Apr 7 19:53:53 dhcp6c 20400 reset a timer on igb0, state=RENEW, timeo=0, retrans=9557
Apr 7 19:53:53 dhcp6c 20400 IA timeout for PD-0, state=ACTIVE5 minutes later after this the connection dies. This is despite assigning a WAN VIP & static route to the first v6 hope.
So fed up with this.. :/