Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes
-
@marcosm another GC customer is on 25.03. Doesn’t fix the issue. We have a real problem here impacting every Gigaclear pfsense customer in GBR.
BTW - Reached out to TAC. Closed as I'm a TAC Lite customer. SIGH. I pay a yearly subscription for pfSense and still receive ZERO support when reporting a bug despite being a paying customer.
-
I can recreate this.
pfSense does not respond to incoming Neighbor Solicitation messages when the source IPv6 GUA is outside any local subnet and not link-local.
In this case, it leads to the ISP ceasing IPv6 traffic after approximately 5 minutes of no response to multiple solicitation requests. DHCPv6 re-solicitation restarts this timer; interface restart temporarily resolves.
pfSense appears to be filtering or not processing Neighbor Solicitations based on the source IP address not being link-local or within the target's subnet, despite the target address frield contained within the ICMPv6 packet being the link-local address. This behavior may be exacerbated by multi-WAN configurations where the interface is not the default route.
When the source IP is a known subnet or link-local, pfSense does correctly process the Neighbor Solicitation and respond.
-
I beleive the root cause of this is when no GUA is provided for the WAN, so it is left with only a link-local address and a deligated prefix. But the key here is, it has no GUA bound to its interface.
When pfsense attempts to respond with a neighbor advertisement, it picks a LAN GUA address and not the WAN link-local address.
A way to recreate this is to have two IPv6 WANs, ensure no GUA on WAN1 and do not delegate any IPv6 suvbets from WAN1 to any LAN. In other words, give all LAN's a subnet outside any deligated range of WAN1 but inside the delegated range of WAN2. It means WAN1 is somewhat pointless at this point, but that is ok for this test.
On WAN1, when a neighbour soliciation request is received from a GUA address to WAN1 local-link address - pfsense attempts to reply with an advertisement but instead of using the local-link address of WAN1 as the source, it picks one of the LAN's interface addresses (remember this is nothing to do with WAN1) and sends an advertisement from this souce IP.
What happens to that neighbor advertisment varies depending on other firewall/routing.static route rules, i've managed to see it attempt to send out WAN2 for example. Hence WAN1 never see's it.
When a WAN only has a local-link IPv6 address, pfsense should be using this as the source address of neighbor advertisement messages, not picking a random LAN interface as the source address.
I beleive this to be the root cause and a bug.
A workaround is to add a IP Alias on the WAN for a /128 GUA from somewhere within the /48 delegated prefix. However, there should not be a need for a router to have a GUA with IPv6 .
-
@Irata said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
However, there should not be a need for a router to have a GUA with IPv6 .
By no means I'm an IPv6 expert, but what about this : If you want to use (Open)VPN over IPv6 to connect to your pfSense OpenVPN server, you'll need a WAN GUA ....
-
@Gertjan Well of course, but IPv6 routers don't need a GUA and they don't need to be VPN servers.
If you did want to run a VPN server in this setup, you'd just address it to a LAN address which has been delegated a GUA /64. Then setup the firewall to allow OpenVPN to that specific IPv6 GUA.
However, this is not related to this issue at all.
-
It's not at all clear to me why the neighbor solicitation doesn't get a reply.
Our ruleset includespass quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136}
, which ought to be enough for it to be allowed.
So far I've been unable to reproduce such issues.It'd be interesting to look at the pflog output to see where and why the packet is dropped. (I.e. is the NS dropped on the way on, or the NA on the way out? Due to what rule?)
As for the OPNSense commit it's ... well, leaving the editorialising aside, it seems to be trying to do three different things at once, at least two of which it does wrong. (Hint: look at where pd->hdr is copied out of the packet for ICMPv6. It'll never contain anything other than 0 for nd->nd_ns_target or mld->mld_addr. At least it'll always be 0 so it's mostly harmless, but it is still wrong.)
-
@kprovost said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
It's not at all clear to me why the neighbor solicitation doesn't get a reply.
Our ruleset includespass quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136}
, which ought to be enough for it to be allowed.
So far I've been unable to reproduce such issues.It'd be interesting to look at the pflog output to see where and why the packet is dropped. (I.e. is the NS dropped on the way on, or the NA on the way out? Due to what rule?)
As for the OPNSense commit it's ... well, leaving the editorialising aside, it seems to be trying to do three different things at once, at least two of which it does wrong. (Hint: look at where pd->hdr is copied out of the packet for ICMPv6. It'll never contain anything other than 0 for nd->nd_ns_target or mld->mld_addr. At least it'll always be 0 so it's mostly harmless, but it is still wrong.)
My ISP does not offer native IPv6, so I do not have an IPv6 configuration to test with. I was simply providing the user with some information from the older FreeBSD bug report that may or may not be relevant to his situation.
I agree the linked thread contained a lot of back and forth, hence the use of quotes around the word discussion
. I periodically check in at the other *sense forum as a guest to monitor for any potential Suricata issues that might be applicable to my pfSense package. That's how I encountered the linked threads.
-
Could it be a routing issue? Because it requires responding to a GUA address from a link-local address? This appears to be what other implementations allow, plus the Juniper switch itself. In all the captures I've taken, I've never once seen pfSense send a Neighbor solicitation/advertisement to a GUA from its link-local source address.
If I do both these things, it resolves the issue:
- Add a static route to IPv6 router GUA on WAN
- Add an Virtual IP/Alias to the WAN, picking any /128 from within the delegated /48 from the WAN
I think that static route is an important clue. If I just add the static route then pfsense picks some GUA from any LAN (that could be outside of the delegated /48) for the Neighbor Advertisement from pfSense. Again, it refuses to respond to the GUA from it's link-local, pfsense will do anything but use it's link-local even if that means picking some unrelated LAN GUA that it should not be sending out this WAN link.
Then adding the IP Alias stops the random picking of a LAN GUA and it now picks the Virtual IP.
Without the Virtual IP, I can't Ping that GUA from the WAN link-local address, so would the same logic be blocking these Neighbor Advertisements too?
I trawled days of syslogs and I don't see anything Neighbor Advertisements/Solicitations that hit any block rule. nothing showed up.
Sadly, the specifiation for Neghbor Solications/Advertisements are ambiguous here but it can be read that a Link-Local address can respond to a GUA and that's what others have implemented.
Having searched internet forums for the Virtual IP/static route trick, I've found several examples of this issue with pfSense and several ISP's around the globe and interestingly they post a very similar workaround.
-
@Irata said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
I beleive the root cause of this is when no GUA is provided for the WAN, so it is left with only a link-local address and a deligated prefix. But the key here is, it has no GUA bound to its interface.
The WAN interface doesn't need a GUA. All it needs is a link local address. Even if it has a GUA, that address will have nothing to do with the delegated prefix.
-
@Gertjan said in Gigaclear & ip6 - lose of connectivity after *exactly* 5 minutes:
By no means I'm an IPv6 expert, but what about this : If you want to use (Open)VPN over IPv6 to connect to your pfSense OpenVPN server, you'll need a WAN GUA ....
Use the interface address on the LAN side.
-
My v6 just failed.
Apr 7 19:53:53 dhcp6c 20400 got an expected reply, sleeping.
Apr 7 19:53:53 dhcp6c 20400 removing an event on igb0, state=RENEW
Apr 7 19:53:53 dhcp6c 20400 script "/var/etc/dhcp6c_wan_dhcp6withoutra_script.sh" terminated
Apr 7 19:53:53 dhcp6c 7146 dhcp6c renew, no change - bypassing update on igb0
Apr 7 19:53:53 dhcp6c 20400 executes /var/etc/dhcp6c_wan_dhcp6withoutra_script.sh
Apr 7 19:53:53 dhcp6c 20400 update a prefix 2a06:61c1:ba35::/48 pltime=86400, vltime=86400
Apr 7 19:53:53 dhcp6c 20400 update an IA: PD-0
Apr 7 19:53:53 dhcp6c 20400 nameserver[1] 2a02:fb8:3:300::3
Apr 7 19:53:53 dhcp6c 20400 nameserver[0] 2a02:fb8:2:200::3
Apr 7 19:53:53 dhcp6c 20400 dhcp6c Received INFO
Apr 7 19:53:53 dhcp6c 20400 get DHCP option DNS, len 32
Apr 7 19:53:53 dhcp6c 20400 IA_PD prefix: 2a06:61c1:ba35::/48 pltime=86400 vltime=86400
Apr 7 19:53:53 dhcp6c 20400 get DHCP option IA_PD prefix, len 25
Apr 7 19:53:53 dhcp6c 20400 IA_PD: ID=0, T1=43200, T2=69120
Apr 7 19:53:53 dhcp6c 20400 get DHCP option IA_PD, len 41
Apr 7 19:53:53 dhcp6c 20400 unknown or unexpected DHCP6 option opt_20, len 0
Apr 7 19:53:53 dhcp6c 20400 get DHCP option opt_20, len 0
Apr 7 19:53:53 dhcp6c 20400 DUID: 00:02:00:00:05:83:34:63:3a:36:64:3a:35:38:3a:38:37:3a:31:37:3a:66:38:00:00:00
Apr 7 19:53:53 dhcp6c 20400 get DHCP option server ID, len 26
Apr 7 19:53:53 dhcp6c 20400 DUID: 00:01:00:01:2f:75:c3:80:64:62:66:21:6b:74
Apr 7 19:53:53 dhcp6c 20400 get DHCP option client ID, len 14
Apr 7 19:53:53 dhcp6c 20400 receive reply from fe80::4e6d:58ff:fe87:1017%igb0 on igb0
Apr 7 19:53:53 dhcp6c 20400 send renew to ff02::1:2%igb0
Apr 7 19:53:53 dhcp6c 20400 set IA_PD
Apr 7 19:53:53 dhcp6c 20400 set IA_PD prefix
Apr 7 19:53:53 dhcp6c 20400 set option request (len 4)
Apr 7 19:53:53 dhcp6c 20400 set elapsed time (len 2)
Apr 7 19:53:53 dhcp6c 20400 set server ID (len 26)
Apr 7 19:53:53 dhcp6c 20400 set client ID (len 14)
Apr 7 19:53:53 dhcp6c 20400 a new XID (5c85c4) is generated
Apr 7 19:53:53 dhcp6c 20400 Sending Renew
Apr 7 19:53:53 dhcp6c 20400 reset a timer on igb0, state=RENEW, timeo=0, retrans=9557
Apr 7 19:53:53 dhcp6c 20400 IA timeout for PD-0, state=ACTIVE5 minutes later after this the connection dies. This is despite assigning a WAN VIP & static route to the first v6 hope.
-
The only way I can have v6 connectivity last >24h05m00s is by:
route -6 add -host 2a02:fb8::11 -interface igb0
ndp -s 2a02:fb8::11 4e:6d:58:87:10:17 permanentWithout adding the permanent ndp entry, it defaults to 24 hours, then 5 minutes after the logs you see above, v6 dies. Once I add the permanent ndp entry I can go past 24 hours.
-
When a WAN only has a local-link IPv6 address, pfsense should be using this as the source address of neighbor advertisement messages, not picking a random LAN interface as the source address.
I tested this on 25.03 and pfSense always sent NA's sourced with the correct link-local address. This sounds more like there may be some routing/NAT issue. There's a default rule that allows any source/destination though so it should go through regardless.
Regarding https://redmine.pfsense.org/issues/16123#note-3
I see the ISP device sending NS for pfSense's link-local address without a response from pfSense:138 2025-04-04 01:51:01.778710 0.020680 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 145 2025-04-04 01:51:04.730835 0.972840 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 152 2025-04-04 01:51:07.730711 0.359749 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 209 2025-04-04 01:51:11.150835 0.015817 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17 214 2025-04-04 01:51:14.138711 1.003684 2a02:fb8::11 ff02::1:ff21:6b74 ICMPv6 86 Neighbor Solicitation for fe80::6662:66ff:fe21:6b74 from 4e:6d:58:87:10:17
While this is happening, check that pfSense has joined the respective multicast group with
ifmcstat -i igc0 -f inet6
. If it has joined as expected then it should show something like:inet6 fe80::6662:66ff:fe21:6b74%igc0 scopeid 0x3 mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3 ... group ff02::1:ff21:6b74%igc0 scopeid 0x3 mode exclude mcast-macaddr 33:33:ff:21:6b:74
A similar issue which has since been fixed exists in old versions:
https://redmine.pfsense.org/issues/13423 -
When this is happening I see this:
vtnet0.3:
inet6 fe80::1c1e:54ff:fe8a:705%vtnet0.3 scopeid 0x19
mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3
group ff02::1:ff00:1%vtnet0.3 scopeid 0x19 mode exclude
mcast-macaddr 33:33:ff:00:00:01
group ff01::1%vtnet0.3 scopeid 0x19 mode exclude
mcast-macaddr 33:33:00:00:00:01
group ff02::2:1861:20ce%vtnet0.3 scopeid 0x19 mode exclude
mcast-macaddr 33:33:18:61:20:ce
group ff02::2:ff18:6120%vtnet0.3 scopeid 0x19 mode exclude
mcast-macaddr 33:33:ff:18:61:20
group ff02::1%vtnet0.3 scopeid 0x19 mode exclude
mcast-macaddr 33:33:00:00:00:01
group ff02::1:ff8a:705%vtnet0.3 scopeid 0x19 mode exclude
mcast-macaddr 33:33:ff:8a:07:05The only clue I have is, just the same as others I must add a static route in to the GUA of the ISP router to get pfsense to reply out of the correct WAN interface, and I have to manually add a permanent ndp entry for that device too. I am running multi-wan. I also don't understand why the ndp table is not being updated for that ISP device, even when I ping that device.
It's really strange, but pfsense definately does not send NA without the above encouragement. I can recreate this issue so easily, every time without fail.
The only other oddity I've noticed is that running:
ndp -s 2a02:fb8::32 4a:5a:0d:5a:f2:b7
gives this sometimes:
ndp: delete: cannot locate 2a02:fb8::32
But I'm not trying to delete, I'm trying to add? Whereas, running this command just after adding the static route will make the command work.
-
@Irata Would you share the output of
uname -a
? -
FreeBSD pfSense 15.0-CURRENT FreeBSD 15.0-CURRENT #0 plus-RELENG_25_03-n256497-da24eca0fcd2: Mon Apr 14 19:32:49 UTC 2025 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/obj/amd64/ILoDLiJx/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/sources/FreeBSD-src-plus-RELENG_25_03/amd64.amd64/sys/pfSense amd64
Thought I should add some info of my setup:
I have 4 WAN interfaces, all dual-stack IPv4 and IPv6 and all with a delegated /48 subnet. I use the IPv6 prefix of the first WAN for all LAN interfaces, I use NPT to translate those to the other 3 WAN prefixes if used. I use a failover group, with each WAN interfaces in a different tier.
The ISP link causing this issue is not the first interface, so i am not using that /48 prefix at the LAN. This troublesome ISP link is the only one where IPv6 is native on the WAN interface and they send NS from a GUA. The other IPv6 interfaces are either L2TP tunnels, or the ISP uses a link-local address for it's Network Solicitation/Advertisements.
It appears others have a much simpler setup to myself and still see the issue.
If I shutdown my pfsense VM, and start up a simple OpenWrt VM, everything is stable and the IPv6 interface just works for this ISP, NA's are sent correctly to the GUA. All I do is swap one VM to another VM of a different OS, no other network changes are made.
To me, the only common factor when NA's are not sent (without some tinkering) is when I'm using pfsense and the ISP sends NS from a GUA. This does appear to be typical behavour from Juniper switches, as I also see this issue on a different test bed.
I have also just discovered, during this issue on this version of pfsense, that opening diagnostics->NDP table on pfsense UI causes the UI to lockup and eventually crash. It sends me to a crash report but that just says
Crash report begins. Anonymous machine information: amd64 15.0-CURRENT FreeBSD 15.0-CURRENT #0 plus-RELENG_25_03-n256497-da24eca0fcd2: Mon Apr 14 19:32:49 UTC 2025 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/obj/amd64/ILoDLiJx/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/sources/FreeB Crash report details: No PHP errors found. No FreeBSD crash data found.
I'm unsure if this is related.
-
I was able to reproduce the issue. The behavior is intended and a workaround exists - see:
https://redmine.pfsense.org/issues/16146I expect this to work with the ISP Gigaclear as well.
-
@marcosm thank you!
(5 minutes later) Confirmed: it works!
-
@ahxcjay Check after 24 hours, as even with this change I still see the NDP entry for first hop going stale and it may get deleted after 24 hours.
-
@Irata yes - 2h53m left…