pfSense does not reply to NS sent by ISP router, ISP does not respond to DHCPv6 request as a result


  • @cnrd

    Well, I don't know what else to tell you. It works fine for me and many others. But when I look at your capture and see all those unanswered solicitations I don't think the problem is with pfsense. You have to find out why the ISP is not responding to them. I see 14 lines of them and not a single response. While I don't see a router solicitation, I do see the solicit XID to a multicast address. Why no response to that?

    What is your WAN config? Mine says DHCP6.

    Also, in your first post you said "When pfSense sends out a DHCPv6 request, my ISP sends out an NS, which pfSense never replies to <-- This seems to be what causes the problem.", but I don't see that in the packet capture.


  • @JKnott yeah I don't really know what else there is to try either.

    WAN is set to DHCPv6.

    The NS from the ISP comes right after the XID in the latest capture.

    All I can say is that in other OS'es it's working because they reply to that NS.

    I don't really know what to ask my ISP about, as I haven't found any documentation/RFC showing that what they are doing is out of spec.

    Anyways thanks for trying :-) as the same thing is happening upstream in FreeBSD, I'll probably try over there.


  • @cnrd

    You might also mention who your ISP is. Someone else might have experience with them.


  • @JKnott My ISP is Gigabit.dk

    As I wanted to test my theory that it would not reply to global addresses, I hand-crafted two different packages using scapy.

    working.pcap
    not-working.pcap

    The only difference between these two packages is the fact that one uses a global IP as the src, while the other uses a local-link address.

    I'm going to open a bug-report with those two minimal examples.

  • LAYER 8 Netgate

    Not sure this is a bug.

    How is an IPv6 host supposed to source a packet to a GUA unicast address from a link-local address? Link-local is link-local, not GUA. The host has no idea that the GUA address is on the next hop. If it is not, the router receiving the packet should not forward the packet if it is sourced from the link-local address. The host cannot source the packet from a GUA address because DHCP6 has not occurred yet (It doesn't have one).

    I tried to find something hard in the RFCs that states this but came up empty.

    macos to pfSense:
    
    Ping in the link-local context works:
    $ ping6 -S fe80::183d:38c9:7896:973b%vlan0 fe80::1:1%vlan0
    PING6(56=40+8+8 bytes) fe80::183d:38c9:7896:973b%vlan0 --> fe80::1:1%vlan0
    16 bytes from fe80::1:1%vlan0, icmp_seq=0 hlim=64 time=0.168 ms
    16 bytes from fe80::1:1%vlan0, icmp_seq=1 hlim=64 time=0.151 ms
    16 bytes from fe80::1:1%vlan0, icmp_seq=2 hlim=64 time=0.142 ms
    16 bytes from fe80::1:1%vlan0, icmp_seq=3 hlim=64 time=0.225 ms
    ^C
    --- fe80::1:1%vlan0 ping6 statistics ---
    4 packets transmitted, 4 packets received, 0.0% packet loss
    round-trip min/avg/max/std-dev = 0.142/0.171/0.225/0.032 ms
    
    Ping link-local to GUA fails:
    $ ping6 -S fe80::183d:38c9:7896:973b%vlan0 2001:470:beef:1::1
    PING6(56=40+8+8 bytes) fe80::183d:38c9:7896:973b%vlan0 --> 2001:470:beef:1::1
    ^C
    --- 2001:470:beef:1::1 ping6 statistics ---
    5 packets transmitted, 0 packets received, 100.0% packet loss
    
    Ping GUA to GUA works:
    $ ping6 -S 2001:470:beef:1:8444:5b18:abab:96f0 2001:470:beef:1::1
    PING6(56=40+8+8 bytes) 2001:470:beef:1:8444:5b18:abab:96f0 --> 2001:470:beef:1::1
    16 bytes from 2001:470:beef:1::1, icmp_seq=0 hlim=64 time=0.201 ms
    16 bytes from 2001:470:beef:1::1, icmp_seq=1 hlim=64 time=0.203 ms
    16 bytes from 2001:470:beef:1::1, icmp_seq=2 hlim=64 time=0.211 ms
    ^C
    --- 2001:470:beef:1::1 ping6 statistics ---
    3 packets transmitted, 3 packets received, 0.0% packet loss
    round-trip min/avg/max/std-dev = 0.201/0.205/0.211/0.004 ms
    

    Is this the same ISP that expected its customers to periodically send a Router Solicitation even though the RFCs explicitly state one MUST NOT do that except in certain instances like an interface reconfiguration, link down/up, etc?


  • @Derelict No they are sending out RA as expected, they had a problem where the RA packages was thrown away, but that was fixed by their HW vendor.

    I have been speaking to one of their internal IT guys, they have been very helpful and tried changing the config of their routers, such that it would send the NS using a link local address, that fixed this problem, but unfortunately broke the DHCP hand-out.

    I tried to find something hard in the RFCs that states this but came up empty.

    I know that it sounds wierd, but as Linux supports it and there is nothing in the RFC stating that it's wrong, I can't really see why it shouldn't be okay. I can see your argument, it does make sense, but if it's not disallowed by the RFC, then someone (in this case the HW vendor of my ISP) chooses to do it.

    How is an IPv6 host supposed to source a packet to a GUA unicast address from a link-local address?

    Here is how debian does it:
    2 0.000005 fe80::541e:337d:38c9:11ee 2a00:7660::248 ICMPv6 86 Neighbor Advertisement fe80::541e:337d:38c9:11ee (sol, ovr) is at ac:16:2d:94:bb:d3

    It is the responsibility of the receiver (router) to not forward that outside of the link local.


  • @cnrd said in pfSense does not reply to NS sent by ISP router, ISP does not respond to DHCPv6 request as a result:

    I have been speaking to one of their internal IT guys, they have been very helpful and tried changing the config of their routers, such that it would send the NS using a link local address, that fixed this problem, but unfortunately broke the DHCP hand-out.

    One thing I've noticed is the ISPs tech are not fully up to speed on IPv6. A couple of years ago, I had a problem with my ISP where I could not reach the Internet with IPv6. After testing on my own, I had determined the problem was not on my network. I called 2nd level support (I don't waste my time with 1st) and had to talk him through how DHCPv6-PD works and how the WAN address is not used for routing. He was then able to verify the problem was elsewhere. But when he tried to get the network guys to work on it, they refused because I had my own router, even though a neighbour had the same problem and he was only using the ISP's router. I had even determined the failing system, by host name, at the head end, by examining the DHCPv6-PD sequence with Wireshark. I then had a senior tech come to my home and again explained how things worked. He tried, with his own computer and modem, and it failed for him too. He then took his computer to the office and tried with 4 different systems and only the one I was connected to failed. The network guys finally accepted the problem and resolved it.

    I was able to work my way through this because I have decades of experience with telecom, computers and networks. An average customer wouldn't have a hope.

    Bottom line, you can't always count on the ISP's techs to fully understand what they're doing.


  • @JKnott can't really reveal who I talked to, but trust me, not an average tech.


  • This post is deleted!

  • @cnrd I have been following this thread with great interest.

    I use the same ISP, and have exactly the same issue, in that the NS coming in is being ignored, and the DHCP6 requests in turn keep being ignored by Gigabit.

    Funny thing is that this worked me for several years, until it stopped, around the time there was a large outage in connection with some infra upgrades being done at the time. It's possibly related with those infrastructure changes.

    Did you have any luck with support? I have a support case registered a couple of weeks ago for the same thing, and I think I will point them to this thread, unless you found a workaround on the BSD/pfSense side.


  • @abw Unfortunately not. Yeah it seems like the upgrade of hardware caused the problem.


  • @cnrd Just had a breakthrough, and now have the full /48 running with DHCPv6 on pfSense.
    My problem was due to having a user defined MAC address on the WAN interface.

    I've always had a user defined MAC address configured on the WAN interface (Interfaces > WAN > MAC Address), so as to ensure my static IPv4 address over several hardware changes.

    After reading everything you had tried, I wanted to test the theory that this user-defined address might have an impact on pfSense/FreeBSDs ability to respond to the NS.

    It seems that this is the root cause of my issue, because when removed it, and let it default to the hardware MAC, DHCPv6 and the PD came up immediately.
    Of course I no longer had an IPv4 address, but I just called up the ISP and they changed the static DHCP4 allocation to my new MAC for me on the spot.

    Now, with the hardware MAC in use, the NS works perfectly.

    So my current root cause theory is that, whatever mechanism pfSense uses to "spoof" the MAC address, does not appear to propagate to the part of FreeBSD that is responsible for NS responses. Only a theory though.

    I really hope this helps you.

    By the way, I only have the following checked in the WAN configuration now:

    Prefix Delegation Size: /48
    Send IPv6 Prefix Hint: SET

    Reboot pfSense, and that's it.


  • @abw huh wierd, any chance you can try to capture the NS/NA handshake? Maybe they have a different router setup where you are connected.

    If you cannot capture the handshake, could you try to figure out what IP the router is presenting in packages?

    Thanks!


  • @abw said in pfSense does not reply to NS sent by ISP router, ISP does not respond to DHCPv6 request as a result:

    So my current root cause theory is that, whatever mechanism pfSense uses to "spoof" the MAC address, does not appear to propagate to the part of FreeBSD that is responsible for NS responses. Only a theory though.

    There's a bit in the MAC to designate a universal or locally assigned address. Perhaps that's what's causing the problem. You should be able to check that with a packet capture.


  • @JKnott Thanks for the idea!

    My old statically coded MAC address OUI was bc:05:43 which came from a FritzBox I had for many years.
    The new OUI, which works, is 00:1f:29, which is registered to HP.

    Both have the second LSB of the first octet set to zero (UAA), so I guess that's not it.

    In my working configuration I had also set the tunable net.inet6.icmp6.nd6_onlink_ns_rfc4861 to 1, but I have not yet retested with it set to zero to determine if it's actually necessary.

    I still need to experiment a little more to isolate the fix.

  • LAYER 8 Netgate

    This is why I always tell people it is better to just call the ISP and get them to do whatever it is they need to do instead of relying on hacks like MAC address spoofing.

    It is almost always better to fix the problem correctly than to paper over it so it can come back and bite you in unexpected ways at some unexpected time in the future. NAT reflection and, without a doubt, any routing asymmetry fall into this category. As does correcting misapplied public addresses instead of using RFC1918 (and a random choice at that).

    Was the captured negotiation that was failing using the hardware or the spoofed MAC address?


  • @Derelict there are two different people in this thread using the same ISP @abw and me. @abw seems to have his problem fixed, I however have never used a spoofed MAC, but still have problems.

  • LAYER 8 Netgate

    @abw said in pfSense does not reply to NS sent by ISP router, ISP does not respond to DHCPv6 request as a result:

    My old statically coded MAC address OUI was bc:05:43

    Not a locally-generated MAC address, which is the second-least-significant bit as 1 in the first octet there.

  • LAYER 8 Netgate

    @cnrd I understand that. I wasn't really talking to you but everyone in general.


  • @Derelict sorry I had a million things going on when I wrote that reply.

    It seems like I finally have it solved, thanks to @abw, who had the following system tuneable set: net.inet6.icmp6.nd6_onlink_ns_rfc4861=1

    Setting this tuneable combined with doing something will allow pfSense to reply. In my case just applying the tuneable was not enough, restarting did not fix it, saving and applying the WAN interface did not fix it, I had to change and apply the DUID for the tunable to finally apply.

    As stated here: https://www.freebsd.org/security/advisories/FreeBSD-SA-08:10.nd6.asc

    The solution described below causes IPv6 Neighbor Discovery Neighbor Solicitation messages from non-neighbors to be ignored.
    This can be re-enabled if required by setting the newly added net.inet6.icmp6.nd6_onlink_ns_rfc4861 sysctl to a non-zero value.

    I think a package coming from a global address to a link local would be considered a non-neighbor.


  • Ran a few more tests, and also concluded that the net.inet6.icmp6.nd6_onlink_ns_rfc4861 tunable is necessary, but the effect can only be seen after the VLTIME DHCP interval has expired.

    I disabled the 'RFC4861 tunable completely, and rebooted. Everything comes up just fine, but after a couple of hours, corresponding to the DHCP6 VLTIME, the WAN IPv6 lease fails to renew, and IPv6 RAs disappear from all interfaces (I'm using Track Interface on the relevant interfaces). This appears to indicate that this tunable is necessary.

    So, for my issue at least, it appears to be the combination of NOT using a statically configured MAC AND using the 'RFC4861 tunable, has fully resolved the issue with this ISP. It has been stable overnight, and looks very promising.
    @Derelict

    This is why I always tell people it is better to just call the ISP and get them to do whatever it is they need to do instead of relying on hacks like MAC address spoofing

    Absolutely agree with this in principle (keep it simple), however it is still a useful feature. Now that I have a responsive and technically proficient ISP, there is no need to spoof anything, however my previous ISP had no useful process for handling a MAC change if I wanted to retain the static IP I had. The instruction to register a new MAC address was to turn off the CPE for 24 hours, which was not acceptable for me, as I am hosting a number of services.

    Anecdotal for sure, however just want to point out that this was an invaluable feature for me dealing with hardware failure, and that there are likely other use cases, perhaps also depending on the support level, contactability and proficiency of the ISP involved.

    Was the captured negotiation that was failing using the hardware or the spoofed MAC address?

    The failing NS (i.e. soliciting no response from pfSense) was sent to a multicast MAC, consisting of 33:33:FF and then the last 3 octets of my spoofed MAC address.

    @cnrd Thanks for not giving up, and for pushing this issue. Very happy that it's working for you now too.


  • @Derelict said in pfSense does not reply to NS sent by ISP router, ISP does not respond to DHCPv6 request as a result:

    This is why I always tell people it is better to just call the ISP and get them to do whatever it is they need to do instead of relying on hacks like MAC address spoofing.

    The problem with that is they might not know. I have decades of experience in telecom, computers and networks and I like to dig down into the details, further than my co-workers would ever go. As a result, I was often the person they'd come to with problems they couldn't solve. In my experience with my own ISP I'll often not even bother with first level support, as I know the problem is likely to be well beyond them and I've even had to educate 2nd level and senior support. There have been a couple of times when a problem got resolved solely because I was able to work through it on my own. I would then have a struggle convincing the support people, because they simply weren't capable of working at that level. One example of this, which I have described here, was a problem with my ISPs CMTS, which I had identified right down to the host name of the failing system in their head end office.

    While many support staff understand the basics of IP, they don't know the nitty, gritty details. This is particularly true with IPv6, as it's so new to them.


  • @cnrd said in pfSense does not reply to NS sent by ISP router, ISP does not respond to DHCPv6 request as a result:

    As stated here: https://www.freebsd.org/security/advisories/FreeBSD-SA-08:10.nd6.asc

    The solution described below causes IPv6 Neighbor Discovery Neighbor Solicitation messages from non-neighbors to be ignored.
    This can be re-enabled if required by setting the newly added net.inet6.icmp6.nd6_onlink_ns_rfc4861 sysctl to a non-zero value.

    I think a package coming from a global address to a link local would be considered a non-neighbor.

    Here is what I read on Redit:

    "II. Problem Description

    IPv6 routers may allow "on-link" IPv6 nodes to create and update the
    router's neighbor cache and forwarding information. A malicious IPv6 node sharing a common router but on a different physical segment from another node may be able to spoof Neighbor Discovery messages, allowing it to update router information for the victim node."

    Now, take a look at the packet containing the neighbour solicitation or advertisement and check the hop limit. It is 255. This is protection against that threat as a router would have to decrement it from 0, but a 0 hop limit would cause the packet to be dropped. This guarantees the packet originated on the local network. If it's any other number, the packet originated elsewhere and with a hop limit other than 255 or 0.