What is expected behaviour of pfSense if ISP edge router does not send periodic RA?

bimmerdriver

In some cases, my ISP edge router does not send periodic RA messages. (They understand this is a problem and they are trying to get the vendor to fix it.) The router does send an initial RA message in response to the RS when when PD happens. What will pfSense do in this situation? Is it expected that IPv6 will stop working? I'm asking, because the observed behaviour is that IPv6 does stop working some time (e.g., a couple of hours) after PD and I'm wondering if this is the cause.

JKnott

It will fail. RAs have a lifetime. If it expires, without receiving another RA, then the computer no longer has a prefix for the network.

bimmerdriver

@jknott Thanks for the reply. Does anyone know what the exact mechanism is. Does the route get removed? Presumably there is a timer. How long is it? As I said, I'm trying to understand what's happening, so I can inform the ISP.

JKnott

When a computer loses the prefix, it no longer has a valid address and I would also expect the route to be deleted. You can examine the lifetime with Packet Capture or Wireshark. I just looked and my RAs show a lifetime of 60s from pfSense, so after a minute without another RA, it would fail. You can examine the lifetime in RAs from your ISP to see what it should be and whether you're exceeding it.

bimmerdriver

@jknott The ISP edge routers normally output RA between 15 and 30 minute intervals. IRRC, the lifetime is consistent with that. Some of the edge routers are not outputting RA messages which as I said is a known problem. The only RA they output is in reply to the RS when PD happens. Sometimes ipv6 stops working after a couple of hours. After restarting the WAN (save / apply), it sometimes stays working indefinitely and only seems to be flakey if pfsense is rebooted or for example, when it was upgraded to 2.4.4_2. It's not deterministic, which is why I'm wondering what the intended behaviour is supposed to be and if this is the cause of ipv6 not staying up or if it's something else.

JKnott

That doesn't sound normal. I see RAs regularly on my network.

bimmerdriver

@jknott said in What is expected behaviour of pfSense if ISP edge router does not send periodic RA?:

That doesn't sound normal. I see RAs regularly on my network.

Sigh. Of course it's not normal. The ISP knows there is a problem, as I already stated.

I posted here to determine what the expected behaviour of pfsense should be if edge router does not send unsolicited RA messages. As I said, an RA message is only being sent in response to the RS at the time of PD, but not unsolicited, but for some reason, the behaviour of pfsense is not consistent. It seems to lose ipv6 connectivity a couple of hours after a reboot or upgrade, but if the WAN interface is restarted (save/apply), it seems to keep working longer. This doesn't make sense. If pfsense requires unsolicited RA messages within the expiration time and they are not received, specifically what is it supposed to do? Why does it seem to only lose ipv6 connectivity a couple of hours later, but if the WAN is restarted, it works longer? If it's designed to delete the route (for example), that does not appear to be happening consistently. Maybe there is something else wrong.

Derelict

Easiest thing to do is probably pcap the RS/RA traffic and look at what is actually being sent.

Cox sends an RA every 4 seconds. I never have any issues. Not sure what this 15-30 minute stuff is all about.

The ones I receive are:
Router lifetime (s): 1800
Reachable time (ms): 3600000
Retrans timer (ms): 0

Router Lifetime
         16-bit unsigned integer.  The lifetime associated
         with the default router in units of seconds.  The
         field can contain values up to 65535 and receivers
         should handle any value, while the sending rules in
         Section 6 limit the lifetime to 9000 seconds.  A
         Lifetime of 0 indicates that the router is not a
         default router and SHOULD NOT appear on the default
         router list.  The Router Lifetime applies only to
         the router's usefulness as a default router; it
         does not apply to information contained in other
         message fields or options.  Options that need time
         limits for their information include their own
         lifetime fields.

Reachable Time
          32-bit unsigned integer.  The time, in
          milliseconds, that a node assumes a neighbor is
          reachable after having received a reachability
          confirmation.  Used by the Neighbor Unreachability
          Detection algorithm (see Section 7.3).  A value of
          zero means unspecified (by this router).

Retrans Timer 
          32-bit unsigned integer.  The time, in
          milliseconds, between retransmitted Neighbor
          Solicitation messages.  Used by address resolution
          and the Neighbor Unreachability Detection algorithm
          (see Sections 7.2 and 7.3).  A value of zero means
          unspecified (by this router).

My initial assumption would be that pfSense (FreeBSD) would obey the RFC (RFC 4861). Only way to know is to pcap whatever the ISP is actually sending and take a look.

bimmerdriver

@derelict Thanks for your reply. This ISP has made some of possibly questionable implementation decisions in their network.

First, the DHCP before RA feature was tested on this network. Their edge routers will not respond to an RS until after the DHCP solicit/advertise and DHCP request/reply sequence is complete. After that, the edge router will respond to an RS with an RA. I just fired up wireshark and captured some packets. The router lifetime in the is 4500 seconds (75 minutes), the reachable time is 0 and the and the retrans timer is 100 ms. These values are also used in the unsolicited RA messages, which leads to another interesting implementation decision.

Second, the time between the unsolicited RA messages ranges from approximately 15 minutes to approximately 30 minutes. I determined this by capturing RA messages over several hours. This is longer than usual, but according to RFC 2461, MaxRtrAdvInterval is 1800 seconds, so they are operating within the allowable limit.

I also looked at the DHCP reply. T1 and T2 in the IA for PD are 300 and 480. The preferred lifetime and valid lifetime in the IA Prefix are 600 and 900, respectively.

The above are from my router which is working properly.

Apparently on some fiber networks, the unsolicited RA messages are not being sent at all. This is a known problem that they are working with the router vendor on. I'm trying to help someone else figure out why pfsense is behaving as I described above. Based on these timers, I would think it should work for 75 minutes (or whatever the prefix lifetime is) until the prefix expires, then it should stop working. However, it seems to fail once after initially getting a prefix, then if the interface is restarted, it keeps working. I don't understand why it would keep working if prefix expiration is causing it to stop working after the interface is started. Maybe something else is going on. I don't have packets captured from this network, but I'll try to get some.