Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    25.03.b.20250507.1611 crash

    Scheduled Pinned Locked Moved Plus 25.03 Develoment Snapshots
    27 Posts 3 Posters 641 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      pst @stephenw10
      last edited by

      @stephenw10 said in 25.03.b.20250507.1611 crash:

      <5>gif0: loop detected
      <5>gif0: loop detected
      <5>gif0: loop detected

      and regarding these, I managed to track down the root cause...

      The loop detected indications appeared when I put my computer to sleep... What? 🤔

      Packet tracing on the gif interface revealed this in wireshark:

      7e5520d1-310c-4fab-92bb-ded35b1b5bf7-image.png

      After further digging I realised that one of my hyper-v guests (an Ubuntu instance) seems to be calling home every time it gets a suspend indication - but using the link-local address. I'm not sure if that's an issue in the hyper-v guest or the hyper-v server sending packets using the link-local address.

      After updating the LAN rules to filter out _private6_ addresses I no longer see gif0 screaming about loop detections.

      All is well, for now...

      P 1 Reply Last reply Reply Quote 1
      • P
        pst @pst
        last edited by

        @stephenw10 but obviously there had to be multiple reasons for these "loop detected", I discovered one additional cause which seems more related to the inner workings of pfSense.

        The scenario is as follows

        1. I put the computer to sleep
        2. a TCP retransmission is received on the gif interface aimed for the now sleeping computer
        3. after three seconds (timer expiry?) a ICMPv6 Destination Unreachable (Address Unreachable) is generated by pfSense
        4. this ICMPv6 packet is what triggers the "gif0: loop detected" (the timing in syslog and packet trace matches)

        The information in the "destination unreachable" looks fine to me, so there is no obvious reason why it could be interpreted as "looped". I can upload the pcap if anyone is interested?

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Hmm, how exactly are you using the gif tunnel(s) there?

          P 1 Reply Last reply Reply Quote 0
          • P
            pst @stephenw10
            last edited by

            @stephenw10 gif0 is the only gif tunnel I have, it is a tunnelbroker.net connection that provides IPv6 to the LAN (where the sleepy computer resides) and a number of VLANs.

            Those LAN/VLANs all have static IPv6 configuration (/64) in the routed/48 tunnelbroker subnet.The router mode is set to Assisted with DHCPv6 servers running.

            In addition to tunnelbroker.net I also have one VLAN that is configured with IPv6 by Tracking my ISP WAN which uses DHCPv6.

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Hmm, curious. Yes, hard to see how that could create any sort of loop on any interface.

              Is that client that goes to sleep attached directly to pfSense? Such that the link state could change when it goes into standby?

              P 2 Replies Last reply Reply Quote 0
              • P
                pst @stephenw10
                last edited by

                @stephenw10 said in 25.03.b.20250507.1611 crash:

                Is that client that goes to sleep attached directly to pfSense?

                No, there's an unmanaged switch inbetween

                1 Reply Last reply Reply Quote 0
                • P
                  pst @stephenw10
                  last edited by

                  @stephenw10 said in 25.03.b.20250507.1611 crash:

                  Is that client that goes to sleep attached directly to pfSense?

                  I changed my network setup and have now tested with a direct connection between Sleepyhead and the pfSense and the pfSense behaviour is different for the same scenario:

                  I can still see the reception of the TCP retransmissions, but pfSense does not respond with ICMPv6 Destination Unreachable after a timeout like previously, it just seems to drop the package which eventually leads to a TCP reset from the other end. No ICMP == no gif0: loop detected in this scenario.

                  This all makes sense I guess, considering the amount of work pfSense does when it detects the LAN/igb1 going down at the point of going to sleep. It knows the LAN client is unavailable and acts accordingly.

                  So, a switch is required between pfSense and the LAN client to trigger the "loop detected" scenario.

                  P 1 Reply Last reply Reply Quote 1
                  • P
                    pst @pst
                    last edited by

                    I found an easier way of recreating the issue, saving me having to put the computer to sleep: all I need to do is to is to start a file transfer (wget for example) in one of the hyper-v guests that resides on the LAN (and therefore also exists over the gif interface), and then just pause the hyper-v guest. After a short while pfSense will answer incoming packets on the gif destined for the sleeping guest with ICMPv6 Destination Unreachable, and in the syslog the corresponding "gif0: loop detected" is added.

                    As I monitor the situation I can also see that the remaining "loop detected" that gets triggered are from Wi-Fi attached phones, which have a tendency to wake up and go back to sleep much more regularly.

                    I think I have now found all scenarios which triggers the "gif0: loop detected". One was down to my misconfiguration of firewall rules, I leave it to you to find out why the pfSense generated "ICMv6 Destination Unreachable" are regarded as "looped".

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, interesting. I'd assume you still saw those loop warnings in 24.11?

                      I don't believe it's actually related to the crash though TBH. Have to wait on that.

                      P 1 Reply Last reply Reply Quote 0
                      • P
                        pst @stephenw10
                        last edited by

                        @stephenw10 said in 25.03.b.20250507.1611 crash:

                        I don't believe it's actually related to the crash though

                        I agree, it is a side track, and I doubt it is actually 25.03-related either. I didn't run the gif in 24.11 so I have no history. If the beta config.xml is compatible with 24.11 I could try and load it and see if the pfSense behaviour has changed.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          The config version has changed so it will complain if you try to load a 25.03 config into 24.11. It might work. 😉 It depends what you actually have configured.

                          P 1 Reply Last reply Reply Quote 0
                          • P
                            pst @stephenw10
                            last edited by

                            @stephenw10 yes, I loaded the current 25.03 config into 24.11 earlier and there were some warnings and a few errors. Most things seemed to work well though, including the GIF tunnel, but there was a lot more "loop detected" than in the beta. They seemed triggered by other scenarios than just the beta's "ICMP6 Unreachable Desination, Unreachable Address". Not sure how much can be read into that considering the "invalid" config file (and I really don't feel like manually setting up the current config in 24.11!)

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Ah, OK. So not a regression then. Yup I think we safely say it's unrelated to the crash.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.