Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    XG-7100 kernel panic boot loop

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    13 Posts 2 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • H
      humbug
      last edited by humbug

      While running a wan bandwidth test, I suddenly lost all wan connectivity, but was still able to reach the web UI. After rebooting, I'm seeing a kernel panic loop when booting multi-user, but am able to start up single-user mode. I captured the boot logs and panic dumps from the serial console here https://pastebin.com/xpQzJUHR

      I'm unsure how to read this, and am hoping for suggestions on what might be wrong.

      1 Reply Last reply Reply Quote 0
      • DerelictD
        Derelict LAYER 8 Netgate
        last edited by

        How did you reboot it?

        In single-user mode run /sbin/fsck -y / about four times then reboot and see if that fixes it.

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        H 1 Reply Last reply Reply Quote 0
        • H
          humbug @Derelict
          last edited by

          @derelict Thanks, that resolved the kernel panic. I could not find a reboot option in the GUI, and I had forgotten to save the pfsense manual to a local disk to see if it was listed.

          I was hoping the power button would do a clean ACPI shutdown with a single press, but after waiting a while and it didn't shut down, I did an 8-second forced shutdown from the power button.

          1 Reply Last reply Reply Quote 0
          • DerelictD
            Derelict LAYER 8 Netgate
            last edited by

            Diagnostics > Reboot

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • H
              humbug
              last edited by

              Ah, there it is! Interestingly, I can replicate the WAN lockup. Every time I run comcast's speed test (speedtest.xfinity.com) the WAN goes down, and the only solution I've found so far is a reboot.

              1 Reply Last reply Reply Quote 0
              • DerelictD
                Derelict LAYER 8 Netgate
                last edited by Derelict

                Anything in the logs? Any expansion on what goes down actually means?

                Chattanooga, Tennessee, USA
                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                1 Reply Last reply Reply Quote 0
                • H
                  humbug
                  last edited by

                  Shortly after the speed test finishes, if I jump to Status > Gateways I can see packet loss climb up to 100%, and Status moves to offline for both WAN_DHCP and WAN_DHCPV6.

                  In the gateway logs:

                  Aug 5 18:51:16	dpinger		WAN_DHCP <IP>: Alarm latency 70660us stddev 120945us loss 21%
                  
                  Aug 5 18:51:16	dpinger		WAN_DHCP6 <IPV6>%lagg0.4090: Alarm latency 75036us stddev 121650us loss 21%
                  
                  1 Reply Last reply Reply Quote 0
                  • DerelictD
                    Derelict LAYER 8 Netgate
                    last edited by

                    OK the next thing I would do is run a packet capture and determine if the pings are going out the interface or not.

                    Diagnostics > Packet Capture

                    Interface: WAN
                    Protocol: ICMP
                    Host Address: Whatever the gateway monitoring address is
                    Count: 10000 (or so)

                    Then run your speed test again.

                    Then, after it fails, stop and look at the capture. My guess is you will see the packets leaving and there being no response, which means something upstream is dying, not the XG-7100.

                    Feel free to download the pcap file and I will send you a place to upload it so I can look at it and provide interpretation if you like.

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • H
                      humbug
                      last edited by

                      I ran the packet capture test a few times to make sure I didn't cut the traces short too early, or some other mistake. But the logs do not show any ICMP packets without a reply, and it is a nicely ordered request/reply sequence until no more requests show up, and the gateway status shows 'Offline'.

                      Other data points:

                      • Rebooting the cable modem does not resolve WAN connection, but rebooting 7100 does.
                      • The WAN goes offline shortly after the upload portion of the bandwidth test, but not always. Upload speeds are ~30Mbps. Lowering the LAN port connection to 100Mbps to throttle download speeds doesn't change results.
                      1 Reply Last reply Reply Quote 0
                      • DerelictD
                        Derelict LAYER 8 Netgate
                        last edited by

                        What ports are you using for WAN and LAN? The default ETH1 and ETH2?

                        Chattanooga, Tennessee, USA
                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                        H 1 Reply Last reply Reply Quote 0
                        • H
                          humbug @Derelict
                          last edited by

                          eth1 for wan, eth4 for lan

                          1 Reply Last reply Reply Quote 0
                          • H
                            humbug
                            last edited by

                            I was able to reproduce the lockups with other high bandwidth downloads, and usually accessing the web UI also breaks (no response), and I need to reboot the system from serial console.

                            I shut off suricata on the LAN interface (which was configured for Inline mode), and so far haven't seen any crashes, but need some more time and testing to confirm this resolves my issue.

                            1 Reply Last reply Reply Quote 0
                            • H
                              humbug
                              last edited by

                              The problem causing wan interface to go down is suricata crashing. If I have it in inline mode on either the lan or the wan interface, then any stream that maxes my wan bandwidth for ~30MB or more crashes suricata.

                              Stopping and restarting the process manually restores the WAN gateway.

                              Do others have any experience with the 7100 and running Suricata in inline mode. I know there are warnings about inline mode, and do not know what to expect with Denverton hardware.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.