Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SG-5100 lost all ix ports after power outage

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    17 Posts 5 Posters 1.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      brians
      last edited by

      We had extended power outage last night at our office and when power came back the SG-5100 would not boot.

      I connected a console cable and it was waiting to assign WAN and LAN which I did to igb0 and igb1 respectively. After startup, ix0, ix1,ix2,ix3 are missing. Cables plugged physically into these ports are lit up with LEDs however.

      After some troubleshooting, I ended up resetting to factory, reloaded config and is still same issue.
      The version which I was running at the time was 22.05 with no ZFS... The only image I had handy was 22.01 which I reflashed from USB (using ZFS now, which is irrelevant in this case) and same issue... all ix ports missing... was able to load 22.05 config onto 22.01 with no issues. Last week we had another extended outage and I noticed it didn't boot but a power cycle fixed and I didn't think much of it other than maybe an issue with filesystem and to upgrade to ZFS soon.

      To get back operational, I moved a dedicated port on ix0 which was using for a specific management LAN onto a VLAN on igb1 and after a bit of manual work am up and running again for now... If I reboot the router It prompts for assigning WAN and LAN.

      The SG5100 is about 4 years old

      Any ideas? Are the ix ports, or the controller defective ?

      I just ordered a 6100 replacement anyways fortunately were in stock so I didn't have to settle for a 4100 :)

      When I replace it with 6100 I can troubleshoot the hardware more... does Netgate offer any sort of repair out of warranty for these?

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        The ix NICs are in the SoC in the 5100 so if those get fried I expect more things to have failed.

        Can you check the boot log for errors attaching the driver?

        Do the NICs appear at all in the output of pciconf -lv?

        Out of warranty repair is likely to be uneconomical on that unfortunately.

        Steve

        B 1 Reply Last reply Reply Quote 0
        • B
          brians @stephenw10
          last edited by stephenw10

          @stephenw10
          Here is a section of boot log pertaining to the ix0 interface. I don't see reference to ix1, ix2, or ix3 maybe it it gives up if doesn't see the first one.

          AHCI v1.31 with 1 6Gbps ports, Port Multiplier supported
          ahcich8: <AHCI channel> at channel 7 on ahci1
          ahciem1: <AHCI enclosure management bridge> on ahci1
          xhci0: <Intel Denverton USB 3.0 controller> mem 0xdff80000-0xdff8ffff irq 19 at device 21.0 on pci0
          xhci0: 32 bytes context size, 64-bit DMA
          usbus0 on xhci0
          usbus0: 5.0Gbps Super Speed USB v3.0
          pcib6: <ACPI PCI-PCI bridge> irq 16 at device 22.0 on pci0
          pci6: <ACPI PCI bus> on pcib6
          pci6: <network, ethernet> at device 0.0 (no driver attached)
          pci6: <network, ethernet> at device 0.1 (no driver attached)
          pcib7: <ACPI PCI-PCI bridge> at device 23.0 on pci0
          pci7: <ACPI PCI bus> on pcib7
          ix0: <Intel(R) X553 L (1GbE)> mem 0xdf400000-0xdf5fffff,0xdf604000-0xdf607fff irq 16 at device 0.0 on pci7
          ix0: Hardware initialization failed
          ix0: IFDI_ATTACH_PRE failed 5
          device_attach: ix0 attach returned 5
          ix0: <Intel(R) X553 L (1GbE)> mem 0xdf200000-0xdf3fffff,0xdf600000-0xdf603fff irq 17 at device 0.1 on pci7
          ix0: Hardware initialization failed
          ix0: IFDI_ATTACH_PRE failed 5
          device_attach: ix0 attach returned 5
          pci0: <simple comms> at device 24.0 (no driver attached)
          uart2: <Intel Denverton UART> port 0xe080-0xe087 mem 0xdff9d000-0xdff9d0ff irq 16 at device 26.0 on pci0
          uart2: Using 1 MSI message
          

          Here is output of pciconf -lv and I notice there is none6,7,8,9 with X553... 1GbE near bottom.

          igb0@pci0:3:0:0:	class=0x020000 card=0x0000ffff chip=0x15338086 rev=0x03 hdr=0x00
              vendor     = 'Intel Corporation'
              device     = 'I210 Gigabit Network Connection'
              class      = network
              subclass   = ethernet
          igb1@pci0:4:0:0:	class=0x020000 card=0x0000ffff chip=0x15338086 rev=0x03 hdr=0x00
              vendor     = 'Intel Corporation'
              device     = 'I210 Gigabit Network Connection'
              class      = network
              subclass   = ethernet
          none6@pci0:6:0:0:	class=0x020000 card=0x00008086 chip=0x13068086 rev=0x11 hdr=0x00
              vendor     = 'Intel Corporation'
              class      = network
              subclass   = ethernet
          none7@pci0:6:0:1:	class=0x020000 card=0x00008086 chip=0x13068086 rev=0x11 hdr=0x00
              vendor     = 'Intel Corporation'
              class      = network
              subclass   = ethernet
          none8@pci0:8:0:0:	class=0x020000 card=0x00008086 chip=0x15e58086 rev=0x11 hdr=0x00
              vendor     = 'Intel Corporation'
              device     = 'Ethernet Connection X553 1GbE'
              class      = network
              subclass   = ethernet
          none9@pci0:8:0:1:	class=0x020000 card=0x00008086 chip=0x15e58086 rev=0x11 hdr=0x00
              vendor     = 'Intel Corporation'
              device     = 'Ethernet Connection X553 1GbE'
              class      = network
              subclass   = ethernet
          
          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Hmm, well that's.... odd! Two of them appear to have somehow changed device ID. And the others are failing to respond to the driver...
            They should appear as:

            ix0@pci0:6:0:0:	class=0x020000 card=0x00008086 chip=0x15e48086 rev=0x11 hdr=0x00
                vendor     = 'Intel Corporation'
                device     = 'Ethernet Connection X553 1GbE'
                class      = network
                subclass   = ethernet
            ix1@pci0:6:0:1:	class=0x020000 card=0x00008086 chip=0x15e48086 rev=0x11 hdr=0x00
                vendor     = 'Intel Corporation'
                device     = 'Ethernet Connection X553 1GbE'
                class      = network
                subclass   = ethernet
            ix2@pci0:8:0:0:	class=0x020000 card=0x00008086 chip=0x15e58086 rev=0x11 hdr=0x00
                vendor     = 'Intel Corporation'
                device     = 'Ethernet Connection X553 1GbE'
                class      = network
                subclass   = ethernet
            ix3@pci0:8:0:1:	class=0x020000 card=0x00008086 chip=0x15e58086 rev=0x11 hdr=0x00
                vendor     = 'Intel Corporation'
                device     = 'Ethernet Connection X553 1GbE'
                class      = network
                subclass   = ethernet
            
            B 1 Reply Last reply Reply Quote 0
            • B
              brians @stephenw10
              last edited by

              @stephenw10 Ok, so is probably some hardware issue I gather.

              stephenw10S 1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator @brians
                last edited by

                It looks like it. I'm just trying to see if we've ever seen anything like that previously.
                I assume it has been through a complete power cycle since the outage?

                B 1 Reply Last reply Reply Quote 0
                • B
                  brians @stephenw10
                  last edited by

                  @stephenw10 Yes, several power cycles.

                  When we get the 6100 next week I will be able to do more experiments on it.

                  1 Reply Last reply Reply Quote 0
                  • bmeeksB
                    bmeeks
                    last edited by

                    Was your extended power outage due to weather perhaps? If so, I would suspect damaged hardware via a transient surge. Ethernet cabling can make a dandy antenna for picking up EMF surges caused from nearby lightning strikes. I've had switch ports destroyed that way in the past.

                    B 1 Reply Last reply Reply Quote 1
                    • B
                      brians @bmeeks
                      last edited by

                      @bmeeks said in SG-5100 lost all ix ports after power outage:

                      tro

                      It was a scheduled outage due to building maintenance.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        When you are able there is something we might try to rewrite the NIC eeprom contents in case it has somehow been corrupted. That's about the only thing I could imagine changing the PCI device IDs like that. The driver is not complaining about the eeprom checksum which it usually would but it could be it doesn't get that far.

                        B 1 Reply Last reply Reply Quote 0
                        • B
                          brians @stephenw10
                          last edited by

                          @stephenw10 I just received 6100 today and restored to it. Therefore now I have SG5100 to do whatever you want to try regarding eeprom.

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            You can try to reset the NIC EEPROM on the 5100 as follows:

                            • Connect to the serial console
                            • Power on the device
                            • Hit Esc or F12 to get the boot menu and choose Enter Setup
                            • Go into the BIOS settings under Advanced > CSM Configuration
                            • Change Network to a different value such as UEFI
                            • Save/exit and reboot

                            That will nudge it to rewrite the NIC EEPROM and then it should boot (but perhaps slower).

                            If the NICs work again after that, then go back into the BIOS and change that same setting back to Legacy or whatever it was on yours to start with.

                            If that does not help, then it probably is a hardware failure.

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            B 1 Reply Last reply Reply Quote 1
                            • B
                              brians @jimp
                              last edited by

                              @jimp I tried and did not fix. Thanks for your assistance.

                              I can continue to use as a test/backup router since it has two working ports still.

                              jimpJ 1 Reply Last reply Reply Quote 0
                              • jimpJ
                                jimp Rebel Alliance Developer Netgate @brians
                                last edited by

                                @brians said in SG-5100 lost all ix ports after power outage:

                                @jimp I tried and did not fix. Thanks for your assistance.

                                I can continue to use as a test/backup router since it has two working ports still.

                                Oh well, it was worth a shot.

                                Out of curiosity, was the error in the system log still the same as before?

                                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                Need help fast? Netgate Global Support!

                                Do not Chat/PM for help!

                                B 1 Reply Last reply Reply Quote 0
                                • B
                                  brians @jimp
                                  last edited by

                                  @jimp Yes the error was same.

                                  J 1 Reply Last reply Reply Quote 0
                                  • J
                                    joekislo @brians
                                    last edited by

                                    Incase somebody finds this thread in the future, there does appear to be issues with some netgate hardware and hard power downs.

                                    Part of our final testing for a new datacenter deployment we perform a hard failover of all equipment by removing the power on each of the redundant pairs of equipment. After removing the power on our 1537 the unit booted back up and ix0 and ix1 were no longer registered. The two onboard copper ports and the expansion slot ports all worked.

                                    pciconf -lv showed the PCI device, but like the OP, it was registered to none@pci. We tried the EEPROM rewrite and that didn't help. Netgate support collected a full status dump of the device, then issued us an RMA for replacement.

                                    Hopefully this saves somebody some time, this hardware appears to be sensitive to unclean shutdowns.

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      Hmm, that's weird. I've seen SFP ports behave oddly across a power cycle, especially with connecting module before vs after boot. But the ix driver was patched to still allow it to attach.

                                      1 Reply Last reply Reply Quote 0
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.