Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SG-5100 takes over 20 minutes to boot after eMMC failure

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    105 Posts 18 Posters 21.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • H
      haraldinho
      last edited by

      I purchased the Transcend 430S M.2 512 GB SATA III 3D NAND. Works perfectly except the boot issue. I am working with Netgate support to see if I can get the issue resolved. Will report back if I have a solution.

      H 1 Reply Last reply Reply Quote 0
      • H
        haraldinho @haraldinho
        last edited by

        On advice of Netgate Support I did a clean reinstall with UFS (was on ZFS). The problem remains exactly the same. I feel the problem is the eMMC memory that is toast but still is getting poked by the 5100. I expected it to be disabled after installing my M.2 SSD, but it is clearly not.

        Loads of these error message in the logs:

        mmcsd0: Error indicated: 1 Timeout
        mmcsd0: Error indicated: 1 Timeout
        sdhci_pci0-slot0: Controller timeout
        sdhci_pci0-slot0: ============== REGISTER DUMP ==============
        sdhci_pci0-slot0: Sys addr: 0x05840000 | Version:  0x00001002
        sdhci_pci0-slot0: Blk size: 0x00005200 | Blk cnt:  0x00000010
        sdhci_pci0-slot0: Argument: 0x00000010 | Trn mode: 0x00000033
        sdhci_pci0-slot0: Present:  0x1fff0206 | Host ctl: 0x00000025
        sdhci_pci0-slot0: Power:    0x0000000b | Blk gap:  0x00000080
        sdhci_pci0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
        sdhci_pci0-slot0: Timeout:  0x0000000d | Int stat: 0x00000001
        sdhci_pci0-slot0: Int enab: 0x01ff003b | Sig enab: 0x01ff003a
        sdhci_pci0-slot0: AC12 err: 0x00000000 | Host ctl2:0x0000000c
        sdhci_pci0-slot0: Caps:     0x546ec8b2 | Caps2:    0x80000007
        sdhci_pci0-slot0: Max curr: 0x00000000 | ADMA err: 0x00000000
        sdhci_pci0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
        sdhci_pci0-slot0: ===========================================
        mmcsd0: Error indicated: 1 Timeout
        mmcsd0: Error indicated: 1 Timeout
        sdhci_pci0-slot0: Controller timeout
        sdhci_pci0-slot0: ============== REGISTER DUMP ==============
        sdhci_pci0-slot0: Sys addr: 0x05840000 | Version:  0x00001002
        sdhci_pci0-slot0: Blk size: 0x00005200 | Blk cnt:  0x00000010
        sdhci_pci0-slot0: Argument: 0x00000200 | Trn mode: 0x00000033
        sdhci_pci0-slot0: Present:  0x1fff0206 | Host ctl: 0x00000025
        sdhci_pci0-slot0: Power:    0x0000000b | Blk gap:  0x00000080
        sdhci_pci0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
        sdhci_pci0-slot0: Timeout:  0x0000000d | Int stat: 0x00000001
        sdhci_pci0-slot0: Int enab: 0x01ff003b | Sig enab: 0x01ff003a
        sdhci_pci0-slot0: AC12 err: 0x00000000 | Host ctl2:0x0000000c
        sdhci_pci0-slot0: Caps:     0x546ec8b2 | Caps2:    0x80000007
        sdhci_pci0-slot0: Max curr: 0x00000000 | ADMA err: 0x00000000
        sdhci_pci0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
        sdhci_pci0-slot0: ===========================================
        mmcsd0: Error indicated: 1 Timeout
        mmcsd0: Error indicated: 1 Timeout
        mmcsd0: Error indicated: 1 Timeout
        mmcsd0: Error indicated: 1 Timeout
        mmcsd0: Error indicated: 1 Timeout
        

        Now I am surfing the internet to see if somebody succeeded in actually getting this memory disabled.

        Harald

        stephenw10S 1 Reply Last reply Reply Quote 0
        • G
          gabacho4 Rebel Alliance
          last edited by

          @haraldinho very unfortunate to hear indeed. So far my eMMC isn’t giving me any issues that I know of but, based on the data, it looks like I’m due to have them.l eventually. I have 2 M.2 ssd on their way to this and the remote location. Crazy that what was a 600 dollar device can be crippled by something like this. Not being able to restart a device and having to rely on someone to physically power it back on is nuts. Gonna have to start considering my options if a real fix cannot be found.

          1 Reply Last reply Reply Quote 1
          • stephenw10S
            stephenw10 Netgate Administrator @haraldinho
            last edited by

            @haraldinho You can disable it completely in pfSense and prevent those errors.
            Create the file /boot/loader.conf.local and add to it:

            hint.mmcsd.0.disabled="1"
            

            But that doesn't help with the slow boot in the BIOS.

            Steve

            H hayescompatibleH 2 Replies Last reply Reply Quote 0
            • G
              gabacho4 Rebel Alliance
              last edited by

              Since these are Lanner devices, have they been contacted regarding BIOS updates or anything? Seems they might be the last real hope since it’s their hardware, maybe?

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Yes, we are working with them on a BIOS update now. It isn't to address this specifically but hopefully could do so.

                Steve

                hayescompatibleH H 2 Replies Last reply Reply Quote 0
                • G
                  gabacho4 Rebel Alliance
                  last edited by

                  That makes me happy. Like the others, I would have assumed that whatever internal BIOS check takes place would identify, or could be told, that the eMMC wasn’t installed/working and be pointed to the M.2 SSD. That was my intended mitigation, based on that assumption, some day in the future should the internal memory ever die. Here’s to hoping they can make it happen.

                  Is this issue that potentially afflicts EVERY Netgate router with internal memory and the M.2 expansion capability?

                  1 Reply Last reply Reply Quote 0
                  • hayescompatibleH
                    hayescompatible @stephenw10
                    last edited by

                    @stephenw10 said in SG-5100 takes over 20 minutes to boot after eMMC failure:

                    Yes, we are working with them on a BIOS update now. It isn't to address this specifically but hopefully could do so.

                    Steve

                    That sounds promising, I'm hopeful this problem will be addressed.

                    Honestly, this is what's preventing me from getting either of the 4100 or 6100 BASE models—if those don't gracefully handle failures of the eMMC either, then it's pretty much a certainty that the only way to ensure long-term reliability is to spring for the MAX versions or install an SSD yourself… in which case, there should be more prominent warnings made about the effects of an eMMC failure, how to monitor its health, more warnings in pfSense when enabling options that could potentially shorten its lifespan, etc.

                    1 Reply Last reply Reply Quote 2
                    • H
                      haraldinho @stephenw10
                      last edited by

                      @stephenw10 Hi Steve, it is really good to hear that there is a BIOS update in the works for the 5100. I would really appreciate if this specific issue could be taken along in the update. It does not seem to be a big thing to do and would really uncripple the devices of people with a dead eMMC. @stephenw10 is this something you can take up with the firmware folks and report back here?

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        I can certainly look into it.

                        1 Reply Last reply Reply Quote 1
                        • H
                          haraldinho @stephenw10
                          last edited by

                          @stephenw10 Hi Stephen, I thought to give this little hack a try and hey presto! My boot issues are gone!! I can do a warm reboot now without any issues! Speed is also good!

                          So a happy camper here! The only drawback to this solution is that you have to SSH into your box to do it and it will be wiped with a clean install. So it would still be best to solve it in BIOS or with some setting in pfSense where you can disable eMMC on Netgate boxes suffering from the issue.

                          Thanks Stephen!

                          G 1 Reply Last reply Reply Quote 0
                          • G
                            gabacho4 Rebel Alliance @haraldinho
                            last edited by gabacho4

                            @haraldinho what is the hack to which you refer?

                            EDIT: do you mean creating the file that Stephen mentioned earlier? If so, that is a great interim fix while they hopefully get something more permanent and formal put in place.

                            H 1 Reply Last reply Reply Quote 0
                            • H
                              haraldinho @gabacho4
                              last edited by

                              @gabacho4 the forum indentation is a bit confusing indeed. I replied to the post from @stephenw10 with the 'hack' and it then shows that post as a reply to Stephens post, but also as a new post at the bottom of the whole topic, with just a small reference to the fact that it is a reply to an earlier message. Anyway, the life saving configuration change is this:

                              Create the file /boot/loader.conf.local and add to it:
                              hint.mmcsd.0.disabled="1"

                              It is as simple as that :-)

                              The only thing I am not sure about is what happens to this setting when you do a regular upgrade to a new version of pfsense. @stephenw10 does it get preserved?

                              hayescompatibleH stephenw10S 2 Replies Last reply Reply Quote 0
                              • hayescompatibleH
                                hayescompatible @haraldinho
                                last edited by

                                @haraldinho said in SG-5100 takes over 20 minutes to boot after eMMC failure:

                                @gabacho4 the forum indentation is a bit confusing indeed. I replied to the post from @stephenw10 with the 'hack' and it then shows that post as a reply to Stephens post, but also as a new post at the bottom of the whole topic, with just a small reference to the fact that it is a reply to an earlier message. Anyway, the life saving configuration change is this:

                                Create the file /boot/loader.conf.local and add to it:
                                hint.mmcsd.0.disabled="1"

                                It is as simple as that :-)

                                The only thing I am not sure about is what happens to this setting when you do a regular upgrade to a new version of pfsense. @stephenw10 does it get preserved?

                                I'll have to try that later when I'm able to reboot my 5100. Sounds promising.

                                G 1 Reply Last reply Reply Quote 0
                                • G
                                  gabacho4 Rebel Alliance @hayescompatible
                                  last edited by

                                  Just to make sure I understand the hack right - it’s merely telling pfSense to ignore the first of the storage options being provided by the BIOS therefore causing pfSense to turn to the M.2 right away rather than trying the eMMC. Correct? If so, this seems like something Netgate could code in as part of the pfSense config and make a toggle for in the GUI (“ignore eMMC at boot”) and we’d all happily be on our way. If coded, it would be persistent across any installation/upgrade or config restoration. Could it be that simple?

                                  H 1 Reply Last reply Reply Quote 0
                                  • H
                                    haraldinho @gabacho4
                                    last edited by haraldinho

                                    @gabacho4 That's how I understand what this setting does indeed. And I guess this setting could be done in pfsense, however, I feel that it is a design flaw of these Lanner boxes and should be fixed in BIOS. But I would be happy with either ;-)

                                    G hayescompatibleH 2 Replies Last reply Reply Quote 0
                                    • G
                                      gabacho4 Rebel Alliance @haraldinho
                                      last edited by

                                      @haraldinho fair enough point. I wonder if ALL of their devices with eMMC suffer from this. If so, the coding with GUI would make sense unless they can get BIOS updates for all the devices. If the 5100 is alone in suffering from this, the you’re right; a final fix from Lanner should be secured. Ultimately I just need a device that I can rely on 100%. I’m pretty spooked right now considering my very remote router could become (if not already) very dangerous to reboot or update. Yikes!

                                      1 Reply Last reply Reply Quote 0
                                      • hayescompatibleH
                                        hayescompatible @haraldinho
                                        last edited by

                                        @haraldinho said in SG-5100 takes over 20 minutes to boot after eMMC failure:

                                        @gabacho4 That's how I understand what this setting does indeed. And I guess this setting could be done in pfsense, however, I feel that it is a design flaw of these Lanner boxes and should be fixed in BIOS. But I would be happy with either ;-)

                                        The boot loader hint didn't work for me but I didn't think it would. My 5100 still appears to hang after I issue a reboot from the webConfigurator. The power light stays lit (good) but so do the lights on the igb0 and igb1 ports. They do this on a cold boot too but after a few seconds, they clear and then the boot process resumes. Still seems to be a BIOS issue.

                                        FWIW, even after adding the boot loader hint that @stephenw10 suggested, pfSense/FreeBSD still sees the eMMC (via dmesg):

                                        sdhci_pci0: <Intel Denverton eMMC 5.0 Controller> mem 0xdff9a000-0xdff9afff,0xdff99000-0xdff99fff irq 16 at device 28.0 on pci0
                                        mmc0: <MMC/SD bus> on sdhci_pci0
                                        

                                        Ideally there'd be a way to disable that completely in the BIOS.

                                        H 1 Reply Last reply Reply Quote 0
                                        • H
                                          haraldinho @hayescompatible
                                          last edited by

                                          @hayescompatible Hmm, that's a bummer. I must say I also have that same line still in my logs after the hack. So there is no difference there between our setups.

                                          NOTE
                                          after you made the modification, you need to power cycle your box once more to get the line loaded and effectuated. After that power cycle, warm reboots from the web configurator work again for me.

                                          Some things you can check:

                                          • If you disable "Quiet mode" in BIOS and do a boot, do you see a difference in your logs with and without the hack?

                                          For me all the references to timeouts like the one below went away in the logs (next to the fact I can now issue a Warm Reboot from the WebConfigurator.

                                          mmcsd0: Error indicated: 1 Timeout
                                          mmcsd0: Error indicated: 1 Timeout
                                          sdhci_pci0-slot0: Controller timeout
                                          sdhci_pci0-slot0: ============== REGISTER DUMP ==============
                                          sdhci_pci0-slot0: Sys addr: 0x06040000 | Version:  0x00001002
                                          sdhci_pci0-slot0: Blk size: 0x00005200 | Blk cnt:  0x00000010
                                          sdhci_pci0-slot0: Argument: 0x00000010 | Trn mode: 0x00000033
                                          sdhci_pci0-slot0: Present:  0x1fff0206 | Host ctl: 0x00000025
                                          sdhci_pci0-slot0: Power:    0x0000000b | Blk gap:  0x00000080
                                          sdhci_pci0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
                                          sdhci_pci0-slot0: Timeout:  0x0000000d | Int stat: 0x00000001
                                          sdhci_pci0-slot0: Int enab: 0x01ff003b | Sig enab: 0x01ff003a
                                          sdhci_pci0-slot0: AC12 err: 0x00000000 | Host ctl2:0x0000000c
                                          sdhci_pci0-slot0: Caps:     0x546ec8b2 | Caps2:    0x80000007
                                          sdhci_pci0-slot0: Max curr: 0x00000000 | ADMA err: 0x00000000
                                          sdhci_pci0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
                                          sdhci_pci0-slot0: ===========================================
                                          
                                          • Do you see the last line from below snippet in your logs (not sure why the characters are all repeated, but that's not the issues here)? It indicates the file is loaded during boot.
                                          Loading /boot/defaults/loader.conf
                                          Loading /boot/defaults/loader.conf
                                          Loading /boot/device.hints
                                          Loading /boot/loader.conf
                                          HLLooaaddiinngg  //bboooott//llooaaddeerr..ccoonnff..llooccaall
                                          
                                          
                                          • if you ssh into your box, and enter the below command, does it then show the exact text of the hack?
                                          cat /boot/loader.conf.local
                                          

                                          It should show:

                                          hint.mmcsd.0.disabled="1"
                                          
                                          • check character set issues
                                            Double check the quote symbols. I spent many hours debugging 'wrong quote symbols' issues in the past (" and ' and `). Also carefully check the other spellings. If you copy-pasted the line, try deleting the file and recreating it by typing it manually.

                                          On my box, the file did not exist. I created it using vi with this commands:

                                          cd /boot
                                          sudo vi loader.conf.local
                                          

                                          I typed the hack line manually, no copy-paste. Had to google how vi works, but that is not that complicated for this simple modification. Just get vi into insert mode by pressing i, type the line, hit escape to get vi into command mode and then enter :w to save the file and then again escape followed by :q to quit. Then do a reboot. But whom am I telling, perhaps you are a Linux guru :-D

                                          hayescompatibleH 1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator @haraldinho
                                            last edited by

                                            @haraldinho said in SG-5100 takes over 20 minutes to boot after eMMC failure:

                                            The only thing I am not sure about is what happens to this setting when you do a regular upgrade to a new version of pfsense. @stephenw10 does it get preserved?

                                            Yes, loader.conf.local is preserved across a firmware upgrade.

                                            @hayescompatible said in SG-5100 takes over 20 minutes to boot after eMMC failure:

                                            FWIW, even after adding the boot loader hint that @stephenw10 suggested, pfSense/FreeBSD still sees the eMMC (via dmesg):
                                            sdhci_pci0: <Intel Denverton eMMC 5.0 Controller> mem 0xdff9a000-0xdff9afff,0xdff99000-0xdff99fff irq 16 at device 28.0 on pci0
                                            mmc0: <MMC/SD bus> on sdhci_pci0

                                            That's the mmc bus. mmcsd0 is the first mmc device on that bus which is what has been disabled.

                                            If there was an issue with the bus or the controller you could disable those instead. Disabling only the emmc storage device itself is the minimum you need to do to stop is trying to read it.

                                            Steve

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.