Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Another Netgate with storage failure, 6 in total so far

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    284 Posts 36 Posters 41.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      andrew_cb @ltctech
      last edited by

      @ltctech Those values indicate imminent failure of your onboard storage.
      From what I have seen, the Pre EOL value always says 0x01, so it seems to be a useless indicator - the onboard eMMC might not support it.

      I would suggest installing an SSD ASAP before your device stops working, and export a backup of your configuration (Diagnostics > Backup & Restore).

      Netgate has chosen not to publish instructions for installing an SSD in the 4100 and 6100.

      Make sure you get an M.2 NVMe that has B+M key (2 notches). Be sure it is NVMe as they are hard to find and most B+M key drives are SATA.

      This video shows how to disassemble the device.

      Clear the onboard storage

      Reinstall pfSense

      D L 3 Replies Last reply Reply Quote 0
      • D
        dane_h @andrew_cb
        last edited by

        @andrew_cb I'm thankful I stumbled onto this thread. My 6100 is not a MAX model, and it's over 3 years old. Purchased right after they came out.

        I'm just going to replace the storage immediately as it's clear it's very probably EOL. I'm comfortable with the process you've outlined, but I'm not clear on exactly which NVMe M.2 2280 SSD to get. Can you please link one from Amazon that would work?

        Would this KingSpec work? I saw higher up that this is what someone else bought, although there are several.

        P D A 3 Replies Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          That drive would not work because it's m.2 SATA. It must be an NVMe drive for 4100 or 6100.

          1 Reply Last reply Reply Quote 0
          • P
            punting_packets @dane_h
            last edited by

            @dane_h I used an Intel Optane 16Gb drive https://www.ebay.co.uk/itm/395684843954

            D 1 Reply Last reply Reply Quote 1
            • D
              dane_h @punting_packets
              last edited by

              @punting_packets That appears to be the same as this one on Amaz.

              P 1 Reply Last reply Reply Quote 0
              • P
                punting_packets @dane_h
                last edited by

                @dane_h Yep, I'd agree.

                1 Reply Last reply Reply Quote 0
                • C
                  Cabledude
                  last edited by

                  These eMMC topics have been here for a while. I’ve had two SG-1100 units fail on eMMC, one of which I fixed using an old SSD in a USB enclosure.

                  It was enough for me to order a Max model (SG-2100 with 128GB SSD preinstalled by Netgate), just to stay on the safe side of things.

                  Pete
                  Home: SG-2100 + UniFi + Synology. SG-1100 retired
                  Parents: SG-1100 + UniFi + Synology
                  Testing: SG-1100 w/ 120GB SSD via ext USB (eMMC dead). Works great

                  1 Reply Last reply Reply Quote 0
                  • D
                    dstaylor @dane_h
                    last edited by

                    @dane_h I ordered and installed this one, up and running. Price about the same.

                    https://www.amazon.com/dp/B08TTDQ5WH?ref=fed_asin_title&th=1

                    dennypageD 1 Reply Last reply Reply Quote 3
                    • A
                      andrew_cb @dane_h
                      last edited by andrew_cb

                      @dane_h I have not used this drive, but it should work: https://www.amazon.com/KingSpec-256GB-Performance-Internal-Ultrabook

                      edit: I checked and it's the same one @dstaylor linked lol

                      1 Reply Last reply Reply Quote 0
                      • dennypageD
                        dennypage @dstaylor
                        last edited by

                        @dstaylor FWIW, I use the same one.

                        1 Reply Last reply Reply Quote 0
                        • L
                          ltctech @andrew_cb
                          last edited by

                          @andrew_cb
                          I am beyond frustrated with Netgate. The whole point of buying Netgate as opposed to using cheap Mini PCs and installing pfSense was to avoid these kind of surprises.

                          This particular unit is installed at an office with no IT staff on the other side of the world. We might have to send them a new unit and getting it swapped out may be a challenge.

                          We'll have to audit the other units that we have deployed (thankfully stateside) and see which ones are eMMC and which are SSD.

                          1 Reply Last reply Reply Quote 0
                          • C
                            Cabledude
                            last edited by

                            @andrew_cb and @stephenw10 Some questions:
                            #1 Is every eMMC equipped netgate prone to be affected? Or are there just a limited number of occurrences? #2 Does the eMMC production series have any influence or is it simply more writes = issue?
                            A good friend of mine is running a 4100 base. He believes he’s fine regarding the eMMC issue because he doesn’t do much logging. I don’t believe he even checks his eMMC health periodically, he’s not concerned about it.

                            Pete
                            Home: SG-2100 + UniFi + Synology. SG-1100 retired
                            Parents: SG-1100 + UniFi + Synology
                            Testing: SG-1100 w/ 120GB SSD via ext USB (eMMC dead). Works great

                            S 1 Reply Last reply Reply Quote 0
                            • S
                              SteveITS Galactic Empire @Cabledude
                              last edited by

                              @Cabledude said in Another Netgate with storage failure, 6 in total so far:

                              doesn’t do much logging

                              It's very relative hence the (my) list of mitigating settings above. The default deny rules log. Is pfSense behind an ISP router that blocks incoming? Suricata logs HTTP requests. Some people leave the dashboard open which logs every web request for each widget update. pfBlocker DNSBL logs DNS requests, and a few feeds like UT1 are gigantic.

                              My 2100 at home is from October 2020 and it shows 10% used:
                              eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x01
                              eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x01
                              eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01

                              eMMC (as a technology) has less "disk writes per day" than SSD. It is also usually much smaller. So writing (completely making up a number here) 5 GB per day has way more impact on an 8 GB eMMC than a 128 GB SSD. Which, overall, is the point of this thread.

                              Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                              When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                              Upvote 👍 helpful posts!

                              w0wW 1 Reply Last reply Reply Quote 3
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                Yup, the larger the drive the less write each individual 'bit' sees for a fixed total drive writes. So larger drives are less affected.

                                1 Reply Last reply Reply Quote 0
                                • arriA
                                  arri
                                  last edited by

                                  Wildly speculating around the 4100, it appears enough damage to the eMMC can brick the boxes too!
                                  One of mine won't post now precluding my ability to install NVMe at all. The leds on the board indicate activity on one flash drive after the reset indicator flickers without any console output or getting past the orange circle of death. This even after pulling the cmos battery, NVMe etc.

                                  1 Reply Last reply Reply Quote 1
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Of the confirmed eMMC failures we've seen most do not fail like that. In fact I don't think we've seen a single failure that presented like that in person. There was one user here on the forum who reported removing the eMMC chip and that that allowed it to bot from NMVe. So far unconfirmed though. So it could be some other failure.

                                    arriA A 2 Replies Last reply Reply Quote 0
                                    • arriA
                                      arri @stephenw10
                                      last edited by

                                      @stephenw10 Figures I'd be a unicorn. To be fair, I suspect once a unit is deemed a brick, they probably seldom make it back to your bench from customers unless they're in the short window.

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Indeed if it fails to POST entirely it's difficult to confirm any sort of cause.

                                        arriA 1 Reply Last reply Reply Quote 0
                                        • arriA
                                          arri @stephenw10
                                          last edited by

                                          @stephenw10 But hey, thanks for the reminder this box is now into the "can't hurt to try" zone where I get to play with solder! Now to go find my magnifier and a certain flash chip ;)

                                          1 Reply Last reply Reply Quote 0
                                          • A
                                            andrew_cb
                                            last edited by andrew_cb

                                            This morning, I was surprised to find that my threads in /r/pfsense and /r/netgate have been deleted. Fortunately, I still have screenshots of an interesting post from kphillips-netgate (@kphillips), in which he says:

                                            ...I think it's important to err on the side of letting people discuss things without overbearing moderation unless it becomes necessary...

                                            Interesting. Can you find out why both Reddit threads were deleted and who made that decision?

                                            ...[I] process RMA support tickets for devices every day...
                                            ...Beware of confirmation bias...

                                            These are good points to keep in mind as they tie into his next statement:

                                            I haven't seen any particularly unusual numbers of RMAs for any particularly product in our lineup.

                                            What is specifically considered to be an RMA? Is this all claims for RMA, or only claims that have been accepted as warranty? What is the ratio of approved RMAs versus RMAs denied because they are past the 1-year warranty? It appears that a significant amount of posts are about devices that failed after the 1-year warranty, and many users have had their warranty claim rejected or did not bother contacting Netgate since they were out of warranty. This would significantly alter the number and ratio of RMA claims.

                                            ...I'm not trying to admonish or belittle anybody here...
                                            ...sometime's it's totally by accident and I'm not trying to "blame shift" here...

                                            I am glad to hear that you personally are not trying to admonish, belittle, or blame shift, but...

                                            ...You will only see people who ran into issues...

                                            It seems that others do not share your view of giving the user the benefit of the doubt. In all the posts made by users who encountered storage failure, what we do see is Netgate consistently blaming the user for causing the storage failure, and never apologizing or showing empathy.

                                            We have a page outlining many of the packages that need an SSD here (linked to the page).

                                            I have repeatedly mentioned that this page is NOT linked anywhere. The only way you can be aware of its existence is by searching or following a link from someone. It is impossible to for a purchaser to be aware of a) storage wear caused by high writes, b) what packages and settings can cause high writes, and c) the decision criteria for choosing a MAX model. If anyone can show me a direct link to the "Supported pfSense Plus Packages" page on the Netgate website, I will be forever humbled.

                                            I run a 6100 as my edge for work at Netgate with on eMMC (no NVME SSD installed) and it gets worked.....HARD. It's my only 6100 I have and I use it for new release building, bug testing, package testing, and much more. It has been in continuous operation for about 3 years with little to know downtime. [screenshots showing eMMC health at 0x05, 0x06, 0x01, and a manufacturing date of 06/2022]

                                            These VERY interesting data points are provided to us by a Netgate employee. Remember what another Netgate employee @jwt said in post #34?

                                            ...the principle difference between eMMC and NVMe or SSD device is the amount of flash present on a typical eMMC .vs SSD or NVMe drive...

                                            Let us analyze these statements further.

                                            jwt asserts that there is no significant difference between eMMC, SSD, or NVMe other than the total amount of flash. Okay, so then eMMC is not a technologically inferior storage medium.

                                            Now, kphillips helpfully provides some evidence of how he has a 6100 with eMMC storage and he has worked it "HARD" for 3 years and yet the storage wear is only at 50-60%.

                                            This is where things start getting confusing...

                                            If the statements made by jwt is correct and the data supplied by kphillips is true and representative of the durability of Netgate devices with eMMC storage, then why are so many users experiencing eMMC failure in under 3 years? And why is there a presumption that the user is at fault for causing the failure? And why does Netgate never express concern about these "rare" incidents? If kphillips were to create a post about his 6100 dying under these conditions, I wonder what kind of response he would receive? Is kphillips trying to prove that the 16GB of onboard eMMC storage actually is fine when "worked HARD"?

                                            On the other hand, if kphillips' data is simply anecdotal and does not represent what a user can expect from the onboard eMMC storage, then it is irrelevant to this discussion.

                                            So, which is it?
                                            Is eMMC endurance similar to SSD/NVMe and Netgate devices with eMMC should be durable enough to handle package usage as kphillips demonstrated and that the devices were simply equipped with faulty eMMC, in which case the users did nothing wrong, and it is simply tough luck because the failure occurred after the 1-year warranty expired;
                                            --OR--
                                            Is the 16GB of onboard eMMC storage a known weak point that needs to be clearly identified and the user provided with abundant warnings on the product page, on the package manager page, whenever installing a package, and on the settings page of each package?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.