Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Another Netgate with storage failure, 6 in total so far

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    309 Posts 41 Posters 109.5k Views 38 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Offline
      michmoor LAYER 8 Rebel Alliance @andrew_cb
      last edited by

      @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

      Would it be correct to assume that you learned these issues the hard way and have experienced storage failures in the past before switching to only MAX versions?

      I have been a pfsense user for quite some time. I am on the forums here and on reddit. The countless tales of unreliable eMMC storage is a tale as old as time so i knew that once i was going the MSP route i knew based on other users' experiences of what not to do.

      Should there be a warning in the marketing? I don't know...eMMC may work really well depending on the deployment. Arista 7050CX3 switches have eMMC storage. Enterprise-grade vendor putting in crappy storage. Then again, there isn't heavy writing to the storage on a switch but I am just trying to illustrate to you that putting these parts in a networking device isn't uncommon. As i mentioned, i have shoved 1100s in a corner at a cafe and no issues for years. I also tune the logging down significantly.

      Firewall: NetGate,Palo Alto-VM,Juniper SRX
      Routing: Juniper, Arista, Cisco
      Switching: Juniper, Arista, Cisco
      Wireless: Unifi, Aruba IAP
      JNCIP,CCNP Enterprise

      1 Reply Last reply Reply Quote 0
      • S Offline
        SteveITS Galactic Empire @andrew_cb
        last edited by

        @andrew_cb it’s not a product page but I think you’re asking for https://www.netgate.com/supported-pfsense-plus-packages

        FWIW I don’t recall that we’ve ever had storage failure at any of our clients. Obviously, situations/setups can differ.

        Also maybe useful for readers:
        https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-lifetime.html

        Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to reboot, or more depending on packages, CPU, and/or disk speed.
        Upvote 👍 helpful posts!

        1 Reply Last reply Reply Quote 2
        • GertjanG Offline
          Gertjan @andrew_cb
          last edited by Gertjan

          @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

          My main gripe is the complete lack of information, warnings, or disclaimers prior to purchasing and during general usage. and there is no way for a reasonable person to know about the risks with the onboard eMMC storage until it is too late.

          Very true.
          To reuse (not yours) words, withing 10 years, it will be known that "emmc" isn't the best choice for very write active OSes. The emmc will probably join realtek on the "don't use these - period" list.

          Btw, what is general usage ?
          In the past, for a firewall, it was this (example) :

          Using username "root".
          Authenticating with public key "rsa-key-20230516-pfsense"
          Passphrase for key "rsa-key-20230516-pfsense":
          Netgate 4100 - Serial: 2014221462 - Netgate Device ID: e57dfbeef5a2527a
          
          *** Welcome to Netgate pfSense Plus 24.11-RELEASE (amd64) on pfSense ***
          
            Current Boot Environment: 24.11-Release
               Next Boot Environment: 24.11-Release
          
           WAN (wan)     -> ix3    -> v4/DHCP4: 192.168.10.4/24
                                      v6/DHCP6: 2a01:beef:907:a600:92ec:77ff:fe29:392a/64
           LAN (lan)     -> igc0   -> v4: 192.168.1.1/24
                                      v6/t6: 2a01:beef:907:a6eb:92ec:77ff:fe29:392c/64
           IDRAC (opt1)  -> igc2   -> v4: 192.168.100.1/24
           PORTAL (opt2) -> igc1   -> v4: 192.168.2.1/24
           VPNS (opt3)   -> ovpns1 -> v4: 192.168.3.1/24
          
           0) Logout / Disconnect SSH            9) pfTop
           1) Assign Interfaces                 10) Filter Logs
           2) Set interface(s) IP address       11) Restart GUI
           3) Reset admin account and password  12) PHP shell + Netgate pfSense Plus tools
           4) Reset to factory defaults         13) Update from console
           5) Reboot system                     14) Disable Secure Shell (sshd)
           6) Halt system                       15) Restore recent configuration
           7) Ping host                         16) Restart PHP-FPM
           8) Shell
          
          Enter an option:
          

          and from then on the system was idle - doing close to nothing (edit : wrong : it makes stats in the background)

          These days, the new normal (example) :

          0952bd78-dc29-4699-b814-6c7028745b2e-image.png

          and some of use wonder who picked the colors ... me, I wonder where and how all this info is stored.
          Before, with our extreme dumb ISP router with 16 Kbytes of (bios ?) vram, I didn't bother. That router didn't contain any or very few settings and what the heck, the ISP replaces it after one phone call.
          But I didn't have these sophisticated stats neither.

          It all boils down to : what where when is all this backed up ? Where is it stored ?
          Look for it, and you'll see the time stamps of all those files, their sizes ... and then you start to dig it : "that is the price to pay" these days : useful, less or pure gadget, it all needed megas to store it's stuff.
          That stuff gets rewritten. All the time.

          And yeah, being here on this forum for a while, and you know this :
          You want a Netgate device with hot swap-able dual (because raid 1 !) old iron seagate red label plate based drives .... Like my NAS. No SSD newtech drives which forces me to count write cycles.
          Ok, it will consume 0,10 Kwh for sure.
          So, ok, plan B : where is it stored ? And can I replace it without doing SMD like soldering ?
          Maybe the real question : is it reparable ?
          (edit : I also want a double power unit - as all my servers - re edit : wait : Dual HA WAN/LAN ?!).

          Anyway, @andrew_cb, keep us posted, as I said else where : you can probably add/replace that broken eemc. You wind up having the MAX, so you can throw your 4100 back in business and come back over ... 10 years ?

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          1 Reply Last reply Reply Quote 0
          • GertjanG Gertjan referenced this topic on
          • bmeeksB Offline
            bmeeks
            last edited by

            Besides the other points raised about eMMC wear caused by logging and/or the data archiving of the fancy Dashboard graph widgets, another thing to consider is that if you have a ZFS install you are automatically going to experience much more background disk writes from ZFS as compared to the old UFS.

            ZFS makes regular writes to the disk as part of its normal operation. And it makes quite a bit more of those than UFS does. That's where the resiliency of ZFS comes from. But that resiliency has a price on eMMC or cheaper SSD devices. Just ZFS by itself probably adds a small incremental boost to eMMC wear, but combine that with extensive application and graph widget logging and things can escalate fast.

            A 1 Reply Last reply Reply Quote 5
            • A Offline
              andrew_cb @bmeeks
              last edited by

              @bmeeks said in Another Netgate with storage failure, 6 in total so far:

              Besides the other points raised about eMMC wear caused by logging and/or the data archiving of the fancy Dashboard graph widgets, another thing to consider is that if you have a ZFS install you are automatically going to experience much more background disk writes from ZFS as compared to the old UFS.

              ZFS makes regular writes to the disk as part of its normal operation. And it makes quite a bit more of those than UFS does. That's where the resiliency of ZFS comes from. But that resiliency has a price on eMMC or cheaper SSD devices. Just ZFS by itself probably adds a small incremental boost to eMMC wear, but combine that with extensive application and graph widget logging and things can escalate fast.

              So just enabling some basic logging/dashboard features, combined with ZFS (which has been the default filesystem for a while now) is enough to shorten the lifespan of a Netgate with eMMC storage?

              How is someone supposed to know about these issues? (@bmeeks I know you're not a Netgate employee)

              Reading through an example of eMMC longevity calculations from here yields some concerning numbers:

              Workload description
              84% Sequential write, 16%Random write
              Chunk Size IOs Distribution: 30%: 4KB, 27%: 16KB, 42%: Mix of 8KB, 32KB-256KB, 1%: 512KB
              eMMC Cache on
              specific eMMC device specs (from datasheet):
              MLC device
              physical capacity = 0.0074(TB) for 8GB device
              endurance cycle = 3000 for MLC
              Write Amplication Factor (WAF):
              WAF = 4.5 (estimated from the workload description above with simulation)
              TBW = physical capacity * endurance cycle / WAF
              TBW = 0.0074(TB) * 3000(cycles) / 4.5(WAF) ~= 5.0 TBW
              5.0 Terabytes is the total amount of data written to the device during its lifetime of use, depending on the workload.
              

              Taking the 5.0TB writable and calculating for 3 years of life gives us:
              5.0TB / 1095 days = 4.57GB per day.
              4.57GB / 24 hours = 190MB per hour.
              190MB / 60 minutes = 3.17MB per minute.
              3.17MB per minute = 53Kb per second.

              So you must write no more than an average of 53Kb per second in order to get 3 years of life, 106Kbps to last 2 years, and 159KBps to last 1 year.

              Being generous and increasing the numbers 10-fold still only leaves a maximum of 530Kb per second to last 3 years.

              The purpose of buying a device instead of building your own is that the manufacturer is supposed to take care of choosing the correct components so that you do not have to. If the expectation is that a user must spend countless hours and years of testing and research in order to understand how to get the device to work properly, then it is far easier and more cost-effective to simply purchase a competitor's device and just pay the yearly fees.

              Now I may sound bitter (and I am at the moment), but I genuinely want to provide feedback to Netgate that they can use to make changes so that others do not experience the same challenges that I am currently experiencing with device failures (and the grim prospect that more will likely fail).

              With all the complexities of changing the behavior of pfSense and various packages, I believe the easiest way to avoid future problems is to either change to a more robust storage medium in the BASE versions or at least make the limitations of eMMC storage abundantly clear.

              The product pages make no differentiation between the BASE and MAX versions other than increased capacity. eMMC and NVMe are mentioned, but I suspect that very few people are aware of the critical differences. After all, if Netgate chose eMMC and no further details are given, then it must not be important, right? A blurb that explains eMMC vs NVME, along with a table listing use cases/packages where the MAX version is recommended, would be a huge benefit to both users and Netgate and a great way to upsell the MAX version.

              bmeeksB M 2 Replies Last reply Reply Quote 1
              • bmeeksB Offline
                bmeeks @andrew_cb
                last edited by

                @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                With all the complexities of changing the behavior of pfSense and various packages, I believe the easiest way to avoid future problems is to either change to a more robust storage medium in the BASE versions or at least make the limitations of eMMC storage abundantly clear.

                I definitely agree with you here. eMMC technology was initially way cheaper than NVMe drives, and that likely drove the decision to use that more than any other factor. I think the disparity has decreased some with the proliferation of NVMe choices now, and NVMe seems a more solid and long-term reliable solution.

                1 Reply Last reply Reply Quote 0
                • M Offline
                  michmoor LAYER 8 Rebel Alliance @andrew_cb
                  last edited by

                  @andrew_cb

                  I agree with all your points to be frank. I personally held the belief that offering a BASE version of any model up to the 6100 is silly especially when viewed from the price perspective where its ~100 bucks difference. For an extra $100 you get better storage. If the price is negligible than why offer the base to begin with.
                  In my opinion, you should pay more for better performance in terms of cpu or memory. Look at the 4200 as an example. Base and MAX is the exact same in terms of specs except storage. I find it hard to believe the nvme is an additional $100 but even if true, i still have the same performance. I can see why there are people who would select the Base sku but Netgate is doing more harm than good when a cheap unreliable drive goes bad in their own flagship product which happens quite often if forum posts are to be believed. Just offer the Max version which in reality is the Base version. Thats it. One Sku per product. They do it with the 8200

                  Depending on the deployment, i would go white box. Grab the pfsense+ license which is very cost effective and deploy a Dell or HP 1RU system which would be far more reliable , more robust.

                  Firewall: NetGate,Palo Alto-VM,Juniper SRX
                  Routing: Juniper, Arista, Cisco
                  Switching: Juniper, Arista, Cisco
                  Wireless: Unifi, Aruba IAP
                  JNCIP,CCNP Enterprise

                  1 Reply Last reply Reply Quote 2
                  • A Offline
                    andrew_cb
                    last edited by andrew_cb

                    Imagine you work for a busy company that is trying a different brand of delivery trucks for its fleet that operates 24/7. The delivery trucks work well and employees like them, so the company buys many more over the next few years. At first, there were no available options, but the brochure for recent models lists two axle options: BASE or MAX. The only difference mentioned is that the BASE axle is 6-lug and the MAX axle is 8-lug.

                    Recently, the factory-equipped axles have begun failing much sooner than those of other truck brands and previous delivery trucks you have owned. Now, a sixth truck is stuck at the side of the road waiting for a tow, missing another important delivery and losing your customer's confidence.

                    You begin researching axle failures and this truck model. You find that the BASE model tires are similar to the axles used on passenger vehicles. They are fine for driving around town with light loads, but highway driving, adding additional equipment, or carrying additional weight causes the axles to wear internally and fail prematurely. Changing to the MAX axle requires a truck to be out of service for 2 days. You find that other companies using these trucks are having the same problem, and the manufacturer even recommends removing the spare tire, jack, radio, passenger seat, bumpers, and mud flaps to reduce weight and extend the life of the BASE axle.

                    You might think, "Ridiculous!" One of the main reasons we bought these trucks is that they support a wide variety of aftermarket accessories that allow the trucks to be customized to significantly improve their functionality, as shown in the glossy brochures.

                    You wonder, "Since these are sold as delivery trucks, why would the manufacturer use inferior axles without making it clear that the BASE axle option is what most customers will need to use the trucks for anything but the lightest of tasks?"

                    The MAX model was a $1000 option that seemed unnecessary at the time, but each premature axle failure costs you $10,000 in towing charges, labor, equipment rentals, customer goodwill, and vehicle repairs. Replacing the axles before they fail will cost $7000 of lost revenue, parts, and labor per truck. You look out your office at the 40 trucks in your loading yard and wonder how you will get through this situation.

                    1 Reply Last reply Reply Quote 2
                    • A Offline
                      andrew_cb
                      last edited by andrew_cb

                      So, things keep getting worse. I put together some scripts to run mmc
                      and parse the health data. I will just let the data speak for itself:
                       
                      netgate_health_blurred.png

                      Of 33 devices, 10 are over 100% Type A wear, and 8 are over 100% Type B wear.
                      Strangely, all are reporting Pre-EOL of 0x01.

                      That's a failure rate of 30%, and if we include the 6 devices that have already failed, that brings it up to a 40% failure rate.

                      There are hundreds of discussions about storage failures in Netgate devices. It seems that most are personal users who are willing to accept this and install an SSD, but for a business with dozens of devices, this is simply unacceptable for a 2-year-old device.

                      Okay, what if a user wants to install an SSD in their existing 4100 or 6100? No problem, right, since the 6100 product page clearly states:

                      Physical Expansion Card Slots: 2x m.2 (Key-B slot) with dual-SIM (LTE, Wi-Fi, or NVMe) (PCIe, USB 2.0, USB 3.0)
                      

                      But wait, there are NO published instructions for installing an SSD, and Netgate staff say it is not possible/supported/recommended!

                      The warranty is only 1 year. Does Netgate even track the failure rate of devices after the warranty runs out? Clearly, a 30-40% failure rate cannot be normal or acceptable.

                      The 6100 has only been out for 3.5 years, and the 4100 has only been out for 3 years. Why are so many users experiencing storage failures? There are even posts of 4200's with eMMC failure - these are 9 months old at the most (and have no way to monitor the storage)!
                      Either:

                      • Users are using the product wrong (according to what?).
                      • The BASE version is inferior and not capable of doing what is advertised.

                      Using the information that is reasonably prominent to a purchaser, the only conclusion is that "The BASE version is inferior and not capable of doing what is advertised." and further, is unfit for anything other than using the default settings.

                      Multiple years of experience with hardware failures using pfSense, notes tucked away on package documentation, and documentation unintuitively named "troubleshooting,", and eMMC vs SSD differences, is not information that a regular purchaser would be aware of.

                      The whole point of buying a device from Netgate is to AVOID having to meticulously research hardware specs, particularly for obscure things like eMMC storage device lifetime which is not generally available.
                      It is very misleading to offer the BASE version when it can only do 10% of the advertised features.

                      The 8200 and 8300 only come in a single version and only have NVMe storage. Why is this? They run the same software, the same packages, the same default config. Are they doing something special that requires more than eMMC can handle?

                      So, to summarize:

                      • There is no mention or warning about the limitations of eMMC storage in the BASE version.
                      • The product page makes no recommendation to get the MAX version to use the advertised features.
                      • The product page misleading states "No artificial limits or add-ons required to make your system fully functional" as this does not apply to the BASE version since anything more than the default configuration risks premature storage failure.
                      • The product pages make no mention that the BASE version cannot be upgraded to the MAX version. When the return period runs out 30 days after purchase, the user is stuck with an expensive device that cannot be upgraded and cannot be used to its full, advertised potential.
                      • Failure rates of the BASE versions can be 30-40% or more, depending on the packages used.

                      How can I contact someone at Netgate to discuss this further? I think this is a serious issue and the product pages and documentation need to be updated to clearly distinguish the limitations of the BASE versions and prevent further confusion and premature device failure.

                      keyserK M 2 Replies Last reply Reply Quote 2
                      • keyserK Offline
                        keyser Rebel Alliance @andrew_cb
                        last edited by

                        @andrew_cb I fully agree with and understand your situation.
                        I luckily discovered the issue 3 years back before my devices died of wear-out, and installed an SSD myself.
                        I created a thread (https://forum.netgate.com/topic/170128/emmc-write-endurance) on this forum, clearly identifying the potential problem and encouraging people to dial down the write intensity of packages and firewall rules. At that time pfBlockerNG had an issue causing it to write in an endless loop, so the figures were really bad at the time. But even after that was corrected, it is still a BIG problem on basic installs.

                        But Netgate kept the eMMC models around and have still not opted into setting up RAM disk as default on those devices (which is needed now).
                        So I have been expecting this to turn into a bigger problem at some point.

                        Not that it helps you or other customers that have dead devices, but I fully agree with you, and you have my sympathy with your current situation :-(

                        Love the no fuss of using the official appliances :-)

                        M GertjanG 2 Replies Last reply Reply Quote 1
                        • M Offline
                          mcury Rebel Alliance @keyser
                          last edited by

                          I got a SG-4100 (not the MAX) and the first thing I did was install a nvme.
                          Since I was a SG-3100 user for a long time, I was already aware of the eMMC lifespan, but I'm pretty sure that new users won't be aware of this.

                          One suggestion to Netgate would be, give the user more options in the shop, with a warning and a link to the docs.

                          Cheaper variant: SG-4200 with eMMC storage (Read about eMMC lifespan here).
                          20 bucks more expensive than cheaper variant: SG-4200 with a 128GB nvme (not enterprise nvme).
                          SG-4200 MAX (enterprise nvme).

                          This would help users during the variant selection, more options for buyers and a warning so users can be prepared in case they get the emmc only variant.

                          dead on arrival, nowhere to be found.

                          1 Reply Last reply Reply Quote 0
                          • GertjanG Offline
                            Gertjan @keyser
                            last edited by

                            @keyser said in Another Netgate with storage failure, 6 in total so far:

                            I luckily discovered the issue 3 years back before my devices died of wear-out, and installed an SSD myself.
                            I created a thread (https://forum.netgate.com/topic/170128/emmc-write-endurance) on this forum, clearly identifying

                            👍
                            That was one of the forum posts I've read and used to decide when I had to decide what 4100 I had to take.
                            The elephant mentioned overthere (== ZFS) wasn't listed here as a package. I found out what 'ZFS' does for a living ...... and I had my answer straight away.

                            @andrew_cb : Great write-up. It will help future potential Netgate appliance buyers very useful info (if they look for it ...).

                            No "help me" PM's please. Use the forum, the community will thank you.
                            Edit : and where are the logs ??

                            bmeeksB 1 Reply Last reply Reply Quote 1
                            • bmeeksB Offline
                              bmeeks @Gertjan
                              last edited by

                              @Gertjan said in Another Netgate with storage failure, 6 in total so far:

                              The elephant mentioned overthere (== ZFS) wasn't listed here as a package. I found out what 'ZFS' does for a living ...... and I had my answer straight away.

                              I believe ZFS is most definitely a strong underlying root cause of the increased wear. It does quite a bit of background disk writes as part of its resiliency processing. Add on heavy logging with a package or two and you can greatly accelerate the wear.

                              I'm still running UFS on the two Netgate devices I manage. I just have them each on a UPS.

                              keyserK 1 Reply Last reply Reply Quote 0
                              • keyserK Offline
                                keyser Rebel Alliance @bmeeks
                                last edited by keyser

                                @bmeeks said in Another Netgate with storage failure, 6 in total so far:

                                @Gertjan said in Another Netgate with storage failure, 6 in total so far:

                                I believe ZFS is most definitely a strong underlying root cause of the increased wear. It does quite a bit of background disk writes as part of its resiliency processing. Add on heavy logging with a package or two and you can greatly accelerate the wear.

                                I'm still running UFS on the two Netgate devices I manage. I just have them each on a UPS.

                                It definitely is since ZFS's write algorithm is both time and allocation triggered. It will always allocate new blocks rather than used blocks for writes. This causes SSDs to rewrite far more blockpages - that would otherwise be considered "static" - over time because of the way they do wear leveling. It's not a HUGE issue, but specifically for lots of logging it will up the write amplification quite noticeably.

                                However - given HOW prone pfsense boxes are to boot failures on UFS after power outages/hard shutdowns, it's a WELL WORTH tradeoff to make. Then comes all the other features like boot environments, optional mirroring and fault handling in upgrades.... It's see no setups where I would not opt for ZFS and then either get a SSD or enable RAMDISK.

                                Love the no fuss of using the official appliances :-)

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S Offline
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Running UFS with ramdisks enabled reduces drive write to near zero and I have yet to see a UFS corruption issue with that.

                                  But it also restricts what you can run especially on smaller systems without RAM to spare. And you do lose some logs etc in the event of a reboot which cab make troubleshooting tricky.

                                  But on older systems running from SD card or (gasp) CF it's only real option IMO.

                                  1 Reply Last reply Reply Quote 0
                                  • M Offline
                                    michmoor LAYER 8 Rebel Alliance @andrew_cb
                                    last edited by

                                    @andrew_cb
                                    Brutal…hard to ignore your data points. Good job on providing context.

                                    Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                    Routing: Juniper, Arista, Cisco
                                    Switching: Juniper, Arista, Cisco
                                    Wireless: Unifi, Aruba IAP
                                    JNCIP,CCNP Enterprise

                                    1 Reply Last reply Reply Quote 0
                                    • A Offline
                                      andrew_cb
                                      last edited by andrew_cb

                                      Just scrolling through the Official Netgate Hardware forum has these definite storage failures (and there are even more threads that might be storage-related):

                                      • 4 days ago: - 6100 with failed eMMC
                                      • 6 days ago - 4200 with failed eMMC
                                      • 8 days ago - failed NVMe on a 6100 MAX
                                      • 14 days ago - 2100 MAX reporting 48% health
                                      • 67 days ago - 1100 with failed eMMC

                                      In this thread @SteveITS lists suggestions for reducing storage wear that mirror what is being said by both Netgate staff and other users:

                                      • https://www.netgate.com/supported-pfsense-plus-packages lists which packages "require" or recommend SSD over eMMC <- Many packages do not specify that they require/recommend SSD
                                      • turn off logging of the default block rules <- why is this on by default if it can be problematic?
                                      • turn off logging of the bogon rules <- again, why is this on by default?
                                      • turn off Suricata logging of HTTP requests <- there is NO documentation for configuring Suricata
                                      • turn off pfBlocker DNSBL logging <- this is not mentioned on the pfBlocker setup page
                                      • create a "don't log" rule for IGMP <- this started occurring in 24.03 due to correcting a logging bug. Redmine and Forum discussion. Again, this can create a lot of logging, so why is it enabled by default?
                                      • don't view the dashboard 24x7 (each widget logs the web server request to update the widget) <- Along with similar suggestions to disable various RRD graphs, this is just getting silly. How can anyone possibly know this will cause an issue?
                                      • use RAM disk <- this requires additional planning and setup to compensate for the loss of persistent logging, and also consumes memory.

                                      Curiously, the Hardware Sizing document does not mention storage at all. It even specifically mentions Snort and Suricata, but says nothing about storage. This seems like a logical place to mention storage write and storage space usage considerations, but unfortunately, it is another missed opportunity.

                                      Now, let us look at the sacred Supported pfSense Plus Packages page. Only HAProxy and NtopNG say "Requires SSH/HDD", and Snort and Suricata say "SSD/HDD strongly recommended".
                                      This would imply that the other packages are safe to use with the onboard eMMC storage, right?

                                      Just to be sure, let us look at the pfBlockerNG documentation page:
                                      Hmm, not much detail there and certainly no mention of storage issues.

                                      What about Status Traffic Totals? Nothing there either.

                                      Maybe some other popular packages will say something.
                                      Arpwatch? Not listed.
                                      Zabbix? Not listed.

                                      The switch to ZFS could very well be causing accelerated eMMC wear out, which might explain why this issue seems to have become much more common in the past 2-3 years. We have SG-3100 that are still running with no issues, possibly because they only support UFS. We had a 7100 fail to boot due to a corrupted filesystem that required using the serial console to repair. After that, we reinstalled all other UFS devices with ZFS.

                                      Again, if I buy a truck that clearly states it can haul 20,000 lbs as standard feature, I should be able to install a trailer hitch and go. I should not have to worry about upgrading the engine, braking system, fuel pump, transmission, or suspension to haul the advertised 20,000 lbs!

                                      I don't understand Netgate's and some community members' attitude on this issue: somehow people are using their Netgate device wrong by trying to utilize the advertised features, and they should just accept these failures and install an SSD or buy a new device.

                                      I can understand this from CE users on third-party hardware who aren't paying Netgate anything, but anyone who purchases a device from Netgate surely must expect more than the sudden death in 1-2 years of devices that cost several hundred dollars (or even thousands) each.

                                      The oft-repeated suggestion to "support the project" does not apply here, as no amount of pfSense licenses or TAC subscriptions will solve the inherent eMMC limitations of white-labelled Silicom hardware.

                                      For all the pfSense power users here, how can we get Netgate's attention and bring about some kind of change?

                                      M GertjanG S 3 Replies Last reply Reply Quote 0
                                      • M Offline
                                        michmoor LAYER 8 Rebel Alliance @andrew_cb
                                        last edited by

                                        @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                        For all the pfSense power users here, how can we get Netgate's attention and bring about some kind of change?

                                        Reply

                                        I'm going to paraphrase a bit from where i heard this statement but essentially it goes "Its hard to tell someone they are doing something wrong when they are making money".
                                        I would bet Vegas money that the Base version of the SKUs is very profitable compared to the Max. I'm also willing to bet they are aware of the eMMC flaws. At the end of the day (granted, I'm cynical by nature), I don't think this will move the needle much. Netgate has offered eMMC storage for a very long time. I do believe a disclaimer is needed to assist those making a purchase decision.

                                        @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                        I don't understand Netgate's and some community members' attitude on this issue: somehow people are using their Netgate device wrong by trying to utilize the advertised features, and they should just accept these failures and install an SSD or buy a new device.

                                        Agree with you here as well. The suggestions essentially boil down to "don't use the software as intended". I cant really add much to your analysis and your grievance but i do hope that 2025 produces some changes.

                                        Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                        Routing: Juniper, Arista, Cisco
                                        Switching: Juniper, Arista, Cisco
                                        Wireless: Unifi, Aruba IAP
                                        JNCIP,CCNP Enterprise

                                        GertjanG A 2 Replies Last reply Reply Quote 0
                                        • GertjanG Offline
                                          Gertjan @andrew_cb
                                          last edited by

                                          @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                          I can understand this from CE users on third-party hardware who aren't

                                          .... aware of this situation, as most, may I say nearly all, in the early pfSense adoption process, in beginning, use a VM, or some "saved from the land-fill-PC", slide in a extra network card, install pfSense and before you know, its years later.
                                          As of this, they, the CE users, can't be hit by this issue : They don't use a Netgate appliance, so most probably no emmc.

                                          And before you think : Not 'against' you, I'd say you've made some very valid points.

                                          @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                          how can we get Netgate's attention ...

                                          It's just me, "yet another user" saying, but I'm pretty sure your posts have been read by 'them'.

                                          No "help me" PM's please. Use the forum, the community will thank you.
                                          Edit : and where are the logs ??

                                          1 Reply Last reply Reply Quote 0
                                          • S Offline
                                            SteveITS Galactic Empire @andrew_cb
                                            last edited by

                                            @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                            How can anyone possibly know this will cause an issue?

                                            I was just listing "lower the amount of disk writing" suggestions.

                                            To play devil's advocate I would suggest none of these things "cause" premature wear, at least by themselves. ZFS wasn't a feature, or at least, not the default, when the 2100 and I think 1100 were released. So it could well be a combination of all these things interacting with new defaults.

                                            Personally I don't think it would have occurred to me to keep the dashboard visible all day until I saw posts about it, in a thread about the web server logs. Perhaps it can have a checkbox to auto-update in the background like the traffic graphs do.

                                            I would guess the logging is on by default because it avoids/answers a lot of "why can't I connect" questions. Package documentation I would think is up to the individual package maintainers, and often done via forum post. Some of the doc pages are pretty outdated.

                                            An SSD is also significantly faster in terms of saving, upgrading, etc. since I/O is faster.

                                            The amount of disk space used by pfSense is typically relatively small so size isn't really a factor unless downloading large lists or data like the UT1 list which is over 1 GB to extract, when it updates. A larger SSD though would have more writing capacity, I'd expect, due to more unused sectors.

                                            I don't know that anyone here is trying to dispute your POV, or your frustration. In terms of contacting Netgate, other than the replies above, if you're a partner you have contact info. If not then you could try sales or support, I don't know. It sounds like an SSD would fit more for your usage scenarios, so I guess the question/goal is to help others or new customers who don't know about wear issues.

                                            Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                                            When upgrading, allow 10-15 minutes to reboot, or more depending on packages, CPU, and/or disk speed.
                                            Upvote 👍 helpful posts!

                                            A 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.