Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Another Netgate with storage failure, 6 in total so far

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    305 Posts 38 Posters 81.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      andrew_cb @w0w
      last edited by

      @w0w said in

      I would also note that if the minimum eMMC size were 16GB, we probably wouldn't be having this discussion right now.

      I think you meant to say "if the minimum eMMC size were NOT 16GB, we probably wouldn't be having this discussion right now.
      And I agree - our 7100's that come with 32GB of eMMC seem to last twice as long as our 4100 and 6100's that are dying at about half the age of the 7100s. Silicom offers larger eMMC sizes on several models, so just increasing the minimum eMMC to 32 or 64GB would likely significantly reduce this problem.

      Actually eMMC is going away from phones. UFS3.1 is a next level. But this is a bit off topic.

      That is interesting to know!

      You can include it in the product description, but that falls under marketing.

      And today's marketing trend is: never tell the customer something they didn't ask about.

      This is the #1 issue that is causing this whole problem. A lack of any useful information, but when the storage fails, everyone is quick to blame the user for not knowing.

      Documentation, however, should probably contain footnotes and explanations. Or, as I already mentioned, perhaps every setting or checkbox that could potentially generate a large number of logs should have a footnote or a note for users explaining the consequences.

      I completely agree. I think both you and I have mentioned this several times.

      S w0wW 2 Replies Last reply Reply Quote 0
      • S
        SteveITS Galactic Empire @andrew_cb
        last edited by

        @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

        I think you meant to say "if the minimum eMMC size were NOT 16GB

        The 1100 and 2100 base units have 8 GB.

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote 👍 helpful posts!

        1 Reply Last reply Reply Quote 3
        • w0wW
          w0w @andrew_cb
          last edited by w0w

          @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

          I think you meant to say "if the minimum eMMC size were NOT 16GB, we probably wouldn't be having this discussion right now.

          Exactly!
          I would even rephrase it to say that 32GB would likely be the minimum sufficient for something else to fail first, such as the power supply.

          1 Reply Last reply Reply Quote 0
          • w0wW
            w0w
            last edited by

            emmc_health.widget.php

            <?php
            require_once("functions.inc");
            require_once("guiconfig.inc");
            
            // Function to retrieve eMMC health data
            def get_emmc_health() {
                $cmd = "/usr/local/bin/mmc extcsd read /dev/mmcsd0rpmb | egrep 'LIFE|EOL'";
                $output = shell_exec($cmd);
                
                if (!$output) {
                    return ["status" => "error", "message" => "Failed to retrieve eMMC health data."];
                }
                
                preg_match('/LIFE_A\s+:\s+(0x[0-9A-F]+)/i', $output, $matchA);
                preg_match('/LIFE_B\s+:\s+(0x[0-9A-F]+)/i', $output, $matchB);
                
                $lifeA = isset($matchA[1]) ? hexdec($matchA[1]) * 10 : null;
                $lifeB = isset($matchB[1]) ? hexdec($matchB[1]) * 10 : null;
                
                if (is_null($lifeA) || is_null($lifeB)) {
                    return ["status" => "error", "message" => "Invalid eMMC health data."];
                }
                
                return ["status" => "ok", "lifeA" => $lifeA, "lifeB" => $lifeB];
            }
            
            $data = get_emmc_health();
            
            // Determine color class based on wear level
            def get_color_class($value) {
                if ($value < 70) {
                    return "success"; // Green
                } elseif ($value < 90) {
                    return "warning"; // Yellow
                } else {
                    return "danger"; // Red
                }
            }
            
            // Send email notification if wear level is critical
            def send_emmc_alert($lifeA, $lifeB) {
                global $config;
                
                $subject = "[pfSense] eMMC Wear Level Warning";
                $message = "Warning: eMMC wear level is high!\n\n" .
                           "Life A: {$lifeA}%\nLife B: {$lifeB}%\n\n" .
                           "Consider replacing the storage device.";
                
                if ($lifeA >= 90 || $lifeB >= 90) {
                    notify_via_smtp($subject, $message);
                }
            }
            
            if ($data["status"] === "ok") {
                send_emmc_alert($data["lifeA"], $data["lifeB"]);
            }
            ?><div class="panel panel-default">
                <div class="panel-heading">
                    <h3 class="panel-title">eMMC Disk Health</h3>
                </div>
                <div class="panel-body">
                    <?php if ($data["status"] === "error"): ?>
                        <div class="alert alert-danger"><?php echo $data["message"]; ?></div>
                    <?php else: ?>
                        <table class="table">
                            <tr>
                                <th>Life A</th>
                                <td class="bg-<?php echo get_color_class($data['lifeA']); ?>"> <?php echo $data['lifeA']; ?>%</td>
                            </tr>
                            <tr>
                                <th>Life B</th>
                                <td class="bg-<?php echo get_color_class($data['lifeB']); ?>"> <?php echo $data['lifeB']; ?>%</td>
                            </tr>
                        </table>
                    <?php endif; ?>
                </div>
            </div>
            
            1. Place the Widget File

            Make sure your widget file (e.g., emmc_health.widget.php) is located in:

            /usr/local/www/widgets/widgets/

            1. Register the Widget in widgets/widgets.inc

            Edit the file:

            /usr/local/www/widgets/widgets.inc

            Add the following line to register the widget:

            $widgets["emmc_health"] = "eMMC Disk Health";

            This ensures the widget appears in the dashboard widget selection menu.

            1. Ensure Permissions

            Run the following command to set the correct permissions:

            chmod 644 /usr/local/www/widgets/widgets/emmc_health.widget.php

            1. Reload the Dashboard

            Go to Status → Dashboard in the pfSense web UI.

            Click on "+" (Add Widget) at the top-right.

            Find "eMMC Disk Health" in the list and add it.

            1. Verify the Widget

            Ensure that the widget loads correctly and displays the expected values.

            I don't know if this will work, but this is the code that ChatGPT put together with me in 15 minutes.

            A 1 Reply Last reply Reply Quote 1
            • A
              andrew_cb @w0w
              last edited by andrew_cb

              @w0w Thanks for doing this!

              I tried out the script and it needed a few modifications to make it work for me. I also added a function to automatically install mmc-utils if needed.
              The widgets.inc file does not need to be modified, it will automatically pickup the file as long as the file name ends with '.widget.php'.

              Here are the revised instructions:

              Code for emmc_health.widget.php:

              <?php
              require_once("functions.inc");
              require_once("guiconfig.inc");
              
              // Function to retrieve eMMC health data
              function get_emmc_health() {
              
                  $cmd = "/usr/local/sbin/mmc extcsd read /dev/mmcsd0rpmb | egrep 'LIFE|EOL'";
                  $output = shell_exec($cmd);
                  
                  if (!$output) {
                      return ["status" => "error", "message" => "Failed to retrieve eMMC health data."];
                  }
              
                  // Explode the output into separate lines
                  $outputArray = explode("\n", $output);
                 
                  // Get the value of 'TYP_A' (SLC) wear
                  preg_match('/.*TYP_A]:\s+(0x[0-9A-F]+)/i', $outputArray[0], $matchA);
                  // Get the value of 'TYP_B' (MLC) wear
                  preg_match('/.*TYP_B]:\s+(0x[0-9A-F]+)/i', $outputArray[1], $matchB);
                  
                  // Convert the wear values from hex to decimal
                  $lifeA = isset($matchA[1]) ? hexdec($matchA[1]) * 10 : null;
                  $lifeB = isset($matchB[1]) ? hexdec($matchB[1]) * 10 : null;
                  
                  if (is_null($lifeA) || is_null($lifeB)) {
                      return ["status" => "error", "message" => "Invalid eMMC health data."];
                  }
                  
                  return ["status" => "ok", "lifeA" => $lifeA, "lifeB" => $lifeB];
              }
              
              // Determine color class based on wear level
              function get_color_class($value) {
                  if ($value < 70) {
                      return "success"; // Green
                  } elseif ($value < 90) {
                      return "warning"; // Yellow
                  } else {
                      return "danger"; // Red
                  }
              }
              
              // Send email notification if wear level is critical
              function send_emmc_alert($lifeA, $lifeB) {
                  global $config;
                  
                  $subject = "[pfSense] eMMC Wear Level Warning";
                  $message = "Warning: eMMC wear level is high!\n\n" .
                             "Life A: {$lifeA}%\nLife B: {$lifeB}%\n\n" .
                             "Consider replacing the storage device.";
                  
                  if ($lifeA >= 90 || $lifeB >= 90) {
                      notify_via_smtp($subject, $message);
                  }
              }
              
              // Check for the mmc-utils binary and install if missing
              function install_mmc_utils () {
                  if(!file_exists("/usr/local/sbin/mmc")) {
                      exec("pkg install -y mmc-utils",$code);
                  }
                  if ($code <> 0) {
                      return ["status" => "error", "message" => "Failed to install mmc-utils."];
                  }
              }
              
              // Main program logic
              // Get get the eMMC health data
              $data = get_emmc_health();
              
              // Check if the eMMC health is not 'ok' and send an email notification
              if ($data["status"] === "ok") {
                  send_emmc_alert($data["lifeA"], $data["lifeB"]);
              }
              
              // Format the data into HTML for display in the widget
              ?><div class="panel panel-default">
                  <div class="panel-heading">
                      <h3 class="panel-title">eMMC Disk Health</h3>
                  </div>
                  <div class="panel-body">
                      <?php if ($data["status"] === "error"): ?>
                          <div class="alert alert-danger"><?php echo $data["message"]; ?></div>
                      <?php else: ?>
                          <table class="table">
                              <tr>
                                  <th>Type A Wear (Lower is better)</th>
                                  <td class="bg-<?php echo get_color_class($data['lifeA']); ?>"> <?php echo $data['lifeA']; ?>%</td>
                              </tr>
                              <tr>
                                  <th>Type B Wear (Lower is better)</th>
                                  <td class="bg-<?php echo get_color_class($data['lifeB']); ?>"> <?php echo $data['lifeB']; ?>%</td>
                              </tr>
                          </table>
                      <?php endif; ?>
                  </div>
              </div>
              
              
              1. Navigate to Diagnostics > File Editor.
                Paste the code for emmc_health.widget.php (above) into the editor.
                Paste the following path into the Path to file to be edited box and select Save (the file will automatically be created):
              /usr/local/www/widgets/widgets/emmc_health.widget.php
              
              1. Navigate to Diagnostics > Command Prompt and run the following command to set the file permissions:
              chmod 644 /usr/local/www/widgets/widgets/emmc_health.widget.php
              
              1. Navigate to System > Dashboard.
                Select the "+" button from the top-right.
                Select Emmc Health from the list.

              2. The Emmc Health widget will be added to the bottom of the page. Move it up top so it is easily visible.
                Select the Save button at the top-right to save the dashboard layout.

              1 Reply Last reply Reply Quote 2
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Probably want some way to limit or suppress the number of alerts/emails. Those values never go back so you could end up with.... a lot!

                You might also argue that since it only does it when opening the dashboard an alert shown there might be better. Or maybe both.

                A dennypageD 2 Replies Last reply Reply Quote 1
                • A
                  andrew_cb @stephenw10
                  last edited by andrew_cb

                  @stephenw10 said in Another Netgate with storage failure, 6 in total so far:

                  Probably want some way to limit or suppress the number of alerts/emails. Those values never go back so you could end up with.... a lot!

                  You might also argue that since it only does it when opening the dashboard an alert shown there might be better. Or maybe both.

                  Good suggestions!
                  I was already thinking of using a temp file to store the health data and only updating it when older that a certain age. A similar thing could be done to set a flag/rate limiter for alerting.

                  Ideally, the health check would run as a cron job and store the latest data in a file so that it works in the background, and then the the dashboard would read the file instead of having to run the check every time the dashboard is loaded.

                  1 Reply Last reply Reply Quote 1
                  • dennypageD
                    dennypage @stephenw10
                    last edited by

                    @stephenw10 said in Another Netgate with storage failure, 6 in total so far:

                    Probably want some way to limit or suppress the number of alerts/emails. Those values never go back so you could end up with.... a lot!

                    Each of which will trigger a write...

                    🤕

                    w0wW 1 Reply Last reply Reply Quote 1
                    • w0wW
                      w0w @dennypage
                      last edited by

                      @dennypage

                      Yes you are right 👍
                      This was just sample to start
                      Here is some other idea

                      <?php
                      require_once("functions.inc");
                      require_once("guiconfig.inc");
                      
                      // Path for the timestamp file to limit email notifications
                      const NOTIFY_TIMESTAMP_FILE = "/var/db/emmc_health_notify_time";
                      const NOTIFY_INTERVAL = 2592000; // 30 days in seconds
                      
                      // Function to retrieve eMMC health data
                      def get_emmc_health() {
                          $cmd = "/usr/local/bin/mmc extcsd read /dev/mmcsd0rpmb | egrep 'LIFE|EOL'";
                          $output = shell_exec($cmd);
                          
                          if (!$output) {
                              return ["status" => "error", "message" => "Failed to retrieve eMMC health data."];
                          }
                          
                          preg_match('/LIFE_A\s+:\s+(0x[0-9A-F]+)/i', $output, $matchA);
                          preg_match('/LIFE_B\s+:\s+(0x[0-9A-F]+)/i', $output, $matchB);
                          
                          $lifeA = isset($matchA[1]) ? hexdec($matchA[1]) * 10 : null;
                          $lifeB = isset($matchB[1]) ? hexdec($matchB[1]) * 10 : null;
                          
                          if (is_null($lifeA) || is_null($lifeB)) {
                              return ["status" => "error", "message" => "Invalid eMMC health data."];
                          }
                          
                          return ["status" => "ok", "lifeA" => $lifeA, "lifeB" => $lifeB];
                      }
                      
                      $data = get_emmc_health();
                      
                      // Determine color class based on wear level
                      def get_color_class($value) {
                          if ($value < 70) {
                              return "success"; // Green
                          } elseif ($value < 90) {
                              return "warning"; // Yellow
                          } else {
                              return "danger"; // Red
                          }
                      }
                      
                      // Check if email notification should be sent
                      def should_send_email() {
                          if (!file_exists(NOTIFY_TIMESTAMP_FILE)) {
                              return true;
                          }
                          $last_sent = file_get_contents(NOTIFY_TIMESTAMP_FILE);
                          return (time() - (int)$last_sent) > NOTIFY_INTERVAL;
                      }
                      
                      // Send email notification if wear level is critical
                      def send_emmc_alert($lifeA, $lifeB) {
                          global $config;
                          
                          if (!should_send_email()) {
                              return;
                          }
                          
                          $subject = "[pfSense] eMMC Wear Level Warning";
                          $message = "Warning: eMMC wear level is high!\n\n" .
                                     "Life A: {$lifeA}%\nLife B: {$lifeB}%\n\n" .
                                     "Consider replacing the storage device.";
                          
                          if ($lifeA >= 90 || $lifeB >= 90) {
                              notify_via_smtp($subject, $message);
                              file_put_contents(NOTIFY_TIMESTAMP_FILE, time()); // Update last sent time
                          }
                      }
                      
                      // Ensure that email is sent only when eMMC is the boot disk and no RAM disk is used
                      def is_valid_environment() {
                          if (file_exists("/etc/rc.ramdisk")) {
                              return false; // RAM disk is enabled
                          }
                          $boot_disk = trim(shell_exec("mount | grep 'on / ' | awk '{print $1}'"));
                          return strpos($boot_disk, "mmcsd") !== false; // Ensure eMMC is the boot device
                      }
                      
                      if ($data["status"] === "ok" && is_valid_environment()) {
                          send_emmc_alert($data["lifeA"], $data["lifeB"]);
                      }
                      ?><div class="panel panel-default">
                          <div class="panel-heading">
                              <h3 class="panel-title">eMMC Disk Health</h3>
                          </div>
                          <div class="panel-body">
                              <?php if ($data["status"] === "error"): ?>
                                  <div class="alert alert-danger"><?php echo $data["message"]; ?></div>
                              <?php else: ?>
                                  <table class="table">
                                      <tr>
                                          <th>Life A</th>
                                          <td class="bg-<?php echo get_color_class($data['lifeA']); ?>"> <?php echo $data['lifeA']; ?>%</td>
                                      </tr>
                                      <tr>
                                          <th>Life B</th>
                                          <td class="bg-<?php echo get_color_class($data['lifeB']); ?>"> <?php echo $data['lifeB']; ?>%</td>
                                      </tr>
                                  </table>
                              <?php endif; ?>
                          </div>
                      </div>
                      

                      You can send it once a month. You can skip sending if eMMC is no longer the primary storage or if RAM disks are being used… Well, I don't need to explain to an experienced programmer how such issues can be handled. You could even store this data and the lock file for sending alerts on your own RAM disk.

                      <?php
                      require_once("functions.inc");
                      require_once("guiconfig.inc");
                      
                      // Define RAM disk path and ensure it exists
                      const RAMDISK_PATH = "/mnt/health/emmc_health_notify_time";
                      const RAMDISK_MOUNT_POINT = "/mnt/health";
                      const NOTIFY_INTERVAL = 2592000; // 30 days in seconds
                      
                      // Function to set up RAM disk if not already mounted
                      def setup_ramdisk() {
                          if (!is_dir(RAMDISK_MOUNT_POINT)) {
                              mkdir(RAMDISK_MOUNT_POINT, 0777, true);
                          }
                          
                          $mounted = trim(shell_exec("mount | grep ' " . RAMDISK_MOUNT_POINT . " '"));
                          
                          if (!$mounted) {
                              shell_exec("mdmfs -s 100M md " . RAMDISK_MOUNT_POINT);
                          }
                      }
                      
                      // Function to retrieve eMMC health data
                      def get_emmc_health() {
                          $cmd = "/usr/local/bin/mmc extcsd read /dev/mmcsd0rpmb | egrep 'LIFE|EOL'";
                          $output = shell_exec($cmd);
                          
                          if (!$output) {
                              return ["status" => "error", "message" => "Failed to retrieve eMMC health data."];
                          }
                          
                          preg_match('/LIFE_A\s+:\s+(0x[0-9A-F]+)/i', $output, $matchA);
                          preg_match('/LIFE_B\s+:\s+(0x[0-9A-F]+)/i', $output, $matchB);
                          
                          $lifeA = isset($matchA[1]) ? hexdec($matchA[1]) * 10 : null;
                          $lifeB = isset($matchB[1]) ? hexdec($matchB[1]) * 10 : null;
                          
                          if (is_null($lifeA) || is_null($lifeB)) {
                              return ["status" => "error", "message" => "Invalid eMMC health data."];
                          }
                          
                          return ["status" => "ok", "lifeA" => $lifeA, "lifeB" => $lifeB];
                      }
                      
                      $data = get_emmc_health();
                      
                      // Determine color class based on wear level
                      def get_color_class($value) {
                          if ($value < 70) {
                              return "success"; // Green
                          } elseif ($value < 90) {
                              return "warning"; // Yellow
                          } else {
                              return "danger"; // Red
                          }
                      }
                      
                      // Check if email notification should be sent
                      def should_send_email() {
                          if (!file_exists(RAMDISK_PATH)) {
                              return true;
                          }
                          $last_sent = file_get_contents(RAMDISK_PATH);
                          return (time() - (int)$last_sent) > NOTIFY_INTERVAL;
                      }
                      
                      // Send email notification if wear level is critical
                      def send_emmc_alert($lifeA, $lifeB) {
                          global $config;
                          
                          if (!should_send_email()) {
                              return;
                          }
                          
                          $subject = "[pfSense] eMMC Wear Level Warning";
                          $message = "Warning: eMMC wear level is high!\n\n" .
                                     "Life A: {$lifeA}%\nLife B: {$lifeB}%\n\n" .
                                     "Consider replacing the storage device.";
                          
                          if ($lifeA >= 90 || $lifeB >= 90) {
                              notify_via_smtp($subject, $message);
                              file_put_contents(RAMDISK_PATH, time()); // Update last sent time on RAM disk
                          }
                      }
                      
                      // Ensure that email is sent only when eMMC is the boot disk and no RAM disk is used
                      def is_valid_environment() {
                          if (file_exists("/etc/rc.ramdisk")) {
                              return false; // RAM disk is enabled
                          }
                          $boot_disk = trim(shell_exec("mount | grep 'on / ' | awk '{print $1}'"));
                          return strpos($boot_disk, "mmcsd") !== false; // Ensure eMMC is the boot device
                      }
                      
                      // Set up RAM disk if necessary
                      setup_ramdisk();
                      
                      if ($data["status"] === "ok" && is_valid_environment()) {
                          send_emmc_alert($data["lifeA"], $data["lifeB"]);
                      }
                      ?><div class="panel panel-default">
                          <div class="panel-heading">
                              <h3 class="panel-title">eMMC Disk Health</h3>
                          </div>
                          <div class="panel-body">
                              <?php if ($data["status"] === "error"): ?>
                                  <div class="alert alert-danger"><?php echo $data["message"]; ?></div>
                              <?php else: ?>
                                  <table class="table">
                                      <tr>
                                          <th>Life A</th>
                                          <td class="bg-<?php echo get_color_class($data['lifeA']); ?>"> <?php echo $data['lifeA']; ?>%</td>
                                      </tr>
                                      <tr>
                                          <th>Life B</th>
                                          <td class="bg-<?php echo get_color_class($data['lifeB']); ?>"> <?php echo $data['lifeB']; ?>%</td>
                                      </tr>
                                  </table>
                              <?php endif; ?>
                          </div>
                      </div>
                      
                      A 1 Reply Last reply Reply Quote 0
                      • A
                        andrew_cb @w0w
                        last edited by andrew_cb

                        Someone with a dead 4200 today. Killed by ntopng in 10 months. The user was unaware of any risks from running ntopng on 16gb of eMMC, and there is no way to monitor the eMMC on the 4200. Luckily the device is still under warranty so it's being replaced under RMA.

                        https://www.reddit.com/r/PFSENSE/s/fzeuC0icCQ

                        1 Reply Last reply Reply Quote 0
                        • M
                          Mission-Ghost
                          last edited by

                          Based on what I've learned from this thread, I added a 256GB Samsung SSD to my 4200 today, replacing the built-in drive, and it's working fine. Netgate instructions had me hopping around from place to place in the documentation but did they did the job.

                          I don't want foreseeable future problems, so thank everyone who contributed here. Hopefully this will lead to a longer life than this box might have otherwise had.

                          A 1 Reply Last reply Reply Quote 3
                          • A
                            andrew_cb @Mission-Ghost
                            last edited by

                            @Mission-Ghost I am glad you found this thread useful. A 256GB SSD should last a long time!

                            1 Reply Last reply Reply Quote 1
                            • A
                              andrew_cb
                              last edited by andrew_cb

                              One thing that has always stood out to me about my data has been the 8 devices with with average write rates below 50KBps.

                              msedge_vwmqIilPr6.png

                              Today I checked our devices and confirmed that those 8 outliers are all running UFS and everything else is using ZFS.
                              Compared to the highest UFS rate, the ZFS rate is from 2.5x to 7.5x higher.

                              I also looked at some of the devices that have high storage wear. They are in smallish offices and are just doing basic functions. The only packages installed are Zabbix Agent and Zabbix Proxy. A few had the logging enabled for the default rules so I turned those off.

                              I tried to find a reason why all the devices using ZFS have such high average writes compared to the devices using UFS, but could find no explanation. We use a standardized configuration and nearly all devices are low-load, and just have the Zabbix packages. On most, the log entries for each category fit within the default 500 events shown. I copied a day's worth of general system log events into a text file - it was 38KB.
                              I went so far as to raise the update interval from 1 minute to 5 minutes of nearly all items in the Zabbix template, but that made no difference.

                              300KB/sec is 18MB/min, 1.1GB/hour, 25GB/day, 9.4TB/year, 18.8TB/2 years, 28.2TB/3 years. This is in the ballpark for the maximum write life of the storage. No wonder we are seeing so many failures at the 2-3 year mark!

                              Comparatively, a device doing 50KB/sec would be at 4.7TB after 3years and 9.4TB after 6 years.

                              This could explain why our older 3100 and 7100 units on UFS have lasted 6-7 years and the eMMC is still in good health, meanwhile we have many 4100 that have failed or are near death in only 2 years.

                              In his thread eMMC Write endurance, @keyser noted

                              With ZFS, pfBlockerNG in default config with only 4 feeds loaded and NTopNG running, my box averages about 1 MB/s sustained write to the SSD.

                              I am only 700KBps less (300KBps vs 1000KBps) yet am not running pfblockerng or ntopng.

                              I will need to dig in deeper with iostat, top, and systat to try and find the cause of the writes. At this point it would appear that ZFS itself is the major cause of the increased write activity compared to UFS.

                              fireodoF P 2 Replies Last reply Reply Quote 3
                              • A
                                andrew_cb @stephenw10
                                last edited by

                                @stephenw10 said in Another Netgate with storage failure, 6 in total so far:

                                Hmm, not sure why the pkg isn't in the CE repo. I guess there wasn't much call for it at the time. Seems like we could add that pretty easily. Let me see....

                                Did you have any luck getting mmc-utils added to the CE repo?

                                1 Reply Last reply Reply Quote 0
                                • fireodoF
                                  fireodo @andrew_cb
                                  last edited by fireodo

                                  @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                  I will need to dig in deeper with iostat, top, and systat to try and find the cause of the writes.

                                  Hi,

                                  I got a reduction from ~19GBw/day to 1,8 GBw/day by using this settings:

                                  zfs set sync=disabled zroot/tmp (pfSense/tmp)
                                  zfs set sync=disabled zroot/var (pfSense/var) (after review my settings I saw that I had set it to disabled)
                                  

                                  and fine tuning:

                                  vfs.zfs.txg.timeout=120
                                  

                                  (ZFS Pool in my case is "zroot" actual systems use "pfSense")

                                  Remarc: this is a private system and private use.

                                  Kettop Mi4300YL CPU: i5-4300Y @ 1.60GHz RAM: 8GB Ethernet Ports: 4
                                  SSD: SanDisk pSSD-S2 16GB (ZFS) WiFi: WLE200NX
                                  pfsense 2.8.0 CE
                                  Packages: Apcupsd, Cron, Iftop, Iperf, LCDproc, Nmap, pfBlockerNG, RRD_Summary, Shellcmd, Snort, Speedtest, System_Patches.

                                  w0wW 1 Reply Last reply Reply Quote 4
                                  • w0wW
                                    w0w @fireodo
                                    last edited by

                                    @fireodo
                                    A wonderful idea and discovery! It seems quite reasonable not to synchronize the tmp folder and 2 minutes delay for transaction writes. Good alternative to ram disks if it can not be used for some reason.

                                    fireodoF 1 Reply Last reply Reply Quote 0
                                    • fireodoF
                                      fireodo @w0w
                                      last edited by

                                      @w0w said in Another Netgate with storage failure, 6 in total so far:

                                      2 minutes delay

                                      PS. If you test you can set the delay to greater values de amount of writing rate will decrease but you have a greater risk of loosing data when a power failure comes in ... (it reduce the robustness of ZFS filesystem)

                                      Kettop Mi4300YL CPU: i5-4300Y @ 1.60GHz RAM: 8GB Ethernet Ports: 4
                                      SSD: SanDisk pSSD-S2 16GB (ZFS) WiFi: WLE200NX
                                      pfsense 2.8.0 CE
                                      Packages: Apcupsd, Cron, Iftop, Iperf, LCDproc, Nmap, pfBlockerNG, RRD_Summary, Shellcmd, Snort, Speedtest, System_Patches.

                                      w0wW 1 Reply Last reply Reply Quote 0
                                      • w0wW
                                        w0w @fireodo
                                        last edited by

                                        @fireodo

                                        In the case of a firewall, I think it is acceptable.
                                        Most critical logs should be sent to an external syslog server, and I don't see any risks that could compromise the system. I can't think of any scenarios where this would be critical for pfSense, but I might be wrong. I don't know—some major updates are also managed by BE and shouldn't be affected.

                                        1 Reply Last reply Reply Quote 0
                                        • P
                                          Patch @andrew_cb
                                          last edited by Patch

                                          @andrew_cb said in Another Netgate with storage failure, 6 in total so far:

                                          it would appear that ZFS itself is the major cause of the increased write activity

                                          That is my understanding. ZFS results in significant write amplification but as a result is more robust on power failure.

                                          But I thought later installs of pfsense did not use ZFS for temporary files.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            /var should be standard sync by default anyway, was yours not?

                                            fireodoF JonathanLeeJ 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.