Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Problems upgrading/installing 24.11 on SG-5100

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    15 Posts 3 Posters 689 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L
      LamaZ
      last edited by LamaZ

      Greetings folks,

      I've tried some googling, and finally decided to see if anyone else has better google-fu than I do.

      After a seemingly smooth upgrade from 24.03->24.11, I thought all was well. Later that night the router locked up. Dhcp died, no ssh with a static address, and even the local console was dead. Only pings were working.

      So I went back and tried to do a fresh install of 24.11 from USB and failed. Then I tried a fresh USB install of 24.03 and also failed. I've even tried going back to the previous install media for 24.03 and am arriving at the same results.

      Install failed.
      
      [1/12] Extracting pciids-20240920: ..... done
      [2/12] Installing libpci-3.13.0...
      [2/12] Extracting libpci-3.13.0: .......... done
      [3/12] Installing e2fsprogs-libuuid-1.47.1...
      [3/12] Extracting e2fsprogs-libuuid-1.47.1: .......... done
      [4/12] Installing aws-sdk-php83-3.273.3...
      [4/12] Extracting aws-sdk-php83-3.273.3: .......... done
      [5/12] Installing zip-3.0_2...
      [5/12] Extracting zip-3.0_2: .......... done
      [6/12] Installing flashrom-1.3.0_4...
      [6/12] Extracting flashrom-1.3.0_4: .......... done
      [7/12] Installing pfSense-pkg-WireGuard-0.2.9...
      [7/12] Extracting pfSense-pkg-WireGuard-0.2.9: .......... done
      [8/12] Installing drm-515-kmod-5.15.160...
      [8/12] Extracting drm-515-kmod-5.15.160: .......... done
      [9/12] Installing pfSense-pkg-ipsec-profile-wizard-1.2.4...
      [9/12] Extracting pfSense-pkg-ipsec-profile-wizard-1.2.4: ......... done
      [10/12] Installing pfSense-pkg-Netgate_Firmware_Upgrade-23.05.01...
      [10/12] Extracting pfSense-pkg-Netgate_Firmware_Upgrade-23.05.01: .......... done
      [11/12] Installing pfSense-pkg-aws-wizard-0.12...
      [11/12] Extracting pfSense-pkg-aws-wizard-0.12: ....... done
      [12/12] Installing pfSense-plus-24.11...
      [12/12] Extracting pfSense-plus-24.11: ... done
      =====
      Message from drm-515-kmod-5.15.160:
      
      --
      The drm-515-kmod port can be enabled for amdgpu (for AMD
      GPUs starting with the HD7000 series / Tahiti) or i915kms (for Intel
      APUs starting with HD3000 / Sandy Bridge) through kld_list in
      /etc/rc.conf. radeonkms for older AMD GPUs can be loaded and there are
      some positive reports if EFI boot is NOT enabled.
      
      For amdgpu: kld_list="amdgpu"
      For Intel: kld_list="i915kms"
      For radeonkms: kld_list="radeonkms"
      
      Please ensure that all users requiring graphics are members of the
      "video" group.
      
      pfSense Post Installation setup
      mount_msdosfs: /dev/ada0p1: Invalid argument
      
      Error: Failed to run the post installation script.
      

      Any ideas? I'm going to try and go back to 23.x next.

      Attached log file:
      install-log.txt

      -LamaZ

      1 Reply Last reply Reply Quote 0
      • L
        LamaZ
        last edited by

        Sooo frustrating. I went back to the 23.09.1 installer (pfSense-plus-memstick-serial-23.09.1-RELEASE-amd64.img), and after it finished and prompted me to reboot it STILL reboots into 24.11-RELEASE!

        -LamaZ

        1 Reply Last reply Reply Quote 0
        • L
          LamaZ
          last edited by LamaZ

          How do I get out of 24.11 hell? I can't reboot out of it. I set the boot environment to 24.03 and it still forces me back. See below the screenshot of my Boot Environments taunting me.

          99c47d15-eb0b-461d-9a8d-99b2a8ffa1ec-image.png

          The option (the play button) option didn't work either:
          9f4b3833-6086-4ed1-b44e-3978e692f4a3-image.png
          -LamaZ

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            What are you booting from here?

            The installer should remove all ZFS BEs from the target drive. So possibly it installed to the eMMC and you are booting from the SSD?

            Another possibility is that the eMMC has gone read-only. Though that is not the usual failure mode for it.

            Steve

            L 1 Reply Last reply Reply Quote 0
            • L
              LamaZ @stephenw10
              last edited by LamaZ

              @stephenw10 Thanks for the tip. The eMMC died many years ago. I thought I disabled it in the BIOS. I'll double check. I definitely targeted the hard disk with the installer /dev/ada0.

              Edit: It is the M.2 hard drive. No luck so far.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                AFAIK there is no way to disable the eMMC. It's a problem on the 5100.

                Perhaps the SSD has gone read only? That would be a lot of writes but that is a much more common failure mode for an SSD.

                L 1 Reply Last reply Reply Quote 0
                • L
                  LamaZ @stephenw10
                  last edited by LamaZ

                  @stephenw10 OK, I could see something like ntopng writing a lot to disk. Question, how would I go about zeroing in on the SSD going into read only mode as the culprit if I can't SSH, webConfig, or console once it locks up? Is an rsyslog server going to be needed, or is there some other method to diagnose this?

                  To be clear, the thought is that the install/upgrade process has so many consecutive writes that the SSD goes into read only mode. Did I get that right?

                  PS -Thanks for the validation on the eMMC cannot be nuked on the 5100. I was going crazy trying to figure out how to permanently disable it.

                  Thanks!

                  -LamaZ

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    First check the SMART data. It should have an estimated wear level value.

                    [24.11-RELEASE][admin@5100.stevew.lan]/root: smartctl -a /dev/ada0
                    smartctl 7.4 2023-08-01 r5530 [FreeBSD 15.0-CURRENT amd64] (local build)
                    Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
                    
                    === START OF INFORMATION SECTION ===
                    Device Model:     NT-32
                    Serial Number:    987032300377
                    LU WWN Device Id: 5 000000 000000000
                    Firmware Version: 1.095.06
                    User Capacity:    32,017,047,552 bytes [32.0 GB]
                    Sector Size:      512 bytes logical/physical
                    Rotation Rate:    Solid State Device
                    Form Factor:      2.5 inches
                    TRIM Command:     Available
                    Device is:        Not in smartctl database 7.3/5528
                    ATA Version is:   ACS-2 T13/2015-D revision 3
                    SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
                    Local Time is:    Tue Dec  3 00:58:51 2024 GMT
                    SMART support is: Available - device has SMART capability.
                    SMART support is: Enabled
                    
                    === START OF READ SMART DATA SECTION ===
                    SMART overall-health self-assessment test result: PASSED
                    
                    General SMART Values:
                    Offline data collection status:  (0x00)	Offline data collection activity
                    					was never started.
                    					Auto Offline Data Collection: Disabled.
                    Self-test execution status:      (   0)	The previous self-test routine completed
                    					without error or no self-test has ever 
                    					been run.
                    Total time to complete Offline 
                    data collection: 		(   32) seconds.
                    Offline data collection
                    capabilities: 			 (0x5b) SMART execute Offline immediate.
                    					Auto Offline data collection on/off support.
                    					Suspend Offline collection upon new
                    					command.
                    					Offline surface scan supported.
                    					Self-test supported.
                    					No Conveyance Self-test supported.
                    					Selective Self-test supported.
                    SMART capabilities:            (0x0003)	Saves SMART data before entering
                    					power-saving mode.
                    					Supports SMART auto save timer.
                    Error logging capability:        (0x01)	Error logging supported.
                    					General Purpose Logging supported.
                    Short self-test routine 
                    recommended polling time: 	 (   1) minutes.
                    Extended self-test routine
                    recommended polling time: 	 (   1) minutes.
                    SCT capabilities: 	       (0x0039)	SCT Status supported.
                    					SCT Error Recovery Control supported.
                    					SCT Feature Control supported.
                    					SCT Data Table supported.
                    
                    SMART Attributes Data Structure revision number: 0
                    Vendor Specific SMART Attributes with Thresholds:
                    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
                      1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always       -       0
                      2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
                      3 Spin_Up_Time            0x0007   100   100   050    Pre-fail  Always       -       0
                      5 Reallocated_Sector_Ct   0x0013   100   100   050    Pre-fail  Always       -       0
                      7 Unknown_SSD_Attribute   0x000b   100   100   050    Pre-fail  Always       -       0
                      8 Unknown_SSD_Attribute   0x0005   100   100   050    Pre-fail  Offline      -       0
                      9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       46130
                     10 Unknown_SSD_Attribute   0x0013   100   100   050    Pre-fail  Always       -       0
                     12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       79
                    167 Unknown_Attribute       0x0022   100   100   000    Old_age   Always       -       0
                    168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
                    169 Unknown_Attribute       0x0013   100   100   010    Pre-fail  Always       -       262146
                    170 Unknown_Attribute       0x0013   100   100   010    Pre-fail  Always       -       0
                    171 Unknown_Attribute       0x0032   000   000   000    Old_age   Always       -       0
                    172 Unknown_Attribute       0x0032   000   000   000    Old_age   Always       -       0
                    173 Unknown_Attribute       0x0012   134   134   000    Old_age   Always       -       3560591852425
                    175 Program_Fail_Count_Chip 0x0013   100   100   010    Pre-fail  Always       -       0
                    180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   100   100   020    Pre-fail  Always       -       69
                    187 Reported_Uncorrect      0x0032   000   000   000    Old_age   Always       -       0
                    192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       62
                    194 Temperature_Celsius     0x0022   075   075   030    Old_age   Always       -       25 (0 60 0 30 0)
                    197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
                    231 Unknown_SSD_Attribute   0x0033   070   070   005    Pre-fail  Always       -       30
                    240 Unknown_SSD_Attribute   0x0013   100   100   050    Pre-fail  Always       -       0
                    241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       18383081362
                    242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       359301512
                    
                    SMART Error Log Version: 1
                    No Errors Logged
                    
                    SMART Self-test log structure revision number 1
                    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
                    # 1  Short offline       Completed without error       00%     18192         -
                    
                    SMART Selective self-test log data structure revision number 1
                     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
                        1        0        0  Not_testing
                        2        0        0  Not_testing
                        3        0        0  Not_testing
                        4        0        0  Not_testing
                        5        0        0  Not_testing
                    Selective self-test flags (0x0):
                      After scanning selected spans, do NOT read-scan remainder of disk.
                    If Selective self-test is pending on power-up, resume after 0 minute delay.
                    
                    The above only provides legacy SMART information - try 'smartctl -x' for more
                    

                    So there the 'Unused_Rsvd_Blk_Cnt_Tot' is above zero, it still has spare blocks to use in the event of others failing. Other drives often have more useful values.

                    L 1 Reply Last reply Reply Quote 1
                    • L
                      LamaZ @stephenw10
                      last edited by

                      @stephenw10 said in Problems upgrading/installing 24.11 on SG-5100:

                      smartctl -a /dev/ada0

                      You sir win! Here is my most un-favorite line in my output:

                      === START OF READ SMART DATA SECTION ===
                      SMART overall-health self-assessment test result: FAILED!
                      Drive failure expected in less than 24 hours. SAVE ALL DATA.
                      No failed Attributes found.
                      

                      Ordering another disk ASAP.

                      -LamaZ

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Yikes! ๐Ÿ˜ฑ

                        L 1 Reply Last reply Reply Quote 0
                        • L
                          LamaZ @stephenw10
                          last edited by

                          @stephenw10 I can't thank you enough. Seriously. I've been going stir crazy for the past couple of days.

                          Question, should we rename the title of this thread? I don't want to give other SG-5100 owners the wrong impression when they see the title.

                          -LamaZ

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            I think it's probably fine to leave it. Not many people come looking for problems before they encounter them. ๐Ÿ˜‰

                            PhizixP 1 Reply Last reply Reply Quote 0
                            • PhizixP
                              Phizix @stephenw10
                              last edited by Phizix

                              @stephenw10 said in Problems upgrading/installing 24.11 on SG-5100:

                              I think it's probably fine to leave it. Not many people come looking for problems before they encounter them. ๐Ÿ˜‰

                              Uhhh I do! That is how I ended up in this thread. I have an SG 5100 and wanted to see what others experience has been upgrading to 24.11.

                              I suspect it will not be as much of an issue (I hope) since I went with a bit larger SSD. It shows about 1.5 years of life left.

                              @LamaZ - What size was your drive?

                              Phizix

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                I could change the title? The drive failure here isn't actually 5100 specific.

                                PhizixP 1 Reply Last reply Reply Quote 0
                                • PhizixP
                                  Phizix @stephenw10
                                  last edited by

                                  @stephenw10 - I totally agree, but that is what originally pulled me into the thread. I wouldn't bother changing the thread title.

                                  I also think that my wear rate is actually slower so I expect the 1.5 year estimate to be more like 3 years.

                                  Phizix

                                  1 Reply Last reply Reply Quote 1
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.