SG-1100 update from 23.09.1 to 24.03 keeps failing
-
@mrneutron said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
SG-1100s in fall 2019
That's 4.5 years ago... I think you're confusing "failing" with "reached end of write life" meaning "too much has been written to it." Do you have any of the "SSD" packages listed at https://www.netgate.com/supported-pfsense-plus-packages installed?
Solid state storage has a finite write life. eMMC is much shorter than SSD, and SSDs are far larger so can spread writes out more.
-
@SteveITS said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
@mrneutron said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
SG-1100s in fall 2019
That's 4.5 years ago...
So, are you saying that Netgate hardware cannot be expected to last more than 4 years?
Do you have any of the "SSD" packages listed at https://www.netgate.com/supported-pfsense-plus-packages installed?
Solid state storage has a finite write life. eMMC is much shorter than SSD, and SSDs are far larger so can spread writes out more.
I have never loaded any SSD packages.
The SG-1100 does not have an SSD, or have an option to add an SSD. -
@mrneutron said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
SG-1100 does not have an SSD
I'm aware. I meant, a package like NtopNG or HAProxy or Snort/Suricata, as listed in the "Storage Requirements" column. I believe Squid used to be listed there while it was supported. I'm not trying to argue with you, just asking.
As for product life time, we've had clients with routers in service a long time. Disk life depends on what is written to disk. Everyone's use case varies. We typically turn off the logging of default block rules, and set a RAM disk, among other things. Some people like to log everything, and watch the dashboard all day which logs every web server request.
-
@mrneutron said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Which specific files will show the evidence?
root@pfSense-install:/tmp # ls /tmp .ICE-unix .font-unix bsdinstall_log recovered_config .X11-unix bsdinstall_boot install-log.txt tmp.GEB2CtRhSn .XIM-unix bsdinstall_etc mnt_recovery
Check install-log.txt and bsdinstall_log from /tmp.
-
I just saw this thread on longevity of the emmc in the 1100.
https://forum.netgate.com/topic/170128/emmc-write-enduranceIt looks like ZFS creates more disk writes than UFS.
But, we needed to move to ZFS in order to upgrade to the 2023 versions of firmware so we have ZFS now, and need to deal with it, or watch them get eaten alive.It sure seems like Netgate needs to roll in some kind of endurance measures into the stock firmware (like a RAMDISK?) to prevent ZFS from wearing out the emmc up prematurely.
It looks like the SG-2100 has an option for an SSD, so maybe it would be able to avoid getting eaten alive by the ZFS and the OS?
"Storage: 8 GB eMMC Flash onboard (or optional 128 GB M.2 SATA 2242 SSD)"
https://shop.netgate.com/products/2100-max-pfsense -
You don't need to use ZFS, you can install the 1100 as UFS. RAMdisks can be enabled, as I said, and UFS with RAMdisks is the lowest drive write setup.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
You don't need to use ZFS, you can install the 1100 as UFS. RAMdisks can be enabled, as I said, and UFS with RAMdisks is the lowest drive write setup.
But, with the upgrade to 23.03, the original UFS file system had an EFI partition that was too small to allow the upgrade, and Netgate really pushed moving to the ZFS file system, so I did.
see:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/upgrades-1100-2100.html?hsCtaTracking=821601fb-1bc2-40c2-a65d-a631bcb219c6%7C58f3d6cd-6fd1-4339-9a58-ddc137771e9f&utm_campaign=Foundation%3A%20pfSense%20Buyer%20Journey&utm_medium=email&_hsenc=p2ANqtz--GmK5dXE7vgA1GS5OzPAVHtgCngk0W1FTR_1DzWcO44Ln61FfJPsxbaSBLCk838-KAr59MaQso_lFGhB1EovOSU53itA&_hsmi=249746989&utm_content=249746989&utm_source=hs_email -
Yes, there are advantages to using ZFS but I'm just pointing out that you don't have to. You can still install UFS if you want to.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Yes, there are advantages to using ZFS but I'm just pointing out that you don't have to. You can still install UFS if you want to.
Thanks for pointing that out.
I had interpreted the Netgate ZFS recommendation as a requirement for upgrading to the 23.xx firmware. Now, that I reread it, I see that recommendation was just tacked on to the end of the reimage instructions, required to increase the size of the EFI partition.Given my presumed emmc failure experience with the SG-1100, I'm hesitant to buy another one and have a lifespan that is less than other options.
My desire was for the SG-1100 to be more of an easy-to-use network appliance like a Linksys or a Netgear router.
The SG-1100 was behaving like that until changing from UFS the ZFS (on Netgate's recommendation) in fall 2023, due to my 2019 SG-1100s having the small EFI partition. -
@SteveITS said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
@mrneutron said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
SG-1100 does not have an SSD
I'm aware. I meant, a package like NtopNG or HAProxy or Snort/Suricata, as listed in the "Storage Requirements" column. I believe Squid used to be listed there while it was supported. I'm not trying to argue with you, just asking.
As for product life time, we've had clients with routers in service a long time. Disk life depends on what is written to disk. Everyone's use case varies. We typically turn off the logging of default block rules, and set a RAM disk, among other things. Some people like to log everything, and watch the dashboard all day which logs every web server request.
Ok, fair enough. I just wanted to be sure you were keeping the scope of the possible fixes/changes confined to the limitations of the SG-1100.
I'm really a very basic pfsense user. I've got an OpenVPN tunnel running from my home end to a remote end. That's about it beyond the normal firewall functions that come with pfsense. My desire was for the SG-1100 to be more of an easy-to-use network appliance like a Linksys or a Netgear router.
Having an SG-1100 unit that has presumed emmc failure after 4 years, requiring hardware replacement, isn't cool.
If Netgate knows that their emmc gets overly taxes, they should roll longevity precautions (like RAMDISK) into the pfsense OS by default and give us a checkbox to turn it on. -
Have a look at System->Advanced->Miscellaneous->RAM Disk Settings
https://docs.netgate.com/pfsense/en/latest/config/advanced-misc.html#ram-disk-settings
-
@terryzb said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Have a look at System->Advanced->Miscellaneous->RAM Disk Settings
https://docs.netgate.com/pfsense/en/latest/config/advanced-misc.html#ram-disk-settings
@terryzb
Thanks! I'll check and see if my SG-1100 has that enabled. -
Indeed there is checkbox to turn it on. But perhaps a further question might be whether it should be enabled by default in some configurations. It does restrict what you can use. And it obviously uses RAM which is important on the 1100 which only has 1GB.
One potential option here is that the new Net Installer will install to any available drive including USB. It's possible to install to a USB drive and boot from that.
It does require manually changing the bootcmd env in uboot though. -
Here's the log from the failed Online Network Installer attempt.
I can't see any big clues as to the failed partitioning.root@pfSense-install:~ # cd /tmp root@pfSense-install:/tmp # ls .ICE-unix .font-unix bsdinstall_etc recovered_config .X11-unix bsdinstall-tmp-fstab bsdinstall_log tmp.zf1blw5xTi .XIM-unix bsdinstall_boot mnt_recovery root@pfSense-install:/tmp # cat bsdinstall_log DEBUG: Running installation step: auto DEBUG: dialog.subr: DEBUG_SELF_INITIALIZE=[] DEBUG: UNAME_S=[FreeBSD] UNAME_P=[aarch64] UNAME_R=[15.0-CURRENT] DEBUG: common.subr: Successfully loaded. DEBUG: f_include: file=[/usr/share/bsdconfig/dialog.subr] DEBUG: dialog.subr: loading includes... DEBUG: f_include: file=[/usr/share/bsdconfig/strings.subr] DEBUG: strings.subr: Successfully loaded. DEBUG: f_include: file=[/usr/share/bsdconfig/variable.subr] DEBUG: variable.subr: loading includes... DEBUG: f_include: file=[/usr/share/bsdconfig/dialog.subr] DEBUG: f_include: file=[/usr/share/bsdconfig/strings.subr] DEBUG: variable.subr: New variable VAR_CONFIG_FILE -> configFile DEBUG: variable.subr: New variable VAR_DEBUG -> debug DEBUG: variable.subr: New variable VAR_DEBUG_FILE -> debugFile DEBUG: variable.subr: New variable VAR_DIRECTORY_PATH -> _directoryPath DEBUG: variable.subr: New variable VAR_DOMAINNAME -> domainname DEBUG: variable.subr: New variable VAR_EDITOR -> editor DEBUG: variable.subr: New variable VAR_EXTRAS -> ifconfig_ DEBUG: variable.subr: New variable VAR_GATEWAY -> defaultrouter DEBUG: variable.subr: New variable VAR_GROUP -> group DEBUG: variable.subr: New variable VAR_GROUP_GID -> groupGid DEBUG: variable.subr: New variable VAR_GROUP_MEMBERS -> groupMembers DEBUG: variable.subr: New variable VAR_GROUP_PASSWORD -> groupPassword DEBUG: variable.subr: New variable VAR_HOSTNAME -> hostname DEBUG: variable.subr: New variable VAR_HTTP_DIR -> httpDirectory DEBUG: variable.subr: New variable VAR_HTTP_FTP_MODE -> httpFtpMode DEBUG: variable.subr: New variable VAR_HTTP_HOST -> httpHost DEBUG: variable.subr: New variable VAR_HTTP_PATH -> _httpPath DEBUG: variable.subr: New variable VAR_HTTP_PORT -> httpPort DEBUG: variable.subr: New variable VAR_HTTP_PROXY -> httpProxy DEBUG: variable.subr: New variable VAR_HTTP_PROXY_HOST -> httpProxyHost DEBUG: variable.subr: New variable VAR_HTTP_PROXY_PATH -> _httpProxyPath DEBUG: variable.subr: New variable VAR_HTTP_PROXY_PORT -> httpProxyPort DEBUG: variable.subr: New variable VAR_IFCONFIG -> ifconfig_ DEBUG: variable.subr: New variable VAR_IPADDR -> ipaddr DEBUG: variable.subr: New variable VAR_IPV6ADDR -> ipv6addr DEBUG: variable.subr: New variable VAR_IPV6_ENABLE -> ipv6_activate_all_interfaces DEBUG: variable.subr: New variable VAR_KEYMAP -> keymap DEBUG: variable.subr: New variable VAR_MEDIA_TIMEOUT -> MEDIA_TIMEOUT DEBUG: variable.subr: New variable VAR_MEDIA_TYPE -> mediaType DEBUG: variable.subr: New variable VAR_NAMESERVER -> nameserver DEBUG: variable.subr: New variable VAR_NETINTERACTIVE -> netInteractive DEBUG: variable.subr: New variable VAR_NETMASK -> netmask DEBUG: variable.subr: New variable VAR_NETWORK_DEVICE -> netDev DEBUG: variable.subr: New variable VAR_NFS_HOST -> nfsHost DEBUG: variable.subr: New variable VAR_NFS_PATH -> nfsPath DEBUG: variable.subr: New variable VAR_NFS_SECURE -> nfs_reserved_port_only DEBUG: variable.subr: New variable VAR_NFS_TCP -> nfs_use_tcp DEBUG: variable.subr: New variable VAR_NFS_V3 -> nfs_use_v3 DEBUG: variable.subr: New variable VAR_NONINTERACTIVE -> nonInteractive DEBUG: variable.subr: New variable VAR_NO_CONFIRM -> noConfirm DEBUG: variable.subr: New variable VAR_NO_ERROR -> noError DEBUG: variable.subr: New variable VAR_NO_INET6 -> noInet6 DEBUG: variable.subr: New variable VAR_PACKAGE -> package DEBUG: variable.subr: New variable VAR_PKG_TMPDIR -> PKG_TMPDIR DEBUG: variable.subr: New variable VAR_PORTS_PATH -> ports DEBUG: variable.subr: New variable VAR_RELNAME -> releaseName DEBUG: variable.subr: New variable VAR_SLOW_ETHER -> slowEthernetCard DEBUG: variable.subr: New variable VAR_TRY_DHCP -> tryDHCP DEBUG: variable.subr: New variable VAR_TRY_RTSOL -> tryRTSOL DEBUG: variable.subr: New variable VAR_UFS_PATH -> ufs DEBUG: variable.subr: New variable VAR_USER -> user DEBUG: variable.subr: New variable VAR_USER_ACCOUNT_EXPIRE -> userAccountExpire DEBUG: variable.subr: New variable VAR_USER_DOTFILES_CREATE -> userDotfilesCreate DEBUG: variable.subr: New variable VAR_USER_GECOS -> userGecos DEBUG: variable.subr: New variable VAR_USER_GID -> userGid DEBUG: variable.subr: New variable VAR_USER_GROUPS -> userGroups DEBUG: variable.subr: New variable VAR_USER_GROUP_DELETE -> userGroupDelete DEBUG: variable.subr: New variable VAR_USER_HOME -> userHome DEBUG: variable.subr: New variable VAR_USER_HOME_CREATE -> userHomeCreate DEBUG: variable.subr: New variable VAR_USER_HOME_DELETE -> userHomeDelete DEBUG: variable.subr: New variable VAR_USER_LOGIN_CLASS -> userLoginClass DEBUG: variable.subr: New variable VAR_USER_PASSWORD -> userPassword DEBUG: variable.subr: New variable VAR_USER_PASSWORD_EXPIRE -> userPasswordExpire DEBUG: variable.subr: New variable VAR_USER_SHELL -> userShell DEBUG: variable.subr: New variable VAR_USER_UID -> userUid DEBUG: variable.subr: New variable VAR_ZFSINTERACTIVE -> zfsInteractive DEBUG: variable.subr: VARIABLE_SELF_INITIALIZE=[1] DEBUG: f_variable_set_defaults: Initializing defaults... DEBUG: f_getvar: var=[debug] value=[1] r=0 DEBUG: f_getvar: var=[editor] value=[/usr/bin/ee] r=0 DEBUG: f_getvar: var=[hostname] value=[pfSense-install] r=0 DEBUG: f_getvar: var=[MEDIA_TIMEOUT] value=[300] r=0 DEBUG: f_getvar: var=[nfs_reserved_port_only] value=[NO] r=0 DEBUG: f_getvar: var=[nfs_use_tcp] value=[NO] r=0 DEBUG: f_getvar: var=[nfs_use_v3] value=[YES] r=0 DEBUG: f_getvar: var=[PKG_TMPDIR] value=[/var/tmp] r=0 DEBUG: f_getvar: var=[releaseName] value=[15.0-CURRENT] r=0 DEBUG: f_variable_set_defaults: Defaults initialized. DEBUG: variable.subr: Successfully loaded. DEBUG: f_include_lang: file=[/usr/libexec/bsdconfig/include/messages.subr] lang=[C.UTF-8] DEBUG: dialog.subr: DIALOG_SELF_INITIALIZE=[1] DEBUG: f_dialog_init: ARGV=[] GETOPTS_STDARGS=[dD:SX] DEBUG: f_dialog_init: SECURE=[] USE_XDIALOG=[] DEBUG: f_dialog_init: dialog(1) API initialized. DEBUG: dialog.subr: Successfully loaded. DEBUG: Began Installation at Tue Mar 12 07:19:10 UTC 2024 DEBUG: dialog.subr: DEBUG_SELF_INITIALIZE=[] DEBUG: UNAME_S=[FreeBSD] UNAME_P=[aarch64] UNAME_R=[15.0-CURRENT] DEBUG: common.subr: Successfully loaded. DEBUG: f_debug_init: ARGV=[pfSense-config-restore] GETOPTS_STDARGS=[dD:] DEBUG: f_debug_init: debug=[1] debugFile=[/tmp/bsdinstall_log] DEBUG: Running installation step: pfSense-config-restore DEBUG: dialog.subr: DEBUG_SELF_INITIALIZE=[] DEBUG: UNAME_S=[FreeBSD] UNAME_P=[aarch64] UNAME_R=[15.0-CURRENT] DEBUG: common.subr: Successfully loaded. DEBUG: f_debug_init: ARGV=[pfSense-netconfig] GETOPTS_STDARGS=[dD:] DEBUG: f_debug_init: debug=[1] debugFile=[/tmp/bsdinstall_log] DEBUG: Running installation step: pfSense-netconfig DEBUG: dialog.subr: DEBUG_SELF_INITIALIZE=[] DEBUG: UNAME_S=[FreeBSD] UNAME_P=[aarch64] UNAME_R=[15.0-CURRENT] DEBUG: common.subr: Successfully loaded. DEBUG: f_debug_init: ARGV=[pfSense-sysinfo] GETOPTS_STDARGS=[dD:] DEBUG: f_debug_init: debug=[1] debugFile=[/tmp/bsdinstall_log] DEBUG: Running installation step: pfSense-sysinfo DEBUG: f_getvar: var=[nonInteractive] value=[] r=1 DEBUG: smbios.system.maker=[Marvell] DEBUG: smbios.system.product=[mvebu_armada-37xx] DEBUG: smbios.system.version=[] DEBUG: smbios.planar.maker=[Marvell] DEBUG: smbios.planar.product=[mvebu_armada-37xx] DEBUG: dialog.subr: DEBUG_SELF_INITIALIZE=[] DEBUG: UNAME_S=[FreeBSD] UNAME_P=[aarch64] UNAME_R=[15.0-CURRENT] DEBUG: common.subr: Successfully loaded. DEBUG: f_debug_init: ARGV=[pfSense-disk-part] GETOPTS_STDARGS=[dD:] DEBUG: f_debug_init: debug=[1] debugFile=[/tmp/bsdinstall_log] DEBUG: Running installation step: pfSense-disk-part DEBUG: dialog.subr: DEBUG_SELF_INITIALIZE=[] DEBUG: UNAME_S=[FreeBSD] UNAME_P=[aarch64] UNAME_R=[15.0-CURRENT] DEBUG: common.subr: Successfully loaded. DEBUG: f_debug_init: ARGV=[umount] GETOPTS_STDARGS=[dD:] DEBUG: f_debug_init: debug=[1] debugFile=[/tmp/bsdinstall_log] DEBUG: Running installation step: umount DEBUG: f_dialog_max_size: bsddialog --print-maxsize = [MaxSize: 53, 121] DEBUG: f_getvar: var=[height] value=[8] r=0 DEBUG: f_getvar: var=[width] value=[57] r=0 root@pfSense-install:/tmp #
-
Does it not create the install-log.txt file then?
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Does it not create the install-log.txt file then?
I entered the ls command for the /tmp directory so we could see the contents.
There is no install-log.txt file.
Maybe that can give an indication of where in the upgrade process it all stopped? -
Yes, exactly it fails before it starts installing which makes sense since it appears to be while trying to partition the disk.
Ok so to boot the installer I assume you are running
run usbrecovery
at the marvell>> prompt? That should erase the eMMC but may fail to do so if the eMMC cannot be written.
You can confirm that by the number disks uboot reports at boot. It should show 5 disks if the eMMC has been correctly wiped:USB1: USB EHCI 1.00 scanning bus 0 for devices... 1 USB Device(s) found scanning bus 1 for devices... 2 USB Device(s) found scanning usb for storage devices... 1 Storage Device(s) found 18022 armada-3720-netgate-1100.dtb 18022 armada-3720-sg1100.dtb 12944 armada-3720-netgate-2100.dtb 12944 armada-3720-sg2100.dtb 4 file(s), 0 dir(s) 845140 bytes read in 53 ms (15.2 MiB/s) 18022 bytes read in 22 ms (799.8 KiB/s) ## Starting EFI application at 07000000 ... Card did not respond to voltage select! Scanning disk sdhci@d0000.blk... Disk sdhci@d0000.blk not ready Scanning disk sdhci@d8000.blk... Scanning disk usb_mass_storage.lun0... Found 5 disks
If it shows 8 disks there after appearing to wipe the eMMC then it's likely become read-only.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Yes, exactly it fails before it starts installing which makes sense since it appears to be while trying to partition the disk.
Ok so to boot the installer I assume you are running
run usbrecovery
at the marvell>> prompt? That should erase the eMMC but may fail to do so if the eMMC cannot be written.Yes, I'm running run usbrecovery at the marvell>> prompt.
You can confirm that by the number disks uboot reports at boot. It should show 5 disks if the eMMC has been correctly wiped:
If it shows 8 disks there after appearing to wipe the eMMC then it's likely become read-only.I do recall seeing 8 disks in the serial text that flew by.
I turned on the RAMDISK last night.
I am seeing files (such as log files) being written with today's date/time.
I believe the /var/log directory is in the RAMDISK, so I guess that may explain why there are current log files being written today.I guess I should turn off the RAMDISK again and see if new log file updates are possible?
[23.09.1-RELEASE][admin@tonka-gw1.duckdns.org]/var/log: ls -al | grep "May" drwxr-xr-x 4 root wheel 59 May 9 09:50 . -rw------- 1 root wheel 433505 May 9 10:06 auth.log -rw------- 1 root wheel 337722 May 9 10:03 dhcpd.log -rw------- 1 root wheel 512741 May 8 20:23 dhcpd.log.0 -rw------- 1 root wheel 243470 May 9 10:07 filter.log -rw------- 1 root wheel 520284 May 9 09:50 filter.log.0 -rw------- 1 root wheel 521295 May 9 09:13 filter.log.1 -rw------- 1 root wheel 530007 May 9 08:36 filter.log.2 -rw------- 1 root wheel 523821 May 9 07:58 filter.log.3 -rw------- 1 root wheel 512765 May 9 07:20 filter.log.4 -rw------- 1 root wheel 515820 May 9 06:43 filter.log.5 -rw------- 1 root wheel 519551 May 9 06:06 filter.log.6 -rw------- 1 root wheel 208469 May 8 21:12 nginx.log -rw------- 1 root wheel 177758 May 8 20:10 ntpd.log -rw------- 1 root wheel 66845 May 9 10:06 openvpn.log -rw------- 1 root wheel 512236 May 9 08:57 openvpn.log.0 -rw------- 1 root wheel 511737 May 8 23:55 openvpn.log.1 -rw------- 1 root wheel 23252 May 8 19:48 resolver.log -rw------- 1 root wheel 970 May 9 10:05 system.log -rw------- 1 root wheel 511106 May 9 07:21 system.log.0 -rw-r--r-- 1 root wheel 394 May 9 10:05 utx.lastlogin -rw------- 1 root wheel 2037 May 9 10:05 utx.log
-
Hmm, perhaps only the partition table area then. Interesting, I don't think we've seen that before.
If you can confirm that it still shows 8 disks after erasing the emmc though something is not allowing that.
Or if you run usbrecovery without a USB drive present and then then reset and it still boots from emmc.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Hmm, perhaps only the partition table area then. Interesting, I don't think we've seen that before.
If you can confirm that it still shows 8 disks after erasing the emmc though something is not allowing that.
Or if you run usbrecovery without a USB drive present and then then reset and it still boots from emmc.
I just revised my previous message regarding the log files, because I turned on the RAMDISK last night and I just realized that could confuse the diagnosis.
I will try the update again and capture as much of the serial data as I can, so I can have a snapshot of what messages are going by.