SG-1100 update from 23.09.1 to 24.03 keeps failing
-
Does it not create the install-log.txt file then?
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Does it not create the install-log.txt file then?
I entered the ls command for the /tmp directory so we could see the contents.
There is no install-log.txt file.
Maybe that can give an indication of where in the upgrade process it all stopped? -
Yes, exactly it fails before it starts installing which makes sense since it appears to be while trying to partition the disk.
Ok so to boot the installer I assume you are running
run usbrecovery
at the marvell>> prompt? That should erase the eMMC but may fail to do so if the eMMC cannot be written.
You can confirm that by the number disks uboot reports at boot. It should show 5 disks if the eMMC has been correctly wiped:USB1: USB EHCI 1.00 scanning bus 0 for devices... 1 USB Device(s) found scanning bus 1 for devices... 2 USB Device(s) found scanning usb for storage devices... 1 Storage Device(s) found 18022 armada-3720-netgate-1100.dtb 18022 armada-3720-sg1100.dtb 12944 armada-3720-netgate-2100.dtb 12944 armada-3720-sg2100.dtb 4 file(s), 0 dir(s) 845140 bytes read in 53 ms (15.2 MiB/s) 18022 bytes read in 22 ms (799.8 KiB/s) ## Starting EFI application at 07000000 ... Card did not respond to voltage select! Scanning disk sdhci@d0000.blk... Disk sdhci@d0000.blk not ready Scanning disk sdhci@d8000.blk... Scanning disk usb_mass_storage.lun0... Found 5 disks
If it shows 8 disks there after appearing to wipe the eMMC then it's likely become read-only.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Yes, exactly it fails before it starts installing which makes sense since it appears to be while trying to partition the disk.
Ok so to boot the installer I assume you are running
run usbrecovery
at the marvell>> prompt? That should erase the eMMC but may fail to do so if the eMMC cannot be written.Yes, I'm running run usbrecovery at the marvell>> prompt.
You can confirm that by the number disks uboot reports at boot. It should show 5 disks if the eMMC has been correctly wiped:
If it shows 8 disks there after appearing to wipe the eMMC then it's likely become read-only.I do recall seeing 8 disks in the serial text that flew by.
I turned on the RAMDISK last night.
I am seeing files (such as log files) being written with today's date/time.
I believe the /var/log directory is in the RAMDISK, so I guess that may explain why there are current log files being written today.I guess I should turn off the RAMDISK again and see if new log file updates are possible?
[23.09.1-RELEASE][admin@tonka-gw1.duckdns.org]/var/log: ls -al | grep "May" drwxr-xr-x 4 root wheel 59 May 9 09:50 . -rw------- 1 root wheel 433505 May 9 10:06 auth.log -rw------- 1 root wheel 337722 May 9 10:03 dhcpd.log -rw------- 1 root wheel 512741 May 8 20:23 dhcpd.log.0 -rw------- 1 root wheel 243470 May 9 10:07 filter.log -rw------- 1 root wheel 520284 May 9 09:50 filter.log.0 -rw------- 1 root wheel 521295 May 9 09:13 filter.log.1 -rw------- 1 root wheel 530007 May 9 08:36 filter.log.2 -rw------- 1 root wheel 523821 May 9 07:58 filter.log.3 -rw------- 1 root wheel 512765 May 9 07:20 filter.log.4 -rw------- 1 root wheel 515820 May 9 06:43 filter.log.5 -rw------- 1 root wheel 519551 May 9 06:06 filter.log.6 -rw------- 1 root wheel 208469 May 8 21:12 nginx.log -rw------- 1 root wheel 177758 May 8 20:10 ntpd.log -rw------- 1 root wheel 66845 May 9 10:06 openvpn.log -rw------- 1 root wheel 512236 May 9 08:57 openvpn.log.0 -rw------- 1 root wheel 511737 May 8 23:55 openvpn.log.1 -rw------- 1 root wheel 23252 May 8 19:48 resolver.log -rw------- 1 root wheel 970 May 9 10:05 system.log -rw------- 1 root wheel 511106 May 9 07:21 system.log.0 -rw-r--r-- 1 root wheel 394 May 9 10:05 utx.lastlogin -rw------- 1 root wheel 2037 May 9 10:05 utx.log
-
Hmm, perhaps only the partition table area then. Interesting, I don't think we've seen that before.
If you can confirm that it still shows 8 disks after erasing the emmc though something is not allowing that.
Or if you run usbrecovery without a USB drive present and then then reset and it still boots from emmc.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Hmm, perhaps only the partition table area then. Interesting, I don't think we've seen that before.
If you can confirm that it still shows 8 disks after erasing the emmc though something is not allowing that.
Or if you run usbrecovery without a USB drive present and then then reset and it still boots from emmc.
I just revised my previous message regarding the log files, because I turned on the RAMDISK last night and I just realized that could confuse the diagnosis.
I will try the update again and capture as much of the serial data as I can, so I can have a snapshot of what messages are going by.
-
I looked at the RAMDISK settings and they were off?!
I guess my turn-on of the RAMDISK didn't take effect, even though I did restart, as advised by the pop-up message.I just clicked the RAMDISK box to ON and then OFF and then rebooted to be sure the RAMDISK was off.
It shows it's off and the /var/log still has entries being written during the last 5 minutes.
[23.09.1-RELEASE][admin@redacted]/var/log: ls -la | grep "May" drwxr-xr-x 4 root wheel 56 May 9 11:41 . -rw------- 1 root wheel 400581 May 9 12:05 auth.log -rw------- 1 root wheel 256298 May 9 12:06 dhcpd.log -rw------- 1 root wheel 500703 May 9 12:06 filter.log -rw------- 1 root wheel 549513 May 9 11:41 filter.log.0 -rw------- 1 root wheel 195573 May 9 12:03 nginx.log -rw------- 1 root wheel 177772 May 9 11:57 ntpd.log -rw------- 1 root wheel 295779 May 9 12:05 openvpn.log -rw------- 1 root wheel 41370 May 9 11:40 resolver.log -rw------- 1 root wheel 507949 May 9 12:03 system.log -rw-r--r-- 1 root wheel 394 May 9 12:02 utx.lastlogin -rw------- 1 root wheel 2037 May 9 12:02 utx.log
-
@mrneutron The RAM disk setting requires a restart to turn on.
The log file will always have current timestamps. It's just a matter of where they are being saved.
If you are concerned about logs, you may want to put a number in that Log Directory field, for instance, to copy logs to eMMC storage every "n" hours. Otherwise they are copied during a clean halt/restart.
if on it will show in the dashboard Disks widget:
-
@SteveITS said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
@mrneutron The RAM disk setting requires a restart to turn on.
The log file will always have current timestamps. It's just a matter of where they are being saved.
If you are concerned about logs, you may want to put a number in that Log Directory field, for instance, to copy logs to eMMC storage every "n" hours. Otherwise they are copied during a clean halt/restart.
if on it will show in the dashboard Disks widget:
@SteveITS, right now, I'm trying to determine if my emmc chip is still working correctly. Thus, I'm trying to determine if any files are being successfully written to it.
-
@mrneutron Ah, I see. IIRC you need to restart to "lose" files with ZFS(?) since hides it due to caching. Try creating a file and restarting, to see if it's still there.
-
I used the touch command to create files in the / directory and /var/log directories.
Both files were gone after a normal reboot.
I guess this is more evidence that the emmc is read-only, now?I read that the mount command is supposed to show if the mounted partitions are ro or rw. It doesn't show ro or rw for any of them.
[23.09.1-RELEASE][admin@redacted]/: mount pfSense/ROOT/default on / (zfs, local, noatime, nfsv4acls) devfs on /dev (devfs) pfSense/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls) pfSense/home on /home (zfs, local, noatime, nfsv4acls) pfSense/var on /var (zfs, local, noatime, nfsv4acls) pfSense/var/cache on /var/cache (zfs, local, noatime, noexec, nosuid, nfsv4acls) pfSense/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls) pfSense/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls) pfSense/var/db on /var/db (zfs, local, noatime, noexec, nosuid, nfsv4acls) pfSense/ROOT/default/cf on /cf (zfs, local, noatime, noexec, nosuid, nfsv4acls) pfSense/ROOT/default/var_cache_pkg on /var/cache/pkg (zfs, local, noatime, noexec, nosuid, nfsv4acls) pfSense/ROOT/default/var_db_pkg on /var/db/pkg (zfs, local, noatime, noexec, nosuid, nfsv4acls) tmpfs on /var/run (tmpfs, local) devfs on /var/dhcpd/dev (devfs)
-
I don't thing mount would show it. They are not mounted read-only.
mount -p
should show it though if they were. -
Huh...all the Linux examples I saw of the mount command had only mount (no -p).
Is BSD usage different?Using mount -p they are all supposedly mounted rw, but they don't hold what I write to them.
[23.09.1-RELEASE][admin@redacted]/: mount -p pfSense/ROOT/default / zfs rw,noatime,nfsv4acls 0 0 devfs /dev devfs rw 0 0 pfSense/tmp /tmp zfs rw,nosuid,noatime,nfsv4acls 0 0 pfSense/home /home zfs rw,noatime,nfsv4acls 0 0 pfSense/var /var zfs rw,noatime,nfsv4acls 0 0 pfSense/var/cache /var/cache zfs rw,noexec,nosuid,noatime,nfsv4acls 0 0 pfSense/var/tmp /var/tmp zfs rw,nosuid,noatime,nfsv4acls 0 0 pfSense/var/log /var/log zfs rw,noexec,nosuid,noatime,nfsv4acls 0 0 pfSense/var/db /var/db zfs rw,noexec,nosuid,noatime,nfsv4acls 0 0 pfSense/ROOT/default/cf /cf zfs rw,noexec,nosuid,noatime,nfsv4acls 0 0 pfSense/ROOT/default/var_cache_pkg /var/cache/pkg zfs rw,noexec,nosuid,noatime,nfsv4acls 0 0 pfSense/ROOT/default/var_db_pkg /var/db/pkg zfs rw,noexec,nosuid,noatime,nfsv4acls 0 0 tmpfs /var/run tmpfs rw 0 0 devfs /var/dhcpd/dev devfs rw 0 0
-
The
-p
switch cause it to print the output in fstab format and includes verbose info.
https://man.freebsd.org/cgi/man.cgi?query=mountYes so ZFS think it's all mounted read-write but that doesn't help if the actual storage is read-only.
-
Yeah, it sure looks like the emmc in my SG-1100 is no longer writable (upgradable).
It still boots in the configuration I had it set in, but I wonder how long before it goes non-functional?"Certain types of disks, such as SSD and eMMC disks, may fail into a read only state where disk writes fail or are discarded, but data can still be read."
from: https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-lifetime.html
-
Yes unfortunately I have to agree. You can try installing to USB. I tested that yesterday.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Yes unfortunately I have to agree. You can try installing to USB. I tested that yesterday.
You can run the pfsense OS off of a USB flash drive, plugged into the SG-1100?
-
Yes, the new Net Installer allows you select another USB drive as the target.
-
@stephenw10 said in SG-1100 update from 23.09.1 to 24.03 keeps failing:
Yes, the new Net Installer allows you select another USB drive as the target.
Can you use either USB port on the SG-1100? (like the USB3 port?)
Does loading into a USB flash drive automatically set the boot config to point to the USB flash drive as the boot drive?
If not, how do you manually edit the boot config? -
Technically you could use either port and the USB3 port would likely be much faster. However when you run either
usbrecovery
orusbboot
it boots the first USB device and if you have two drives inserted that is the USB3 port. Which means that when you boot the installer the target drive must be in the USB2 slot. That shouldn't matter, I'd expect to be able to move the drive after installing. However I was unable to do so. The 1100 can be picky about booting USB drives though so it could just be the drive I'm using.The default boot device is not automatically changed. You need to set the bootcmd uboot env from the uboot prompt like:
setenv bootcmd 'run usbboot; run emmcboot;' saveenv
Now in your case you may have an additional issue that the emmc is still holding a ZFS filesystem and cannot be wiped. If that is that case you'd need to either make sure the zfs pool names are different or use UFS on the USB drive.
-