24.03 Update not booting
-
I might be hitting an issue with the EFI boot with an error "efibootmgr: efi variables not supported on this system. root? kldload efirt?"
Updating boot code... /usr/local/sbin/../libexec/install-boot.sh -b auto -d /tmp/be_mount.1E54 -f zfs -s gpt -u ada0 gpart bootcode -b /tmp/be_mount.1E54/boot/pmbr -p /tmp/be_mount.1E54/boot/gptzfsboot -i 1 ada0 partcode written to ada0p1 bootcode written to ada0 No ESP partition found...skipping. Done. >>> Copying upgrade log...done. >>> Unmounting upgraded boot environment...done. >>> Activating auto-default-20231226103555_20240423111842 for the next boot only...done. System is going to be upgraded. Rebooting in 10 seconds. >>> Unlocking package pkg...done. Success
-
Not seeing that error there.
What do you see at the console?
What were you upgrading from?
What harware is that?
If efibootmgr is not supported it can add a new entry but the existing uefi variables should still boot.
Steve
-
Upgrading from 23.09.1 on my own firewall box (Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz).
I'll get into the console and get a screenshot of where it's hanging this evening.
-
Do you have the log showing the efibootmgr error?
-
@stephenw10 I don't have the log. The system hangs on trying to boot up 24.03, when I restart it falls back to 23.09.1.. I'll have to pull the box later and connect it up to a monitor to see where it's hanging exactly.
-
Ah so the upgrade appears to complete successfully but the result just fails to boot?
-
Got a possible culprit
https://forum.netgate.com/topic/187655/24-03-install-failed-in-1-out-of-3
-
@stephenw10 here's the console when trying to boot up and hangs.And yes, the upgrade completes successfully but hangs at this point.
-
Ah that's a drive or drive controller error. It's possible the new BE is using some bad part of the drive. Also possible the newer drivers in 24.03 are causing some problem there.
Do you know what drive/drive controller it's using?
-
@stephenw10 Intel Pro/1000 5.6.10 PCI-E
Samsung 256GB SSD.
I did try updating again and using another boot environment, same issue persists. I also got rid of some older BEs and tried that, it still fails. This was a problem with the development builds as well.
-
Can you connect a serial console and get a full log from the upgrade?
Do you see any errors in the 23.09.1 boot log from the drive or controller when they are detected?
-
@stephenw10 I actually ordered a console cable last night so I can troubleshoot during the install. I do have the full update log from the GUI during the update process.
I looked at the BootOS log and didn't see anything abnormal, although it's only the log for the 23.09.1 BE, there's nothing for 24.03.
Another interesting thing is that after the update, my 23.09. BE shows errors with package manager. These packages should be all up to date.
-
It's because the repo branch is set to 24.03. It's seeing the newer version there. But you can't install them because they are not compatible with 23.09.1. If you set the branch back those should show as up to data again. Unless there actually are updates.
-
Hi,
I am curious on your findings as my system behaves the probably the same. No clue whether it is also a driver issue but it always worked fine with 23.09.1 but does not start with 24.03. (The old start log shows a missing driver for my onboard WiFi but this should certainly not prevent the new system from starting.)
Regards, Michael -
@stephenw10 Ah, makes sense. Didn't realize it stayed on that branch after reverting back.
Here's the update log from the GUI for 24.03 if that helps.Update.txt
-
@stephenw10 Here's the boot log for 23.09.1 boot log.txt
-
@stephenw10 : In case it helps, here also my dmesg.txt from 23.09.1.
-
Ok different drives and different driver controllers.
@Mike-moon Are you actually seeing the same identical error?
@hypnosis4u2nv Your boot log from 23.09.1 shows the same error:
ahci0: <Intel Sunrise Point-LP AHCI SATA controller> port 0xf090-0xf097,0xf080-0xf083,0xf060-0xf07f mem 0xdf628000-0xdf629fff,0xdf62c000-0xdf62c0ff,0xdf62b000-0xdf62b7ff irq 16 at device 23.0 on pci0 ahci0: AHCI v1.31 with 3 6Gbps ports, Port Multiplier not supported ahcich0: <AHCI channel> at channel 0 on ahci0 ... ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <SAMSUNG MZMTE256HMHP-000MV EXT41M0Q> ACS-2 ATA SATA 3.x device ada0: Serial Number S1F1NSAG782964 ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 244198MB (500118192 512 byte sectors) ahcich0: Timeout on slot 28 port 0 ahcich0: is 00000000 cs 00000000 ss 10000000 rs 10000000 tfd 40 serr 00000000 cmd 0000c217 (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 10 90 2d 20 40 0b 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command, 3 more tries remain
It seems likely to be the disk. Can you test a different drive?
-
@stephenw10 Don't have a spare drive. Wouldn't hurt to get another one to swap out. Although I'll be starting a fresh install when I do. It's interesting though that it continues to boot past the error while on 24.03 it wants to hang.
-
@stephenw10 : Seems that I do not have exactly the identical error. System seems to hang already in the bootloader.