UEFI VM upgrade failure
-
I'm seeing something similar after upgrading from 23.09.01 to 24.03 on a Proxmox VM on one of my systems (but not the other). It seems to freeze and eat 100% of one CPU core forever during boot some of the times. With enough reboots it gets past the freeze and boots up normally. I believe the upgrade itself went just fine, it just doesn't boot reliably now.
-
@mikebenna That's not the same error, I forked the topic.
What's the difference between those VMs?
Does one of them have a serial port by any chance?
https://redmine.pfsense.org/issues/15074Steve
-
Yes, both have serial ports because I ran into that issue in the past. Good guess however. :)
Working:
- CPU i3-N305
- PCI passthrough for both WAN NICs
- virtio driver for LAN NIC
- USB passthrough for UPS communications
Crashing:
- CPU i7-8700K
- PCI passthrough for one WAN NIC
- virtio driver for other WAN NIC
- virtio driver for LAN NIC
- No USB devices passed through
Boot drives are virtio on Ceph for both.
-
They're both UEFI I assume?
That boot error is a UEFI boot issue. It's hanging at the EFI framebuffer. We did see some problems with that in early 24.03 versions but nopt for months.
-
Actually, now that you mention it, I just realized the working system is not UEFI but the crashing one is UEFI. This is indeed sounding more and more UEFI-related.
Let me know if I can help further.
Mike
-
@stephenw10 I saw something like this with a CE and the newest system patches yesterday. I restarted that VM in Proxmox again and everything was good. An UEFI installation with ZFS. But I don't remember what the screen was actually showing. And I rebooted that host again today after upgrading proxmox to the newest version, no problems so far for that pfSense VM.
-
@mikebenna said in UEFI VM upgrade failure:
the working system is not UEFI but the crashing one is UEFI.
Aha! OK what Proxmox version? I have a bunch of pfSense VMs in Proxmox that I've upgraded numerous times and some of them are UEFI. I haven't seen any issues for months there.
Any other details on the VM you can provide would probably help.
-
@stephenw10 said in UEFI VM upgrade failure:
Aha! OK what Proxmox version? [...]
Any other details on the VM you can provide would probably help.
Proxmox Virtual Environment 8.2.2. Also, I just rebooted the host to rule out anything lingering (it had been up for ~79 days). This time pfSense 26.03 hung 16 times in a row. On the 17th try, I stopped at the boot menu by pressing space, pressed 6 a few times to rotate through kernel versions (and then back to 1), then pressed 1 for multi-user boot and it booted fine. I wonder if the delay during my menu time caused different behavior.
Here's the 'summary' for the host in question before and after reboot (note kernel version numbers changed due to upgrade that was waiting for the reboot), followed by the hardware page for the VM in question:
Host summary page after reboot:
Hardware for the VM:
-
Hmm, newer Proxmox version than me....
Likely the same issue as this: https://redmine.pfsense.org/issues/14773
What CPU type are you using for the VM? Can you try kvm64 if you're not using that?
-
@stephenw10 said in UEFI VM upgrade failure:
What CPU type are you using for the VM? Can you try kvm64 if you're not using that?
That screen capture looks identical; good find. I was using CPU type 'host' (because of PCI pass-through it would never migrate to another CPU anyway). Now trying kvm64... wow, 5 reboots in a row all booted successfully. This seems to have worked around the problem!
Thank you!
ps: fwiw having two VMs set up in HA mode is awesome. Rebooting the main router repeatedly all day and nobody is noticing any downtime. :)
-
Ah nice. Also confirms an issue exists there. And still exists in 24.03-REL.