23.05 firmware upgrade crashed a 3100 and an 1100
-
@rlinnemann Also, in our cases it was on a USB stick install. Both cases had the small EFI partition. I've reinstalled on at least one other 2100 without issue but it was my home
and I only had a 4 GB stick so used 22.05 not 23.01 which was current at the time.
-
@SteveITS be careful of leaving USB sticks inserted also, they can share the glabels that we use to identify the ESP on the root device, and can get in the way. I intend to make this more robust in the near future as well.
-
@SteveITS said in 23.05 firmware upgrade crashed a 3100 and an 1100:
@rlinnemann Ah, thanks for digging into it. The weirdest part is it would boot fine the first time.
This is a detail that I had not seen specified before, and potentially explains something that I was puzzling me about the loader failing to mount the zroot, which I did't expect to have been upgraded at the point that the system first reboots. The zpool may be undergoing a backward incompatible change after booting the new kernel for the first time, that would explain this behavior. I'll be looking further into that as well.
-
@rlinnemann Always fun to get new puzzle pieces. Info in the thread I mentioned above, and sort of hijacked: https://forum.netgate.com/topic/180432/certificate-verification-failed/7
-
@rlinnemann re: leaving the stick in, I know for sure my case a while back was not that since it happened after I shut it down and mounted it back on the wall. (What! It was working!) The case a week or two ago was a coworker on a different 2100 but I don’t think that was the issue there.
In my case I assumed it was dying because it didn’t boot after a power outage, though I didn’t even try to diagnose it because it had the small EFI. I also didn’t write down the error since I assumed it was dying.
-
@rlinnemann FWIW a coworker reinstalled the "dead" 2100 with the same 23.05 USB he used a couple weeks ago and it seems to be fine in very limited usage. He's restarted it several times.
-
-
@SteveITS I'd be very surprised if the loader failed to copy on a recovery install. The upgrade failures that we hit updating the EFI loader are mostly or entirely due to complications with live systems that may have arbitrary filesystems mounted, additional storage media that aliases labels, IO errors or out of space conditions, etc. I'm glad to hear the 2100 is back in action. Thanks for your feedback!
-
@rlinnemann Yeah me too for obvious wipe and repartition reasons. Yet it seems to have happened twice on two units, after the restore from GUI.
That (same) 23.01 stick was very probably created from a partner vault download but I don’t recall offhand if it was shortly after 23.01 released or later when I created the stick with it. Either way, time for a new stick.
-
@SteveITS just to clarify, did you restore from a 23.05 USB image or did you restore to 23.01 and upgrade?
-
@rlinnemann The most recent attempt was in that other thread but to recap in one spot with more detail, we had two scenarios. Both 2100s had the small EFI to begin with so needed a reinstall.
-
May 5. Client had a short power event. We found the UPS was defective, 10 seconds runtime. The router did not boot up afterwards. On the phone they said they unplugged it. I went there with a spare 2100 and a 23.01 USB install...that image was ‎downloaded February ‎18 by the way, I looked now that I'm logged in. I didn't even try to troubleshoot it, I just reinstalled since I knew we had to. It booted up, I restored the config file in the web GUI, and then realized it didn't boot up after. Figured it was dead, and had about 15 minute left, so put in place the new 2100. I didn't really pay attention to the errors at the time.
-
June 6. A coworker went out to reinstall on a 2100 at a client's office to get it over with. He used the same USB I made since he didn't want to burn another. Installed, restored the config file in the web GUI, and it wouldn't boot, as above. Those were the logs and screen caps in the other thread. He reinstalled again, it failed the same way again, and he called me. Fortunately he could plug his laptop into the Comcast modem and downloaded 23.05 to burn on a spare USB. He installed that and it was fine after.
In hindsight we think 2) was the same issue as 1). This week the same tech used his USB stick to install 23.05 on the router from 1) above and it seems fine after a few reboots.
I still have the "bad" USB, it's on my desk at work. Not sure how the USB stick or its image could be a problem like this but it didn't get recycled yet.
When I reinstalled on my 2100 at home, a few months ago, I used 22.05 because I couldn't find an 8 GB stick at home. (edit: Etcher, btw, doesn't warn you about the space needed if you try to write the compressed image directly, it just fails x% of the way through)
Edit: I had found that other thread because there are very few search results for the "cannot open /boot/lua/loader.lua" error.
-
-