Best way to upgrade 2.5.2ce to 22.05 plus
-
@stephenw10 Quick checking shows yes, update step is now back on 22.01 with only a minor few packages to update for the switch. That is a bit more comforting now :)
Thanks for having a look!
-
No problem, thanks for pointing it out.
It should be possible to go directly to 22.05 once the quirks are resolved.
Steve
-
@stephenw10 Seems that - sad enough - something was really bonkers with the upgrade. Don't know if it had something to do with the system having already seen the 22.05 repo or sth alike but after selecting "pfSense Plus Upgrade" what should have been a quick update and reboot to 22.01 turned into a nightmare with a reboot, a "corrupt config.xml - 0 bytes" message and no config in /cf/conf and NO backups anymore in /cf/conf/backups. In fact the backups folder was erased completely. So sth. went very wrong with that.
I don't know if it's related to the repo pointing to 22.05 before or if theres another thing messy with switching from 2.6 to 22.01 but we got 2x the "corrupt config" and "broken installation" after reboot so cut our losses as long as we were still in our maintenance window and did a reinstall of 2.6 on the machine (glad it was the standby node) and reinstalled the config from our backup.
Would be nice if you could check if there's another thing that leaves upgrades from normal HW stranded with no config and wiped backups. The system also was still on 2.6 after the boot, so somehow it didn't even install the 22.01 kernel or meta packages. Very weird!
Cheers
\jens -
Hmm, that's odd. I tested it here after reverting that and several systems upgraded no problem.
Do you have any logs from the upgrades?
-
@stephenw10 I had to look if the logserver did still receive something of interest but as we had to abandon and reinstall, no logs on the device itself of course.
Otherwise only found 2 messages from pkg-static before the reboot that -kernel and -rc were updated to 22.01 then reboot by root. Just after the rebooting the system "hang" in limbo state due to /cf/conf being completely empty as if that ZFS dataset had been wiped clean. And as the system hang without configuration no logging to logserver of course :/
-
@jegr said in Best way to upgrade 2.5.2ce to 22.05 plus:
2 messages from pkg-static before the reboot that -kernel and -rc were updated to 22.01 then reboot by root
That was after the update to 22.01?
If that was during the upgrade to 22.05 that looks unexpected. -
@stephenw10 That was indeed after selecting "pfSense Plus ugprade" and hitting option 13 via SSH to upgrade. It showed around 9?10? packages to do, downloaded and then rebootet to complete it but was then stuck with no config anymore.
-
Hmm, any history on those units? They were on 2.6 and failed at the upgrade to 22.01?
But they did 'see' the 22.05 repo for a short time? Were they ever set to use the 2.7 repo branch?
-
No history on that. Both boxes were freshly installed with 2.6 because of ZFS changes beforehand (and had problems with the large filter set bug that was hotfixed with system_patches). So we installed both nodes from scratch, installed system_patches for the hotfixes and then did a restore of the configs. Worked very well and the cluster is up and running well since that.
As for the upgrade, we registered both (the primary one actually as the secondary node still had active TAC pro as we had to diagnose the 2.6/pf ruleset bug with support) and I switched the secondary to "Upgrade" as it's written in the Update menu. After switching, we had 22.05 as target instead of 22.01 but no package or update was triggered at that point.
That was when I got in contact here in the forums and you guys switched back to 22.01. Just beore the upgrade I made sure the target is still 22.01 by switching back to stable (2.6) and back to upgrade (22.01 was shown). Then got into the box via SSH and did "13" (upgrade) via console, got the list of ~12 packages that would be upgraded and they were downloaded. Then the system did the 10s downtime and reboot.
Afterwards after waiting for more then 10m for coming back (it's not the fastest booting those UEFI things...) it still was missing and I checked via the internal IPMI to find a console in a "broken" state. E.g. it looked pretty much like that:FreeBSD/arm64 (Amnesiac) (ttyu0) Config.xml is corrupted and is 0 bytes. Could not restore a previous backup. 0) Logout (SSH only) 9) pfTop 1) Assign Interfaces 10) Filter Logs 2) Set interface(s) IP address 11) Restart webConfigurator 3) Reset webConfigurator password 12) PHP shell + Netgate pfSense Plus tools 4) Reset to factory defaults 13) Update from console 5) Reboot system 14) Enable Secure Shell (sshd) 6) Halt system 15) Restore recent configuration 7) Ping host 16) Restart PHP-FPM 8) Shell Enter an option:
So menu shown but already via the "(Amnesiac)" in the first line one could see it didn't correctly boot up pfSense core setups.
Checking/cf/conf
showed only a text file but no content whatsoever.config.xml
was gone,backup
folder nonexistent, other text files orrules.debug.old
nowhere to be found.After much daddling with the box I got it manually configured on an interface and ssh started up so I could get the last config.xml on the box and rebootet. Afterwards the box booted up fine on first glance but doing anything package or repo related was broken as hell. No packages were listed as installed, it couldn't find updates or update repos at all etc.
I did a reset and refresh (as per the docs with repo problems) and anypkg-static
magic that I could think of but the most I got was that the 3 packages that were installed by re-boostrapping the repos/pkg-package were shown as installed. Any other package wasn't even shown, no FreeBSD base, no pfSense specifics, none. The whole installed package list was empty (until I bootstrapped, then it was the 3 pkgs from bootstrapping).So somehow the OS forgot as much as every package installed by switching the repos. And it seems that particular problem or something very related is also still happening even with Netgate Boxes, as a german forum member describes almost the same problem happened to him after changing out his UPS. He thought he had the box shutdown correctly but after changing the UPS and repowering his Netgate 2100 the box came up exactly the same way as our secondary firewall HW - with Amnesiac / corrupt config.xml menu, no configs on the device and broken package paths. /topic/173402/hilfe-config-xml-is-corrupted-and-is-0-bytes-could-not-restore-a-previous-backup (german) His device also was running on the new ZFS scheme and after having the problem and getting the last config back on the device had problems with the package repos although he was somehow getting it back to running via console only with the GUI still bonkers.
As his was also running on ZFS and 22.01 (not 22.05 yet but had the update path set to 22.05 for a bit), perhaps something strange is going on related to switching either repos or perhaps with the new ZFS datasets and the preparations for Boot Environments?
/cf/conf completely missing every single file and directory gives a bit of "snapshot or dataset gone wrong" vibe as it is a distinct dataset in the ZFS setup, so I'm leaning towards that and perhaps something with the /var/xy dataset going very wrong?
Thats just an educated guess though but I'm happy to help in debugging it as far as I'm able to without compromising the box/failover of our DC cluster.
Cheers
\jens -
@jegr said in Best way to upgrade 2.5.2ce to 22.05 plus:
/cf/conf completely missing every single file and directory gives a bit of "snapshot or dataset gone wrong" vibe
Mmm, I agree. Hard to see what else could cause something like that.
Were you able to try switching Boot Environments?Steve
-
@stephenw10 No, as the system came up with the config again, it still showed 2.6 instead of 22.01
That's why I thought something must have happened before the system could actually process all the update packages. /var/cache etc. or /cf/conf gone missing could remedy that. Whatever the case, it was a very strange thing. And as it wasn't yet on 22.01, no snapshot or BE was created (that I knew of). Also no possibility to choose one at boot time (I stopped the menu once to check). -
Mmm, yes it wouldn't have had the boot menu option.
Hmm, OK, thanks we are looking into this. -
@stephenw10 If I can apply further information, I'd be happy to help