issues upgrading Netgate 4100 from 23.09.1 to 24.03
-
Was previously running flawlessly with system update -> branch -> previous stable version -> 23.09.1
Decided to finally upgrade to 24.03.
Upgrade seemed to run without issues. Successfully rebooted into the web gui, but noticed the following alert:
Upgrade
- check-upgrade(1): unknown error @ 2024-11-19 ...
Troubleshooting
Service status is green for all services.
Interface status is green for all network interfaces.
/conf/upgrade_log.latest.txt doesn't contain any obvious errors
"pkg-static update -f" returns many errors...
...
pkg-static: An error occured while fetching package
Unable to update repository pfSense
Error updating repositoriesMeanwhile, clients connected to the firewall are unable to reach the internet. Internal traffic between wan/lan/dmz is working fine.
On the main dashboard, under System Information -> Version it looks like the update was successful:
24.03-RELEASE (amd64)
The system is on the latest versionThis makes it appear that the firewall was able to query the update servers on the internet.
I attempted to restore an old configuration, to no avail.
I suspect there's a subtle configuration error somewhere that worked in 23.09, but breaks 24.03.
I attempted disabling non-critical firewall rules to no avail. I also set ipv4 as the default/preferred, to no avail.
Thank you for any assistance.
-
I was able to restore connectivity to the internet by using the console option 'Restore recent configuration'
And then selecting the last configuration saved before the upgrade process was initiated.
If I diff the post-upgrade and pre-upgrade config files, there are very few diffs...
ipsec client profile version
1.2 -> 1.2.1
netgate firmware upgrade
23.05.00 -> 23.05.01
avahi version
2.2_4 -> 2.2_5
package repo path
pfSense-repo-23_09_1_rel -> pfSense-repo-24_03_relThat's it.
If I revert to the new config: clients can't route to the internet
If I revert to the old config: everything works fineHappy this is working, but would really like to be able to upgrade seamlessly to future updates, and to understand what broke in the upgrade.
The only substantive difference in the configs seems to be the firmware and package repo.
Any insight into which of these config changes could be the root of the issue?
thanks again.
-
Hmm, curious that restoring a config would correct that.
Do you have the actual errors pkg was showing?
Steve
-
@stephenw10 Thanks for the response.
Unfortunately I did the update in the web gui, it seemed to proceed without error. After the reboot, everything appeared to start normally without error except for the alert:
"check-upgrade(1): unknown error"
/conf/upgrade_log.latest.txt didn't contain errors.
subsequent console 'pkg' commands failed with 'error occurred while fetching package', but that seems to have been due to the post-install network issue.
i browsed the logs but didn't see any failures that would be related to external connectivity issues.
a couple of additional observations about the diffs between the two config files.
The files referenced in the configs
/usr/local/etc/pfSense/pkg/repos/pfSense-repo-24_03-rel.confdoesn't seem to exist. instead, its:
/usr/local/etc/pfSense/pkg/repos/pfSense-repo-0000.confAlso, the new config contains the empty block <qinqs></qinqs> which the old config did not.
I could try reintroducing these changes one by one, to see which causes the connectivity issue?
thanks again.
-
Unfortunately the error reporting from pkg in 23.09.1 meant that error could have been a lot of things. It was improved a lot in 24.03. So that could just be expected during the upgrade.
The empty qinq block does nothing. As do other empty tags that might be added.
If faced with that I would have tried running:
pfSense-repoc -DN
That should have either updated the pkg repos or returned a useful error.I doubt anything in the config actually caused the problem. When you restored a previous config it likely reloaded whatever was missing.
Possibly an unpopulated alias? Do you have pass rules from pfBlocker maybe?
-
@stephenw10 thanks again.
am not running pfBlocker.
Well.... I reinstalled the new config again as an experiment, and ran into the same issue (temporarily!).
The system booted with some banners saying that packages were installing and not to make any changes in the gui until complete.
Also, avahi and bandwidthd failed to start, so I started them manually.
The alerts icon showed red and the more verbose error "package reinstall process was ABORTED due to lack of internet connectivity".
Tested from clients and could not reach the internet.
However, I then rebooted the 4100 (still with the new config), and the same banners appeared, this time very briefly and the system started working as expected. Connectivity restored.
So, it does seem that the system can work with the new config generated for 24.03, but something does seem to be going on post-install (and post-config change), which took some config swapping and reboots to work out.
I'm pretty sure the config revert played a role because when the upgrade originally failed to work, I rebooted a number of times hoping it would resolve. Spent an hour before reading the upgrade troubleshooting docs and then finally posting here.
thank you.
-
Just to follow up. I continued to debug this and found that even when I used the old config, which 'resolved' the issue, the system was modifying/updating the config after successful reboot.
Then, external networking would fail on subsequent reboots.
While I'm not running pfblocker, I did have a lot of legacy stuff in my config file, migrated from older generations of netgate devices, including squidguard and some vpns and gateways that no longer existed and weren't accessible from the pfsense gui.
After cleaning that up by editing the config and reloading into pfsense, rebooting and repeating, I do seem to have a config file that works and survives multiple system reboots.
I suspect the problem was some non-existent default gateways that weren't visible in the gui and perhaps got ignored in the past but were interfering with 24.03.
(maybe need to do a reset to factory defaults and config from scratch to further cleanup of 10+ year old config baggage.)
happy it is working for now. I do get an odd popup, which disappears after a few seconds...
"Netgate pfSense Plus will automatically reboot in -1 seconds. Verifying will commit this temporary boot environent to be the next boot environment."
@stephenw10 thanks again for the tip regarding pfblocker. that led me down the path of cleaning up the config.
best regards
-
@pfcharles said in issues upgrading Netgate 4100 from 23.09.1 to 24.03:
I do get an odd popup, which disappears after a few seconds...
You will see that if you log into the webgui as soon as it becomes available but before bootup has completed. As you are seeing it should disappear when bootup completes.