25.07 upgrade on Netgate 4100 gets rolled back
-
It'd be helpful/preventative, I think, if the upgrade would do a quick check "are there more than ___ backup config files in the directory?" before upgrading. Not sure if that should be "more than the configured number, or more than 500, or what, but a few thousand is enough to cause a long (10m?) page load and eventual timeout loading the config history page as it tries to delete them. Perhaps the warning could link to a troubleshooting document page.
-
@SteveITS Indeed. Also when running a full update like firmware update -> 25.07 that could perhaps be an additional sanity check to perform as would be a check for old snapshots or disk space < xyGB free. Both things (too many snapshots, too much disk space in use) as well as the file overflow thing were stuff, that we stumbled upon on multiple customers that were running into problems when upgrading their boxes. After the first ones, it was easy to spot on subsequent customers. Even my own homebrew box had the file overflow without me noticing and I just thought it strange that it used 3.4G disk space when a normal installation would be around ~2G without snaps. Only then I remembered - oh snap, I'm running pfB, too and haven't added the hotfix for the file overflow that we were testing...
So perhaps those 3 cases would make for a few additional easy pre-flight checks for future updates :)
Cheers
-
@JeGr I think (?) it tries to check space but it's not uncommon to see posts about failed upgrades for space reasons. Maybe it needs a larger free space check.
We had one client with an old 2440 I recently upgraded through several versions successfully but it's at 94% full because of all the old files and I don't think I want to try 25.11, remotely. :-/
-
Hmm, I agree. Let me see what we can do here.
-
@stephenw10 What would help at upgrades? :-)
I have a 4200 and am having the same problem, presumably. I have pfblockerng installed.
I'm also seeing:
ld-elf.so.1: Shared object "libmd.so.7" not found, required by "pfSense-repoc"
I have a ticket open at Netgate and they want me to do a USB upgrade. That didn't feel right to me so I started searching and found this thread.
-
Any ideas on this. BTW, my memory: 30% of 3890 MiB on a 4200
-
@vronp said in 25.07 upgrade on Netgate 4100 gets rolled back:
ld-elf.so.1: Shared object "libmd.so.7" not found, required by "pfSense-repoc"
That's different, see
https://forum.netgate.com/topic/198754/ld-elf.so.1-shared-object-libmd.so.7-not-found-required-by-pfsense-repocBut too many old config files can be a problem also, sure. How much free disk space do you have?
-
@vronp said in 25.07 upgrade on Netgate 4100 gets rolled back:
I'm also seeing:
ld-elf.so.1: Shared object "libmd.so.7" not found, required by "pfSense-repoc"
That's just an ugly error it should not prevent upgrading. If you run at the CLI:
pfSense-repoc-static -N
it should succeed as expected and that's what the upgrade uses. -
Thanks. I also discovered 15,000 files in /cf/conf/backup
It seems I need to clean that up. Is there a limit setting for backups there or is this the pfblockerng bug that was mentioned?
-
@vronp The default is 30 I believe. fixed in 25.07:
https://docs.netgate.com/pfsense/en/latest/releases/25-07.html#configuration-backendpfB just makes it worse by generating one per cron job (default per hour).
Diagnostics > Configuration History will time out while it tries to delete them all, just refresh every time it does. Or delete manually.
-
@SteveITS
28% of 3890 MiB
28% of 4.6G (zfs)I also just found 15,000 files in /cf/conf/backup
-
This is a bug. It should be limited to 30 backups there. The bug was that it was only pruning the backups when the user visited the Diag > Backup&Restore page. If you visit that page it will try to prune them. It might take a while if you have 15K files! It;s fixed in 25.07.
-
@SteveITS Thank you. I'm going to try to run an upgrade again as I'm hoping that the problem described above (copied below) is the cause of my problem even though I only have 15,000 files in that directory.
"The box in question had 121,387 config-<timestamp>.xml files in /cf/conf/backup directory that accounted to around 1.5G in files. But it wasn't the disk space that were the problem but somehow the snapshot booted and wouldn't be able to access /cf/conf or cf/conf/backup because the process that tried to do something didn't succeed as the directory in question had too many files that broke some shell script magic."
-
Yeah it tries to parse all the files in that folder when it runs the config upgrade and has a really bad time! It should be fine after pruning them.
-
@stephenw10 Yep. The upgrade completed without a problem after clearing those files. Thanks all for the assistance!