[solved] 2.2.3 nanobsd - packages reinstall after upgrade totally screwed
-
tl;dr: Preferably before upgrade, go to Diagnostics - NanoBSD and
–------ OP below --------
After reboot, pretty much none of the packages reinstalled properly. Strongly suspect this commit:
https://redmine.pfsense.org/projects/pfsense/repository/revisions/4fabdca76a3f956df4363d7599cfa784848506ab
Jun 25 10:31:43 check_reload_status: Syncing firewall Jun 25 10:30:06 php-fpm[44952]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:30:06 php-fpm[44952]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:30:06 php-fpm[44952]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:30:06 php-fpm[44952]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:30:06 php-fpm[44952]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:30:06 php-fpm[44952]: /rc.start_packages: Restarting/Starting all packages. Jun 25 10:30:02 php-fpm[44952]: /pkg_mgr_install.php: Reference 1000 is going negative, not doing unreference. Jun 25 10:30:02 check_reload_status: Starting packages Jun 25 10:30:02 check_reload_status: Reloading filter Jun 25 10:30:02 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: System Patches. Jun 25 10:30:02 check_reload_status: Syncing firewall Jun 25 10:29:56 check_reload_status: Syncing firewall Jun 25 10:29:42 check_reload_status: Syncing firewall Jun 25 10:29:34 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for System Patches . Jun 25 10:29:30 check_reload_status: Syncing firewall Jun 25 10:29:24 check_reload_status: Syncing firewall Jun 25 10:29:22 check_reload_status: Syncing firewall Jun 25 10:29:21 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: Shellcmd. Jun 25 10:29:21 check_reload_status: Syncing firewall Jun 25 10:29:05 check_reload_status: Syncing firewall Jun 25 10:28:57 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for Shellcmd . Jun 25 10:28:53 check_reload_status: Syncing firewall Jun 25 10:28:45 check_reload_status: Syncing firewall Jun 25 10:28:44 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: Cron. Jun 25 10:28:44 check_reload_status: Syncing firewall Jun 25 10:28:25 check_reload_status: Syncing firewall Jun 25 10:28:16 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for Cron . Jun 25 10:28:02 php: sshd: The command '/sbin/mount -u -w -o sync,noatime /cf' returned exit code '1', the output was 'mount: /dev/ufs/cf: Operation not permitted' Jun 25 10:27:59 php-fpm[44952]: /pkg_mgr_install.php: [sshdcond] xmlrpc sync is ending. Jun 25 10:27:59 php-fpm[44952]: /pkg_mgr_install.php: [sshdcond] xmlrpc sync is starting. Jun 25 10:27:53 check_reload_status: Syncing firewall Jun 25 10:27:37 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for SSHDCond . Jun 25 10:27:21 check_reload_status: Syncing firewall Jun 25 10:27:20 check_reload_status: Syncing firewall Jun 25 10:27:19 check_reload_status: Syncing firewall Jun 25 10:27:19 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:27:19 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: gwled. Jun 25 10:27:18 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:27:02 check_reload_status: Syncing firewall Jun 25 10:27:02 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:26:39 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for gwled . Jun 25 10:26:23 check_reload_status: Syncing firewall Jun 25 10:26:22 check_reload_status: Syncing firewall Jun 25 10:26:21 check_reload_status: Syncing firewall Jun 25 10:26:21 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:26:21 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: Service Watchdog. Jun 25 10:26:20 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:26:04 check_reload_status: Syncing firewall Jun 25 10:26:04 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:25:42 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for Service Watchdog . Jun 25 10:25:25 check_reload_status: Syncing firewall Jun 25 10:25:24 check_reload_status: Syncing firewall Jun 25 10:25:24 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:25:24 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: Notes. Jun 25 10:25:23 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:25:07 check_reload_status: Syncing firewall Jun 25 10:25:07 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:24:45 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for Notes . Jun 25 10:24:29 check_reload_status: Syncing firewall Jun 25 10:24:27 check_reload_status: Syncing firewall Jun 25 10:24:27 check_reload_status: Syncing firewall Jun 25 10:24:27 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:24:26 php-fpm[44952]: /pkg_mgr_install.php: Successfully installed package: RRD Summary. Jun 25 10:24:26 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:24:10 check_reload_status: Syncing firewall Jun 25 10:24:10 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:23:47 php-fpm[44952]: /pkg_mgr_install.php: Beginning package installation for RRD Summary . Jun 25 10:23:31 check_reload_status: Syncing firewall Jun 25 10:23:29 check_reload_status: Syncing firewall Jun 25 10:23:28 php-fpm[44952]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:23:28 check_reload_status: Syncing firewall Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:16:39 php-fpm[70365]: /rc.start_packages: Restarting/Starting all packages. Jun 25 10:16:33 check_reload_status: Starting packages Jun 25 10:16:33 check_reload_status: Reloading filter Jun 25 10:16:33 php-fpm[70365]: /pkg_mgr_install.php: Successfully installed package: System Patches. Jun 25 10:16:33 check_reload_status: Syncing firewall Jun 25 10:16:25 check_reload_status: Syncing firewall Jun 25 10:16:22 php-fpm[70365]: /pkg_mgr_install.php: Beginning package installation for System Patches . Jun 25 10:16:19 check_reload_status: Syncing firewall Jun 25 10:16:17 check_reload_status: Syncing firewall Jun 25 10:16:16 php-fpm[70365]: /pkg_mgr_install.php: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: XML error: no packagegui object found! Jun 25 10:12:44 php-fpm[75330]: /rc.start_packages: Restarting/Starting all packages.
Also, this one line from there looks pretty bad:
The command '/sbin/mount -u -w -o sync,noatime /cf' returned exit code '1', the output was 'mount: /dev/ufs/cf: Operation not permitted'
Even after I have done - Diagnostics -> Backup/Restore -> Reinstall packages, most of the GUI components (except for SSHDCond, Shellcmd and System Patches) were still either incomplete or totally missing (select something from menu and get either PHP errors, some empty GUI mockup or 404 Not Found)
Reinstalling packages yet again one by one seems to fix this.
-
I doubt that commit was directly related, as it'd likely break package reinstalls in all circumstances for everyone. They're no more problematic than they ever have been that I've seen (not that that's great, I know, something we'll be working on for 2.3 while switching to pkg).
There is some odd stuff there though. What packages you have on there? 32 or 64 bit?
-
Alix. Executive summary after I fixed the mess:
Package Name - After reboot - After Reinstall packages - After reinstall one by one Service Watchdog - broken GUI - Broken GUI - Works Notes - broken GUI - Broken GUI - Works Cron - broken GUI - Broken GUI - Works gwled - broken GUI - Broken GUI - Works RRD Summary - 404 not found - Broken GUI - Works Shellcmd - Broken GUI - Works - Works SSHDCond - Broken GUi - Works - Works System Patches - 404 not found - Works - Works
A special case:
NUT - completely failed multiple times (fetch failures, inc file missing resulting in deinstall failed so reinstall subsequently got aborted, took about 5 attempts to get reinstalled)
P.S. Also wondering what's this:
Jun 25 10:30:02 php-fpm[44952]: /pkg_mgr_install.php: Reference 1000 is going negative, not doing unreference.
-
Reference 1000 is going negative, not doing unreference.
That is from util.inc function refcount_unreference($reference)
The shared memory location keeps count of the number of things that have the file system mounted RW while they do their install, write conf files, make changes or whatever. It should count up when a bunch of stuff is happening at once, then when it transitions from 1->0 (and running nanoBSD…) the file system gets set back to RO.
Going negative indicates that there is a mismatched extra call to /etc/rc.conf_mount_ro somewhere, or something similar. All code that needs the file system mounted RW for changes should call rc.conf_mount_rw near the start, then rc.conf_mount_ro after making all the changes. Whatever code paths/error conditions/options there are in an installer it needs to always do the same number of RW calls on the way in as RO calls on the way out.
Perhaps there is a package that has something mismatched in its installer, because I remember looking at all the base system RW and RO calls a while ago and confirming that they all seemed to be in happily-matched pairs.
-
The removal of the forcesync patch means it takes much longer to go rw->ro, with the diff on an ALIX most pronounced. That might be somehow related in that circumstance. I'm way too sleep deprived right now to look into that further at this instant, but might help others who are interested in digging into it.
-
i dont mean to hijack this thread but for me i use cron, lightsquid and squid3 and the first 2 reinstall fine but now squid3 wont go past extracting on a full install
Beginning package installation for squid3 .
Downloading package configuration file… done.
Saving updated package information... done.
Downloading squid3 and its dependencies...
Checking for package installation...
Downloading https://files.pfsense.org/packages/10/All/squid-3.4.10_2-i386.pbi ... (extracting) -
@cmb:
The removal of the forcesync patch means it takes much longer to go rw->ro, with the diff on an ALIX most pronounced.
So, you removed what? This? Since vfs.forcesync seems to be still set to 1 on the upgraded system in System Tunables.
i dont mean to hijack this thread
Yeah, so please don't and post to the proper forum section.
-
lightsquid also same
Beginning package installation for Lightsquid .
Downloading package configuration file… done.
Saving updated package information... done.
Downloading Lightsquid and its dependencies...
Checking for package installation...
Downloading https://files.pfsense.org/packages/10/All/lightsquid-1.8_2-i386.pbi ... (extracting) -
Dude, this thread is about nanobsd – which your box apparently is NOT. Plus, it's not about squid* failing to extract. Please leave it alone.
-
Yeah, that's the patch that was removed. At least the part of it that actually affected the filesystem.
I'd be interested to know if the problem could be repeated with the disk set to be permanently rw (Diag > NanoBSD, check the box and save)
The switch to RO is a crutch/safety net/etc to give people a warm and fuzzy about disk writes but in truth, aside from maybe a stray package here and there, things won't write to the disk willy-nilly so it's reasonably safe to do. All of the volatile things on NanoBSD are held in RAM disks anyhow, that's where the real danger is with a full install, constant writes to things in /tmp and logs that happen in RAM on NanoBSD.
-
Yeah, that's the patch that was removed. At least the part of it that actually affected the filesystem.
Hmmm… Seems to have pretty bad performance impact on these poor Alix boxes (even with UDMA) :(
I'd be interested to know if the problem could be repeated with the disk set to be permanently rw (Diag > NanoBSD, check the box and save)
Will that stick/work on the post-upgrade reboot? I can do that if that's the case, still lots of boxes left to upgrade where's I'd love to avoid this screw-up.
-
It varies from CF to CF – I have one CF that is really obnoxious to use with it, on the order of 45s-1m delays on each save. A different card is only ~3-4s, barely worse than without the patch.
If you set the flag in your config then it is consulted before any potential RO switch done at any time, essentially making conf_mount_ro() a no-op.
-
It varies from CF to CF – I have one CF that is really obnoxious to use with it, on the order of 45s-1m delays on each save. A different card is only ~3-4s, barely worse than without the patch.
Well, the cards are what PC Engines sells. This: http://www.pcengines.ch/cf2slc.htm
If you set the flag in your config then it is consulted before any potential RO switch done at any time, essentially making conf_mount_ro() a no-op.
Sounds good.
-
I have had no luck with those cards over the years. They've all died on me fairly soon.
The "good" card I'm using at the moment is a Sandisk. The one that is awful is a Kingston. Though I have another Kingston that is OK, so… shrug
Generally speaking, the faster the card the less likely you will be to see problems.
-
I have had no luck with those cards over the years. They've all died on me fairly soon.
Hmmm… interested. Out of some ~20, only one is dead here so far. Granted, there's not much done with them. Those Alixes are replacement for shitty ISP-supplied junk, with writing done when new pfS versions comes out, pretty much nothing to write home about in between.
Kingston, I won't touch. Can't even count how many SD cards have dies on me in phones. The higher the was class, the shittier product. Some of them even DoA. >:( >:( >:(
-
i have 3 alix boxes and 4 full installs with 1 of them running on nanobsd and in general i see some issue with package reinstalls
the first alix with nanobsd had cron package and that reinstalled fine on reboot after upgrade
the second nanobsd machine is a full PC with nanobsd on it and that runs squid and cron only, on reboot cron installs fine but squid got stuck on extracting, so i aborted it and reupgraded it and the next time it installed squid just fine and all errors vanishedi dont know y but extracting gets stuck for some packages and for some it works fine
-
I'd be interested to know if the problem could be repeated with the disk set to be permanently rw (Diag > NanoBSD, check the box and save)
Well, good news is that this totally avoided any issue described in the OP on two Alix boxes… 8)
-
I'd be interested to know if the problem could be repeated with the disk set to be permanently rw (Diag > NanoBSD, check the box and save)
Well, good news is that this totally avoided any issue described in the OP on two Alix boxes… 8)
That is good news. I added a note to the changelog doc about that yesterday, I may add a note to the upgrade guide as well.
-
Yeah, that's the patch that was removed. At least the part of it that actually affected the filesystem.
Guys, it really sucks now.
Every config action is delayed 4 seconds now, this is very anti-productive. I'm using brand new SanDisk CF cards, 2015 model, on Supermicro A1SRi-2758F with a CF-to-SATA adapter.
/etc/rc.conf_mount_rw followed by /etc/rc.conf_mount_ro is also 4 to 5 seconds.
Previously it was working in an instant.I am using exclusively only NanoBSD version in all kinds of setups, Jetway systems, SuperMicro and various thin clients and never had any boot problems or whatsoever.
Can you please consider putting back the patch with a configurable option/system tunable? Because I definitely vote to keep using it.
I see the option of keeping it RW all the time, but NanoBSD exists exactly because of the super-great capability to keep the system RO, and we should really keep relying on that professional feature, as an extra security measure.
-
Yeah, that's the patch that was removed. At least the part of it that actually affected the filesystem.
Guys, it really sucks now.
Every config action is delayed 4 seconds now, this is very anti-productive. I'm using brand new SanDisk CF cards, 2015 model, on Supermicro A1SRi-2758F with a CF-to-SATA adapter.
/etc/rc.conf_mount_rw followed by /etc/rc.conf_mount_ro is also 4 to 5 seconds.
Previously it was working in an instant.I am using exclusively only NanoBSD version in all kinds of setups, Jetway systems, SuperMicro and various thin clients and never had any boot problems or whatsoever.
Can you please consider putting back the patch with a configurable option/system tunable? Because I definitely vote to keep using it.
I see the option of keeping it RW all the time, but NanoBSD exists exactly because of the super-great capability to keep the system RO, and we should really keep relying on that professional feature, as an extra security measure.
Consider yourself lucky with those 4 seconds. It's virtually minutes for some people, plain unusable without switching to permanent RW. Plus this - https://redmine.pfsense.org/issues/4803 – dunno how exactly this helped with filesystem corruption, appears the cure is worse than the disease.
-
Oh shit.
Put back the patch, please, please…
-
I have noticed that after the first reboot, packages don't seem to fully reinstall until I reboot a second time. Is that normal?
-
For most of my boxes packages just don't reinstall, some finally did and the others I had to roll back to 2.2.2 until this package reinstall issue is fixed
-
I have been bitten by this too. Config changes take about 5 minutes to complete. I have three nearly identical systems on which the 4GB nanobsd image is written on a brand new HP 8GB v221 USB stick. Since the 2.2.3 upgrade my systems are pretty well unusable from an administration standpoint.
Is there a workaround for this? The thread above is not very clear.
Thanks,
Bennett -
Is there a workaround for this? The thread above is not very clear.
I don't know what's not very clear from the huge hint at the top of the very first post.
-
I'm not sure what you are seeing in the first post, but all I see is a bold faced advisory that explains nothing, and I read every post in the thread.
"tl;dr: Preferably before upgrade, go to Diagnostics - NanoBSD and"
AND nothing.
Might you grace me with a more fulsome explanation?
If making my flash permanently read/write is the workaround you are cryptically alluding to, I'm not keen on it.

 -
-
AND nothing.
Your browser or network might block inline pictures. The next sentence is actually a screenshot of the setting you have to do, but for some reason it doesn't appear in your browser.
-
I don't want to start a new thread about this, but I've had to move back to 2.2.2 because of the 2-3 min config change time. So are some saying now that the change is to just set RW all the time? I could do this but I believed from others saying over the years that CF cards didn't like this. I'm wondering if USB sticks don't like this either. I know SSD have wear leveling so those probably don't apply.
Anyhow, is this now what must be done to run this normally or has anyone committed to patching this or at least fixing it next release?
Last question, is the broken limiter fixed in 2.2.3?
Thanks.
-
is the broken limiter fixed in 2.2.3?
As far as I remember reading from the changelogs, it is.
-
Thank you, Robi.
For whatever reason, adblock plus doesn't like the source of the inline image (tinypic.com).
-
Well, the cards are what PC Engines sells. This: http://www.pcengines.ch/cf2slc.htm
Do you have a recalled card by chance?
http://www.pcengines.ch/cfissue.htmWe don't have any 2G of those, but have several of the 4G version of same and they're all fine.
-
@cmb:
Do you have a recalled card by chance?
http://www.pcengines.ch/cfissue.htmWe don't have any 2G of those, but have several of the 4G version of same and they're all fine.
Hmm, don't think so… Just re-checked a couple of spare ones I have laying around, and they are all "code K" marked. (All of them were ordered at the same time, ~100 of them.)
-
Hmm, don't think so… Just re-checked a couple of spare ones I have laying around, and they are all "code K" marked. (All of them were ordered at the same time, ~100 of them.)
Could you double check a 'time /etc/rc.conf_mount_ro' (when it's rw mounted) on one of those?
And send a picture of the card?
-
@cmb:
Hmm, don't think so… Just re-checked a couple of spare ones I have laying around, and they are all "code K" marked. (All of them were ordered at the same time, ~100 of them.)
Could you double check a 'time /etc/rc.conf_mount_ro' (when it's rw mounted) on one of those?
$ time /etc/rc.conf_mount_ro 0.560u 0.517s 0:12.28 8.7% 2623+262k 0+4349io 5pf+0w
Mind you, this is with previously completely unused card I just imaged and booted from on a desktop computer via USB card reader… It's whole lot worse on the Alix boxes. >:(
-
Mind you, this is with previously completely unused card I just imaged and booted from on a desktop computer via USB card reader… It's whole lot worse on the Alix boxes. >:(
Could you put that specific card in an ALIX to compare? I think it's more the card than anything to do with how fast the system is.
What's the 'time /etc/rc.conf_mount_ro' from rw like on one of your production systems?
-
@cmb:
Could you put that specific card in an ALIX to compare? I think it's more the card than anything to do with how fast the system is.
Mere unchecking of the "Keep media mounted read/write at all times." and clicking save took almost two minutes with the browser spinning and waiting for the change to get saved. Subsequent /etc/rc.conf_mount_rw; /etc/rc.conf_mount_ro took 30-45 seconds, tried 5 times and got tired of it.
I seriously don't have anything good to say about causing similar huge regressions on a bugfix release. Can as well get rid of the read-only mounts altogether, because it is plain not usable and breaks tons of stuff. Haven't seen a single complaint about the "harmful" patch for years.
-