Packages disappear overnight after system restore
-
This is the sequence to reproduce (tried 3 times):
- (8.00pm) fresh written CF with pfsense 2.3.5-i386 - nanobsd
- system restore of a perfectly working configuration
- at reboot, system works and package reinstallation occurs
- after 1h, all packages are restored (and displayed on menus) and everything is working
- from this point no more GUI access til next morning
- (8.00am) packages have disappeared from menus, 'no package' is displayed in the package manager and 'unable to retrieve package information'
- a top command from console shell reveals 'pkg-static update' still in the process list, consuming more than 90% CPU
- killing the afore said process to execute option 13 (manual update) from console gives:
Enter an option: 13 >>> Updating repositories metadata... Updating pfSense-core repository catalogue... pkg-static: Repository pfSense-core load error: access repo file(/root/var/db/pkg/repo-pfSense-core.sqlite) failed: No such file or directory pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-core/meta.txz: Authentication error repository pfSense-core has no meta file, using default settings pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-core/packagesite.txz: Authentication error Unable to update repository pfSense-core Updating pfSense repository catalogue... pkg-static: Repository pfSense load error: access repo file(/root/var/db/pkg/repo-pfSense.sqlite) failed: No such file or directory pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-pfSense_v2_3_5/meta.txz: Authentication error repository pfSense has no meta file, using default settings pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-pfSense_v2_3_5/packagesite.txz: Authentication error Unable to update repository pfSense Error updating repositories! embedded (unknown) - Netgate Device ID: 3b913ce448e6f8e803fd
- system is still working perfectly except packages are all lost and I can't perform any system update (it displays 'up to date' even if I know there are some patches after 2.3.5)
- from the GUI I see the following 'general' notice :
Package reinstall process was ABORTED due to lack of internet connectivity @ 2018-08-23 23:22:23
(...of course there was NO lack of internet connectivity, I had kept using it)
Does anybody have an idea?
-
What is in the system log between the time you know it was working and when it failed?
Also that authentication error implies you are behind a proxy which is denying access to fetch the requested URL.
-
Hi jimp, thanks for your reply.
I can't see that part of the log anymore; I didn't check it immediately as I was focusing on reproducing and observing the same behaviour I described.
While I'm pretty sure it didn't depend on a local momentary cause (it repeated for 3 times in the same way), I'm ready to perform another run to check it again.
BTW, I'm not behind a proxy, my pfsense box is connected to an upstream router via a CARP VIP (even if there was just a single machine on in that moment).
From what I noticed, I'd say the trouble is somehow generated by the logic around the package/repository system, taking place after packages had been reinstalled successfully.
In the meantime, while I'm preparing to rewrite the firmware again, could you speculate on a good reason for which the pkg system could permanently impair itself?
Even if the pfsense box actually thought of a internet outage, why can't I execute a 'pkg update' or 'pkg upgrade' any more?
More or less, I've seen similar issues of this kind in the last 2 years and I'm sorry to say that these problems have never been fully addressed yet.
-
I have found where the problem originates!
Tracking the full process required almost 3hrs; this is why it had previously been scheduled to execute overnight.
After 1hr from system restore, package descriptions are still displayed on the GUI menus. In the meantime, pkg is already hogging more than 80% CPU time.
In the system log, there is no error but a notice:
"Updating pfSense-core repository catalogue..."
while on the console I see:
Waiting for Internet connection to update pkg metadata and finish package reinstallationUpdating pfSense-core repository catalogue... repository pfSense-core has no meta file, using default settings Unable to update repository pfSense-core Updating pfSense repository catalogue... repository pfSense has no meta file, using default settings Unable to update repository pfSense Error updating repositories! ERROR!!! An error occurred on pkg execution (rc = 70) with parameters 'update -f': pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-core/meta.txz: Authentication error pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-core/packagesite.txz: Authentication error pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-core/meta.txz: Authentication error pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-pfSense_v2_3_5/meta.txz: Authentication error pkg-static: https://pkg.pfsense.org/pfSense_v2_3_5_i386-pfSense_v2_3_5/packagesite.txz: Authentication error
After 1hr, the warning in the log has been repeated another time, the same for the console.
Needless to say, the internet connection has never been lost and the pfSense box is still working perfectly.
Packages descriptions are visible on the GUI menu, but trying to access, a "No valid package defined" notice appears.
After another hour, the boot sound is heard (so the console messages are no longer visible) and in the system log I see:
rc.bootup: New alert found: Package reinstall process was ABORTED due to lack of internet connectivity
Next, for each package to be (re)installed, a line with:
/rc.start_packages: The OpenVPN Client Export Utility package is missing its configuration file and must be reinstalled.
Now, all the package items have been removed from the GUI, resulting in "packages disappeared" as described in the title of this topic.
So, the full debug trace is:
- the user restores a previously created configuration on a perfectly working pfsense box, making it to reboot
- the pfsense system tries to download and install the packages accessing the official repositories
- the system enters a loop condition of failed updates, consuming lots of CPU cycles
- the pkg system has become impaired and cannot work any more for package download or system upgrade
- the system eventually realizes that package descriptions in the GUI hook to non-existent programs and deletes them
What has to be addressed:
- why the pfsense system thinks there is a loss of connectivity
- why the above said condition becomes a permanent fault
I'm looking forward to the Netgate support answering this.
-
Part of that sounds like an issue recently fixed on 2.4.4 snapshots.
I think what you're seeing is all just one problem, the fact that pkg can't reach the pfSense servers directly from the firewall.
Every reference I can find to "Authentication Error" points to it being behind a proxy of some sort that is not passing through the request. First, I'd check under System > Advanced, Miscellaneous tab and ensure there is no proxy information listed there, not even a username and password (which may be auto-filled by your browser).
There is a small chance it could also be from failing to verify the SSL certificate on the server, so that means you should check your firewall's clock/date to ensure it is accurate.
Failing that, see if you have
/etc/ssl/cert.pem
present, it should be a symbolic link to/usr/local/share/certs/ca-root-nss.crt
. Check to see if that file is there, too. If/etc/ssl/cert.pem
is missing, run:: ln -s /usr/local/share/certs/ca-root-nss.crt /etc/ssl/cert.pem
If
/usr/local/share/certs/ca-root-nss.crt
then somehow your installation is missing the root certificate packageca-root-nss
.If all else fails, you can edit
/usr/local/etc/pkg/repos/pfSense.conf
and change thehttps://`` URLs to
http://` to see if that makes a difference. -
Hi jiimp and thank you for your kind support.
I performed all the checks you suggested:
- in system/advanced/miscellaneous there is no proxy setup
- data/time is accurate
- /etc/ssl/cert.pem exists as a symlink to /usr/local/share/certs/ca-root-nss.crt, that contains 158 certificates
- I modified urls in /usr/local/etc/pkg/repos/pfSense.conf (https to http) and executed option 13 from shell, having this output:
>>> Updating repositories metadata... Updating pfSense-core repository catalogue... pkg-static: Repository pfSense-core load error: access repo file(/root/var/db/pkg/repo-pfSense-core.sqlite) failed: No such file or directory pkg-static: http://pkg.pfsense.org/pfSense_v2_3_5_i386-core/meta.txz: Operation timed out repository pfSense-core has no meta file, using default settings pkg-static: http://pkg.pfsense.org/pfSense_v2_3_5_i386-core/packagesite.txz: Operation timed out Unable to update repository pfSense-core Updating pfSense repository catalogue... pkg-static: Repository pfSense load error: access repo file(/root/var/db/pkg/repo-pfSense.sqlite) failed: No such file or directory pkg-static: http://pkg.pfsense.org/pfSense_v2_3_5_i386-pfSense_v2_3_5/meta.txz: Operation timed out repository pfSense has no meta file, using default settings pkg-static: http://pkg.pfsense.org/pfSense_v2_3_5_i386-pfSense_v2_3_5/packagesite.txz: Operation timed out Unable to update repository pfSense Error updating repositories!
-
Hi jimp,
some good news here.Your statement that "pkg can't reach the pfsense servers" pointed me to the right direction; I haven't understood it fully, but I found a way at least to unlock the pkg issue.
In my case, it was due to a double stack IPv4/IPv6 issue: to solve it, I had to temporaly disable the network interface linked to the GIF port; removing IPv6 name resolution plus removing the IPv6 default gateway and firewall rules to route IPv6 traffic didn't suffice.
I don't like to be so inaccurate in test results, but as IPv6 connectivity was actually working, defining this problem will require some more tests and I meant to find a quick workaround for everybody experiencing this kind of issue.
Let me know if this rings a bell.