23.05.1 -> 23.09 - PKG is broken - SG5100
-
Yessir
-
It could be a problem with the pkg database or the disk contents.
You can try running this from an ssh or console shell:
$ pkg check -s -a | & egrep -v 'Checking all packages|missing file'
That excludes a bunch of irrelevant "missing file" messages from man pages and the like which we don't include on purpose.
There may be an error or two for changes in expected files (like maybe
linker.hints
for the kernel), but it may also point at some other potential issues to investigate. -
@jimp Thanks
The issue here is any command requiring pkg will fail, it is completely broken.
pkg check -s -a | & egrep -v 'Checking all packages|missing file' ld-elf.so.1: Shared object "libssl.so.30" not found, required by "pkg"
-
@Dyspareunia said in 23.05.1 -> 23.09 - PKG is broken - SG5100:
@jimp Thanks
The issue here is any command requiring pkg will fail, it is completely broken.
pkg check -s -a | & egrep -v 'Checking all packages|missing file' ld-elf.so.1: Shared object "libssl.so.30" not found, required by "pkg"
Use
pkg-static
then:$ pkg-static check -s -a | & egrep -v 'Checking all packages|missing file'
-
@jimp Thanks, good call, here is the output.
$ pkg-static check -s -a | & egrep -v 'Checking all packages|missing file' devcpu-data-intel-20230214: checksum mismatch for /boot/firmware/intel-ucode.bin glib-2.76.1,2: checksum mismatch for /usr/local/lib/libgio-2.0.a icu-72.1,1: checksum mismatch for /usr/local/share/icu/72.1/icudt72l.dat libunistring-1.1: checksum mismatch for /usr/local/lib/libunistring.so.5.0.0 nettle-3.8.1: checksum mismatch for /usr/local/lib/libnettle.a pfSense-base-23.05.1: checksum mismatch for /usr/local/share/pfSense/base.txz py311-setuptools-63.1.0: checksum mismatch for /usr/local/lib/python3.11/site-packages/setuptools/gui-arm64.exe python311-3.11.2_2: checksum mismatch for /usr/local/lib/python3.11/config-3.11/libpython3.11.a python311-3.11.2_2: checksum mismatch for /usr/local/lib/python3.11/test/__pycache__/test_pyexpat.cpython-311.opt-1.pyc python39-3.9.16_2: checksum mismatch for /usr/local/lib/python3.9/ctypes/test/__pycache__/test_struct_fields.cpython-39.opt-2.pyc
-
Having a checksum mismatch on that many files that shouldn't change makes me worry for the condition of your disk and/or filesystem.
If you are running UFS (not ZFS), you might try a disk check using
fsck
:https://docs.netgate.com/pfsense/en/latest/troubleshooting/filesystem-check.html
If you are running ZFS, maybe try a scrub (also mentioned in the doc above).
Might not hurt to run a SMART test on the drive as well if possible. That would depend on the type of disk you have in the 5100.
-
@jimp Ok, thanks for the heads up.
This is the output of fsck, Is the incorrect block count and 'unexpected soft update inconsistency' messages showing an issue?
# fsck -fy / ** /dev/ufsid/5c79cf272cf85168 (NO WRITE) ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes INCORRECT BLOCK COUNT I=240932 (680 should be 672) CORRECT? no INCORRECT BLOCK COUNT I=241079 (192 should be 184) CORRECT? no PARTIALLY TRUNCATED INODE I=331820 SALVAGE? no INCORRECT BLOCK COUNT I=726088 (1856 should be 0) CORRECT? no ** Phase 2 - Check Pathnames UPDATE FILESYSTEM TO TRACK DIRECTORY DEPTH? nohase 3 - Check Connectivity ** Phase 4 - Check Reference Counts UNREF FILE I=722432 OWNER=root MODE=100666 SIZE=0 MTIME=Apr 30 10:41 2021 CLEAR? no UNREF FILE I=722458 OWNER=root MODE=100666 SIZE=0 MTIME=Jan 19 22:45 2021 CLEAR? no UNREF FILE I=725782 OWNER=root MODE=100666 SIZE=0 MTIME=Nov 13 23:05 2023 CLEAR? no UNREF FILE I=726571 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726572 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726573 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726574 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726575 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726576 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726578 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726579 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726580 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726581 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726582 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726583 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726584 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726585 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726586 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726587 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726588 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726589 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726590 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726591 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726592 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726593 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726594 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726595 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726596 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726597 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726598 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:35 2023 RECONNECT? no CLEAR? no UNREF FILE I=726600 OWNER=root MODE=100600 SIZE=0 MTIME=Nov 14 10:36 2023 RECONNECT? no CLEAR? no ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? no SUMMARY INFORMATION BAD SALVAGE? no BLK(S) MISSING IN BIT MAPS SALVAGE? no 48658 files, 932777 used, 817029 free (12741 frags, 100536 blocks, 0.7% fragmentation)
I checked the SMART section of my UI, but it doesn't populate anything. I suspect that the SDMMC storage device doesn't support SMART?
I tried from the command line as well
# geom disk list Geom name: mmcsd0 Providers: 1. Name: mmcsd0 Mediasize: 7818182656 (7.3G) Sectorsize: 512 Stripesize: 512 Stripeoffset: 0 Mode: r2w2e7 descr: MMCHC M32508 5.2 SN 398280D7 MFG 06/2018 by 112 0x0000 ident: 398280D7 rotationrate: 0 fwsectors: 0 fwheads: 0 Geom name: mmcsd0boot0 Providers: 1. Name: mmcsd0boot0 Mediasize: 4194304 (4.0M) Sectorsize: 512 Stripesize: 512 Stripeoffset: 0 Mode: r0w0e0 descr: MMCHC M32508 5.2 SN 398280D7 MFG 06/2018 by 112 0x0000 ident: 398280D7 rotationrate: 0 fwsectors: 0 fwheads: 0 Geom name: mmcsd0boot1 Providers: 1. Name: mmcsd0boot1 Mediasize: 4194304 (4.0M) Sectorsize: 512 Stripesize: 512 Stripeoffset: 0 Mode: r0w0e0 descr: MMCHC M32508 5.2 SN 398280D7 MFG 06/2018 by 112 0x0000 ident: 398280D7 rotationrate: 0 fwsectors: 0 fwheads: 0
# smartctl -t short -d mmc /dev/mmcsd0 smartctl 7.3 2022-02-28 r5338 [FreeBSD 14.0-CURRENT amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org /dev/mmcsd0: Unknown device type 'mmc' =======> VALID ARGUMENTS ARE: ata, scsi[+TYPE], nvme[,NSID], sat[,auto][,N][+TYPE], usbcypress[,X], usbjmicron[,p][,x][,N], usbprolific, usbsunplus, sntasmedia, sntjmicron[,NSID], sntrealtek, intelliprop,N[+TYPE], jmb39x[-q],N[,sLBA][,force][+TYPE], jms56x,N[,sLBA][,force][+TYPE], 3ware,N, hpt,L/M/N, cciss,N, areca,N/E, megaraid,N, atacam, auto, test <======= Use smartctl -h to get a usage summary
I could be doing this wrong as well, I'm clearly not super versed with some of this stuff.
What does my path forward look like? Do I need to wipe the device and start fresh? I fear a lot of config I have on this device may be lost. Do backups also capture pfBlocker feeds, certbot settings for example?
-
Those fsck errors are worrisome but there isn't enough detail to say if it's from an unclean shutdown/crash type FS issue or hardware. It's hard to tell on the mmc since it doesn't report data like SSDs do with SMART.
Was that fsck run from single user mode? It didn't appear to fix anything even though you used
-y
.You might be able to use the reboot menu option to have it automatically run the fsck for you during boot.
-
@jimp said in 23.05.1 -> 23.09 - PKG is broken - SG5100:
Was that fsck run from single user mode? It didn't appear to fix anything even though you used -y.
No, I believe that requires I reboot into this mode using a console connection.
I WFH and need to keep things up for now, I'll look at this as soon as I can.
-
Before you do that you might want to get ahold of TAC and make sure you have installer media for 23.09 in case the filesystem damage is too severe to be recovered. There is a chance running fsck may make it worse in some ways even though it is repairing damage.
Hopefully it's just filesystem corruption, though.
Honestly you may be better off reinstalling 23.09 clean and going to ZFS as it is a lot more robust against these sorts of problems.
-
@jimp said in 23.05.1 -> 23.09 - PKG is broken - SG5100:
Honestly you may be better off reinstalling 23.09 clean and going to ZFS as it is a lot more robust against these sorts of problems.
Yeah way ahead of you, I contacted TAC yesterday and got the 23.09 image. I've been wanting to switch to ZFS for a while now as well as upgrade the storage on the device. Seems like now may be the time.
-
You should check the eMMC write life of the drive:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/disk-lifetime.html#emmcOn the 5100 if the eMMC fails entirely it can cause problems booting at all. So if it's close to the write-life limit you should fit an SSD as soon as possible.
Steve
-
@stephenw10 said in 23.05.1 -> 23.09 - PKG is broken - SG5100:
On the 5100 if the eMMC fails entirely it can cause problems booting at all. So if it's close to the write-life limit you should fit an SSD as soon as possible.
Thanks for this, I took a look and I'm now properly terrified.
# mmc extcsd read /dev/mmcsd0rpmb | egrep 'LIFE|EOL' eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x0b eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x0b eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01
According to the link you gave me:
0x0b
The disk has used 100%-110% of its estimated life time.I've ordered a new SSD on Amazon, will arrive in the next few days.
-
If you don't have a bunch of heavy packages installed I'd suggest enabling ram disks to prevent any significant further writes before the SSD is installed. That does require a reboot.
-
@stephenw10 said in 23.05.1 -> 23.09 - PKG is broken - SG5100:
If you don't have a bunch of heavy packages installed I'd suggest enabling ram disks to prevent any significant further writes before the SSD is installed. That does require a reboot.
Thanks, done!
-
@stephenw10 I am experiencing the same problems, but it's a near new 4100 running ZFS. Tried installing 23.09 from memstick (multiple times). Each time I assign the WAN and LAN ip addresses, but the device doesn't appear on the LAN an is not reachable.
-
You're seeing ssl library or pkg issues after a clean re-install?