SG-5100 Kernel Panic/Boot Loop
-
Good day, everyone.
I have an SG-5100 I received last week. Brand spankin' new. I booted it up on the workbench and did a config restore from my SG-3100. Everything was working fine, except I didn't have it connected to the network so it could pull down the packages to reinstall them. So I factory reset it again, racked it up in the closet, plugged it back in and powered it up so I could restore the config while it was connected and in place.
When nothing happened for a while, I pulled it back down and hooked it up to the console. This is what I saw.
https://pastebin.com/FerCZ94p
[snip] Welcome to pfSense 2.4.4-RELEASE (Patch 3)... No core dumps found. ...ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/ipsec /usr/local/lib/perl5/5.26/mach/CORE 32-bit compatibility ldconfig path: done. External config loader 1.0 is now starting... mmcsd0s1 mmcsd0s1a mmcsd0s1b Launching the init system...Updating CPU Microcode... CPU: Intel(R) Atom(TM) CPU C3558 @ 2.20GHz (2200.07-MHz K8-class CPU) Origin="GenuineIntel" Id=0x506f1 Family=0x6 Model=0x5f Stepping=1 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4ff8ebbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,RDRAND> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x101<LAHF,Prefetch> Structured Extended Features=0x2294e283<FSGSBASE,TSCADJ,SMEP,ERMS,NFPUSG,MPX,PQE,RDSEED,SMAP,CLFLUSHOPT,PROCTRACE,SHA> Structured Extended Features3=0xac000400<IBPB,STIBP,ARCH_CAP,SSBD> XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> IA32_ARCH_CAPS=0x69<RDCL_NO> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics Done. mode = 0100666, inum = 561816, fs = / panic: ffs_valloc: dup alloc cpuid = 3 KDB: enter: panic [ thread pid 408 tid 100172 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db:0:kdb.enter.default> textdump set textdump set db:0:kdb.enter.default> capture on db:0:kdb.enter.default> run lockinfo db:1:lockinfo> show locks No such command; use "help" to list available commands db:1:lockinfo> show alllocks No such command; use "help" to list available commands db:1:lockinfo> show lockedvnods Locked vnodes vnode 0xfffff800076603b0: tag ufs, type VDIR usecount 1, writecount 0, refcount 4 mountedhere 0 flags (VI_ACTIVE) v_object 0xfffff800076a90f0 ref 0 pages 1 cleanbuf 0 dirtybuf 1 lock type ufs: EXCL by thread 0xfffff80007720000 (pid 408, php-cgi, tid 100172) ino 561792, on dev ufsid/5d4890a17562fc55 vnode 0xfffff800077b5000: tag ufs, type VREG usecount 1, writecount 0, refcount 1 flags (VI_ACTIVE) lock type ufs: EXCL by thread 0xfffff80007720000 (pid 408, php-cgi, tid 100172) ino 561816, on dev ufsid/5d4890a17562fc55 [snip]
More in the pastebin. At the end, it reboots and does it all over again. Did I brick it? Can I save it?
-
i think you just need to launch a
# /sbin/fsck -y /
Filesystem is not clean, probably corrupted
boot in single user mode and repeat that command until fsck neither finds nor fixes problems when run. Do not stop when it claims to have cleaned the filesystem after fixing an issue. -
That did it all right. I saw that fsck had marked it clean in the original console output, so I assumed that couldn't be the problem.
Thanks for the help!
Snaps for kiokoman.
Enter full pathname of shell or RETURN for /bin/sh: uhub0: 8 ports with 8 removable, self powered # /sbin/fsck -y / ** /dev/ufsid/5d4890a17562fc55 USE JOURNAL? yes ** SU+J Recovering /dev/ufsid/5d4890a17562fc55 ** Reading 33554432 byte journal from inode 4. RECOVER? yes ** Building recovery table. ** Resolving unreferenced inode list. ** Processing journal entries. WRITE CHANGES? yes ** 39 journal records in 2560 bytes for 48.75% utilization ** Freed 0 inodes (0 dirs) 2 blocks, and 2 frags. ***** FILE SYSTEM MARKED CLEAN ***** # /sbin/fsck -y / ** /dev/ufsid/5d4890a17562fc55 USE JOURNAL? yes ** SU+J Recovering /dev/ufsid/5d4890a17562fc55 Journal timestamp does not match fs mount time ** Skipping journal, falling through to full fsck ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts UNREF FILE I=561816 OWNER=root MODE=100666 SIZE=0 MTIME=Aug 10 21:37 2019 CLEAR? yes UNREF FILE I=561821 OWNER=root MODE=100666 SIZE=0 MTIME=Aug 10 21:27 2019 CLEAR? yes ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? yes SUMMARY INFORMATION BAD SALVAGE? yes BLK(S) MISSING IN BIT MAPS SALVAGE? yes 21873 files, 231050 used, 1521533 free (997 frags, 190067 blocks, 0.1% fragmentation) ***** FILE SYSTEM IS CLEAN ***** ***** FILE SYSTEM WAS MODIFIED ***** # /sbin/fsck -y / ** /dev/ufsid/5d4890a17562fc55 USE JOURNAL? yes ** SU+J Recovering /dev/ufsid/5d4890a17562fc55 Journal timestamp does not match fs mount time ** Skipping journal, falling through to full fsck ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 21873 files, 231050 used, 1521533 free (997 frags, 190067 blocks, 0.1% fragmentation) ***** FILE SYSTEM IS CLEAN ***** # /sbin/fsck -y / ** /dev/ufsid/5d4890a17562fc55 USE JOURNAL? yes ** SU+J Recovering /dev/ufsid/5d4890a17562fc55 Journal timestamp does not match fs mount time ** Skipping journal, falling through to full fsck ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 21873 files, 231050 used, 1521533 free (997 frags, 190067 blocks, 0.1% fragmentation) ***** FILE SYSTEM IS CLEAN ***** # /sbin/fsck -y / ** /dev/ufsid/5d4890a17562fc55 USE JOURNAL? yes ** SU+J Recovering /dev/ufsid/5d4890a17562fc55 Journal timestamp does not match fs mount time ** Skipping journal, falling through to full fsck ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 21873 files, 231050 used, 1521533 free (997 frags, 190067 blocks, 0.1% fragmentation) ***** FILE SYSTEM IS CLEAN ***** # /sbin/fsck -y / ** /dev/ufsid/5d4890a17562fc55 USE JOURNAL? yes ** SU+J Recovering /dev/ufsid/5d4890a17562fc55 Journal timestamp does not match fs mount time ** Skipping journal, falling through to full fsck ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 21873 files, 231050 used, 1521533 free (997 frags, 190067 blocks, 0.1% fragmentation) ***** FILE SYSTEM IS CLEAN ***** # reboot Aug 11 21:47:48 init: single user shell terminated. Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `bufdaemon' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 0 0 0 0 0 0 0 0 0 0 done All buffers synced. Uptime: 2m21s
-
You didn't say how you powered it down prior to re-racking it. You should never just unplug the power. Always use the shutdown command in the pfSense menu. Simply removing power is just about guaranteed to cause disk corruption.
-
I'm ashamed to say that's very likely what I did wrong. I'm usually more conscientious than that. I did a legitimate shutdown and halt before I put it back in the closet this time. How embarrassing.
-
No harm since you repaired the file system. These little Netgate appliances are still PCs at heart, so you have to shutdown them down gracefully even though they do sort of resemble a network switch. They are actively writing to the disk even if it is solid-state, unlike say a typical firmware-based switch which just reads from a ROM of some type.