SG-2440 kernel panic/loops while booting



  • We have an SG-2440 purchased May 2016 from NetGate.

    Rebooted after a configuration change.  It doesn't boot properly.  We can connect the console.  It tries to boot.  We see the kernel make some progress, but then there is some kind of kernel panic.    Not sure how to capture the console output.  The keyboard doesn't seem active either.

    Are we encountering the Clock Signal Component  Issue?  https://www.netgate.com/blog/clock-signal-component-issue.html

    We're okay with re-installing the image, but not sure where to get it.  The docs say to log into our portal account for the proper image.  But there's nothing there except a notice that "You have no active subscriptions."

    Thank you.

    –Karl



  • Replying with information from the console.

    Looks like the UFS file system is corrupted:  "mangled entry"

    How can I fix it?

    SMP: AP CPU #1 Launched!
    Timecounter "TSC" frequency 1750043358 Hz quality 1000
    Trying to mount root from ufs:/dev/ufsid/574df39d930fea05 [rw]…
    WARNING: / was not properly dismounted
    Configuring crash dumps...
    Using /dev/label/swap0 for dump device.
    ** SU+J Recovering /dev/ufsid/574df39d930fea05
    ** Reading 30539776 byte journal from inode 4.
    ** Building recovery table.
    ** Resolving unreferenced inode list.
    ** Processing journal entries.
    ** 49 journal records in 2560 bytes for 61.25% utilization
    ** Freed 0 inodes (0 dirs) 1 blocks, and 0 frags.

    ***** FILE SYSTEM MARKED CLEAN *****
    Filesystems are clean, continuing...
    Mounting filesystems...

    ___
    / f
    / p _
    / Sense
    _

        _
    _/

    Welcome to pfSense 2.3.1-RELEASE on the 'pfSense' platform...

    savecore: reboot after panic: ufs_dirbad: /: bad dir ino 11 at offset 512: mangled entry
    savecore: writing core to /var/crash/textdump.tar.0
    Creating symlinkpanic: ufs_dirbad: /: bad dir ino 11 at offset 512: mangled entry
    cpuid = 1
    KDB: enter: panic


  • Rebel Alliance Developer Netgate

    Boot to single user mode and run "fsck -y /" a few times until it does not find anything, don't stop when it claims to be clean. It may take 3-5 times.



  • Thank you.  Recovered.

    After logging in, there was a crash report waiting.  Do the developers want to see that?

    –Karl


  • Rebel Alliance Developer Netgate

    No, it's a generic filesystem panic. There wouldn't be anything useful in the crash report since it was crashing due to corruption of the filesystem.



  • This tends to happen with UFS after power failures (used to happen to me a fair bit at a few troublesome locations with bad power). Looking forward to 2.4 release which is any day now, that will bring ZFS to the table and hopefully put this issue to rest.



  • @jimp:

    Boot to single user mode and run "fsck -y /" a few times until it does not find anything, don't stop when it claims to be clean. It may take 3-5 times.

    Thanks!

    This got me all fixed in a hurry in the field today. 
    Day started out with me having a really bad attitude toward PFSense Store purchased SG-2440 equipment.
    I usually build my own PFsense boxes out of PC hardware for clients and they last for years without any issues
    other than keeping them up to date.

    This would have been the 4th SG-2440 that suddenly ate itself for lunch and failed. 
    This box has not been modified or changed lately and is on clean battery backed power. 
    So I wonder what might have caused this to happen suddenly.

    Thanks again & take care!


Log in to reply