XG-1541 Weird Boot Behavior and Changing FSCK Messages
-
We're running an XG-1541 and I came into the office to find it had not recovered from a scheduled reboot. When I got in in the morning it was offline and when I checked at the KVM it had a Blank screen, no messages, no response to input. I restarted it and noted that the "Intel Boot Agent" screen reset numerous times before it moved on to the standard boot loader. The Intel Boot agent message appears at the top it then below it "press "ctrl-s" to enter setup" appears, the message to enter setup disappears, then the Intel boot agent message does, the screen goes black then the whole thing repeats. It did this 5 or 6 times before continuing.
Once it got into the menu I went to the shell and ran a FSCK and got a bunch of errors, however not being a BSD/*NIX admin I am unfamiliar with FSCK output I'm not sure how serious they are. I rebooted into single user and ran it again to repair, it marked the drive clean. Rebooted back to normal mode and again multiple errors. The boot behavior was the same each time, resetting at the Intel boot agent message multiple times. The output from FSCK varies, if i run it several times in a row I will frequently get different messages, I've included a couple below:
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=4333827 OWNER=root MODE=100666
SIZE=0 MTIME=Sep 25 07:42 2019
CLEAR? no** Phase 5 - Check Cyl groups
208838 files, 3777527 used, 16479844 free (462412 frags, 2002179 blocks, 2.3% fragmentation)And run immediately after that I get:
SETTING DIRTY FLAG IN READ_ONLY MODE
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
INCORRECT BLOCK COUNT I=4333845 (8 should be 0)
CORRECT? noINCORRECT BLOCK COUNT I=4333846 (408 should be 0)
CORRECT? no** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=4333827 OWNER=root MODE=100666
SIZE=0 MTIME=Sep 25 07:42 2019
CLEAR? no** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? noSUMMARY INFORMATION BAD
SALVAGE? noBLK(S) MISSING IN BIT MAPS
SALVAGE? no208838 files, 3777470 used, 16479792 free (462408 frags, 2002173 blocks, 2.3% fragmentation)
The hard drive is the Intel 53x and Pro 2500 Series SSD that shipped with the system and still passes SMART tests, but this seems like it could be failing to me. The boot agent could have problems with the hard drive causing it to loop temporarily.
I'm going to get a new drive and attempt to clone it. But before that are there any other recommended diagnostics I should run?
-
i don't think it's anything serius from the fsck log
when you reboot to single user mode launch
fsck -y /
multiple times even if it tell you that the file system is clean
at least run it 5 / 6 times
this could happen after a blackoutbut there could be a worse problem from the KVM
-
I'll try that tomorrow morning when I get in and can take it offline, Thanks.
Is the stutter at the Intel boot agent normal for this model? I don't remember noticing it do so before.
-
No that does not sounds like the normal boot output.
If running fsck multiple times from single user mode does not correct it I would open a ticket with us. https://go.netgate.com
Steve
-
Yeah no change. Booted single user and ran FSCK about 20 times it would say clean each time but rebooted and tried from the shell only to get an SU+J error and a long list of incorrect block counts right away. I"ll open a ticket, thanks everyone.