Crash Report Analysis - Correct procedure?



  • Having just had my first crash after installing and running pfs 2.0.2 for a week or so - a couple of questions:

    1. Is there a procedure for requesting help / analysis of crash report? (Surprised no specific forum area?)
    2. Any suggested reading to help in analysis? (newbie to BSD)

    I have seen many people posting all of report but it appears only the early section has main relevance? In the hope that this is the case the first part of mine seems to show nothing?

    Crash report begins.  Anonymous machine information:

    i386
    8.1-RELEASE-p6
    FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 17:53:00 EST 2011    root@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8

    Crash report details:

    Filename: /var/crash/bounds
    1

    Filename: /var/crash/info.0
    Dump header from device /dev/ad0s1b
      Architecture: i386
      Architecture Version: 1
      Dump Length: 66560B (0 MB)
      Blocksize: 512
      Dumptime: Sun Dec 23 23:07:51 2012
      Hostname: pfsense.home
      Magic: FreeBSD Text Dump
      Version String: FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 17:53:00 EST 2011
        root@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8
      Panic String:
      Dump Parity: 3941382271
      Bounds: 0
      Dump Status: good

    Filename: /var/crash/textdump.tar.0

    Thanks in advance for any help / suggestions



  • Matching the crash report by the IP you're posting from here, the root of the problem is:

    MCA: Bank 4, Status 0xb200000000040151
    MCA: Global Cap 0x0000000000000005, Status 0x0000000000000004
    MCA: Vendor "GenuineIntel", ID 0x652, APIC ID 0
    MCA: CPU 0 UNCOR PCC ICACHE L1 IRD error

    Fatal trap 28: machine check trap while in user mode

    That means the hardware informed the OS it had a L1 cache error on the CPU, which is a hardware problem. The OS triggered a kernel panic to prevent corruption or other bad behavior that'd be inherent in running with cache errors. I'd guess probably with the CPU, though I guess the motherboard could potentially induce such an error. That's the first time I've seen a report like that, no one else has reported anything similar here, but if you Google on that last line you'll find discussion of it with FreeBSD in general. It's definitely a hardware problem though.



  • Thanks for the rapid response - much appreciated.

    Based upon your help (and lots of Googling) it sems most likely culprits are RAM, power supply and CPU itself.  I am working on the first two initially - being the easiest.

    Thanks again


  • Rebel Alliance Developer Netgate

    FYI- That is not a 2.0.2 crash report.

    Your kernel is this:

    Version String: FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 17:53:00 EST 2011
        root@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8

    Which is 2.0.1.

    So either it's still on 2.0.1, or the upgrade did not fully complete.

    Aside from other troubleshooting, I'd definitely schedule a config backup and reinstall.



  • Sorry - simple typo - running 2.0.1

    Thanks


  • Rebel Alliance Developer Netgate

    OK, in that case once you have found/fixed the other issue(s), make sure to upgrade to 2.0.2.



  • Just a small update - in case it helps anyone else.  I appear to have solved the problem - at least no further crashes for a week.  Only changes

    1. Assigned different interrupts to each network card. Previously they were all sharing IRQ11 - now on 9,10,11.
    2. Increased RAM to 256MB  (Now very rarely, if ever,  needs to swap out)

    I have tried pushing everything to the limit - managed to get load averages above 15 - with continuously 0% idle for 20-30 minutes - with no apparent issues. To be really correct I should have changed only one thing to verify the true cause - but as it now works well I plan to leave it well alone.

    So all systems go - only one last question.  I am now running very happily with 2.0.1.

    Should I upgrade to 2.0.2?  I see posted some small issues - admittedly most of which would not affect me. I am always hesitant about upgrading anything to a new version if I don't need to.  Looking in the "what's new" I don't see anything that I desperately need?

    So - upgrade now or wait?


  • Netgate Administrator

    It's more about the security fixes than the extra features. Look at the release notes to see what applies to you.
    As JimP said recently it looks like there will be a further update relatively soon to deal with the pppoe DNS issue. You could wait for that if it applies to you.
    I have upgraded with no issues.

    Steve


Locked