Unrecoverable machine check exception
-
I found some information that may be relevant that points to hardware. I have had linux and windows installed on the PC previously without any issues. I noticed from a google search someone else had the same hardware issues on a Dell OptiPlex with a Intel Pro NIC installed.
I have upgraded the BIOS to the newest version. Could FreeBSD/PFSense have an incompatibility with this hardware or is this truly only a hardware issue that BSD recognizes?
MCA: Bank 3, Status 0xfe00000000800400
MCA: Global Cap 0x0000000000000c09, Status 0x0000000000000004
MCA: Vendor "GenuineIntel", ID 0x206a7, APIC ID 0
MCA: CPU 0 UNCOR PCC OVER internal timer error
MCA: Address 0x3fff806160dd
MCA: Misc 0x3ffff
panic: Unrecoverable machine check exception
cpuid = 0
KDB: enter: panic -
MCA/MCE errors are 100% hardware. Nothing to do with the OS or any software.
$ mcelog --no-dmi --ascii --file mce.log Hardware event. This is not a software error. CPU 0 BANK 3 MISC 3ffff ADDR 3fff806160dd MCG status:MCIP STATUS fe00000000800400 MCGSTATUS 4 MCGCAP c09 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 42
Not much more helpful. Looks like a CPU problem, but maybe a slight chance it's power/heat.
-
@jimp Thank you for sharing that information. I assume the Dell diagnostic test does not perform a deep enough test to identify the culprit. This leads me down the right path.
-
Edit:
I replaced The Dell Optiplex 790 completely with a known good one and same crashes, same error message to the letter. The only piece of hardware that was the same was an Intel Pro 1000 NIC. After replacing the NIC the issue is no longer present.I was incorrect in believing this issue was related to PFSense. PFSense assisted me in discovering bad hardware as did Jimp.
MCA: Bank 3, Status 0xfe00000000800400
MCA: Global Cap 0x0000000000000c09, Status 0x0000000000000004
MCA: Vendor "GenuineIntel", ID 0x206a7, APIC ID 0
MCA: CPU 0 UNCOR PCC OVER internal timer error
MCA: Address 0x3fff805ea790
MCA: Misc 0x3ffff
panic: Unrecoverable machine check exception
cpuid = 0
KDB: enter: panic -
Linux and Windows works fine with this PC and the other one that was crashing with PFsense. It must be a issue with this hardware compatibility with FreeBSD.
-
Then you have something else wrong in your environment, maybe bad power. That error cannot come from software. It is generated in the hardware/BIOS.
-
Bad power would be odd since I have it connected to a surge suppressor with two Ubuntu servers that stay on 24/7 without any crashes. Anyway thanks for the response.
-
@mokfarg said in Unrecoverable machine check exception:
Dell Optiplex 790
If both were the same model, same vintage, they may have the same hardware issue. Bad capacitors hit in waves like that.
If you don't believe me about the errors being hardware, research Machine Check Exceptions: https://en.wikipedia.org/wiki/Machine-check_exception
I know you do not want to believe it's hardware, but there is literally no way for software to trigger those.
-
I appreciate the response I truly do. I guess that is a possibility. The only piece of hardware that I have had in both PCs was an Intel NIC, I guess it could cause a kernel panic? I'll remove it and test, usually the crashes happen quickly.
-
Edit:
I replaced The Dell Optiplex 790 completely with a known good one and same crashes, same error message to the letter. The only piece of hardware that was the same was an Intel Pro 1000 NIC. After replacing the NIC the issue is no longer present.I was incorrect in believing this issue was related to PFSense. PFSense assisted me in discovering bad hardware as did Jimp.
MCA: Bank 3, Status 0xfe00000000800400
MCA: Global Cap 0x0000000000000c09, Status 0x0000000000000004
MCA: Vendor "GenuineIntel", ID 0x206a7, APIC ID 0
MCA: CPU 0 UNCOR PCC OVER internal timer error
MCA: Address 0x3fff805ea790
MCA: Misc 0x3ffff
panic: Unrecoverable machine check exception
cpuid = 0
KDB: enter: panic