v2.4.4 crashing under load with MCA error
-
MCA: CPU 2 UNCOR PCC OVER internal timer error
Above is the error I'm getting. MCA = Machine Check Architecture.
Error occurs randomly, but seems to be replicable if I simply load a few web pages or videos at once from the one PC that is connected to it.
It's not overheating - ~100degF give or take.
This is on a Dell Optiplex 390. Actually, two identical boxes are producing the same exact error.
CPU: Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz (3292.59-MHz K8-class CPU).
Origin="GenuineIntel" Id=0x206a7 Family=0x6 Model=0x2a Stepping=7I'm not overclocking or anything like that. As far as I know these boxes have never been overclocked. (I got them barely used from a real-estate office).
Also, using an Intel 4-port card, but that doesn't seem to be the issue as near as I can tell.
igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x3020-0x303f mem 0xe3420000-0xe343ffff,0xe3000000-0xe33fffff,0xe3450000-0xe3453fff irq 18 at device 0.0 on pci3
I've googled the error code to death.Here's the things I've tried:
BIOS updated (Currently A14 - latest).
Reset BIOS defaults.
Tried two different PCs, identical Optiplex 390's with 4Gb ram.
Ran thorough system diagnostics - no issues. (RAM or Processor).
Tried full reinstall on both boxes.
Checked for bad capacitors on the mboard - no sign of any issues.Also, Win10 and Linux Mint ran just fine on both of these boxes for about a year.
Help?
-
MCA errors are hardware errors caught by FreeBSD, there is nothing pfSense can do about them. Talk to the FreeBSD devs, but it's likely you need new/different hardware.
-
MCA errors are almost exclusively hardware. Though it could be some hardware issue that only FreeBSD/pfSense tickles.
Can you test it with FreeBSD 11.2?
There is usually a lot more lines of MCA errors and probably a kernal panic line. Do you have those?
Steve
-
Thanks very much for the response... here's the full MCA:
MCA: Misc 0x3ffff
MCA: Address 0x3fff80609d46
MCA: CPU 0 UNCOR PCC OVER internal timer error
MCA: Vendor "GenuineIntel", ID 0x206a7, APIC ID 0
MCA: Global Cap 0x0000000000000c07, Status 0x0000000000000004
MCA: Bank 3, Status 0xfe00000000800400What are the odds that both identical boxes have the same hardware issue?
-
There was no panic string shown though? No crash report after rebooting?
Unless it's some common fault on that board it's more likely an issue only FreeBSD is hitting as I say. The BIOS may configure the hardware for Windows or Linux but passes different values to BSD for example.
Steve
-
$ mcelog --no-dmi --ascii --file mce.log Hardware event. This is not a software error. CPU 0 BANK 3 MISC 0 ADDR 0 MCG status: STATUS fe00000000800400 MCGSTATUS 0 APICID 0 SOCKETID 0
You have a hardware problem.