Random Crashing
-
Just loaded up a pfsense box
HW:
AMD Athlon XP 3000+
1GB DDR
WD 250GB SATA HDDBox is located in a rack in server room and is staying cool.
Also passed an inquisitor burn in test
Getting random restarts
I see the crash dumps but nothing is jumping at me (weak linux skills )
Here is a link to the crash dump on paste bin
http://pastebin.com/YBJc3ghp
Only thing I saw was a DMA write delay right before it crashed so I installed PF to a flash drive and tried it and still getting crashes.
Can anyone give me any ideas on what is going on?
Thanks,
Lothar863 -
I would normally suggest, bad hard drive, bad ram, overheating, bad psu in that order but it seems like you've tested for that. It's an older box and bad caps on the board seem a likely suspect but that too should be shown by a burn in.
Which install type are you using? What NICs? (looks like 6x Realteks)
Any reason you're using 2.1.1 and not 2.1.2?
Any idea what this address is and why it keeps moving MACs?
<6>arp: 172.29.10.25 moved from 00:15:17:54:ee:00 to 00:15:17:54:ee:01 on re0
<6>arp: 172.29.10.25 moved from 00:15:17:54:ee:01 to 00:15:17:54:ee:00 on re0Steve
-
on 2.1.1 after reload to the flash drive. prior to reload was on 2.1.2
installed using new image downloaded from site and put onto a flash drive. then installed to another flash drive connected to native ports on the back of the system
the 172 is a virtual nic on another pf sense box at a remote location
not sure why the mac would change
checked all the caps before building box and no sign of swelling.
current temp is about 20 c in the server room and 23c on the cpu heat sink cpu reports 44F in bios
-
i was incorrect about the 172 that is going to a intel pro dual nic card. trunk may not be working properly on that card or it was connected tot he other port.
-
I am not familiar with "Inquisitor", but it seems to be a package of many different stress tests and "inquisitor burn in test" seems to reference specifically the CPU burn-in test. Have you actually done any specific memory tests?
-
Ah, good point. I just assumed it was a combined hardware test. Does it test the RAM and HD?
Which image are you using? Are you running full install?Steve
-
Defualt test is:
1800 sec cpu burn
19 step memory test
destructive HDD read write testand then repeat until stopped
also ran memtest 86+ and no errors
-
http://pastebin.com/YBJc3ghp
another one. I have turned off the 172 and it is still crashing
-
Nothing jumps out at me I'm afraid. :(
You may have wait for someone who can read those crash dumps correctly.The process that seems to be causing the problem is tcpdump so perhaps it's trying to do something that your NICs don't support. Do you know which Realtek NICs they are?Try disabling all the hardware off loading features in System: Advanced: Networking:
That's pretty much a guess but easy to try.Steve
-
http://www.newegg.com/Product/Product.aspx?Item=N82E16833704011
These are the cards I put in