Fatal Trap 12 every few days…

GoldServe

Hrm, doesn't the system log files get overwritten on every boot?

eskild

My fw had a similar crash today! psSense 2.0.

I have never experienced this before. If there anyway I can collect information from pfSense after a reboot that makes troubleshooting easier? Logs, coredumps?

I have used pfSense since Rel1 was in alpha stage, and this is the first time I had to physically reset the fw due to instabillity.

Please see the attached image.

pfSenseCrash-05dec2011.jpg_thumb

GoldServe

Can you type "bt" at the prompt?

eskild

bt: Command not found.

I have restarted the fw though as no traffic was possible after the crash.

GoldServe

you must have an embedded platform where the kernel does not allow "back trace"

I believe there is a way to change the kernel on the embedded platform. Someone more knowledgeable can chime in.

eskild

I do not have embedded plattform. I have a default x86 install: pfSense 2.0-RELEASE-pfSense (i386)

stephenw10

You can only run a back trace at the db> prompt, after a crash, not if:

@eskild:

I have restarted the fw though as no traffic was possible after the crash.

Steve

eskild

Thanks, I suspected that.
It might be good to evaluate features that captures system information that can be used for troubleshooting by the dev team later.
I was surprised that after the boot, there were no traces from the crash at all, and impossible to provide any hard evidence of what have happened.
I doubt that there are many firewalls that can be offline for a long time while consulting support. Most of us need to reboot and have the system back in service right away.

Just my two cents.

Back to the problem at hand. Is it possible that the crash can be caused by memory issue (RAM)? I have seen instabillities on other systems being caused by failing RAM.

Thanks

jimp

We fixed the fact that some crashes do not automatically restart in 2.0.1/2.1, but it's an easy fix:

Edit /etc/ddb.conf and change

script kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset

to

script kdb.enter.default=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset

(So just change kdb.enter.panic to kdb.enter.default)

Then run:

/sbin/ddb /etc/ddb.conf

From that point on it should collect the debug data and reboot itself automatically, and also give you a crash report notice in the GUI that you can use to upload the data to our servers (or grab it from /var/crash yourself)

From that panic it could be faulty hardware, but it's hard to say for sure. Usually if it's bad RAM the crashes would be in a different place every time, not in the exact same path. Though it could be a faulty NIC.

atul

Hello,

I am also getting the same error every few days, or sometimes more than once a day. I have recently upgraded from 1.2.3 to 2.0.1. But, the pfSense was crashing and restarting before the upgrade, so it is not "only" associated with 2.0.1 release.

I am attaching the entire crash log (long) that I was able to see on the GUI. I have sent it to pfSense team for further analysis.

Atul.

[Crash Report.txt](/public/imported_attachments/1/Crash Report.txt)

jimp

@atul:

Hello,

I am also getting the same error every few days, or sometimes more than once a day. I have recently upgraded from 1.2.3 to 2.0.1. But, the pfSense was crashing and restarting before the upgrade, so it is not "only" associated with 2.0.1 release.

I am attaching the entire crash log (long) that I was able to see on the GUI. I have sent it to pfSense team for further analysis.

Atul.

That crash is in code writing to the filesystem. There is very little likelihood there is a problem in that code, it's been solid for years on FreeBSD.

More likely your HDD or storage media has issues, or it could be cabling/controller/DMA issues, but it's definitely storage.

atul

Thanks jimp. I will change the hard disk and check again.

Out of curiosity - how did you know that this is storage related?

Atul.