Kernel Panic
-
When a master syncs the config to a slave it makes several alterations in the process.
Just flipping a slave to a master in the XMLRPC settings it going to cause you problems because of this.
If you really want to swap their roles, restore the master's config to the slave box and vice versa.
When a CARP slave takes over automatically - if you have configured everything right - it can function that way indefinitely until you bring the master back up, you just can't make any changes to the slave's config you want to keep. But that is all a topic for another thread. It's not the source of the panic - it's just a Bad Thing.
-
Good to know - I'll avoid doing that in the future.
-
Ok so I waited until I thought this snapshot would be okaish… Within 10 minutes of installing this my machine crashed:
2.0-BETA5 (i386)
built on Fri Feb 4 15:47:28 EST 2011Pentium(R) Dual-Core CPU E5400 @ 2.70GHz
[2.0-BETA5][root@fw.home]/root(1): uname -a
FreeBSD fw.home 8.1-RELEASE-p2 FreeBSD 8.1-RELEASE-p2 #1: Fri Feb 4 15:45:20 EST 2011 sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.8 i386–---------
Did not do anything special on the firewall when it died.
-
-
Sorry, I haven't had time to exactly test the new snapshots till now. I loaded it up with 2.0-BETA5 (i386) built on Sun Feb 6 23:54:00 EST 2011. Crashed when trying to use openVPN. This is the P4 Dell machine that was described prior.
Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350 KDB: stack backtrace: db_trace_self_wrapper(c0eb72bd,ccc3ca90,c0a41f05,546,0,...) at db_trace_self_wrapper+0x26 kdb_backtrace(546,0,ffffffff,c145d504,ccc3cac8,...) at kdb_backtrace+0x29 _witness_debugger(c0eb97d5,ccc3cadc,4,1,0,...) at _witness_debugger+0x25 witness_warn(5,0,c0ef7b9a,14,c131b440,...) at witness_warn+0x20d trap(ccc3cb68) at trap+0x19e calltrap() at calltrap+0x6 --- trap 0xc, eip = 0xc0a611ca, esp = 0xccc3cba8, ebp = 0xccc3cbb8 --- m_tag_delete(c341cd00,dedeadc0,c341cd00,c341cd00,ccc3cbf0,...) at m_tag_delete+0x5a m_tag_delete_chain(c341cd00,0,c0e6e71f,0,c2ed91e0,...) at m_tag_delete_chain+0x3f mb_dtor_mbuf(c341cd00,100,0,c0a42958,df,...) at mb_dtor_mbuf+0x35 uma_zfree_arg(c1d7e380,c341cd00,0,1e,ccc3cc84,...) at uma_zfree_arg+0x29 m_freem(c341cd00,4,c0e6e71f,b87,c2f4e000,...) at m_freem+0x43 lem_txeof(c2f52580,0,c0e6e71f,546,c2f525bc,...) at lem_txeof+0x158 lem_handle_rxtx(c2f4e000,1,c0eb8b8e,4f,c2edb8d8,...) at lem_handle_rxtx+0x60 taskqueue_run(c2edb8c0,c2edb8d8,c0ea5f47,0,c0eb21ed,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed5c,344,c131b440,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3b180,c2f525ec,ccc3cd38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0xdedeadc0 fault code= supervisor read, page not present instruction pointer= 0x20:0xc0a611ca stack pointer = 0x28:0xccc3cba8 frame pointer = 0x28:0xccc3cbb8 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 0 (em0 taskq) [thread] Stopped at m_tag_delete+0x5a: movl 0(%ecx),%eax db> bt Tracing pid 0 tid 64050 td 0xc2f4bc80 m_tag_delete(c341cd00,dedeadc0,c341cd00,c341cd00,ccc3cbf0,...) at m_tag_delete+0x5a m_tag_delete_chain(c341cd00,0,c0e6e71f,0,c2ed91e0,...) at m_tag_delete_chain+0x3f mb_dtor_mbuf(c341cd00,100,0,c0a42958,df,...) at mb_dtor_mbuf+0x35 uma_zfree_arg(c1d7e380,c341cd00,0,1e,ccc3cc84,...) at uma_zfree_arg+0x29 m_freem(c341cd00,4,c0e6e71f,b87,c2f4e000,...) at m_freem+0x43 lem_txeof(c2f52580,0,c0e6e71f,546,c2f525bc,...) at lem_txeof+0x158 lem_handle_rxtx(c2f4e000,1,c0eb8b8e,4f,c2edb8d8,...) at lem_handle_rxtx+0x60 taskqueue_run(c2edb8c0,c2edb8d8,c0ea5f47,0,c0eb21ed,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed5c,344,c131b440,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3b180,c2f525ec,ccc3cd38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 --- db> It is kinda nice though that the pfSense now knows it crashed and asks me if I would like to send the error info to the developers. [quote]We have detected a crash report. Click here for more information.[/quote] though the only thing the tab shows on the error report is this [code]Crash report begins. Anonymous machine information: i386 8.1-RELEASE-p2 FreeBSD 8.1-RELEASE-p2 #1: Sun Feb 6 23:47:16 EST 2011 sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_Dev.8 Crash report details: 2048 [/code][/thread]
-
I don't have access to my pfSense box(s) to get crash info at the moment, but I can say I am in the same boat as LostInConfusion - still getting panics when I try to access a pfSense box through the VPN (though in my case it's IPSec, not OpenVPN). This is using the most recent snapshot. For what it's worth, the previously mentioned carp lock issue probably wasn't at the root of the issue I am having, since I am not using any carp functionality whatsoever with my setup…
-
It is kinda nice though that the pfSense now knows it crashed and asks me if I would like to send the error info to the developers.
We have detected a crash report. Click here for more information.
though the only thing the tab shows on the error report is this
Crash report begins. Anonymous machine information: i386 8.1-RELEASE-p2 FreeBSD 8.1-RELEASE-p2 #1: Sun Feb 6 23:47:16 EST 2011 sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_Dev.8 Crash report details: 2048
That one is a false positive.
The actual report would have done a lot more.
If it stops at a db> prompt, it isn't collecting the crash info automatically for whatever reason.
-
@jimp - any ideas about my crash? I assume it is related to this thread…
-
I'm in the same situation there as well - my pfSense box stops at the 'db>' prompt after a panic, generating similar information-free reports. Also, in contrast to what I believe other folks have reported, my pfSense box doesn't automatically reboot after a panic, but rather sits happily at the db> prompt, patiently - and indefinitely - awaiting my input. (Which is why I am loathe to do any testing unless I'm within arms reach of the particular box in question!) Is there a setting I've missed somewhere that enables this automatic rebooting, or is my pfSense box just that unhappy?
-
pwnell,
I don't know, it looks similar to another panic that was already reported over the weekend. Have you tried a new snapshot?
fasteddy,
What type of install are you on? full/nano? i386 or amd64? How much RAM and swap space?
If you can get a capture of the panic (even a picture of it) and the output of 'bt' at the debug prompt it would help.
-
I am on a full install, i386, 2GB RAM, 80GB Hard drive, 4GB swap space. The panic information, backtrace info, and general situation remain the same as my original post (on page 15 of this thread). http://forum.pfsense.org/index.php/topic,31721.msg170362.html#msg170362
-
I am on a full install, i386, 2GB RAM, 80GB Hard drive, 4GB swap space. The panic information, backtrace info, and general situation remain the same as my original post (on page 15 of this thread). http://forum.pfsense.org/index.php/topic,31721.msg170362.html#msg170362
Show me the output of:
sysctl debug.ddb
-
Here goes:
# sysctl debug.ddb debug.ddb.capture.data: debug.ddb.capture.bufsize: 49152 debug.ddb.capture.inprogress: 0 debug.ddb.capture.maxbufsize: 5242880 debug.ddb.capture.bufoff: 0 debug.ddb.scripting.unscript: debug.ddb.scripting.scripts: lockinfo=show locks; show alllocks; show lockedvnods kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset kdb.enter.witness=run lockinfo debug.ddb.textdump.do_version: 1 debug.ddb.textdump.do_panic: 1 debug.ddb.textdump.do_msgbuf: 1 debug.ddb.textdump.do_ddb: 1 debug.ddb.textdump.do_config: 1 debug.ddb.textdump.pending: 0
-
That all looks normal, like it should be doing the textdumps…
What about:
cat /boot/kernel/pfsense_kernel.txt
-
Doesn't have a whole lot to say:
# cat /boot/kernel/pfsense_kernel.txt UP #
-
It's enough :-)
That's the uniprocessor kernel. I'm running the SMP kernels. Perhaps that is the difference. Can you switch to an SMP kernel and see if the crashes leave you at the debug prompt still? There are instructions earlier in this thread for switching kernels (also you can edit that text file and change it to SMP instead of UP and the next update will use the SMP kernel instead)
-
Installed the latest one available
2.0-BETA5 (i386)
built on Sun Feb 6 23:54:00 EST 2011I have a 1 hour 15 minute uptime …. Holding thumbs.
-
JimP,
Did you need me to do anything while I am here working on other projects? Anything I can do to help debug this?
-
No, Ermal is going to be the one having to look into the actual panics.
I was just trying to figure out why some people aren't getting the proper crash dumps like they should.
-
I've changed to the SMP kernel and updated to the latest snapshots as of 15 minutes ago, and my machine seems to be automatically rebooting and generating proper crash reports now. Thanks jimp!
Unfortunately, I am still able to crash my system at will the same way I have been from the get go - by accessing the web gui through the VPN. I sent the (now useful!) crash report via the web gui, is there anything I can/should do beyond that at this point?