Kernel Panic
-
I have pushed some patches that should solve even the carp panic Slaygon.
So newest snapshots should have the fix for you.
I would skip the next one coming and get the other one. -
just tried snap
2.0-BETA5 (i386)
built on Fri Jan 28 05:30:15 EST 2011and it crashed. :)
The only one i had any luck with is the first kernel that was posted.
Just so i am clear… the snaps have the correct kernel or do i still need to download from the separate link?
-
@ermal:
I have pushed some patches that should solve even the carp panic Slaygon.
So newest snapshots should have the fix for you.
I would skip the next one coming and get the other one.So the "Fri Jan 28 05:30:15 EST 2011" is not the one you recommend for the carp fixes, but instead wait for the next one?
I'm currently on "Fri Jan 28 00:53:50 EST 2011".Oh, and my backup just became the master spontaneously again.
Excellent work btw. Much appreciated!
-
yes, wait for the next one. It was just restarted to pick up the patches he pushed.
-
@vito please type a bt at that prompt next time.
-
Loaded 2.0-BETA5 (i386) built on Fri Jan 28 05:30:15 EST 2011 running on a dell Intel(R) Pentium(R) 4 CPU 2.40GHz onboard gig nic (em)
Enter an option: Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350 KDB: stack backtrace: db_trace_self_wrapper(c0eb7ccb,ccc3ca90,c0a421c5,546,0,...) at db_trace_self_wrapper+0x26 kdb_backtrace(546,0,ffffffff,c145df04,ccc3cac8,...) at kdb_backtrace+0x29 _witness_debugger(c0eba1e3,ccc3cadc,4,1,0,...) at _witness_debugger+0x25 witness_warn(5,0,c0ef8592,2d05a8c0,c131be40,...) at witness_warn+0x20d trap(ccc3cb68) at trap+0x19e calltrap() at calltrap+0x6 --- trap 0xc, eip = 0xc0a61478, esp = 0xccc3cba8, ebp = 0xccc3cbb8 --- m_tag_delete(c2fef600,dedeadc0,c2fef600,c2fef600,ccc3cbf0,...) at m_tag_delete+0x48 m_tag_delete_chain(c2fef600,0,c0e6f12d,0,c2ed9720,...) at m_tag_delete_chain+0x3f mb_dtor_mbuf(c2fef600,100,0,c0a42c18,df,...) at mb_dtor_mbuf+0x35 uma_zfree_arg(c1d7e380,c2fef600,0,72,ccc3cc84,...) at uma_zfree_arg+0x29 m_freem(c2fef600,4,c0e6f12d,b87,c2f4e000,...) at m_freem+0x43 lem_txeof(c2f52580,0,c0e6f12d,546,c2f525bc,...) at lem_txeof+0x158 lem_handle_rxtx(c2f4e000,1,c0eb959c,4f,c2edb8d8,...) at lem_handle_rxtx+0x60 taskqueue_run(c2edb8c0,c2edb8d8,c0ea6955,0,c0eb2bfb,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaf76a,344,c131be40,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3b440,c2f525ec,ccc3cd38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xdedeadc0 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0a61478 stack pointer = 0x28:0xccc3cba8 frame pointer = 0x28:0xccc3cbb8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (em0 taskq) [thread] Stopped at m_tag_delete+0x48: movl 0(%ecx),%eax db> bt Tracing pid 0 tid 64050 td 0xc2f4bc80 m_tag_delete(c2fef600,dedeadc0,c2fef600,c2fef600,ccc3cbf0,...) at m_tag_delete+0x48 m_tag_delete_chain(c2fef600,0,c0e6f12d,0,c2ed9720,...) at m_tag_delete_chain+0x3f mb_dtor_mbuf(c2fef600,100,0,c0a42c18,df,...) at mb_dtor_mbuf+0x35 uma_zfree_arg(c1d7e380,c2fef600,0,72,ccc3cc84,...) at uma_zfree_arg+0x29 m_freem(c2fef600,4,c0e6f12d,b87,c2f4e000,...) at m_freem+0x43 lem_txeof(c2f52580,0,c0e6f12d,546,c2f525bc,...) at lem_txeof+0x158 lem_handle_rxtx(c2f4e000,1,c0eb959c,4f,c2edb8d8,...) at lem_handle_rxtx+0x60 taskqueue_run(c2edb8c0,c2edb8d8,c0ea6955,0,c0eb2bfb,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaf76a,344,c131be40,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3b440,c2f525ec,ccc3cd38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 --- db> Happened instantly when I tried to OpenVPN in[/thread]
-
it is a hard lock…can't type anything in.
On reboot, still do not see the crash files that jimp added. -
it is a hard lock…can't type anything in.
On reboot, still do not see the crash files that jimp added.As I said before, those won't be created for hard locks, only for panics/crashes.
-
Just seen this come out: "Fri Jan 28 13:06:21 EST 2011"…
Is this the one where the carp sync problems are addressed?If so, it will take me a number of hours to get this applied and verified. 1-4 hours of active monitoring before the bug usually appears, however, I will not be able to start testing all too soon. Expecting about 2 to 8 hours before I can start monitoring.
If anyone could start testing sooner, that would be great!
(for those that don't know, I run two DL-type HP servers with quad Intel gbit NICs in them)
And again, great work pf team! Really appreciated!
-
If the carp patches were also pushed to the AMD version 2.0-BETA5 (amd64)
built on Fri Jan 28 13:06:21 EST 2011 then it's not fixed for me. As soon as I make any change and it syncs the backup firewall locks or panics. It's not a hard lock though, it does bring me to a prompt but not seeing any of crash files in the /var/crash.Andy
-
If the carp patches were also pushed to the AMD version 2.0-BETA5 (amd64)
built on Fri Jan 28 13:06:21 EST 2011 then it's not fixed for me. As soon as I make any change and it syncs the backup firewall locks or panics. It's not a hard lock though, it does bring me to a prompt but not seeing any of crash files in the /var/crash.Andy
It's the amd64 version I am running too.
Not too good news there, Andy. Sorry to hear.
And usually, with the carp errors, there seems to be a freeze, rather than a crash, which will render no trace dumps.
Sometimes, very unusual though, I get a debug prompt at which I can get a backtrace. Mostly the box just hangs.I guess the devs would appreciate any input they can get hold of, so if you do get to a prompt and can type something in, give "bt" (as in backtrace) a go and see if you could provide any more info.
Cheerio.
-
I just tried on my amd64 vm and when I force a manual panic (sysctl debug.kdb.panic=1) I get a textdump, and no db> prompt. I'm not sure why someone on a current snapshot update would still be getting left at a db> prompt when it crashes, they should be gathering the info and rebooting on their own.
Doesn't help with the hangs, though, but the hangs are a different problem (and a different thread :-)
-
Here is what I get when carp tries to sync. This is running snapshot version amd64-20110128-0938.
Thanks,
Andy
-
That snap was before the last fixes went in. Try the newest one.
-
That snap was before the last fixes went in. Try the newest one.
pfSense-Full-Update-2.0-BETA5-amd64-20110128-0938.tgz 28-Jan-2011 13:09 99MThat's the latest I see on the snapshots server. Am I looking in the wrong spot? http://snapshots.pfsense.org/FreeBSD_RELENG_8_1/amd64/pfSense_HEAD/updates/?C=M;O=D
-
It's probably still building right now. Give it some time…
-
still problems on latest snap…
2.0-BETA5 (i386)
built on Sat Jan 29 01:09:59 EST 2011this time, i could type bt (no hard lock)
Here is the out put (sorry had to be a cell pic)
-
One more patch went in after that build, it's going right now and will be ready in a few hours.
-
sounds good thanks!
-
has anyone been running the latest snapshot? does it still panic and all?