Kernel Panic
-
I had a thread started on this here from dec 13, but noticed this one
http://forum.pfsense.org/index.php/topic,31031.msg163019.html#msg163019i just posted this pic on my thread.
EDIT: If this helps, i have mutiple vlans on em1 on this box. A few of the other firewalls with the problem, no vlans on the interface.
-
Here is a screen shoot of the debug in the dev kernel. Can not get to the console at the moment for any more info
-
I have exactly the same kernel crashes in FreeBSD7 and 8. To me, it seems to be related to VLAN tags and NICs.
http://forums.freebsd.org/showthread.php?t=18676
(see last post #8)
-
Successfully grabbed the panic in developer mode
Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350 KDB: stack backtrace: X_db_sym_numargs(c0eb7066,e302aa90,c0a41d45,546,0,...) at X_db_sym_numargs+0x146 kdb_backtrace(546,0,ffffffff,c145d1ac,e302aac8,...) at kdb_backtrace+0x29 witness_display_spinlock(c0eb957e,e302aadc,4,1,0,...) at witness_display_spinlock+0x75 witness_warn(5,0,c0ef792d,14,c131b140,...) at witness_warn+0x20d trap(e302ab68) at trap+0x19e alltraps(c336dc00,dedeadc0,c336dc00,c336dc00,e302abf0,...) at alltraps+0x1b m_tag_delete_chain(c336dc00,0,c0e6e512,0,c2ed9bc0,...) at m_tag_delete_chain+0x3f reallocf(c336dc00,100,0,c0a42798,df,...) at reallocf+0x8a5 uma_zfree_arg(c1d7e380,c336dc00,0,bc,e302ac84,...) at uma_zfree_arg+0x29 m_freem(c336dc00,4,c0e6e512,b87,c2f4e000,...) at m_freem+0x43 ed_probe_RTL80x9(c2f52580,0,c0e6e512,546,c2f525bc,...) at 0xc06ec448 ed_probe_RTL80x9(c2f4e000,1,c0eb8937,4f,c2edb918,...) at 0xc06efe10 taskqueue_run(c2edb900,c2edb918,c0ea5cf0,0,c0eb1f96,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,e302ad38,c0eaeb05,344,c131b140,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3afc0,c2f525ec,e302ad38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xe302ad70, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0xdedeadc0 fault code= supervisor read, page not present instruction pointer= 0x20:0xc0a60fe8 stack pointer = 0x28:0xe302aba8 frame pointer = 0x28:0xe302abb8 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 0 (em0 taskq) [thread] Stopped at m_tag_delete+0x48: movl 0(%ecx),%eax db> [/thread]
-
Looks like it may be in the Intel driver… fun.
-
so, is it something you guys can fix or is it something with freebsd?
-
Hopefully it's fixable, though lem is legacy em, so it's some rather old/early em chipsets. Might explain why so few people have hit it.
-
If it is an intel problem, then why would I be recieving the same problem on my soekris net5501-70 board? that uses a VIA VT6105M chip.
-
Without a backtrace from there it's impossible to say.
-
Hopefully it's fixable, though lem is legacy em, so it's some rather old/early em chipsets. Might explain why so few people have hit it.
Jim
where do you see it is an old legacy nic/driver? (not questioning, just Curious) :)
just an fyi, had no problem with oct snaps.Thanks for your help
-
where do you see it is an old legacy nic/driver? (not questioning, just Curious) :)
just an fyi, had no problem with oct snaps.exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
if_lem.c. lem is legacy em, a normal em card would have been in if_em.c
-
Hmm - that line looks similar to the panic I get when captive portal is enabled on my box.
exclusive sleep mutex fxp0 (network driver) r = 0 (0xc36de018) locked @ /usr/pfSensesrc/src/sys/dev/fxp/if_fxp.c:1288
Details here:
http://forum.pfsense.org/index.php/topic,30791.msg159227.html#msg159227CryoGenID gets a panic but with yet another set of drivers. Cino does as well.
http://forum.pfsense.org/index.php/topic,29839.60.htmlIs there anything we can do to help besides posting back traces?
-
I just spent a bit of time on the phone with someone who hit this. It does seem to be related to OpenVPN somehow (or the kind of traffic that is seen more often with OpenVPN I suppose). Once we had the developer kernel on it stayed up for quite a while until we had someone connect with OpenVPN and generate some traffic.
-
A patch was just committed by ermal that might be a potential fix for this, or at least change the behavior somewhat. Give the next snapshot a try.
-
Just as a side note I was getting a kernal panic with the PPPOA interface being selected to WAN rather than rl1. Not sure if that is due to incorrect config but if so might be worth removing to save people the hassle.
-
Still getting the panic, I don't think the commit happened on the most recent snap.
Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350 KDB: stack backtrace: X_db_sym_numargs(c0eb72fb,ccc3ca90,c0a41f25,546,0,...) at X_db_sym_numargs+0x146 kdb_backtrace(546,0,ffffffff,c145d42c,ccc3cac8,...) at kdb_backtrace+0x29 witness_display_spinlock(c0eb9813,ccc3cadc,4,1,0,...) at witness_display_spinlock+0x75 witness_warn(5,0,c0ef7bc2,14,c131b3c0,...) at witness_warn+0x20d trap(ccc3cb68) at trap+0x19e alltraps(c341ab00,dedeadc0,c341ab00,c341ab00,ccc3cbf0,...) at alltraps+0x1b m_tag_delete_chain(c341ab00,0,c0e6e75d,0,c2ed9d50,...) at m_tag_delete_chain+0x3f reallocf(c341ab00,100,0,c0a42978,df,...) at reallocf+0x8a5 uma_zfree_arg(c1d7e380,c341ab00,0,d5,ccc3cc84,...) at uma_zfree_arg+0x29 m_freem(c341ab00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43 ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8 ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0 taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed9a,344,c131b3c0,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3b1a0,c2f525ec,ccc3cd38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0xdedeadc0 fault code= supervisor read, page not present instruction pointer= 0x20:0xc0a611c8 stack pointer = 0x28:0xccc3cba8 frame pointer = 0x28:0xccc3cbb8 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 0 (em0 taskq) [thread] Stopped at m_tag_delete+0x48: movl 0(%ecx),%eax db> [/thread]
-
It did happen. I manually restarted the builders after the patch went in. So apparently it still isn't quite right.
-
Could someone who can readily reproduce this panic give this custom firmware build a try?
http://cvs.pfsense.org/~jimp/pfSense-Full-Update-2.0-BETA5-i386-20110114-2041.tgz
It was built without a patch that does the extra mbuf operations that may be triggering the panic.
-
Bad news JimP, still crashes.
Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350 KDB: stack backtrace: X_db_sym_numargs(c0eb72fb,ccc3ca90,c0a41f25,546,0,...) at X_db_sym_numargs+0x146 kdb_backtrace(546,0,ffffffff,c145d42c,ccc3cac8,...) at kdb_backtrace+0x29 witness_display_spinlock(c0eb9813,ccc3cadc,4,1,0,...) at witness_display_spinlock+0x75 witness_warn(5,0,c0ef7bc2,14,c131b3c0,...) at witness_warn+0x20d trap(ccc3cb68) at trap+0x19e alltraps(c2feeb00,dedeadc0,c2feeb00,c2feeb00,ccc3cbf0,...) at alltraps+0x1b m_tag_delete_chain(c2feeb00,0,c0e6e75d,0,c2ed9b50,...) at m_tag_delete_chain+0x3f reallocf(c2feeb00,100,0,c0a42978,df,...) at reallocf+0x8a5 uma_zfree_arg(c1d7e380,c2feeb00,0,b5,ccc3cc84,...) at uma_zfree_arg+0x29 m_freem(c2feeb00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43 ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8 ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0 taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed9a,344,c131b3c0,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3b1a0,c2f525ec,ccc3cd38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0xdedeadc0 fault code= supervisor read, page not present instruction pointer= 0x20:0xc0a611c8 stack pointer = 0x28:0xccc3cba8 frame pointer = 0x28:0xccc3cbb8 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 0 (em0 taskq) [thread] Stopped at m_tag_delete+0x48: movl 0(%ecx),%eax db> [/thread]
-
currently running 2.0-BETA5 (i386) built on Thu Jan 13 19:33:19 EST 201
not sure how far back this happens.in a test network -
2 machines, each w/ 4 intel nics (em0 - em3)
WAN, LAN, Opt1, Opt2 (CARP interface)Running CARP on WAN, LAN, Opt1 interfaces
Syncing on Opt2 interface.Recently started getting panics on box2 when changing settings on box1.
Panic & BackTrace from box2 included below.
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1a4 fault code = supervisor read, page not present instruction pointer = 0x20:0xc09ee51d stack pointer = 0x28:0xd670aa54 frame pointer = 0x28:0xd670aa70 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 253 (devd) [thread] Stopped at _mtx_lock_sleep+0x6d: movl 0x1a4(%ecx),%eax db> bt Tracing pid 253 tid 64081 td 0xc4142000 _mtx_lock_sleep(c40f16d0,c4142000,0,c0ecfc57,fd,...) at _mtx_lock_sleep+0x6d _mtx_lock_flags(c40f16d0,0,c0ecfc57,fd,0,...) at _mtx_lock_flags+0xf7 carp6_input(c3ae5800,c0286938,c40f3a00,c0ea9fce,3,...) at carp6_input+0x9bd ifioctl(c46a3b44,c0286938,c40f3a00,c4142000,c40cf900,...) at ifioctl+0x141e soo_ioctl(c412ddc8,c0286938,c40f3a00,c39aa400,c4142000,...) at soo_ioctl+0x415 kern_ioctl(c4142000,f,c0286938,c40f3a00,1a3b7d0,...) at kern_ioctl+0x1fd ioctl(c4142000,d670acf8,c0ef7af5,c0ecdaff,c41a77f8,...) at ioctl+0x134 syscall(d670ad38) at syscall+0x220 Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x8088357, esp = 0xbfbfe89c, ebp = 0xbfbfe908 --- db> reboot [/thread]