Kernel Panic
-
I have 2.0-BETA5 (i386) built on Fri Dec 31 14:08:23 EST 2010. I have been having problems with this firewall running on a old dell P4 computer. It has been running great, until I login via openvpn and try to remote desktop a computer. The firewall then kernel panics with:
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0x0 fault code= supervisor read, page not present instruction pointer= 0x20:0x0 stack pointer = 0x28:0xcca52bbc frame pointer = 0x28:0xcca52bc8 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 0 (em0 taskq) trap number= 12 panic: page fault cpuid = 0 Uptime: 16h28m26s Cannot dump. Device not defined or unavailable. Automatic reboot in 15 seconds - press a key on the console to abort Rebooting...
The firewall has 9 VLANs all trunked over one physical lan port. I run squid with lightsquid for logging on all VLAN interfaces. I am running captive portal too.
I get the same problem at home with my Soekris net5501-70. There I have only LAN, WRLS (for guest access), DMZ, and am running the same utilities (squid, lightsquid, nut, captive portal). -
Same here… has there been a bug opened? Is there already a fix?
-
I would love for a problem to be in the fix list, but I don't even know exactly what the cause is. I would appreciate some help from one of the administrators to help out in the debug process.
-
As a start, you can install the developer kernel.
After the panic type "bt" at the debugger console and capture that info as well as the panic.
Directions for installing the developer kernel are at the following link
-
Hi Guys,
Here I have the same problem when I active the Squid…
I´m testing since version: pfSense-2.0-BETA5-20101228-0454.iso.gz to pfSense-2.0-BETA5-20110101-1659.iso.gz.
If I don´t use Squid the problem doesn´t hapen... this was done in a lab, about ten users were use to test.
We need open a Bug to report this...
Luiz Ferreira.
-
Since those snapshots are nearly a week old, before you report any problem, make sure it can be reproduced on the most current snapshot available.
And if you get a panic, we need the full text of the panic (even a picture of the screen is OK) and if possible, install a debug kernel (as PJ2 mentioned) and get a backtrace.
Be careful with posting or reporting bugs to an existing thread or ticket about panics, too, since they could be unrelated unless the circumstances and panic message are identical.
-
The panic shows still after a most recent update [2.0-BETA5 (i386) built on Wed Jan 5 03:16:13 EST 2011]. Same panic as above.
If it helps out any, I am running 2008 r2 and connecting using a win7 client. I noticed the panic happens after the kerbros handshake, when it is "initiating remote connection"EDIT: I tried do load the developer kernel on my Soekris board and it didn't work, using the above link and instructions. It actually crashed the system and wouldn't boot. I will try it on the p4 later today, hopefully I won't have the same outcome.
EDIT EDIT: This is a copy of this afternoons crash from my home system.
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0x0 fault code= supervisor read, page not present instruction pointer= 0x20:0x0 stack pointer = 0x28:0xd5341bf4 frame pointer = 0x28:0xd5341c28 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 11 (irq5: vr1) trap number= 12 panic: page fault cpuid = 0 Uptime: 1d19h33m12s Cannot dump. Device not defined or unavailable. Automatic reboot in 15 seconds - press a key on the console to abort Rebooting...
The firewall crashed while I was out and trying to remote into my home system. I am using squid in transparency mode. I haven't tried disabling squid.
-
Is anyone else able to get the panic logged? I can't get the developer setup on either one of the firewalls. I can't afford the downtime.
-
I had a thread started on this here from dec 13, but noticed this one
http://forum.pfsense.org/index.php/topic,31031.msg163019.html#msg163019i just posted this pic on my thread.
EDIT: If this helps, i have mutiple vlans on em1 on this box. A few of the other firewalls with the problem, no vlans on the interface.
-
Here is a screen shoot of the debug in the dev kernel. Can not get to the console at the moment for any more info
-
I have exactly the same kernel crashes in FreeBSD7 and 8. To me, it seems to be related to VLAN tags and NICs.
http://forums.freebsd.org/showthread.php?t=18676
(see last post #8)
-
Successfully grabbed the panic in developer mode
Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350 KDB: stack backtrace: X_db_sym_numargs(c0eb7066,e302aa90,c0a41d45,546,0,...) at X_db_sym_numargs+0x146 kdb_backtrace(546,0,ffffffff,c145d1ac,e302aac8,...) at kdb_backtrace+0x29 witness_display_spinlock(c0eb957e,e302aadc,4,1,0,...) at witness_display_spinlock+0x75 witness_warn(5,0,c0ef792d,14,c131b140,...) at witness_warn+0x20d trap(e302ab68) at trap+0x19e alltraps(c336dc00,dedeadc0,c336dc00,c336dc00,e302abf0,...) at alltraps+0x1b m_tag_delete_chain(c336dc00,0,c0e6e512,0,c2ed9bc0,...) at m_tag_delete_chain+0x3f reallocf(c336dc00,100,0,c0a42798,df,...) at reallocf+0x8a5 uma_zfree_arg(c1d7e380,c336dc00,0,bc,e302ac84,...) at uma_zfree_arg+0x29 m_freem(c336dc00,4,c0e6e512,b87,c2f4e000,...) at m_freem+0x43 ed_probe_RTL80x9(c2f52580,0,c0e6e512,546,c2f525bc,...) at 0xc06ec448 ed_probe_RTL80x9(c2f4e000,1,c0eb8937,4f,c2edb918,...) at 0xc06efe10 taskqueue_run(c2edb900,c2edb918,c0ea5cf0,0,c0eb1f96,...) at taskqueue_run+0x103 taskqueue_thread_loop(c2f525ec,e302ad38,c0eaeb05,344,c131b140,...) at taskqueue_thread_loop+0x68 fork_exit(c0a3afc0,c2f525ec,e302ad38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xe302ad70, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0xdedeadc0 fault code= supervisor read, page not present instruction pointer= 0x20:0xc0a60fe8 stack pointer = 0x28:0xe302aba8 frame pointer = 0x28:0xe302abb8 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 0 (em0 taskq) [thread] Stopped at m_tag_delete+0x48: movl 0(%ecx),%eax db> [/thread]
-
Looks like it may be in the Intel driver… fun.
-
so, is it something you guys can fix or is it something with freebsd?
-
Hopefully it's fixable, though lem is legacy em, so it's some rather old/early em chipsets. Might explain why so few people have hit it.
-
If it is an intel problem, then why would I be recieving the same problem on my soekris net5501-70 board? that uses a VIA VT6105M chip.
-
Without a backtrace from there it's impossible to say.
-
Hopefully it's fixable, though lem is legacy em, so it's some rather old/early em chipsets. Might explain why so few people have hit it.
Jim
where do you see it is an old legacy nic/driver? (not questioning, just Curious) :)
just an fyi, had no problem with oct snaps.Thanks for your help
-
where do you see it is an old legacy nic/driver? (not questioning, just Curious) :)
just an fyi, had no problem with oct snaps.exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
if_lem.c. lem is legacy em, a normal em card would have been in if_em.c
-
Hmm - that line looks similar to the panic I get when captive portal is enabled on my box.
exclusive sleep mutex fxp0 (network driver) r = 0 (0xc36de018) locked @ /usr/pfSensesrc/src/sys/dev/fxp/if_fxp.c:1288
Details here:
http://forum.pfsense.org/index.php/topic,30791.msg159227.html#msg159227CryoGenID gets a panic but with yet another set of drivers. Cino does as well.
http://forum.pfsense.org/index.php/topic,29839.60.htmlIs there anything we can do to help besides posting back traces?