Kernel Panic
-
I'm working on getting text dumps working so that even people on a normal install will have the crash dump analysis left in /var/crash after an auto reboot. Hitting a couple roadblocks, though.
-
I think I am running into this issue. I do not have the dev kernel installed. Thought it was a bad hard drive, Had a lot of drive errors after rebooting a few times until it no longer booted. Replaced the drive. Machine ran fine for a couple hours off network. As soon as traffic started hitting it it quit within a few minutes. After a reboot it passed traffic a few minutes and quit again.
Had been running the Dec31 Full i386(it ran fine). Updated to Jan 21 over the weekend. Issue started yesterday. Loaded Jan 26 amd64 today and still see the same behavior.
Does this sound similar to what everyone else is seeing? Where can I find the Dec 31 snap, i must have erased it?
I have another box at the office and some spare public IP's. I can set it up for testing if any of the devs want access (when its up).
Let me know.
This is on a 5015A-EHF-D525 w/4G RAM
http://www.supermicro.com/products/system/1U/5015/SYS-5015A-EHF-D525.cfm -
Text dumps will be a nice feature. Thanks for working on that Jimp.
-
On the setup we have where secondary panics used to be replicable it's fixed now.
jnorell: can you email me a backup of your config? cmb at pfsense dot org
-
Uploaded a new kernel http://files.pfsense.org/kernel_new.gz
Beaware that you need to be updated to the latest snapshot before using this kernel otherwise you will get hangs. -
Is this just for the intel em chip fix? or will it also work on my Soekris vr chip?
-
@ermal:
Uploaded a new kernel http://files.pfsense.org/kernel_new.gz
Is that for full installs? amd64?
-
That is for full install 32bits.
It should work even on alix boards i am not sure on others since i cannot build for those now.It should fix all the panics in vr, em or other nics.
-
Should we be using this snap and later?
pfSense-Full-Update-2.0-BETA5-i386-20110126-0422.tgz 26-Jan-2011 09:47 83M -
Yes.
-
Done.
-
Will there be an amd64 version of the fix?
-
This is what I am supposed to do, correct?
/etc/rc.conf_mount_rw fetch http://files.pfsense.org/kernel_new.gz cp kernel_new.gz /boot/kernel/kernel.gz
EDIT: Modified code to reflect below change
-
With a little correction.
The last command is cp kernel_new.gz /boot/kernel/kernel.gz@hugo,
if this results in a fix it will be included in the snapshots which will put it on amd64 snaps as well. -
And then reboot.
-
And hopefully if it all goes as planned the next snapshot should have textdump support that works in it, and when it does panic it will automatically restart and leave the crash info in /var/crash in a .tar file that we can review. Works better than cell phone pics or swapping kernels… :-)
-
:D ;D :D OMG!!! I am posting this right now via OpenVPN on my neighbors wifi. It is working on my Soekris net5501-70 board so far and I haven't been booted. Will edit this post if it crashes doing more testing.
EDIT: Able to remote into everything fine, and trying file transfers
EDIT EDIT: Haven't had any problems since 5 with the network at home (Soekris). Still am testing the "work" one (em).
-
just installed the latest snap with the kernel.
Looking good so far! :)
copied 1GB of data…no problem.going to test some more.
-
Ermal, please let us know when there will be a 64-bit kernel snap available for testing.
Jim, please let us know when there is a snap ready or in the pipe with textdump.
-
CRAP!!! It faulted out again (Soekris):
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0xd553ebc0 frame pointer = 0x28:0xd553ebcc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (irq5: vr1) [thread] Stopped at 0: *** error reading from address 0 *** db> bt Tracing pid 12 tid 64026 td 0xc3aa4780 m_tag_delete(c4dde600,c4de0b3d) at m_tag_delete+0x32 m_tag_delete_chain(c4dde600,0,d553ec2c,c0c94f99,c4dde600,...) at m_tag_delete_chain+0x82 mb_dtor_mbuf(c4dde600,100,0,c0a9041f,c3a92000,...) at mb_dtor_mbuf+0x25 uma_zfree_arg(c1d7b000,c4dde600,0,c3a28000,d553ec70,...) at uma_zfree_arg+0x29 m_freem(c4dde600,c3debe00,c3a92000,7f,7f,...) at m_freem+0x43 vr_txeof(0,c12c69c0,d5530008,0,0,...) at vr_txeof+0x340 vr_intr(c3a28000,0,109,dbd32e12,b15,...) at vr_intr+0x1c7 intr_event_execute_handlers(c39757f8,c3972180,c0e6ba48,52d,c39721f0,...) at intr_event_execute_handlers+0x14b ithread_loop(c3a8e450,d553ed38,6000,f600001b,0,...) at ithread_loop+0x6b fork_exit(c09ae910,c3a8e450,d553ed38) at fork_exit+0x91 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xd553ed70, ebp = 0 --- db> At the time I was logged on with 2 clients. One (my netbook) was pushing pandora radio, and the other (my lappy) was working on configuration on the local network via remote desktop. The desktop handshake for security (kerbros) is when it crashed.[/thread]