Kernel Panic

jeebsion

@jeebsion:

has anyone been running the latest snapshot? does it still panic and all?

I have been since late last night and I've tried everything I was doing before to make it panic but have had no success. Looks like the guys have done a good job of finding the problem.

Andy

Oh .. that's good .. because the last update it hung whilst rebooting the moment its initializing the WAN interfaces ..

vito

i am still testing but so far on
snap
2.0-BETA5 (i386)
built on Sat Jan 29 23:42:13 EST 2011

Have not had a lock or panic!
:)
pushed a good amount of data with open vpn.

eri--

Please monitor the output of netstat -m and see if those counters go crazy otherwise thank you for help testing.

geewhz01

Had another panic just now after 21 hrs uptime. Attached is info, running build AMD64-20110129-1502.

Thanks,

Andy

pfsenseerror2.png_thumb

eri--

You need to upgrade at one snapshot after it.
This issue was reported before in this thread and it should be fixed on snapshot later than what you have installed

eri--

I assume there are no panics anymore from seeing no complaints anymore?

clarknova

I won't normally know for another week, but I'll surely keep this thread posted.

LostInIgnorance

My soekris hasn't crashed (great job fixing the panics! :D), but i haven't tested the dell em machine yet. :-\

geewhz01

Just crashed here, sorry. Running 2.0-BETA5 (amd64)
built on Sun Jan 30 23:04:29 EST 2011 . Happened right after a rule change.

pfsense131.png_thumb

eri--

What did crash the primary or the secondary?
It crashed on clicking apply rules or in the secondary after the sync?

geewhz01

@ermal:

What did crash the primary or the secondary?
It crashed on clicking apply rules or in the secondary after the sync?

The secondary crashed after the sync.

Thanks,

Andy

eri--

Can you please send me your config?

vito

ermal
My system has been good so far.
i just updated another firewall and will let you know.

Thanks again for your help.

geewhz01

@ermal:

Can you please send me your config?

I have sent borh the primary and secondary. Let me know if I can be of further assistance.

Thanks,

Andy

eri--

I committed a fix for this issue too.
So grab a snapshot from late tomorrow and test.

geewhz01

Still a problem for me. Secondary locked a few minutes after a rule change. Running AMD64 20110201-1959.

Thanks,

Andy

pfsense2211.png_thumb

eri--

Confirmed fix since i found a way to reproduce it.
Snapshots of tomorrow should be ok.

jnorell

A quick note, I updated my carp slave to Feb 2 04:04:51 EST 2011 with some of the recent kernel panic fixes, but I'm still able to panic it by adding a new VIP addr. (I deleted a VIP addr, no problem; then re-added that addr, no problem; then added a new VIP addr, and it panic'd in devd again.)

fasteddy

I looked through a similar topic (http://forum.pfsense.org/index.php/topic,31721.0.html) but wasn't entirely convinced I'm having the same issue, so a new topic seemed to be in order…

I have had an IPSec site-to-site VPN connection running trouble-free for months now using two pfSense PCs running 2.0 BETA i386 snapshots from May of 2010. After updating to a January 27th snapshot (and snapshots from every day since, including today) I'm having problems with one of the PCs locking up. Without fail, if I attempt to log on to the web configuration page of the remote pfSense box through the VPN connection, it will serve up the 'login' page, but crashes as soon as I hit the 'login' button. The error message on the pfSense box's monitor says:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xd79a2720
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xd79a2720
stack pointer           = 0x28:0xc5b26bbc
frame pointer           = 0x28:0xc5b26bc8
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (em3 taskq)

I've attached a picture of the backtrace information as well.

If I do a remote desktop connection to a PC on the remote side of the VPN, I can access the web configuration through that PC's browser with no issues. The problem only occurs if I try to access the web configuration of the remote pfSense box (or ssh console, for that matter) directly from a machine on the local side of the VPN. I've been doing this with no problems until I updated on the 27th.

For what it's worth, this only happens on one of my two machines - if I'm on the 'sick' pfSense box's network I can access the 'healthy' pfSense box's web configuration directly through the VPN with no issues. The two PCs are substantially different, but are using the same network cards - Intel Pro/1000MT gigabit NICs. The 'sick' pfSense box also has an on-board Marvell Yukon Gigabit interface (skc0).

The output of 'pciconf -lvp' is:


skc0@pci0:0:10:0:       class=0x020000 card=0x811a1043 chip=0x432011ab rev=0x13 hdr=0x00
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfa000000, size 16384, enabled
    bar   [14] = type I/O Port, range 32, base 0xa000, size 256, enabled
em0@pci0:0:11:0:        class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfa300000, size 131072, enabled
    bar   [14] = type Memory, range 32, base 0xfa200000, size 131072, enabled
    bar   [18] = type I/O Port, range 32, base 0xa400, size 64, enabled
em1@pci0:0:12:0:        class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfa600000, size 131072, enabled
    bar   [14] = type Memory, range 32, base 0xfa500000, size 131072, enabled
    bar   [18] = type I/O Port, range 32, base 0xa800, size 64, enabled
em2@pci0:0:13:0:        class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfa900000, size 131072, enabled
    bar   [14] = type Memory, range 32, base 0xfa800000, size 131072, enabled
    bar   [18] = type I/O Port, range 32, base 0xb000, size 64, enabled
em3@pci0:0:14:0:        class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfac00000, size 131072, enabled
    bar   [14] = type Memory, range 32, base 0xfab00000, size 131072, enabled
    bar   [18] = type I/O Port, range 32, base 0xb400, size 64, enabled

Any chance someone could shed some light on this issue? I can reliably reproduce the issue if need be, and would be happy to provide any additional information that may be required.

Thank you, and best regards!

fasteddy_backtrace.jpg_thumb

jimp

The snap from 2/3 would be the first to have the carp panic fixes.

There is still an issue (when deleting a VIP) but the panic you're seeing has likely been fixed in the most recent snap that is up now (Thu Feb 3 00:55:19 EST 2011)