Kernel Panic
-
What did crash the primary or the secondary?
It crashed on clicking apply rules or in the secondary after the sync? -
@ermal:
What did crash the primary or the secondary?
It crashed on clicking apply rules or in the secondary after the sync?The secondary crashed after the sync.
Thanks,
Andy
-
Can you please send me your config?
-
ermal
My system has been good so far.
i just updated another firewall and will let you know.Thanks again for your help.
-
@ermal:
Can you please send me your config?
I have sent borh the primary and secondary. Let me know if I can be of further assistance.
Thanks,
Andy
-
I committed a fix for this issue too.
So grab a snapshot from late tomorrow and test. -
Still a problem for me. Secondary locked a few minutes after a rule change. Running AMD64 20110201-1959.
Thanks,
Andy
-
Confirmed fix since i found a way to reproduce it.
Snapshots of tomorrow should be ok. -
A quick note, I updated my carp slave to Feb 2 04:04:51 EST 2011 with some of the recent kernel panic fixes, but I'm still able to panic it by adding a new VIP addr. (I deleted a VIP addr, no problem; then re-added that addr, no problem; then added a new VIP addr, and it panic'd in devd again.)
-
I looked through a similar topic (http://forum.pfsense.org/index.php/topic,31721.0.html) but wasn't entirely convinced I'm having the same issue, so a new topic seemed to be in order…
I have had an IPSec site-to-site VPN connection running trouble-free for months now using two pfSense PCs running 2.0 BETA i386 snapshots from May of 2010. After updating to a January 27th snapshot (and snapshots from every day since, including today) I'm having problems with one of the PCs locking up. Without fail, if I attempt to log on to the web configuration page of the remote pfSense box through the VPN connection, it will serve up the 'login' page, but crashes as soon as I hit the 'login' button. The error message on the pfSense box's monitor says:
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xd79a2720 fault code = supervisor read, page not present instruction pointer = 0x20:0xd79a2720 stack pointer = 0x28:0xc5b26bbc frame pointer = 0x28:0xc5b26bc8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (em3 taskq)
I've attached a picture of the backtrace information as well.
If I do a remote desktop connection to a PC on the remote side of the VPN, I can access the web configuration through that PC's browser with no issues. The problem only occurs if I try to access the web configuration of the remote pfSense box (or ssh console, for that matter) directly from a machine on the local side of the VPN. I've been doing this with no problems until I updated on the 27th.
For what it's worth, this only happens on one of my two machines - if I'm on the 'sick' pfSense box's network I can access the 'healthy' pfSense box's web configuration directly through the VPN with no issues. The two PCs are substantially different, but are using the same network cards - Intel Pro/1000MT gigabit NICs. The 'sick' pfSense box also has an on-board Marvell Yukon Gigabit interface (skc0).
The output of 'pciconf -lvp' is:
skc0@pci0:0:10:0: class=0x020000 card=0x811a1043 chip=0x432011ab rev=0x13 hdr=0x00 class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfa000000, size 16384, enabled bar [14] = type I/O Port, range 32, base 0xa000, size 256, enabled em0@pci0:0:11:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00 class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfa300000, size 131072, enabled bar [14] = type Memory, range 32, base 0xfa200000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xa400, size 64, enabled em1@pci0:0:12:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00 class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfa600000, size 131072, enabled bar [14] = type Memory, range 32, base 0xfa500000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xa800, size 64, enabled em2@pci0:0:13:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00 class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfa900000, size 131072, enabled bar [14] = type Memory, range 32, base 0xfa800000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xb000, size 64, enabled em3@pci0:0:14:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00 class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfac00000, size 131072, enabled bar [14] = type Memory, range 32, base 0xfab00000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xb400, size 64, enabled
Any chance someone could shed some light on this issue? I can reliably reproduce the issue if need be, and would be happy to provide any additional information that may be required.
Thank you, and best regards!
-
The snap from 2/3 would be the first to have the carp panic fixes.
There is still an issue (when deleting a VIP) but the panic you're seeing has likely been fixed in the most recent snap that is up now (Thu Feb 3 00:55:19 EST 2011)
-
Hi Jimp - My read of the build logs suggested that Ermal's patch didn't get included in last nights build even though the build was started after the commit was made.
Thu Feb 3 04:21:58 EST 2011 -|- >>> Sleeping for 86400 in between snapshot builder runs. Last known commit 847e5e8257b58906a0d12ce48275cae7162aab47
That commit listed there shows up in redmine as being a couple before ermals commit. Am I reading that wrong?
-
I started the builder by hand after his commit.
The patches are committed in the tools repo - the tools repo isn't tracked by the builder in the function you pasted. -
Hmmm - Just panicked, running 2.0-BETA5 (i386) built on Thu Feb 3 00:55:19 EST 2011 on both master & slave.
Updated a rule on the master, slave panicked.
I've attached images of the panic & bt.
![](http://Carp panic.png)
![](http://Carp bt.png)![Carp panic.png](/public/imported_attachments/1/Carp panic.png)
![Carp panic.png_thumb](/public/imported_attachments/1/Carp panic.png_thumb)
![Carp bt.png](/public/imported_attachments/1/Carp bt.png)
![Carp bt.png_thumb](/public/imported_attachments/1/Carp bt.png_thumb) -
The issue is fixed. Just the patch i committed had some typo from copy/pasto.
-
You should grab the next snapshot that will come out.
AFAIK it has no more such issues. -
I just started a new build after ermal's commit - the next new snapshot should include this fix.
-
Thanks Ermal! I'll give the next snapshot a try when it is available and report back.
-
Running 2.0-BETA5 (i386) built on Thu Feb 3 18:55:08 EST 2011 on both master & slave.
Slave panicked as soon as I updated a rule on the master.
Attached images of both panic & bt.
![](http://Carp Panic.png)
![](http://carp bt.png)
![Carp Panic.png](/public/imported_attachments/1/Carp Panic.png)
![Carp Panic.png_thumb](/public/imported_attachments/1/Carp Panic.png_thumb)
![carp bt.png](/public/imported_attachments/1/carp bt.png)
![carp bt.png_thumb](/public/imported_attachments/1/carp bt.png_thumb) -
Wrong snapshot sorry.
This is the commit that fixes the error https://rcs.pfsense.org/projects/pfsense-tools/repos/mainline/commits/08f1322c7d5d9fae8ef52dc356c75a59d2483263