Kernel Panic
-
This is on the slave. I've not had any problems on the master with this, just the slave.
Thanks,
Andy
-
snapshot from 2/9/2011.
Dell Poweredge 860
Intel Pentium D CPU 2.80GHz
2 gigs ram.
Openvpn client export plugin is the only plugin installed.
pptp server for remote clients.
4 ipsec tunnels to other networks.Crashes get stuck at db> prompt.
I have been unable to find a way to reliably trigger a crash but it does so every 24 hours or so.Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address= 0x6c fault code= supervisor read, page not present instruction pointer= 0x20:0xc04ee165 stack pointer = 0x28:0xc5b89c20 frame pointer = 0x28:0xc5b89c34 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 8 (pfpurge) [thread] Stopped at pf_state_tree_ext_gwy_RB_REMOVE_COLOR+0x25: cmpl $0x1,0x6c(%edx) db> Happy to provide additional information.[/thread]
-
@sammy@fusionsleep.com:
snapshot from 2/9/2011.
[…]Stopped at pf_state_tree_ext_gwy_RB_REMOVE_COLOR+0x25: cmpl $0x1,0x6c(%edx) db>
Happy to provide additional information.
We also need the output of "bt" at that prompt. Also, is this i386 or amd64?
-
Will do the bt at the prompt next crash.
snapshot from 2/9/2011.
Dell Poweredge 860
Intel Pentium D CPU 2.80GHz
2 gigs ram.
Openvpn client export plugin is the only plugin installed.
pptp server for remote clients.
4 ipsec tunnels to other networks. -
None of that tells us if you used a 32-bit (i386) or 64-bit (amd64) snapshot.
-
i386, sorry, I didn't realize the amd64 snapshot would run on a pentium D.
-
It may, it may not, I'm not sure what that CPU can do… but it's always best to provide the information requested rather than relying on people to guess what you may have done based on the hardware you have.
-
Ok i think i got even this one handled.
Wait for the next snapshot and report back. -
2.0-BETA5 (amd64) built on Fri Feb 11 06:51:36 EST 2011 : I was able to add a CARP IP to the master (2.0-BETA5 (amd64) built on Fri Jan 21 23:51:34 EST 2011) and the slave got it but then I've added a certificaton autority on the master and the slave crashed.
By the way: if I add the CA to my primary host, should I be able to use it from the slave right?
thanks
-
Hi Guys
I keep getting a kernel panic:
pf_state_tree_id_RB_REMOVE_COLOR+0x25 empl
This keeps happening every so often.
Anything I can do?
Kind regards
Aubrey Kloppers -
Make sure both of your firewalls are on the latest snapshot and try again. Those panics should be cleared up.
-
Upgraded the slave to 2.0-RC1 (amd64) built on Sun Feb 13 23:53:14 EST 2011.
Crashed after installing autobackup package on master (sync started and slave crashed): I'm really scared to upgrade the master node to the EC1 as I've already put the cluster in production…is the commercial support going to support me even if I'm running on 2.0? thanks
-
Yes, commercial support will assist you even if it's 2.0. Your panic is different than the previous posted in this thread though.
-
Ok,
I'll try to upgrade the master node in two hours. Thanks -
Can you describe please your setup?
EDIT: I put even more safety belts so your carp panic should not happen on new snapshots.
-
Hi, I'll open a ticket tomorrow with all the details on my setup. By the way, I have the 2 nodes running 2.0-RC1 (amd64) built on Sun Feb 13 23:53:14 EST 2011 for a few hours without problems. I didn't work on any new carp ips, only on openvpn: the master was synced to the slave many times without problems. I'll try to play more with carp tomorrow and let you know :-)
thanks -
(Ticket #KZZ-134399 opened)
I have the cluster in production for 1 day without any problem right now.
Here is one strange thing I've seen in the logs (on the slave) the other day, what happened?Feb 13 22:14:50 kernel: vip7: link state changed to DOWN
Feb 13 22:14:50 kernel: vip7: MASTER -> BACKUP (more frequent advertisement received)
Feb 13 22:14:50 kernel: vip8: link state changed to DOWN
Feb 13 22:14:50 kernel: vip9: link state changed to DOWN
Feb 13 22:14:50 kernel: nt received)
Feb 13 22:14:50 kernel:
Feb 13 22:14:50 kernel: e<5m>Ne
Feb 13 22:14:50 kernel: <d6o>Wis
Feb 13 22:14:50 kernel: e<5r>to
Feb 13 22:14:50 kernel: d< 6>tv
Feb 13 22:14:50 kernel: <5a>dge
Feb 13 22:14:50 kernel: h<6a>nt
Feb 13 22:14:50 kernel: q<u5>en c
Feb 13 22:14:50 kernel: t<a6t>ere
Feb 13 22:14:50 kernel: e< 5f>s
Feb 13 22:14:50 kernel: <k6>m or
Feb 13 22:14:50 kernel: P<5 >(n
Feb 13 22:14:50 kernel: :< 6l>iU
Feb 13 22:14:50 kernel: B<a5c>K1
Feb 13 22:14:50 kernel: i<6p>
Feb 13 22:14:50 kernel: -<5>>v
Feb 13 22:14:50 kernel: vip8: MASTER
Feb 13 22:14:50 kernel: vip10: link state changed to DOWN
Feb 13 22:14:50 kernel: vip11: link state changed to DOWN
Feb 13 22:14:50 kernel: vip9: MASTER -> BACKUP (more frequent advertisement received)
Feb 13 22:14:50 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
Feb 13 22:14:50 kernel: : MASTER -> BACKUP (more frequent advertisement received)
Feb 13 22:14:50 kernel:
Feb 13 22:14:50 kernel: p<150>N
Feb 13 22:14:50 kernel: o <d6o>Wvi
Feb 13 22:14:50 kernel: vip12: link state changed t
Feb 13 22:14:50 kernel: vip11: MASTER -> BACKUP (more frequent advertisement received)
Feb 13 22:14:50 kernel: vip12: MASTER -> BACKUP (more frequent advertisement received)
Feb 13 22:14:49 dhcpd: For info, please visit https://www.isc.org/software/dhcp/
Feb 13 22:14:49 dhcpd: All rights reserved.
Feb 13 22:14:49 dhcpd: Copyright 2004-2010 Internet Systems Consortium.
Feb 13 22:14:49 dhcpd: Internet Systems Consortium DHCP Server 4.1.1-P1</d6o></a5c></k6></a6t></u5></d6o> -
The next snap (building now) should have more carp panic fixes.
Not sure about the crazy kernel message there, though it looks like maybe it was two messages overlapping, but a bit more corrupt than that usually is.
-
I would like to inform that since its updating on Sunday (when the snapshot server was available again), the Dell P4 with the integrated em gigE, I have not recieved any panics and the machine is running well. Thank you all for all the hard work fixing the issue! :D
-
Hi,
Like cyber7, I'm getting kernel panics (pf_state_tree_id_RB_REMOVE_COLOR, pfpurge is always the current process) since two weeks (approx. 2 to 3 times a week) on a VMWare setup. Here's a screenshot :