Kernel Panic
-
JimP,
I will be trying the old dell with the gigE port today to find out if any changes you mentioned above may have done something. I haven't had the downtime lately to put it back into the system. Took that computer out and used another with 2 100 nics and got it running so I could remote back into the system.
So it wouldn't be worthwhile to put that old dell back in? -
I'll see about a backtrace. Shouldn't be a problem.
-
It may be worth trying again on a current snap. At least on today's snapshot FTP no longer freezes my router :-)
-
Hi,
I think that I'm experiencing the same problem here: I have 2 boxes running latest beta (2.0-BETA5 (amd64) built on Fri Jan 21 00:30:42 EST 2011 ) of pfsense (I have another cluster of pfsense 1.2.3 running) that I'd like to move to production soon (tomorrow or the day after tomorrow actually :-D): I have the sync enabled, and when I add another CARP IP on the primary box, the secondary crashes (I was able to reproduce it 4 times, the last one with a devel kernel).
This happens as soon as I create the new vip on the primary (the sync starts), not after pressing apply.You can see a picture of the crash + backtrace.
An interesting thing: the third time I tried (the first one with the devel kernel) I was able to create the carp ip on the primary, and it was successfully synced on the secondary.
But on the secondary logs I can see something that to me it looks like a "soft" or "recoverable" panic.What do you think?
thanks
Jan 21 17:37:00 check_reload_status: syncing firewall Jan 21 17:37:00 kernel: vip1: link state changed to DOWN Jan 21 17:37:00 kernel: em0: promiscuous mode disabled Jan 21 17:37:00 kernel: vip2: link state changed to DOWN Jan 21 17:37:00 kernel: em1: promiscuous mode disabled Jan 21 17:37:00 kernel: vip3: link state changed to DOWN Jan 21 17:37:00 kernel: em2: promiscuous mode disabled Jan 21 17:37:00 kernel: em2_vlan70: promiscuous mode disabled Jan 21 17:37:00 kernel: carp0: changing name to 'vip1' Jan 21 17:37:00 kernel: em0: promiscuous mode enabled Jan 21 17:37:00 kernel: vip1: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip1: link state changed to UP Jan 21 17:37:00 kernel: carp1: changing name to 'vip2' Jan 21 17:37:00 kernel: em1: promiscuous mode enabled Jan 21 17:37:00 kernel: vip2: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip2: link state changed to UP Jan 21 17:37:00 kernel: carp2: changing name to 'vip3' Jan 21 17:37:00 kernel: em2: promiscuous mode enabled Jan 21 17:37:00 kernel: em2_vlan70: promiscuous mode enabled Jan 21 17:37:00 kernel: vip3: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip3: link state changed to UP Jan 21 17:37:00 php: : CARP sync not being done because of missing sync ip! Jan 21 17:37:00 check_reload_status: syncing firewall Jan 21 17:37:00 kernel: carp3: changing name to 'vip4' Jan 21 17:37:00 kernel: vip4: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip4: link state changed to UP Jan 21 17:37:00 kernel: vip1: link state changed to DOWN Jan 21 17:37:00 kernel: vip1: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip1: link state changed to UP Jan 21 17:37:00 php: : CARP sync not being done because of missing sync ip! Jan 21 17:37:00 check_reload_status: reloading filter Jan 21 17:37:00 kernel: vip2: link state changed to DOWN Jan 21 17:37:00 kernel: vip2: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip2: link state changed to UP Jan 21 17:37:00 kernel: vip3: link state changed to DOWN Jan 21 17:37:00 kernel: vip3: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip3: link state changed to UP Jan 21 17:37:00 kernel: vip4: link state changed to DOWN Jan 21 17:37:00 kernel: vip4: INIT -> MASTER (preempting) Jan 21 17:37:00 kernel: vip4: link state changed to UP Jan 21 17:37:00 php: /xmlrpc.php: ROUTING: change default route to *** Jan 21 17:37:00 php: /xmlrpc.php: Removing static route for monitor*** and adding a new route through *** Jan 21 17:37:00 php: /xmlrpc.php: Removing static route for monitor *** and adding a new route through *** Jan 21 17:37:00 apinger: Exiting on signal 15. Jan 21 17:37:01 apinger: Starting Alarm Pinger, apinger(60120) Jan 21 17:37:01 php: /xmlrpc.php: Resyncing OpenVPN instances. Jan 21 17:37:02 kernel: vip2: MASTER -> BACKUP (more frequent advertisement received) Jan 21 17:37:02 kernel: vip2: link state changed to DOWN Jan 21 17:37:03 dhcpd: Internet Systems Consortium DHCP Server 4.1.1-P1 Jan 21 17:37:03 dhcpd: Copyright 2004-2010 Internet Systems Consortium. Jan 21 17:37:03 dhcpd: All rights reserved. Jan 21 17:37:03 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ Jan 21 17:37:03 dnsmasq[51897]: exiting on receipt of SIGTERM Jan 21 17:37:04 dnsmasq[5180]: started, version 2.55 cachesize 10000 Jan 21 17:37:04 dnsmasq[5180]: compile time options: IPv6 GNU-getopt no-DBus I18N DHCP TFTP Jan 21 17:37:04 dnsmasq[5180]: reading /etc/resolv.conf Jan 21 17:37:04 dnsmasq[5180]: using nameserver 8.8.8.8#53 Jan 21 17:37:04 dnsmasq[5180]: read /etc/hosts - 2 addresses Jan 21 17:37:05 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received) Jan 21 17:37:05 kernel: vip1: link state changed to DOWN Jan 21 17:37:05 dhcpd: Internet Systems Consortium DHCP Server 4.1.1-P1 Jan 21 17:37:05 dhcpd: Copyright 2004-2010 Internet Systems Consortium. Jan 21 17:37:05 dhcpd: All rights reserved. Jan 21 17:37:05 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ Jan 21 17:37:06 kernel: vip3: MASTER -> BACKUP (more frequent advertisement received) Jan 21 17:37:06 kernel: vip3: link state changed to DOWN Jan 21 17:38:34 kernel: lock order reversal: Jan 21 17:38:34 kernel: 1st 0xffffffff8123e520 in_ifaddr_lock (in_ifaddr_lock) @ /usr/pfSensesrc/src/sys/netinet/if_ether.c:541 Jan 21 17:38:34 kernel: 2nd 0xffffff00026d55a0 carp_if (carp_if) @ /usr/pfSensesrc/src/sys/netinet/ip_carp.c:1160 Jan 21 17:38:34 kernel: KDB: stack backtrace: Jan 21 17:38:34 kernel: X_db_sym_numargs() at X_db_sym_numargs+0x15a Jan 21 17:38:34 kernel: witness_display_spinlock() at witness_display_spinlock+0x9e Jan 21 17:38:34 kernel: witness_checkorder() at witness_checkorder+0x81e Jan 21 17:38:34 kernel: _mtx_lock_flags() at _mtx_lock_flags+0x78 Jan 21 17:38:34 kernel: carp_iamatch() at carp_iamatch+0x38 Jan 21 17:38:34 kernel: arprequest() at arprequest+0x4b8 Jan 21 17:38:34 kernel: netisr_dispatch_src() at netisr_dispatch_src+0xb8 Jan 21 17:38:34 kernel: ether_demux() at ether_demux+0x18d Jan 21 17:38:34 kernel: ether_vlanencap() at ether_vlanencap+0x295 Jan 21 17:38:34 kernel: ed_probe_RTL80x9() at ed_probe_RTL80x9+0x7cf8 Jan 21 17:38:34 kernel: ed_probe_RTL80x9() at ed_probe_RTL80x9+0x7ff4 Jan 21 17:38:34 kernel: intr_event_execute_handlers() at intr_event_execute_handlers+0x66 Jan 21 17:38:34 kernel: intr_event_add_handler() at intr_event_add_handler+0x432 Jan 21 17:38:34 kernel: fork_exit() at fork_exit+0x12a Jan 21 17:38:34 kernel: fork_trampoline() at fork_trampoline+0xe Jan 21 17:38:34 kernel: --- trap 0, rip = 0, rsp = 0xffffff80000d6d30, rbp = 0 --- Jan 21 17:38:44 kernel: vip4: MASTER -> BACKUP (more frequent advertisement received) Jan 21 17:38:44 kernel: vip4: link state changed to DOWN Jan 21 17:40:26 check_reload_status: syncing firewall Jan 21 17:40:26 syslogd: exiting on signal 15 Jan 21 17:40:26 syslogd: kernel boot file is /boot/kernel/kernel Jan 21 17:40:26 php: : CARP sync not being done because of missing sync ip!
Except for this incident I must say that it's a pleasure to work with beta 2!
-
Try a snapshot from tomorrow a fix for this has been put in place.
-
Thanks for you prompt support.
I've upgraded to 21 Jan 23:51 but the secondary still crashes, do you need a backtrace? Or is this snapshot not new enough?
thanks -
I am still getting the panic on my home Soekris net5501-70 board when using openVPN. I am unable to load the developer kernel on here since the board is an embedded system without a display. The board refuses to boot once loaded with the developer. Is there another way to grab the panic? I am using a HD in place of a cf card since I am running squid and HAVP.
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address= 0x0 fault code= supervisor read, page not present instruction pointer= 0x20:0x0 stack pointer = 0x28:0xd5341bf4 frame pointer = 0x28:0xd5341c28 code segment= base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 11 (irq5: vr1) trap number= 12 panic: page fault cpuid = 0 Uptime: 2d6h27m42s Cannot dump. Device not defined or unavailable. Automatic reboot in 15 seconds - press a key on the console to abort Rebooting...
-
Hi, I've upgraded to version: Sun Jan 23 01:37:41 EST 2011, changed to devel kernel and the system doesn't boot any more, see the attached pictures.
Thanks
-
Tested with snap
2.0-BETA5 (i386)
built on Fri Jan 21 06:52:27 EST 2011Still not working.
Thanks!
-
Looks like the e1000 (em and igb) driver from 8.2-STABLE saw quite a few updates/fixes and we may have to pull that in. I'll post in here when something gets committed (or you can just follow the commit log on the tools repo)
-
Commit just happened with the changed drivers. Next snap should have them, assuming they build OK.
-
So that should fix the gig card issue, but what can I do to get the full backtrace on my embedded Soekris board?
-
We don't have an embedded debug kernel, not sure how hard it might be to make one. If it doesn't work properly with the full install's debug kernel, it may be a pain to do.
-
Jimp,
I will let you know after the next full update snap.
from another post, are the snaps building fine now? (or the server, what ever the problem was)Thanks again!!
-
I will let you know after the next full update snap.
from another post, are the snaps building fine now? (or the server, what ever the problem was)The build won't be restarted for a couple more hours (waiting on patches for other issues to go in, too) so it probably won't be uploaded until tomorrow AM.
The snapshots were building fine all weekend, but at one point the snapshot web server (where they are copied after being built) ran out of space. It's OK now.
-
Yeah I tried the full install from the cd on the hd, but the board wouldn't boot. Had to stick with the embedded build.
-
Soekris won't boot from USB. You have to put your boot media in another machine to do the install. Once you have the drive back in the soekris you will probably have to manually enter your rootfs device at the prompt, then update /etc/fstab.
-
that is exactly what I did, using vm and an external usb-sata adapter. Just wish I could get the developer to boot on the board. I should have specified I used a usb dvd since I don't have one on my netbook.
-
I have installed 1.2.3 and 2.0 on a net5501-70 a few times, never had a problem other than changing the root device. Post specifics if you want help troubleshooting it.
Have you tried installing the full developer kernel on the embedded install?
-
last time I tried to do what PJ2 outlined at the begining of this thread, it locked my system up and wouldn't boot. Help is always greatly appreciated if I can get it installed and get the panic.
EDIT: Would actually prefer having the developer kernel on this embedded device. I am using the 5501-70; sata wd 80g; running full embedded; vr0-wan, vr1-lan, vr2-wrls (friendly wifi for visitors), vr3-dmz; packages squid, lightsquid, nut, havp, nmap, snort; anything else?