Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Kernel Panic

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    325 Posts 35 Posters 252.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W
      wallabybob
      last edited by

      @jimp:

      OK, just checking… It looks odd to me that the backtrace references ed_probe_RTL80x9 which is a really old realtek chip,

      Here's an extract from the stack trace:

      m_freem(c2feeb00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
      ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
      ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
      taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
      

      Note the two ed_probe_RTL80x9 references are not accompanied by a symbol name and offset. I suspect ed_probe_RTL80x9 is merely the closest lower value global symbol but its too far away to warrant printing the PC as symbol+offset. If that is the case you shouldn't take too much notice of the ed_probe_RTL80x9.

      1 Reply Last reply Reply Quote 0
      • L
        LostInIgnorance
        last edited by

        @jimp:

        We have arranged serial console access with someone who has been able to reproduce the panic so hopefully we'll have a lead on a fix early next week.

        JimP, is there anything I can do to help out?

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          Not that I'm aware of. If the mbuf tag patch isn't the cause, it almost has to be the recent e1000 driver update (em, igb, etc).

          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            Someone else had seen that once but so far we've been unable to replicate it so the real cause can be tracked down.

            It seemed to be something in the configuration, though.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • L
              LostInIgnorance
              last edited by

              I am afraid to update since I haven't heard anything back.  Is it still crashing or has it been fixed?

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                Nothing has changed with the drivers, but there are plenty of other things that have been fixed, it may be worth trying.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • S
                  Sabbasth
                  last edited by

                  How can I get the logs (system.log is flushed every boot ?) so I can help targeting the problem ?

                  I have 4 NICs (5 if), all Intel em, both PCI NIC or MB integrated NIC.
                  The computer just freezes, no reboot.

                  I have nmap and bandwithd installed. I'm using outboud Multi Wan, DHCP server, no VLAN, no traffic shaper, no VPN.

                  All was running good with an old snapshot. Freezes started after an upgrade a week ago. I currently have the lastest snapshot installed.
                  The freezes are random, sometimes pfSense runs some minutes, sometimes some hours.

                  Any FTP transfert aborts with an error (There were problems a week or so with passive FTP, but they were connection problems, here transferts are aborted).
                  I think this can be linked to the problem if a buffer in the driver is the problem.

                  1 Reply Last reply Reply Quote 0
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    @Sabbasth:

                    How can I get the logs (system.log is flushed every boot ?) so I can help targeting the problem ?

                    I have 4 NICs (5 if), all Intel em, both PCI NIC or MB integrated NIC.
                    The computer just freezes, no reboot.

                    I have nmap and bandwithd installed. I'm using outboud Multi Wan, DHCP server, no VLAN, no traffic shaper, no VPN.

                    All was running good with an old snapshot. Freezes started after an upgrade a week ago. I currently have the lastest snapshot installed.
                    The freezes are random, sometimes pfSense runs some minutes, sometimes some hours.

                    Any FTP transfert aborts with an error (There were problems a week or so with passive FTP, but they were connection problems, here transferts are aborted).
                    I think this can be linked to the problem if a buffer in the driver is the problem.

                    If you are seeing a freeze and not a reset/panic, then this thread isn't related. Start a new thread for that. These drivers haven't changed for several weeks now.

                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 0
                    • C
                      clarknova
                      last edited by

                      2.0-BETA5 (amd64)
                      built on Wed Jan 12 18:01:47 EST 2011

                      I just experienced my first kernel panic last night after more than 8 days uptime. I'm using a SM X7SPA-H board with only the onboard Intel GBE (Intel 82574L Gigabit Ethernet). I'm not using openvpn, but both NICs have multiple vlans on them and deal only in tagged traffic.

                      Is there a reasonable chance that updating to the latest snap will resolve this? I don't know that I can reproduce this panic intentionally, as it hasn't happened before and I wasn't doing anything interesting when it happened. I do have clients, but the panic happened at my lowest traffic period of the day.

                      db

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        I don't think we have done anything that would have fixed these panics, if it is driver-related.

                        Are you able to switch to a developer kernel so you can obtain a backtrace?

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • L
                          LostInIgnorance
                          last edited by

                          JimP,
                          I will be trying the old dell with the gigE port today to find out if any changes you mentioned above may have done something.  I haven't had the downtime lately to put it back into the system.  Took that computer out and used another with 2 100 nics and got it running so I could remote back into the system.
                          So it wouldn't be worthwhile to put that old dell back in?

                          1 Reply Last reply Reply Quote 0
                          • C
                            clarknova
                            last edited by

                            I'll see about a backtrace. Shouldn't be a problem.

                            db

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              It may be worth trying again on a current snap. At least on today's snapshot FTP no longer freezes my router :-)

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • D
                                disa
                                last edited by

                                Hi,
                                I think that I'm experiencing the same problem here: I have 2 boxes running latest beta (2.0-BETA5 (amd64) built on Fri Jan 21 00:30:42 EST 2011 ) of pfsense (I have another cluster of pfsense 1.2.3 running) that I'd like to move to production soon (tomorrow or the day after tomorrow  actually :-D): I have the sync enabled, and when I add another CARP IP on the primary box, the secondary crashes (I was able to reproduce it 4 times, the last one with a devel kernel).
                                This happens as soon as I create the new vip on the primary (the sync starts), not after pressing apply.

                                You can see a picture of the crash + backtrace.

                                An interesting thing: the third time I tried (the first one with the devel kernel) I was able to create the carp ip on the primary, and it was successfully synced on the secondary.
                                But on the secondary logs I can see something that to me it looks like a "soft" or "recoverable" panic.

                                What do you think?

                                thanks

                                Jan 21 17:37:00 	check_reload_status: syncing firewall
                                Jan 21 17:37:00 	kernel: vip1: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: em0: promiscuous mode disabled
                                Jan 21 17:37:00 	kernel: vip2: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: em1: promiscuous mode disabled
                                Jan 21 17:37:00 	kernel: vip3: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: em2: promiscuous mode disabled
                                Jan 21 17:37:00 	kernel: em2_vlan70: promiscuous mode disabled
                                Jan 21 17:37:00 	kernel: carp0: changing name to 'vip1'
                                Jan 21 17:37:00 	kernel: em0: promiscuous mode enabled
                                Jan 21 17:37:00 	kernel: vip1: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip1: link state changed to UP
                                Jan 21 17:37:00 	kernel: carp1: changing name to 'vip2'
                                Jan 21 17:37:00 	kernel: em1: promiscuous mode enabled
                                Jan 21 17:37:00 	kernel: vip2: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip2: link state changed to UP
                                Jan 21 17:37:00 	kernel: carp2: changing name to 'vip3'
                                Jan 21 17:37:00 	kernel: em2: promiscuous mode enabled
                                Jan 21 17:37:00 	kernel: em2_vlan70: promiscuous mode enabled
                                Jan 21 17:37:00 	kernel: vip3: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip3: link state changed to UP
                                Jan 21 17:37:00 	php: : CARP sync not being done because of missing sync ip!
                                Jan 21 17:37:00 	check_reload_status: syncing firewall
                                Jan 21 17:37:00 	kernel: carp3: changing name to 'vip4'
                                Jan 21 17:37:00 	kernel: vip4: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip4: link state changed to UP
                                Jan 21 17:37:00 	kernel: vip1: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: vip1: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip1: link state changed to UP
                                Jan 21 17:37:00 	php: : CARP sync not being done because of missing sync ip!
                                Jan 21 17:37:00 	check_reload_status: reloading filter
                                Jan 21 17:37:00 	kernel: vip2: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: vip2: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip2: link state changed to UP
                                Jan 21 17:37:00 	kernel: vip3: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: vip3: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip3: link state changed to UP
                                Jan 21 17:37:00 	kernel: vip4: link state changed to DOWN
                                Jan 21 17:37:00 	kernel: vip4: INIT -> MASTER (preempting)
                                Jan 21 17:37:00 	kernel: vip4: link state changed to UP
                                Jan 21 17:37:00 	php: /xmlrpc.php: ROUTING: change default route to ***
                                Jan 21 17:37:00 	php: /xmlrpc.php: Removing static route for monitor*** and adding a new route through ***
                                Jan 21 17:37:00 	php: /xmlrpc.php: Removing static route for monitor *** and adding a new route through ***
                                Jan 21 17:37:00 	apinger: Exiting on signal 15.
                                Jan 21 17:37:01 	apinger: Starting Alarm Pinger, apinger(60120)
                                Jan 21 17:37:01 	php: /xmlrpc.php: Resyncing OpenVPN instances.
                                Jan 21 17:37:02 	kernel: vip2: MASTER -> BACKUP (more frequent advertisement received)
                                Jan 21 17:37:02 	kernel: vip2: link state changed to DOWN
                                Jan 21 17:37:03 	dhcpd: Internet Systems Consortium DHCP Server 4.1.1-P1
                                Jan 21 17:37:03 	dhcpd: Copyright 2004-2010 Internet Systems Consortium.
                                Jan 21 17:37:03 	dhcpd: All rights reserved.
                                Jan 21 17:37:03 	dhcpd: For info, please visit https://www.isc.org/software/dhcp/
                                Jan 21 17:37:03 	dnsmasq[51897]: exiting on receipt of SIGTERM
                                Jan 21 17:37:04 	dnsmasq[5180]: started, version 2.55 cachesize 10000
                                Jan 21 17:37:04 	dnsmasq[5180]: compile time options: IPv6 GNU-getopt no-DBus I18N DHCP TFTP
                                Jan 21 17:37:04 	dnsmasq[5180]: reading /etc/resolv.conf
                                Jan 21 17:37:04 	dnsmasq[5180]: using nameserver 8.8.8.8#53
                                Jan 21 17:37:04 	dnsmasq[5180]: read /etc/hosts - 2 addresses
                                Jan 21 17:37:05 	kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
                                Jan 21 17:37:05 	kernel: vip1: link state changed to DOWN
                                Jan 21 17:37:05 	dhcpd: Internet Systems Consortium DHCP Server 4.1.1-P1
                                Jan 21 17:37:05 	dhcpd: Copyright 2004-2010 Internet Systems Consortium.
                                Jan 21 17:37:05 	dhcpd: All rights reserved.
                                Jan 21 17:37:05 	dhcpd: For info, please visit https://www.isc.org/software/dhcp/
                                Jan 21 17:37:06 	kernel: vip3: MASTER -> BACKUP (more frequent advertisement received)
                                Jan 21 17:37:06 	kernel: vip3: link state changed to DOWN
                                Jan 21 17:38:34 	kernel: lock order reversal:
                                Jan 21 17:38:34 	kernel: 1st 0xffffffff8123e520 in_ifaddr_lock (in_ifaddr_lock) @ /usr/pfSensesrc/src/sys/netinet/if_ether.c:541
                                Jan 21 17:38:34 	kernel: 2nd 0xffffff00026d55a0 carp_if (carp_if) @ /usr/pfSensesrc/src/sys/netinet/ip_carp.c:1160
                                Jan 21 17:38:34 	kernel: KDB: stack backtrace:
                                Jan 21 17:38:34 	kernel: X_db_sym_numargs() at X_db_sym_numargs+0x15a
                                Jan 21 17:38:34 	kernel: witness_display_spinlock() at witness_display_spinlock+0x9e
                                Jan 21 17:38:34 	kernel: witness_checkorder() at witness_checkorder+0x81e
                                Jan 21 17:38:34 	kernel: _mtx_lock_flags() at _mtx_lock_flags+0x78
                                Jan 21 17:38:34 	kernel: carp_iamatch() at carp_iamatch+0x38
                                Jan 21 17:38:34 	kernel: arprequest() at arprequest+0x4b8
                                Jan 21 17:38:34 	kernel: netisr_dispatch_src() at netisr_dispatch_src+0xb8
                                Jan 21 17:38:34 	kernel: ether_demux() at ether_demux+0x18d
                                Jan 21 17:38:34 	kernel: ether_vlanencap() at ether_vlanencap+0x295
                                Jan 21 17:38:34 	kernel: ed_probe_RTL80x9() at ed_probe_RTL80x9+0x7cf8
                                Jan 21 17:38:34 	kernel: ed_probe_RTL80x9() at ed_probe_RTL80x9+0x7ff4
                                Jan 21 17:38:34 	kernel: intr_event_execute_handlers() at intr_event_execute_handlers+0x66
                                Jan 21 17:38:34 	kernel: intr_event_add_handler() at intr_event_add_handler+0x432
                                Jan 21 17:38:34 	kernel: fork_exit() at fork_exit+0x12a
                                Jan 21 17:38:34 	kernel: fork_trampoline() at fork_trampoline+0xe
                                Jan 21 17:38:34 	kernel: --- trap 0, rip = 0, rsp = 0xffffff80000d6d30, rbp = 0 ---
                                Jan 21 17:38:44 	kernel: vip4: MASTER -> BACKUP (more frequent advertisement received)
                                Jan 21 17:38:44 	kernel: vip4: link state changed to DOWN
                                Jan 21 17:40:26 	check_reload_status: syncing firewall
                                Jan 21 17:40:26 	syslogd: exiting on signal 15
                                Jan 21 17:40:26 	syslogd: kernel boot file is /boot/kernel/kernel
                                Jan 21 17:40:26 	php: : CARP sync not being done because of missing sync ip!
                                
                                

                                Except for this incident I must say that it's a pleasure to work with beta 2!

                                21012011221.jpg
                                21012011221.jpg_thumb
                                21012011222.jpg
                                21012011222.jpg_thumb

                                1 Reply Last reply Reply Quote 0
                                • E
                                  eri--
                                  last edited by

                                  Try a snapshot from tomorrow a fix for this has been put in place.

                                  1 Reply Last reply Reply Quote 0
                                  • D
                                    disa
                                    last edited by

                                    Thanks for you prompt support.
                                    I've upgraded to 21 Jan 23:51 but the secondary still crashes, do you need a backtrace? Or is this snapshot not new enough?
                                    thanks

                                    1 Reply Last reply Reply Quote 0
                                    • L
                                      LostInIgnorance
                                      last edited by

                                      I am still getting the panic on my home Soekris net5501-70 board when using openVPN.  I am unable to load the developer kernel on here since the board is an embedded system without a display.  The board refuses to boot once loaded with the developer.  Is there another way to grab the panic?  I am using a HD in place of a cf card since I am running squid and HAVP.

                                      Fatal trap 12: page fault while in kernel mode
                                      cpuid = 0; apic id = 00
                                      fault virtual address= 0x0
                                      fault code= supervisor read, page not present
                                      instruction pointer= 0x20:0x0
                                      stack pointer        = 0x28:0xd5341bf4
                                      frame pointer        = 0x28:0xd5341c28
                                      code segment= base 0x0, limit 0xfffff, type 0x1b
                                      = DPL 0, pres 1, def32 1, gran 1
                                      processor eflags= interrupt enabled, resume, IOPL = 0
                                      current process= 11 (irq5: vr1)
                                      trap number= 12
                                      panic: page fault
                                      cpuid = 0
                                      Uptime: 2d6h27m42s
                                      Cannot dump. Device not defined or unavailable.
                                      Automatic reboot in 15 seconds - press a key on the console to abort
                                      Rebooting...
                                      
                                      1 Reply Last reply Reply Quote 0
                                      • D
                                        disa
                                        last edited by

                                        Hi, I've upgraded to version: Sun Jan 23 01:37:41 EST 2011, changed to devel kernel and the system doesn't boot any more, see the attached pictures.

                                        Thanks

                                        23012011229.jpg
                                        23012011229.jpg_thumb
                                        23012011230.jpg
                                        23012011230.jpg_thumb

                                        1 Reply Last reply Reply Quote 0
                                        • V
                                          vito
                                          last edited by

                                          Tested with snap
                                          2.0-BETA5 (i386)
                                          built on Fri Jan 21 06:52:27 EST 2011

                                          Still not working.

                                          Thanks!

                                          1 Reply Last reply Reply Quote 0
                                          • jimpJ
                                            jimp Rebel Alliance Developer Netgate
                                            last edited by

                                            Looks like the e1000 (em and igb) driver from 8.2-STABLE saw quite a few updates/fixes and we may have to pull that in. I'll post in here when something gets committed (or you can just follow the commit log on the tools repo)

                                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                            Need help fast? Netgate Global Support!

                                            Do not Chat/PM for help!

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.