Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Kernel Panic

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    325 Posts 35 Posters 279.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      FisherKing
      last edited by

      Hmm - that line looks similar to the panic I get when captive portal is enabled on my box.

      
      exclusive sleep mutex fxp0 (network driver) r = 0 (0xc36de018) locked @ /usr/pfSensesrc/src/sys/dev/fxp/if_fxp.c:1288
      
      

      Details here:
      http://forum.pfsense.org/index.php/topic,30791.msg159227.html#msg159227

      CryoGenID gets a panic but with yet another set of drivers.  Cino does as well.
      http://forum.pfsense.org/index.php/topic,29839.60.html

      Is there anything we can do to help besides posting back traces?

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        I just spent a bit of time on the phone with someone who hit this. It does seem to be related to OpenVPN somehow (or the kind of traffic that is seen more often with OpenVPN I suppose). Once we had the developer kernel on it stayed up for quite a while until we had someone connect with OpenVPN and generate some traffic.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          A patch was just committed by ermal that might be a potential fix for this, or at least change the behavior somewhat. Give the next snapshot a try.

          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • J
            Jonb
            last edited by

            Just as a side note I was getting a kernal panic with the PPPOA interface being selected to WAN rather than rl1. Not sure if that is due to incorrect config but if so might be worth removing to save people the hassle.

            Hosted desktops and servers with support without complication.
            www.blueskysystems.co.uk

            1 Reply Last reply Reply Quote 0
            • L
              LostInIgnorance
              last edited by

              Still getting the panic, I don't think the commit happened on the most recent snap.

              Kernel page fault with the following non-sleepable locks held:
              exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
              KDB: stack backtrace:
              X_db_sym_numargs(c0eb72fb,ccc3ca90,c0a41f25,546,0,...) at X_db_sym_numargs+0x146
              kdb_backtrace(546,0,ffffffff,c145d42c,ccc3cac8,...) at kdb_backtrace+0x29
              witness_display_spinlock(c0eb9813,ccc3cadc,4,1,0,...) at witness_display_spinlock+0x75
              witness_warn(5,0,c0ef7bc2,14,c131b3c0,...) at witness_warn+0x20d
              trap(ccc3cb68) at trap+0x19e
              alltraps(c341ab00,dedeadc0,c341ab00,c341ab00,ccc3cbf0,...) at alltraps+0x1b
              m_tag_delete_chain(c341ab00,0,c0e6e75d,0,c2ed9d50,...) at m_tag_delete_chain+0x3f
              reallocf(c341ab00,100,0,c0a42978,df,...) at reallocf+0x8a5
              uma_zfree_arg(c1d7e380,c341ab00,0,d5,ccc3cc84,...) at uma_zfree_arg+0x29
              m_freem(c341ab00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
              ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
              ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
              taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
              taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed9a,344,c131b3c0,...) at taskqueue_thread_loop+0x68
              fork_exit(c0a3b1a0,c2f525ec,ccc3cd38) at fork_exit+0xb8
              fork_trampoline() at fork_trampoline+0x8
              --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 ---
              
              Fatal trap 12: page fault while in kernel mode
              cpuid = 0; apic id = 00
              fault virtual address= 0xdedeadc0
              fault code= supervisor read, page not present
              instruction pointer= 0x20:0xc0a611c8
              stack pointer        = 0x28:0xccc3cba8
              frame pointer        = 0x28:0xccc3cbb8
              code segment= base 0x0, limit 0xfffff, type 0x1b
              = DPL 0, pres 1, def32 1, gran 1
              processor eflags= interrupt enabled, resume, IOPL = 0
              current process= 0 (em0 taskq)
              [thread]
              Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
              db> [/thread]
              
              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                It did happen. I manually restarted the builders after the patch went in. So apparently it still isn't quite right.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • jimpJ
                  jimp Rebel Alliance Developer Netgate
                  last edited by

                  Could someone who can readily reproduce this panic give this custom firmware build a try?

                  http://cvs.pfsense.org/~jimp/pfSense-Full-Update-2.0-BETA5-i386-20110114-2041.tgz

                  It was built without a patch that does the extra mbuf operations that may be triggering the panic.

                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 0
                  • L
                    LostInIgnorance
                    last edited by

                    Bad news JimP, still crashes.

                    Kernel page fault with the following non-sleepable locks held:
                    exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                    KDB: stack backtrace:
                    X_db_sym_numargs(c0eb72fb,ccc3ca90,c0a41f25,546,0,...) at X_db_sym_numargs+0x146
                    kdb_backtrace(546,0,ffffffff,c145d42c,ccc3cac8,...) at kdb_backtrace+0x29
                    witness_display_spinlock(c0eb9813,ccc3cadc,4,1,0,...) at witness_display_spinlock+0x75
                    witness_warn(5,0,c0ef7bc2,14,c131b3c0,...) at witness_warn+0x20d
                    trap(ccc3cb68) at trap+0x19e
                    alltraps(c2feeb00,dedeadc0,c2feeb00,c2feeb00,ccc3cbf0,...) at alltraps+0x1b
                    m_tag_delete_chain(c2feeb00,0,c0e6e75d,0,c2ed9b50,...) at m_tag_delete_chain+0x3f
                    reallocf(c2feeb00,100,0,c0a42978,df,...) at reallocf+0x8a5
                    uma_zfree_arg(c1d7e380,c2feeb00,0,b5,ccc3cc84,...) at uma_zfree_arg+0x29
                    m_freem(c2feeb00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
                    ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
                    ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
                    taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
                    taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed9a,344,c131b3c0,...) at taskqueue_thread_loop+0x68
                    fork_exit(c0a3b1a0,c2f525ec,ccc3cd38) at fork_exit+0xb8
                    fork_trampoline() at fork_trampoline+0x8
                    --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 ---
                    
                    Fatal trap 12: page fault while in kernel mode
                    cpuid = 0; apic id = 00
                    fault virtual address= 0xdedeadc0
                    fault code= supervisor read, page not present
                    instruction pointer= 0x20:0xc0a611c8
                    stack pointer        = 0x28:0xccc3cba8
                    frame pointer        = 0x28:0xccc3cbb8
                    code segment= base 0x0, limit 0xfffff, type 0x1b
                    = DPL 0, pres 1, def32 1, gran 1
                    processor eflags= interrupt enabled, resume, IOPL = 0
                    current process= 0 (em0 taskq)
                    [thread]
                    Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
                    db> [/thread]
                    
                    1 Reply Last reply Reply Quote 0
                    • F
                      FisherKing
                      last edited by

                      currently running 2.0-BETA5 (i386) built on Thu Jan 13 19:33:19 EST 201
                      not sure how far back this happens.

                      in a test network -
                      2 machines, each w/ 4 intel nics (em0 - em3)
                      WAN, LAN, Opt1, Opt2 (CARP interface)

                      Running CARP on WAN, LAN, Opt1 interfaces
                      Syncing on Opt2 interface.

                      Recently started getting panics on box2 when changing settings on box1.

                      Panic & BackTrace from box2 included below.

                      
                      Fatal trap 12: page fault while in kernel mode
                      
                      cpuid = 0; apic id = 00
                      
                      fault virtual address	= 0x1a4
                      
                      fault code		= supervisor read, page not present
                      
                      instruction pointer	= 0x20:0xc09ee51d
                      
                      stack pointer	        = 0x28:0xd670aa54
                      
                      frame pointer	        = 0x28:0xd670aa70
                      
                      code segment		= base 0x0, limit 0xfffff, type 0x1b
                      
                      			= DPL 0, pres 1, def32 1, gran 1
                      
                      processor eflags	= interrupt enabled, resume, IOPL = 0
                      
                      current process		= 253 (devd)
                      
                      [thread]
                      Stopped at      _mtx_lock_sleep+0x6d:   movl    0x1a4(%ecx),%eax
                      
                      db> bt
                      Tracing pid 253 tid 64081 td 0xc4142000
                      _mtx_lock_sleep(c40f16d0,c4142000,0,c0ecfc57,fd,...) at _mtx_lock_sleep+0x6d
                      _mtx_lock_flags(c40f16d0,0,c0ecfc57,fd,0,...) at _mtx_lock_flags+0xf7
                      carp6_input(c3ae5800,c0286938,c40f3a00,c0ea9fce,3,...) at carp6_input+0x9bd
                      ifioctl(c46a3b44,c0286938,c40f3a00,c4142000,c40cf900,...) at ifioctl+0x141e
                      soo_ioctl(c412ddc8,c0286938,c40f3a00,c39aa400,c4142000,...) at soo_ioctl+0x415
                      kern_ioctl(c4142000,f,c0286938,c40f3a00,1a3b7d0,...) at kern_ioctl+0x1fd
                      ioctl(c4142000,d670acf8,c0ef7af5,c0ecdaff,c41a77f8,...) at ioctl+0x134
                      syscall(d670ad38) at syscall+0x220
                      Xint0x80_syscall() at Xint0x80_syscall+0x20
                      --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x8088357, esp = 0xbfbfe89c, ebp = 0xbfbfe908 ---
                      db> reboot
                      [/thread]
                      
                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        Out of curiosity, what type of network cards do you have in that box? Is it rl and em both? Or just em? or just rl? Or something else?

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • L
                          LostInIgnorance
                          last edited by

                          one em network (gig embedded on the board of an old dell p4).  All network traffic is VLAN'd on that one interface.

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            OK, just checking… It looks odd to me that the backtrace references ed_probe_RTL80x9 which is a really old realtek chip, but it may just be something weird that I don't know at that level in the kernel/network stack.

                            We have arranged serial console access with someone who has been able to reproduce the panic so hopefully we'll have a lead on a fix early next week.

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • W
                              wallabybob
                              last edited by

                              @jimp:

                              OK, just checking… It looks odd to me that the backtrace references ed_probe_RTL80x9 which is a really old realtek chip,

                              Here's an extract from the stack trace:

                              m_freem(c2feeb00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
                              ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
                              ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
                              taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
                              

                              Note the two ed_probe_RTL80x9 references are not accompanied by a symbol name and offset. I suspect ed_probe_RTL80x9 is merely the closest lower value global symbol but its too far away to warrant printing the PC as symbol+offset. If that is the case you shouldn't take too much notice of the ed_probe_RTL80x9.

                              1 Reply Last reply Reply Quote 0
                              • L
                                LostInIgnorance
                                last edited by

                                @jimp:

                                We have arranged serial console access with someone who has been able to reproduce the panic so hopefully we'll have a lead on a fix early next week.

                                JimP, is there anything I can do to help out?

                                1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  Not that I'm aware of. If the mbuf tag patch isn't the cause, it almost has to be the recent e1000 driver update (em, igb, etc).

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • jimpJ
                                    jimp Rebel Alliance Developer Netgate
                                    last edited by

                                    Someone else had seen that once but so far we've been unable to replicate it so the real cause can be tracked down.

                                    It seemed to be something in the configuration, though.

                                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                    Need help fast? Netgate Global Support!

                                    Do not Chat/PM for help!

                                    1 Reply Last reply Reply Quote 0
                                    • L
                                      LostInIgnorance
                                      last edited by

                                      I am afraid to update since I haven't heard anything back.  Is it still crashing or has it been fixed?

                                      1 Reply Last reply Reply Quote 0
                                      • jimpJ
                                        jimp Rebel Alliance Developer Netgate
                                        last edited by

                                        Nothing has changed with the drivers, but there are plenty of other things that have been fixed, it may be worth trying.

                                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                        Need help fast? Netgate Global Support!

                                        Do not Chat/PM for help!

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Sabbasth
                                          last edited by

                                          How can I get the logs (system.log is flushed every boot ?) so I can help targeting the problem ?

                                          I have 4 NICs (5 if), all Intel em, both PCI NIC or MB integrated NIC.
                                          The computer just freezes, no reboot.

                                          I have nmap and bandwithd installed. I'm using outboud Multi Wan, DHCP server, no VLAN, no traffic shaper, no VPN.

                                          All was running good with an old snapshot. Freezes started after an upgrade a week ago. I currently have the lastest snapshot installed.
                                          The freezes are random, sometimes pfSense runs some minutes, sometimes some hours.

                                          Any FTP transfert aborts with an error (There were problems a week or so with passive FTP, but they were connection problems, here transferts are aborted).
                                          I think this can be linked to the problem if a buffer in the driver is the problem.

                                          1 Reply Last reply Reply Quote 0
                                          • jimpJ
                                            jimp Rebel Alliance Developer Netgate
                                            last edited by

                                            @Sabbasth:

                                            How can I get the logs (system.log is flushed every boot ?) so I can help targeting the problem ?

                                            I have 4 NICs (5 if), all Intel em, both PCI NIC or MB integrated NIC.
                                            The computer just freezes, no reboot.

                                            I have nmap and bandwithd installed. I'm using outboud Multi Wan, DHCP server, no VLAN, no traffic shaper, no VPN.

                                            All was running good with an old snapshot. Freezes started after an upgrade a week ago. I currently have the lastest snapshot installed.
                                            The freezes are random, sometimes pfSense runs some minutes, sometimes some hours.

                                            Any FTP transfert aborts with an error (There were problems a week or so with passive FTP, but they were connection problems, here transferts are aborted).
                                            I think this can be linked to the problem if a buffer in the driver is the problem.

                                            If you are seeing a freeze and not a reset/panic, then this thread isn't related. Start a new thread for that. These drivers haven't changed for several weeks now.

                                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                            Need help fast? Netgate Global Support!

                                            Do not Chat/PM for help!

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.