Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    Kernel Panic

    2.0-RC Snapshot Feedback and Problems - RETIRED
    35
    325
    133131
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L
      LostInIgnorance last edited by

      I have 2.0-BETA5 (i386) built on Fri Dec 31 14:08:23 EST 2010.  I have been having problems with this firewall running on a old dell P4 computer.  It has been running great, until I login via openvpn and try to remote desktop a computer.  The firewall then kernel panics with:

      Fatal trap 12: page fault while in kernel mode
      cpuid = 0; apic id = 00
      fault virtual address= 0x0
      fault code= supervisor read, page not present
      instruction pointer= 0x20:0x0
      stack pointer        = 0x28:0xcca52bbc
      frame pointer        = 0x28:0xcca52bc8
      code segment= base 0x0, limit 0xfffff, type 0x1b
      = DPL 0, pres 1, def32 1, gran 1
      processor eflags= interrupt enabled, resume, IOPL = 0
      current process= 0 (em0 taskq)
      trap number= 12
      panic: page fault
      cpuid = 0
      Uptime: 16h28m26s
      Cannot dump. Device not defined or unavailable.
      Automatic reboot in 15 seconds - press a key on the console to abort
      Rebooting...
      

      The firewall has 9 VLANs all trunked over one physical lan port.  I run squid with lightsquid for logging on all VLAN interfaces.  I am running captive portal too.
      I get the same problem at home with my Soekris net5501-70.  There I have only LAN, WRLS (for guest access), DMZ, and am running the same utilities (squid, lightsquid, nut, captive portal).

      1 Reply Last reply Reply Quote 0
      • M
        mbis last edited by

        Same here… has there been a bug opened? Is there already a fix?

        1 Reply Last reply Reply Quote 0
        • L
          LostInIgnorance last edited by

          I would love for a problem to be in the fix list, but I don't even know exactly what the cause is.  I would appreciate some help from one of the administrators to help out in the debug process.

          1 Reply Last reply Reply Quote 0
          • F
            FisherKing last edited by

            As a start, you can install the developer kernel.

            After the panic type "bt" at the debugger console and capture that info as well as the panic.

            Directions for installing the developer kernel are at the following link

            http://doc.pfsense.org/index.php/Switching_Kernels

            1 Reply Last reply Reply Quote 0
            • T
              ti-guilherme last edited by

              Hi Guys,

              Here I have the same problem when I active the Squid…

              I´m testing since version: pfSense-2.0-BETA5-20101228-0454.iso.gz to pfSense-2.0-BETA5-20110101-1659.iso.gz.

              If I don´t use Squid the problem doesn´t hapen... this was done in a lab, about ten users were use to test.

              We need open a Bug to report this...

              Luiz Ferreira.

              1 Reply Last reply Reply Quote 0
              • jimp
                jimp Rebel Alliance Developer Netgate last edited by

                Since those snapshots are nearly a week old, before you report any problem, make sure it can be reproduced on the most current snapshot available.

                And if you get a panic, we need the full text of the panic (even a picture of the screen is OK) and if possible, install a debug kernel (as PJ2 mentioned) and get a backtrace.

                Be careful with posting or reporting bugs to an existing thread or ticket about panics, too, since they could be unrelated unless the circumstances and panic message are identical.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • L
                  LostInIgnorance last edited by

                  The panic shows still after a most recent update [2.0-BETA5 (i386) built on Wed Jan 5 03:16:13 EST 2011].  Same panic as above.
                  If it helps out any, I am running 2008 r2 and connecting using a win7 client.  I noticed the panic happens after the kerbros handshake, when it is "initiating remote connection"

                  EDIT: I tried do load the developer kernel on my Soekris board and it didn't work, using the above link and instructions.  It actually crashed the system and wouldn't boot.  I will try it on the p4 later today, hopefully I won't have the same outcome.

                  EDIT EDIT: This is a copy of this afternoons crash from my home system.

                  Fatal trap 12: page fault while in kernel mode
                  cpuid = 0; apic id = 00
                  fault virtual address= 0x0
                  fault code= supervisor read, page not present
                  instruction pointer= 0x20:0x0
                  stack pointer        = 0x28:0xd5341bf4
                  frame pointer        = 0x28:0xd5341c28
                  code segment= base 0x0, limit 0xfffff, type 0x1b
                  = DPL 0, pres 1, def32 1, gran 1
                  processor eflags= interrupt enabled, resume, IOPL = 0
                  current process= 11 (irq5: vr1)
                  trap number= 12
                  panic: page fault
                  cpuid = 0
                  Uptime: 1d19h33m12s
                  Cannot dump. Device not defined or unavailable.
                  Automatic reboot in 15 seconds - press a key on the console to abort
                  Rebooting...
                  

                  The firewall crashed while I was out and trying to remote into my home system.  I am using squid in transparency mode.  I haven't tried disabling squid.

                  1 Reply Last reply Reply Quote 0
                  • L
                    LostInIgnorance last edited by

                    Is anyone else able to get the panic logged?  I can't get the developer setup on either one of the firewalls.  I can't afford the downtime.

                    1 Reply Last reply Reply Quote 0
                    • V
                      vito last edited by

                      I had a thread started on this here from dec 13, but noticed this one
                      http://forum.pfsense.org/index.php/topic,31031.msg163019.html#msg163019

                      i just posted this pic on my thread.

                      EDIT: If this helps, i have mutiple vlans on em1 on this box. A few of the other firewalls with the problem, no vlans on the interface.


                      1 Reply Last reply Reply Quote 0
                      • V
                        vito last edited by

                        Here is a screen shoot of the debug in the dev kernel. Can not get to the console at the moment for any more info


                        1 Reply Last reply Reply Quote 0
                        • O
                          odin last edited by

                          I have exactly the same kernel crashes in FreeBSD7 and 8. To me, it seems to be related to VLAN tags and NICs.

                          http://forums.freebsd.org/showthread.php?t=18676

                          (see last post #8)

                          1 Reply Last reply Reply Quote 0
                          • L
                            LostInIgnorance last edited by

                            Successfully grabbed the panic in developer mode

                            Kernel page fault with the following non-sleepable locks held:
                            exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                            KDB: stack backtrace:
                            X_db_sym_numargs(c0eb7066,e302aa90,c0a41d45,546,0,...) at X_db_sym_numargs+0x146
                            kdb_backtrace(546,0,ffffffff,c145d1ac,e302aac8,...) at kdb_backtrace+0x29
                            witness_display_spinlock(c0eb957e,e302aadc,4,1,0,...) at witness_display_spinlock+0x75
                            witness_warn(5,0,c0ef792d,14,c131b140,...) at witness_warn+0x20d
                            trap(e302ab68) at trap+0x19e
                            alltraps(c336dc00,dedeadc0,c336dc00,c336dc00,e302abf0,...) at alltraps+0x1b
                            m_tag_delete_chain(c336dc00,0,c0e6e512,0,c2ed9bc0,...) at m_tag_delete_chain+0x3f
                            reallocf(c336dc00,100,0,c0a42798,df,...) at reallocf+0x8a5
                            uma_zfree_arg(c1d7e380,c336dc00,0,bc,e302ac84,...) at uma_zfree_arg+0x29
                            m_freem(c336dc00,4,c0e6e512,b87,c2f4e000,...) at m_freem+0x43
                            ed_probe_RTL80x9(c2f52580,0,c0e6e512,546,c2f525bc,...) at 0xc06ec448
                            ed_probe_RTL80x9(c2f4e000,1,c0eb8937,4f,c2edb918,...) at 0xc06efe10
                            taskqueue_run(c2edb900,c2edb918,c0ea5cf0,0,c0eb1f96,...) at taskqueue_run+0x103
                            taskqueue_thread_loop(c2f525ec,e302ad38,c0eaeb05,344,c131b140,...) at taskqueue_thread_loop+0x68
                            fork_exit(c0a3afc0,c2f525ec,e302ad38) at fork_exit+0xb8
                            fork_trampoline() at fork_trampoline+0x8
                            --- trap 0, eip = 0, esp = 0xe302ad70, ebp = 0 ---
                            
                            Fatal trap 12: page fault while in kernel mode
                            cpuid = 0; apic id = 00
                            fault virtual address= 0xdedeadc0
                            fault code= supervisor read, page not present
                            instruction pointer= 0x20:0xc0a60fe8
                            stack pointer        = 0x28:0xe302aba8
                            frame pointer        = 0x28:0xe302abb8
                            code segment= base 0x0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, def32 1, gran 1
                            processor eflags= interrupt enabled, resume, IOPL = 0
                            current process= 0 (em0 taskq)
                            [thread]
                            Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
                            db> [/thread]
                            
                            1 Reply Last reply Reply Quote 0
                            • jimp
                              jimp Rebel Alliance Developer Netgate last edited by

                              Looks like it may be in the Intel driver… fun.

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • L
                                LostInIgnorance last edited by

                                so, is it something you guys can fix or is it something with freebsd?

                                1 Reply Last reply Reply Quote 0
                                • jimp
                                  jimp Rebel Alliance Developer Netgate last edited by

                                  Hopefully it's fixable, though lem is legacy em, so it's some rather old/early em chipsets. Might explain why so few people have hit it.

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • L
                                    LostInIgnorance last edited by

                                    If it is an intel problem, then why would I be recieving the same problem on my soekris net5501-70 board? that uses a VIA VT6105M chip.

                                    1 Reply Last reply Reply Quote 0
                                    • jimp
                                      jimp Rebel Alliance Developer Netgate last edited by

                                      Without a backtrace from there it's impossible to say.

                                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                      Need help fast? Netgate Global Support!

                                      Do not Chat/PM for help!

                                      1 Reply Last reply Reply Quote 0
                                      • V
                                        vito last edited by

                                        @jimp:

                                        Hopefully it's fixable, though lem is legacy em, so it's some rather old/early em chipsets. Might explain why so few people have hit it.

                                        Jim
                                        where do you see it is an old legacy nic/driver? (not questioning, just Curious) :)
                                        just an fyi, had no problem with oct snaps.

                                        Thanks for your help

                                        1 Reply Last reply Reply Quote 0
                                        • jimp
                                          jimp Rebel Alliance Developer Netgate last edited by

                                          @vito:

                                          where do you see it is an old legacy nic/driver? (not questioning, just Curious) :)
                                          just an fyi, had no problem with oct snaps.

                                          
                                          exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                                          

                                          if_lem.c. lem is legacy em, a normal em card would have been in if_em.c

                                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                          Need help fast? Netgate Global Support!

                                          Do not Chat/PM for help!

                                          1 Reply Last reply Reply Quote 0
                                          • F
                                            FisherKing last edited by

                                            Hmm - that line looks similar to the panic I get when captive portal is enabled on my box.

                                            
                                            exclusive sleep mutex fxp0 (network driver) r = 0 (0xc36de018) locked @ /usr/pfSensesrc/src/sys/dev/fxp/if_fxp.c:1288
                                            
                                            

                                            Details here:
                                            http://forum.pfsense.org/index.php/topic,30791.msg159227.html#msg159227

                                            CryoGenID gets a panic but with yet another set of drivers.  Cino does as well.
                                            http://forum.pfsense.org/index.php/topic,29839.60.html

                                            Is there anything we can do to help besides posting back traces?

                                            1 Reply Last reply Reply Quote 0
                                            • jimp
                                              jimp Rebel Alliance Developer Netgate last edited by

                                              I just spent a bit of time on the phone with someone who hit this. It does seem to be related to OpenVPN somehow (or the kind of traffic that is seen more often with OpenVPN I suppose). Once we had the developer kernel on it stayed up for quite a while until we had someone connect with OpenVPN and generate some traffic.

                                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                              Need help fast? Netgate Global Support!

                                              Do not Chat/PM for help!

                                              1 Reply Last reply Reply Quote 0
                                              • jimp
                                                jimp Rebel Alliance Developer Netgate last edited by

                                                A patch was just committed by ermal that might be a potential fix for this, or at least change the behavior somewhat. Give the next snapshot a try.

                                                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                Need help fast? Netgate Global Support!

                                                Do not Chat/PM for help!

                                                1 Reply Last reply Reply Quote 0
                                                • J
                                                  Jonb last edited by

                                                  Just as a side note I was getting a kernal panic with the PPPOA interface being selected to WAN rather than rl1. Not sure if that is due to incorrect config but if so might be worth removing to save people the hassle.

                                                  Hosted desktops and servers with support without complication.
                                                  www.blueskysystems.co.uk

                                                  1 Reply Last reply Reply Quote 0
                                                  • L
                                                    LostInIgnorance last edited by

                                                    Still getting the panic, I don't think the commit happened on the most recent snap.

                                                    Kernel page fault with the following non-sleepable locks held:
                                                    exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                                                    KDB: stack backtrace:
                                                    X_db_sym_numargs(c0eb72fb,ccc3ca90,c0a41f25,546,0,...) at X_db_sym_numargs+0x146
                                                    kdb_backtrace(546,0,ffffffff,c145d42c,ccc3cac8,...) at kdb_backtrace+0x29
                                                    witness_display_spinlock(c0eb9813,ccc3cadc,4,1,0,...) at witness_display_spinlock+0x75
                                                    witness_warn(5,0,c0ef7bc2,14,c131b3c0,...) at witness_warn+0x20d
                                                    trap(ccc3cb68) at trap+0x19e
                                                    alltraps(c341ab00,dedeadc0,c341ab00,c341ab00,ccc3cbf0,...) at alltraps+0x1b
                                                    m_tag_delete_chain(c341ab00,0,c0e6e75d,0,c2ed9d50,...) at m_tag_delete_chain+0x3f
                                                    reallocf(c341ab00,100,0,c0a42978,df,...) at reallocf+0x8a5
                                                    uma_zfree_arg(c1d7e380,c341ab00,0,d5,ccc3cc84,...) at uma_zfree_arg+0x29
                                                    m_freem(c341ab00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
                                                    ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
                                                    ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
                                                    taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
                                                    taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed9a,344,c131b3c0,...) at taskqueue_thread_loop+0x68
                                                    fork_exit(c0a3b1a0,c2f525ec,ccc3cd38) at fork_exit+0xb8
                                                    fork_trampoline() at fork_trampoline+0x8
                                                    --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 ---
                                                    
                                                    Fatal trap 12: page fault while in kernel mode
                                                    cpuid = 0; apic id = 00
                                                    fault virtual address= 0xdedeadc0
                                                    fault code= supervisor read, page not present
                                                    instruction pointer= 0x20:0xc0a611c8
                                                    stack pointer        = 0x28:0xccc3cba8
                                                    frame pointer        = 0x28:0xccc3cbb8
                                                    code segment= base 0x0, limit 0xfffff, type 0x1b
                                                    = DPL 0, pres 1, def32 1, gran 1
                                                    processor eflags= interrupt enabled, resume, IOPL = 0
                                                    current process= 0 (em0 taskq)
                                                    [thread]
                                                    Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
                                                    db> [/thread]
                                                    
                                                    1 Reply Last reply Reply Quote 0
                                                    • jimp
                                                      jimp Rebel Alliance Developer Netgate last edited by

                                                      It did happen. I manually restarted the builders after the patch went in. So apparently it still isn't quite right.

                                                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                      Need help fast? Netgate Global Support!

                                                      Do not Chat/PM for help!

                                                      1 Reply Last reply Reply Quote 0
                                                      • jimp
                                                        jimp Rebel Alliance Developer Netgate last edited by

                                                        Could someone who can readily reproduce this panic give this custom firmware build a try?

                                                        http://cvs.pfsense.org/~jimp/pfSense-Full-Update-2.0-BETA5-i386-20110114-2041.tgz

                                                        It was built without a patch that does the extra mbuf operations that may be triggering the panic.

                                                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                        Need help fast? Netgate Global Support!

                                                        Do not Chat/PM for help!

                                                        1 Reply Last reply Reply Quote 0
                                                        • L
                                                          LostInIgnorance last edited by

                                                          Bad news JimP, still crashes.

                                                          Kernel page fault with the following non-sleepable locks held:
                                                          exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                                                          KDB: stack backtrace:
                                                          X_db_sym_numargs(c0eb72fb,ccc3ca90,c0a41f25,546,0,...) at X_db_sym_numargs+0x146
                                                          kdb_backtrace(546,0,ffffffff,c145d42c,ccc3cac8,...) at kdb_backtrace+0x29
                                                          witness_display_spinlock(c0eb9813,ccc3cadc,4,1,0,...) at witness_display_spinlock+0x75
                                                          witness_warn(5,0,c0ef7bc2,14,c131b3c0,...) at witness_warn+0x20d
                                                          trap(ccc3cb68) at trap+0x19e
                                                          alltraps(c2feeb00,dedeadc0,c2feeb00,c2feeb00,ccc3cbf0,...) at alltraps+0x1b
                                                          m_tag_delete_chain(c2feeb00,0,c0e6e75d,0,c2ed9b50,...) at m_tag_delete_chain+0x3f
                                                          reallocf(c2feeb00,100,0,c0a42978,df,...) at reallocf+0x8a5
                                                          uma_zfree_arg(c1d7e380,c2feeb00,0,b5,ccc3cc84,...) at uma_zfree_arg+0x29
                                                          m_freem(c2feeb00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
                                                          ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
                                                          ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
                                                          taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
                                                          taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaed9a,344,c131b3c0,...) at taskqueue_thread_loop+0x68
                                                          fork_exit(c0a3b1a0,c2f525ec,ccc3cd38) at fork_exit+0xb8
                                                          fork_trampoline() at fork_trampoline+0x8
                                                          --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 ---
                                                          
                                                          Fatal trap 12: page fault while in kernel mode
                                                          cpuid = 0; apic id = 00
                                                          fault virtual address= 0xdedeadc0
                                                          fault code= supervisor read, page not present
                                                          instruction pointer= 0x20:0xc0a611c8
                                                          stack pointer        = 0x28:0xccc3cba8
                                                          frame pointer        = 0x28:0xccc3cbb8
                                                          code segment= base 0x0, limit 0xfffff, type 0x1b
                                                          = DPL 0, pres 1, def32 1, gran 1
                                                          processor eflags= interrupt enabled, resume, IOPL = 0
                                                          current process= 0 (em0 taskq)
                                                          [thread]
                                                          Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
                                                          db> [/thread]
                                                          
                                                          1 Reply Last reply Reply Quote 0
                                                          • F
                                                            FisherKing last edited by

                                                            currently running 2.0-BETA5 (i386) built on Thu Jan 13 19:33:19 EST 201
                                                            not sure how far back this happens.

                                                            in a test network -
                                                            2 machines, each w/ 4 intel nics (em0 - em3)
                                                            WAN, LAN, Opt1, Opt2 (CARP interface)

                                                            Running CARP on WAN, LAN, Opt1 interfaces
                                                            Syncing on Opt2 interface.

                                                            Recently started getting panics on box2 when changing settings on box1.

                                                            Panic & BackTrace from box2 included below.

                                                            
                                                            Fatal trap 12: page fault while in kernel mode
                                                            
                                                            cpuid = 0; apic id = 00
                                                            
                                                            fault virtual address	= 0x1a4
                                                            
                                                            fault code		= supervisor read, page not present
                                                            
                                                            instruction pointer	= 0x20:0xc09ee51d
                                                            
                                                            stack pointer	        = 0x28:0xd670aa54
                                                            
                                                            frame pointer	        = 0x28:0xd670aa70
                                                            
                                                            code segment		= base 0x0, limit 0xfffff, type 0x1b
                                                            
                                                            			= DPL 0, pres 1, def32 1, gran 1
                                                            
                                                            processor eflags	= interrupt enabled, resume, IOPL = 0
                                                            
                                                            current process		= 253 (devd)
                                                            
                                                            [thread]
                                                            Stopped at      _mtx_lock_sleep+0x6d:   movl    0x1a4(%ecx),%eax
                                                            
                                                            db> bt
                                                            Tracing pid 253 tid 64081 td 0xc4142000
                                                            _mtx_lock_sleep(c40f16d0,c4142000,0,c0ecfc57,fd,...) at _mtx_lock_sleep+0x6d
                                                            _mtx_lock_flags(c40f16d0,0,c0ecfc57,fd,0,...) at _mtx_lock_flags+0xf7
                                                            carp6_input(c3ae5800,c0286938,c40f3a00,c0ea9fce,3,...) at carp6_input+0x9bd
                                                            ifioctl(c46a3b44,c0286938,c40f3a00,c4142000,c40cf900,...) at ifioctl+0x141e
                                                            soo_ioctl(c412ddc8,c0286938,c40f3a00,c39aa400,c4142000,...) at soo_ioctl+0x415
                                                            kern_ioctl(c4142000,f,c0286938,c40f3a00,1a3b7d0,...) at kern_ioctl+0x1fd
                                                            ioctl(c4142000,d670acf8,c0ef7af5,c0ecdaff,c41a77f8,...) at ioctl+0x134
                                                            syscall(d670ad38) at syscall+0x220
                                                            Xint0x80_syscall() at Xint0x80_syscall+0x20
                                                            --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x8088357, esp = 0xbfbfe89c, ebp = 0xbfbfe908 ---
                                                            db> reboot
                                                            [/thread]
                                                            
                                                            1 Reply Last reply Reply Quote 0
                                                            • jimp
                                                              jimp Rebel Alliance Developer Netgate last edited by

                                                              Out of curiosity, what type of network cards do you have in that box? Is it rl and em both? Or just em? or just rl? Or something else?

                                                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                              Need help fast? Netgate Global Support!

                                                              Do not Chat/PM for help!

                                                              1 Reply Last reply Reply Quote 0
                                                              • L
                                                                LostInIgnorance last edited by

                                                                one em network (gig embedded on the board of an old dell p4).  All network traffic is VLAN'd on that one interface.

                                                                1 Reply Last reply Reply Quote 0
                                                                • jimp
                                                                  jimp Rebel Alliance Developer Netgate last edited by

                                                                  OK, just checking… It looks odd to me that the backtrace references ed_probe_RTL80x9 which is a really old realtek chip, but it may just be something weird that I don't know at that level in the kernel/network stack.

                                                                  We have arranged serial console access with someone who has been able to reproduce the panic so hopefully we'll have a lead on a fix early next week.

                                                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                                  Need help fast? Netgate Global Support!

                                                                  Do not Chat/PM for help!

                                                                  1 Reply Last reply Reply Quote 0
                                                                  • W
                                                                    wallabybob last edited by

                                                                    @jimp:

                                                                    OK, just checking… It looks odd to me that the backtrace references ed_probe_RTL80x9 which is a really old realtek chip,

                                                                    Here's an extract from the stack trace:

                                                                    m_freem(c2feeb00,4,c0e6e75d,b87,c2f4e000,...) at m_freem+0x43
                                                                    ed_probe_RTL80x9(c2f52580,0,c0e6e75d,546,c2f525bc,...) at 0xc06ec4d8
                                                                    ed_probe_RTL80x9(c2f4e000,1,c0eb8bcc,4f,c2edb918,...) at 0xc06efea0
                                                                    taskqueue_run(c2edb900,c2edb918,c0ea5f85,0,c0eb222b,...) at taskqueue_run+0x103
                                                                    

                                                                    Note the two ed_probe_RTL80x9 references are not accompanied by a symbol name and offset. I suspect ed_probe_RTL80x9 is merely the closest lower value global symbol but its too far away to warrant printing the PC as symbol+offset. If that is the case you shouldn't take too much notice of the ed_probe_RTL80x9.

                                                                    1 Reply Last reply Reply Quote 0
                                                                    • L
                                                                      LostInIgnorance last edited by

                                                                      @jimp:

                                                                      We have arranged serial console access with someone who has been able to reproduce the panic so hopefully we'll have a lead on a fix early next week.

                                                                      JimP, is there anything I can do to help out?

                                                                      1 Reply Last reply Reply Quote 0
                                                                      • jimp
                                                                        jimp Rebel Alliance Developer Netgate last edited by

                                                                        Not that I'm aware of. If the mbuf tag patch isn't the cause, it almost has to be the recent e1000 driver update (em, igb, etc).

                                                                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                                        Need help fast? Netgate Global Support!

                                                                        Do not Chat/PM for help!

                                                                        1 Reply Last reply Reply Quote 0
                                                                        • jimp
                                                                          jimp Rebel Alliance Developer Netgate last edited by

                                                                          Someone else had seen that once but so far we've been unable to replicate it so the real cause can be tracked down.

                                                                          It seemed to be something in the configuration, though.

                                                                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                                          Need help fast? Netgate Global Support!

                                                                          Do not Chat/PM for help!

                                                                          1 Reply Last reply Reply Quote 0
                                                                          • L
                                                                            LostInIgnorance last edited by

                                                                            I am afraid to update since I haven't heard anything back.  Is it still crashing or has it been fixed?

                                                                            1 Reply Last reply Reply Quote 0
                                                                            • jimp
                                                                              jimp Rebel Alliance Developer Netgate last edited by

                                                                              Nothing has changed with the drivers, but there are plenty of other things that have been fixed, it may be worth trying.

                                                                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                                              Need help fast? Netgate Global Support!

                                                                              Do not Chat/PM for help!

                                                                              1 Reply Last reply Reply Quote 0
                                                                              • S
                                                                                Sabbasth last edited by

                                                                                How can I get the logs (system.log is flushed every boot ?) so I can help targeting the problem ?

                                                                                I have 4 NICs (5 if), all Intel em, both PCI NIC or MB integrated NIC.
                                                                                The computer just freezes, no reboot.

                                                                                I have nmap and bandwithd installed. I'm using outboud Multi Wan, DHCP server, no VLAN, no traffic shaper, no VPN.

                                                                                All was running good with an old snapshot. Freezes started after an upgrade a week ago. I currently have the lastest snapshot installed.
                                                                                The freezes are random, sometimes pfSense runs some minutes, sometimes some hours.

                                                                                Any FTP transfert aborts with an error (There were problems a week or so with passive FTP, but they were connection problems, here transferts are aborted).
                                                                                I think this can be linked to the problem if a buffer in the driver is the problem.

                                                                                1 Reply Last reply Reply Quote 0
                                                                                • jimp
                                                                                  jimp Rebel Alliance Developer Netgate last edited by

                                                                                  @Sabbasth:

                                                                                  How can I get the logs (system.log is flushed every boot ?) so I can help targeting the problem ?

                                                                                  I have 4 NICs (5 if), all Intel em, both PCI NIC or MB integrated NIC.
                                                                                  The computer just freezes, no reboot.

                                                                                  I have nmap and bandwithd installed. I'm using outboud Multi Wan, DHCP server, no VLAN, no traffic shaper, no VPN.

                                                                                  All was running good with an old snapshot. Freezes started after an upgrade a week ago. I currently have the lastest snapshot installed.
                                                                                  The freezes are random, sometimes pfSense runs some minutes, sometimes some hours.

                                                                                  Any FTP transfert aborts with an error (There were problems a week or so with passive FTP, but they were connection problems, here transferts are aborted).
                                                                                  I think this can be linked to the problem if a buffer in the driver is the problem.

                                                                                  If you are seeing a freeze and not a reset/panic, then this thread isn't related. Start a new thread for that. These drivers haven't changed for several weeks now.

                                                                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                                                                  Need help fast? Netgate Global Support!

                                                                                  Do not Chat/PM for help!

                                                                                  1 Reply Last reply Reply Quote 0
                                                                                  • C
                                                                                    clarknova last edited by

                                                                                    2.0-BETA5 (amd64)
                                                                                    built on Wed Jan 12 18:01:47 EST 2011

                                                                                    I just experienced my first kernel panic last night after more than 8 days uptime. I'm using a SM X7SPA-H board with only the onboard Intel GBE (Intel 82574L Gigabit Ethernet). I'm not using openvpn, but both NICs have multiple vlans on them and deal only in tagged traffic.

                                                                                    Is there a reasonable chance that updating to the latest snap will resolve this? I don't know that I can reproduce this panic intentionally, as it hasn't happened before and I wasn't doing anything interesting when it happened. I do have clients, but the panic happened at my lowest traffic period of the day.

                                                                                    db

                                                                                    1 Reply Last reply Reply Quote 0
                                                                                    • First post
                                                                                      Last post