Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Kernel Panic

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    325 Posts 35 Posters 279.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ
      jimp Rebel Alliance Developer Netgate
      last edited by

      I just added support to the builder to make an embedded kernel with debug options. I've got a test build going on my box, if one cranks out I'll upload it somewhere this evening or tomorrow. Failing that, the main snapshots should include it from here on. Not the next snapshot, but the one after it, should have them.

      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • L
        LostInIgnorance
        last edited by

        OMG, THANKS JIMP!!!  I am excited to know if it works!

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          For those wanting to debug on ALIX/other embedded devices…

          /etc/rc.conf_mount_rw
          fetch http://pingle.org/files/kernel_wrap_Dev.gz
          tar xzpf kernel_wrap_Dev.gz -C /boot/
          
          

          And then reboot. It works on my ALIX.

          The next snapshot after the one building now should have them in there as well, but not the one building now.

          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • L
            LostInIgnorance
            last edited by

            JimP, the above mentioned code worked great on my Soekris net-5501-70 board.  Thanks for that, I hope to drive down the road and use my neighbors wifi to see if I can crash it now.

            1 Reply Last reply Reply Quote 0
            • L
              LostInIgnorance
              last edited by

              Grabbed the panic from the above mentioned Soekris board.

              Kernel page fault with the following non-sleepable locks held:
              exclusive sleep mutex vr1 (network driver) r = 0 (0xc3640aec) locked @ /usr/pfSensesrc/src/sys/dev/vr/if_vr.c:1675
              KDB: stack backtrace:
              X_db_sym_numargs(c0c4e35f,d5341a88,c092ce85,68b,0,...) at X_db_sym_numargs+0x146
              kdb_backtrace(68b,0,ffffffff,c11b73f4,d5341ac0,...) at kdb_backtrace+0x29
              witness_display_spinlock(c0c50877,d5341ad4,4,1,0,...) at witness_display_spinlock+0x75
              witness_warn(5,0,c0c823d7,c1981a94,c3556aa0,...) at witness_warn+0x20d
              trap(d5341b60) at trap+0x172
              alltraps(c3899300,dedeadc0,c3899300,c3899300,d5341be8,...) at alltraps+0x1b
              m_tag_delete_chain(c3899300,0,c092cc2b,0,0,...) at m_tag_delete_chain+0x3f
              m_pkthdr_init(c3899300,100,0,c092cc2b,c0c3acf7,...) at m_pkthdr_init+0x8b5
              uma_zfree_arg(c1981a80,c3899300,0,c3640000,d5341c70,...) at uma_zfree_arg+0x29
              m_freem(c3899300,4,c0c3acf7,5a3,0,...) at m_freem+0x43
              ucom_attach(c3640aec,0,c0c3acf7,68b,c3640aec,...) at ucom_attach+0x88f5
              ucom_attach(c3640000,d5341cc8,c08d8a54,c107b5c0,c3554238,...) at ucom_attach+0xaa17
              intr_event_execute_handlers(c3556aa0,c3554200,c0c4616c,533,c3554270,...) at intr_event_execute_handlers+0x125
              intr_event_add_handler(c3644b60,d5341d38,c0c45ecc,344,c3556aa0,...) at intr_event_add_handler+0x42f
              fork_exit(c08c1b70,c3644b60,d5341d38) at fork_exit+0xb8
              fork_trampoline() at fork_trampoline+0x8
              --- trap 0, eip = 0, esp = 0xd5341d70, ebp = 0 ---
              
              Fatal trap 12: page fault while in kernel mode
              cpuid = 0; apic id = 00
              fault virtual address= 0xdedeadc0
              fault code= supervisor read, page not present
              instruction pointer= 0x20:0xc094b038
              stack pointer        = 0x28:0xd5341ba0
              frame pointer        = 0x28:0xd5341bb0
              code segment= base 0x0, limit 0xfffff, type 0x1b
              = DPL 0, pres 1, def32 1, gran 1
              processor eflags= interrupt enabled, resume, IOPL = 0
              current process= 11 (irq5: vr1)
              [thread]
              Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
              db> bt
              Tracing pid 11 tid 64025 td 0xc358d780
              m_tag_delete(c3899300,dedeadc0,c3899300,c3899300,d5341be8,...) at m_tag_delete+0x48
              m_tag_delete_chain(c3899300,0,c092cc2b,0,0,...) at m_tag_delete_chain+0x3f
              m_pkthdr_init(c3899300,100,0,c092cc2b,c0c3acf7,...) at m_pkthdr_init+0x8b5
              uma_zfree_arg(c1981a80,c3899300,0,c3640000,d5341c70,...) at uma_zfree_arg+0x29
              m_freem(c3899300,4,c0c3acf7,5a3,0,...) at m_freem+0x43
              ucom_attach(c3640aec,0,c0c3acf7,68b,c3640aec,...) at ucom_attach+0x88f5
              ucom_attach(c3640000,d5341cc8,c08d8a54,c107b5c0,c3554238,...) at ucom_attach+0xaa17
              intr_event_execute_handlers(c3556aa0,c3554200,c0c4616c,533,c3554270,...) at intr_event_execute_handlers+0x125
              intr_event_add_handler(c3644b60,d5341d38,c0c45ecc,344,c3556aa0,...) at intr_event_add_handler+0x42f
              fork_exit(c08c1b70,c3644b60,d5341d38) at fork_exit+0xb8
              fork_trampoline() at fork_trampoline+0x8
              --- trap 0, eip = 0, esp = 0xd5341d70, ebp = 0 ---
              db> 
              
              EDIT: Deleted crap from HAVP to reflect just panic[/thread]
              
              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                Is vr1 bridged to anything?

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • L
                  LostInIgnorance
                  last edited by

                  not that I am aware of, I do force all traffic through the OpenVPN interface (Force all client generated traffic through the tunnel)

                  1 Reply Last reply Reply Quote 0
                  • L
                    LostInIgnorance
                    last edited by

                    I did notice this on the console after having to uninstall and reinstall HAVP.

                    lock order reversal:
                     1st 0xc4279df4 ufs (ufs) @ /usr/pfSensesrc/src/sys/kern/vfs_mount.c:1204
                     2nd 0xc474e6a0 syncer (syncer) @ /usr/pfSensesrc/src/sys/kern/vfs_subr.c:2203
                    KDB: stack backtrace:
                    X_db_sym_numargs(c0c4e35f,d6704a3c,c092ce85,c091d9bb,c0c512cb,...) at X_db_sym_numargs+0x146
                    kdb_backtrace(c091d9bb,c0c512cb,c3516ee8,c3517020,d6704a98,...) at kdb_backtrace+0x29
                    witness_display_spinlock(c0c512cb,c474e6a0,c0c5872e,c3517020,c0c585a4,...) at witness_display_spinlock+0x75
                    witness_checkorder(c474e6a0,9,c0c585a4,89b,0,...) at witness_checkorder+0x839
                    __lockmgr_args(c474e6a0,80100,c474e6bc,0,0,...) at __lockmgr_args+0x7f5
                    vop_stdlock(d6704bb4,3,c0c585a4,80100,c474e648,...) at vop_stdlock+0x62
                    VOP_LOCK1_APV(c1032b00,d6704bb4,c08d9223,c10560a0,c474e648,...) at VOP_LOCK1_APV+0xb5
                    _vn_lock(c474e648,80100,c0c585a4,89b,0,...) at _vn_lock+0x5e
                    insmntque(d6704c58,c097342e,c474e648,0,c0c57db8,...) at insmntque+0x288
                    vrele(c474e648,0,c0c57db8,4f9,80,...) at vrele+0x10
                    dounmount(c37b2000,8080000,c3b3a780,47e,fdf65b4a,...) at dounmount+0x3ce
                    unmount(c3b3a780,d6704cf8,c3b3a780,d6704d2c,206,...) at unmount+0x2bf
                    syscall(d6704d38) at syscall+0x1da
                    Xint0x80_syscall() at Xint0x80_syscall+0x20
                    --- syscall (22, FreeBSD ELF32, unmount), eip = 0x280dfa9f, esp = 0xbfbfe61c, ebp = 0xbfbfe6e8 ---
                    
                    1 Reply Last reply Reply Quote 0
                    • jimpJ
                      jimp Rebel Alliance Developer Netgate
                      last edited by

                      Don't worry about LORs, they're mostly harmless.

                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                      Need help fast? Netgate Global Support!

                      Do not Chat/PM for help!

                      1 Reply Last reply Reply Quote 0
                      • L
                        LostInIgnorance
                        last edited by

                        Did notice this from both panics though.

                        Panic from old P4 computer:

                        Kernel page fault with the following non-sleepable locks held:
                        exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                        

                        Panic from Soekris board:

                        Kernel page fault with the following non-sleepable locks held:
                        exclusive sleep mutex vr1 (network driver) r = 0 (0xc3640aec) locked @ /usr/pfSensesrc/src/sys/dev/vr/if_vr.c:1675
                        

                        Is it just coincidence?

                        1 Reply Last reply Reply Quote 0
                        • D
                          disa
                          last edited by

                          Hi, I've reinstalled the secondary machine, it's now running 2.0-BETA5 (amd64) built on Mon Jan 17 22:14:04 EST 2011 (the primary has 2.0-BETA5 (amd64) built on Fri Jan 21 23:51:34 EST 2011).

                          I disabled the sync, created all the remaining carp vips on the primary (I've 12 of them right now), and re-enabled the sync. The secondary didn't crash.

                          What shall I do now? I'm a bit scared of upgrading it :-) thanks

                          1 Reply Last reply Reply Quote 0
                          • E
                            eri--
                            last edited by

                            Just wait for build late build from today and it should be safe to upgrade.

                            1 Reply Last reply Reply Quote 0
                            • D
                              disa
                              last edited by

                              ok, and which kernel should I be running? SMP or devel? thanks

                              1 Reply Last reply Reply Quote 0
                              • V
                                vito
                                last edited by

                                as of snap
                                2.0-BETA5 (i386)
                                built on Mon Jan 24 18:48:13 EST 2011

                                fix did not work.

                                Maybe fix was not in this build?

                                1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  @vito:

                                  as of snap
                                  2.0-BETA5 (i386)
                                  built on Mon Jan 24 18:48:13 EST 2011

                                  fix did not work.

                                  Maybe fix was not in this build?

                                  It should have been.

                                  So it still crashed the exact same way, with the same panic in the same place?

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • jimpJ
                                    jimp Rebel Alliance Developer Netgate
                                    last edited by

                                    @LostInIgnorance:

                                    Did notice this from both panics though.

                                    Panic from old P4 computer:

                                    Kernel page fault with the following non-sleepable locks held:
                                    exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                                    

                                    Panic from Soekris board:

                                    Kernel page fault with the following non-sleepable locks held:
                                    exclusive sleep mutex vr1 (network driver) r = 0 (0xc3640aec) locked @ /usr/pfSensesrc/src/sys/dev/vr/if_vr.c:1675
                                    

                                    Is it just coincidence?

                                    Hard to say for sure.

                                    I just setup my ALIX and opened a ton of browser tabs and pushed a bunch of traffic through it and I managed to crash Chrome but not the router… I wish I could reproduce this.

                                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                    Need help fast? Netgate Global Support!

                                    Do not Chat/PM for help!

                                    1 Reply Last reply Reply Quote 0
                                    • V
                                      vito
                                      last edited by

                                      @jimp:

                                      @vito:

                                      as of snap
                                      2.0-BETA5 (i386)
                                      built on Mon Jan 24 18:48:13 EST 2011

                                      fix did not work.

                                      Maybe fix was not in this build?

                                      It should have been.

                                      So it still crashed the exact same way, with the same panic in the same place?

                                      Here is the new screen shoot of the panic.
                                      Same symptoms, only when using openvpn

                                      photo.JPG
                                      photo.JPG_thumb

                                      1 Reply Last reply Reply Quote 0
                                      • L
                                        LostInIgnorance
                                        last edited by

                                        The key is only when you connect to OpenVPN and then after successful connection, open a browser and the page should time out and you should get disconnected from the OpenVPN connection.

                                        1 Reply Last reply Reply Quote 0
                                        • jimpJ
                                          jimp Rebel Alliance Developer Netgate
                                          last edited by

                                          I had ~10 browser tabs open to all kinds of sites, streaming a youtube video, and was copying a file over SMB, all over OpenVPN from a client on the WAN side of my ALIX.

                                          Perhaps it's a bug specific to the vr chip in the Soekris.

                                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                          Need help fast? Netgate Global Support!

                                          Do not Chat/PM for help!

                                          1 Reply Last reply Reply Quote 0
                                          • L
                                            LostInIgnorance
                                            last edited by

                                            If it is a specific issue to the vr chip, then why am I getting the same type of panic with the em (as I had posted above)?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.