Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Kernel Panic

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    325 Posts 35 Posters 247.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • V
      vito
      last edited by

      yep, i saw the post and checked /var/crash
      nothing in the folder.

      will the build on the snap server be different then the one you posted?

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        In theory it should be about the same, but in practice that isn't always the case.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • C
          clahti
          last edited by

          quick question, I am in the freeze/hang camp.  I don't get a kernel panic on the console, so when I update to the next snapshot should I expect a dump in /var/crash when the system hangs?

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            No - the hangs you can't get out of except by a power cycle. There is no crash dump from a hang, only from a panic. (This thread is really for the panics… the hangs have a separate thread)

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • C
              clahti
              last edited by

              ya that's what I thought, thanks.

              1 Reply Last reply Reply Quote 0
              • S
                Slaygon
                last edited by

                Today one of the fw's (running "2.0-BETA5 (amd64) built on Thu Jan 27 19:46:43 EST 2011") managed to output something in /var/crash/ before it died.

                See attached files.

                ddb.txt
                msgbuf.txt

                1 Reply Last reply Reply Quote 0
                • E
                  eri--
                  last edited by

                  That seems like another panic related to carp and your link going up down.
                  So seems like progress, at least to those who do not have intensive carp clusters but just hangs.
                  I will get back when i hav emore info on this new issue.

                  1 Reply Last reply Reply Quote 0
                  • S
                    Slaygon
                    last edited by

                    @ermal:

                    That seems like another panic related to carp and your link going up down.
                    So seems like progress, at least to those who do not have intensive carp clusters but just hangs.
                    I will get back when i hav emore info on this new issue.

                    I guess that would explain why the backup spontaneously becomes the master every now and then?
                    I get this on the backup:
                    BACKUP -> MASTER (preempting a slower master)

                    and this on the master:
                    MASTER -> BACKUP (more frequent advertisement received)

                    Adjusting advskew probably won't help here, I guess?

                    1 Reply Last reply Reply Quote 0
                    • E
                      eri--
                      last edited by

                      I have pushed some patches that should solve even the carp panic Slaygon.

                      So newest snapshots should have the fix for you.
                      I would skip the next one coming and get the other one.

                      1 Reply Last reply Reply Quote 0
                      • V
                        vito
                        last edited by

                        just tried snap
                        2.0-BETA5 (i386)
                        built on Fri Jan 28 05:30:15 EST 2011

                        and it crashed. :)

                        The only one i had any luck with is the first kernel that was posted.

                        Just so i am clear… the snaps have the correct kernel or do i still need to download from the separate link?

                        photo.JPG
                        photo.JPG_thumb

                        1 Reply Last reply Reply Quote 0
                        • S
                          Slaygon
                          last edited by

                          @ermal:

                          I have pushed some patches that should solve even the carp panic Slaygon.

                          So newest snapshots should have the fix for you.
                          I would skip the next one coming and get the other one.

                          So the "Fri Jan 28 05:30:15 EST 2011" is not the one you recommend for the carp fixes, but instead wait for the next one?
                          I'm currently on "Fri Jan 28 00:53:50 EST 2011".

                          Oh, and my backup just became the master spontaneously again.

                          Excellent work btw. Much appreciated!

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            yes, wait for the next one. It was just restarted to pick up the patches he pushed.

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • E
                              eri--
                              last edited by

                              @vito please type a bt at that prompt next time.

                              1 Reply Last reply Reply Quote 0
                              • L
                                LostInIgnorance
                                last edited by

                                Loaded 2.0-BETA5 (i386) built on Fri Jan 28 05:30:15 EST 2011 running on a dell Intel(R) Pentium(R) 4 CPU 2.40GHz onboard gig nic (em)

                                
                                Enter an option: Kernel page fault with the following non-sleepable locks held:
                                exclusive sleep mutex em0 (EM TX Lock) r = 0 (0xc2f52580) locked @ /usr/pfSensesrc/src/sys/dev/e1000/if_lem.c:1350
                                KDB: stack backtrace:
                                db_trace_self_wrapper(c0eb7ccb,ccc3ca90,c0a421c5,546,0,...) at db_trace_self_wrapper+0x26
                                kdb_backtrace(546,0,ffffffff,c145df04,ccc3cac8,...) at kdb_backtrace+0x29
                                _witness_debugger(c0eba1e3,ccc3cadc,4,1,0,...) at _witness_debugger+0x25
                                witness_warn(5,0,c0ef8592,2d05a8c0,c131be40,...) at witness_warn+0x20d
                                trap(ccc3cb68) at trap+0x19e
                                calltrap() at calltrap+0x6
                                --- trap 0xc, eip = 0xc0a61478, esp = 0xccc3cba8, ebp = 0xccc3cbb8 ---
                                m_tag_delete(c2fef600,dedeadc0,c2fef600,c2fef600,ccc3cbf0,...) at m_tag_delete+0x48
                                m_tag_delete_chain(c2fef600,0,c0e6f12d,0,c2ed9720,...) at m_tag_delete_chain+0x3f
                                mb_dtor_mbuf(c2fef600,100,0,c0a42c18,df,...) at mb_dtor_mbuf+0x35
                                uma_zfree_arg(c1d7e380,c2fef600,0,72,ccc3cc84,...) at uma_zfree_arg+0x29
                                m_freem(c2fef600,4,c0e6f12d,b87,c2f4e000,...) at m_freem+0x43
                                lem_txeof(c2f52580,0,c0e6f12d,546,c2f525bc,...) at lem_txeof+0x158
                                lem_handle_rxtx(c2f4e000,1,c0eb959c,4f,c2edb8d8,...) at lem_handle_rxtx+0x60
                                taskqueue_run(c2edb8c0,c2edb8d8,c0ea6955,0,c0eb2bfb,...) at taskqueue_run+0x103
                                taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaf76a,344,c131be40,...) at taskqueue_thread_loop+0x68
                                fork_exit(c0a3b440,c2f525ec,ccc3cd38) at fork_exit+0xb8
                                fork_trampoline() at fork_trampoline+0x8
                                --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 ---
                                
                                Fatal trap 12: page fault while in kernel mode
                                cpuid = 0; apic id = 00
                                fault virtual address   = 0xdedeadc0
                                fault code              = supervisor read, page not present
                                instruction pointer     = 0x20:0xc0a61478
                                stack pointer           = 0x28:0xccc3cba8
                                frame pointer           = 0x28:0xccc3cbb8
                                code segment            = base 0x0, limit 0xfffff, type 0x1b
                                                        = DPL 0, pres 1, def32 1, gran 1
                                processor eflags        = interrupt enabled, resume, IOPL = 0
                                current process         = 0 (em0 taskq)
                                [thread]
                                Stopped at      m_tag_delete+0x48:      movl    0(%ecx),%eax
                                db> bt
                                Tracing pid 0 tid 64050 td 0xc2f4bc80
                                m_tag_delete(c2fef600,dedeadc0,c2fef600,c2fef600,ccc3cbf0,...) at m_tag_delete+0x48
                                m_tag_delete_chain(c2fef600,0,c0e6f12d,0,c2ed9720,...) at m_tag_delete_chain+0x3f
                                mb_dtor_mbuf(c2fef600,100,0,c0a42c18,df,...) at mb_dtor_mbuf+0x35
                                uma_zfree_arg(c1d7e380,c2fef600,0,72,ccc3cc84,...) at uma_zfree_arg+0x29
                                m_freem(c2fef600,4,c0e6f12d,b87,c2f4e000,...) at m_freem+0x43
                                lem_txeof(c2f52580,0,c0e6f12d,546,c2f525bc,...) at lem_txeof+0x158
                                lem_handle_rxtx(c2f4e000,1,c0eb959c,4f,c2edb8d8,...) at lem_handle_rxtx+0x60
                                taskqueue_run(c2edb8c0,c2edb8d8,c0ea6955,0,c0eb2bfb,...) at taskqueue_run+0x103
                                taskqueue_thread_loop(c2f525ec,ccc3cd38,c0eaf76a,344,c131be40,...) at taskqueue_thread_loop+0x68
                                fork_exit(c0a3b440,c2f525ec,ccc3cd38) at fork_exit+0xb8
                                fork_trampoline() at fork_trampoline+0x8
                                --- trap 0, eip = 0, esp = 0xccc3cd70, ebp = 0 ---
                                db>
                                
                                Happened instantly when I tried to OpenVPN in[/thread]
                                
                                1 Reply Last reply Reply Quote 0
                                • V
                                  vito
                                  last edited by

                                  it is a hard lock…can't type anything in.
                                  On reboot, still do not see the crash files that jimp added.

                                  1 Reply Last reply Reply Quote 0
                                  • jimpJ
                                    jimp Rebel Alliance Developer Netgate
                                    last edited by

                                    @vito:

                                    it is a hard lock…can't type anything in.
                                    On reboot, still do not see the crash files that jimp added.

                                    As I said before, those won't be created for hard locks, only for panics/crashes.

                                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                    Need help fast? Netgate Global Support!

                                    Do not Chat/PM for help!

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      Slaygon
                                      last edited by

                                      Just seen this come out: "Fri Jan 28 13:06:21 EST 2011"…
                                      Is this the one where the carp sync problems are addressed?

                                      If so, it will take me a number of hours to get this applied and verified. 1-4 hours of active monitoring before the bug usually appears, however, I will not be able to start testing all too soon. Expecting about 2 to 8 hours before I can start monitoring.

                                      If anyone could start testing sooner, that would be great!

                                      (for those that don't know, I run two DL-type HP servers with quad Intel gbit NICs in them)

                                      And again, great work pf team! Really appreciated!

                                      1 Reply Last reply Reply Quote 0
                                      • G
                                        geewhz01
                                        last edited by

                                        If the carp patches were also pushed to the AMD version 2.0-BETA5 (amd64)
                                        built on Fri Jan 28 13:06:21 EST 2011 then it's not fixed for me.  As soon as I make any change and it syncs the backup firewall locks or panics.  It's not a hard lock though, it does bring me to a prompt but not seeing any of crash files in the /var/crash.

                                        Andy

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Slaygon
                                          last edited by

                                          @geewhz01:

                                          If the carp patches were also pushed to the AMD version 2.0-BETA5 (amd64)
                                          built on Fri Jan 28 13:06:21 EST 2011 then it's not fixed for me.  As soon as I make any change and it syncs the backup firewall locks or panics.  It's not a hard lock though, it does bring me to a prompt but not seeing any of crash files in the /var/crash.

                                          Andy

                                          It's the amd64 version I am running too.
                                          Not too good news there, Andy. Sorry to hear.
                                          And usually, with the carp errors, there seems to be a freeze, rather than a crash, which will render no trace dumps.
                                          Sometimes, very unusual though, I get a debug prompt at which I can get a backtrace. Mostly the box just hangs.

                                          I guess the devs would appreciate any input they can get hold of, so if you do get to a prompt and can type something in, give "bt" (as in backtrace) a go and see if you could provide any more info.

                                          Cheerio.

                                          1 Reply Last reply Reply Quote 0
                                          • jimpJ
                                            jimp Rebel Alliance Developer Netgate
                                            last edited by

                                            I just tried on my amd64 vm and when I force a manual panic (sysctl debug.kdb.panic=1) I get a textdump, and no db> prompt. I'm not sure why someone on a current snapshot update would still be getting left at a db> prompt when it crashes, they should be gathering the info and rebooting on their own.

                                            Doesn't help with the hangs, though, but the hangs are a different problem (and a different thread :-)

                                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                            Need help fast? Netgate Global Support!

                                            Do not Chat/PM for help!

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.