Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Crash report

    Scheduled Pinned Locked Moved 2.2.5 Snapshot Feedback and Issues
    11 Posts 5 Posters 4.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dennypageD
      dennypage
      last edited by

      I had a crash this morning on my SG-4860 running with

      pfSense-Full-Update-2.2.5-DEVELOPMENT-amd64-20151018-0257.tgz

      Crash report submitted via web interface. Please let me know if you need more detail.

      [Chris, in case you are going to ask, this was stock SU+J]

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        We would need to know the IP address it was submitted from. Looking at the IP address you logged into the forum from I see a crash from a nearby system submitted yesterday that was running 2.2.5, so I suppose that might be it. IP address ended in .73

        Looks like something crashed in unbound somehow:

        Backtrace:

        db:0:kdb.enter.default>  show pcpu
        cpuid        = 0
        dynamic pcpu = 0x63a600
        curthread    = 0xfffff80100852920: pid 80566 "unbound"
        curpcb       = 0xfffffe006441dcc0
        fpcurthread  = 0xfffff80100852920: pid 80566 "unbound"
        idlethread   = 0xfffff80003390000: tid 100003 "idle: cpu0"
        curpmap      = 0xfffff80126edd9f8
        tssp         = 0xffffffff8219d190
        commontssp   = 0xffffffff8219d190
        rsp0         = 0xfffffe006441dcc0
        gs32p        = 0xffffffff8219ebe8
        ldt          = 0xffffffff8219ec28
        tss          = 0xffffffff8219ec18
        db:0:kdb.enter.default>  bt
        Tracing pid 80566 tid 100156 td 0xfffff80100852920
        done_store_dr() at done_store_dr+0x21/frame 0xfffffe006441daf0
        mi_switch() at mi_switch+0xe1/frame 0xfffffe006441db30
        critical_exit() at critical_exit+0x7a/frame 0xfffffe006441db50
        intr_event_handle() at intr_event_handle+0x106/frame 0xfffffe006441dba0
        intr_execute_handlers() at intr_execute_handlers+0x48/frame 0xfffffe006441dbd0
        lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfffffe006441dbf0
        Xapic_isr1() at Xapic_isr1+0xa4/frame 0xfffffe006441dbf0
        --- interrupt, rip = 0x4354e4, rsp = 0x7fffffffebb0, rbp = 0x7fffffffebc0 ---
        
        

        End of the message buffer:

        kernel trap 12 with interrupts disabled
        
        Fatal trap 12: page fault while in kernel mode
        cpuid = 0; apic id = 00
        fault virtual address	= 0xfffffe006443bfff
        fault code		= supervisor write data, page not present
        instruction pointer	= 0x20:0xffffffff80f34434
        stack pointer	        = 0x28:0xfffffe006441da80
        frame pointer	        = 0x28:0xfffffe006441daf0
        code segment		= base 0x0, limit 0xfffff, type 0x1b
        			= DPL 0, pres 1, long 1, def32 0, gran 1
        processor eflags	= resume, IOPL = 0
        current process		= 80566 (unbound)
        
        

        That's a pretty deep area for it to have crashed, unless it crashes repeatedly in the exact same spot I might be inclined to distrust the hardware at the moment.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • dennypageD
          dennypage
          last edited by

          I just send in another. Same place. I believe I've had the issue previously with 2.2.2 or 2.2.3. If you look back, you should find previous crash reports, either from .73 or from .78. All Unbound related I believe.

          I "fixed" the issue previously by turning off DHCP registration. With DHCP registration disabled, Unbound has been fairly stable for me. One crash (spontaneous exit) per month maybe, but no system crashes.

          I've been testing 2.2.5 for a few weeks, and it's been very stable for me aside from an install problem that I've been talking with Chris about. I just turned DHCP registration back on as part of 2.2.5 testing about 3 days ago. In those 3 days, I've had 2 system crashes.

          1 Reply Last reply Reply Quote 0
          • dennypageD
            dennypage
            last edited by

            I had another this morning. In php-fpm this time, but still at the point of a lease update.

            If you want to swap out the hardware I'm okay with that. However before doing that, I think you probably want to have a close look at some of the earlier crash reports I submitted. The first ones should show a SG-2440 rather than the current SG-4860.

            1 Reply Last reply Reply Quote 0
            • dennypageD
              dennypage
              last edited by

              I just sent in another, again with Unbound.

              Unfortunately, this one hit in the middle of an upgrade and left the system unbootable. Required a re-install.

              1 Reply Last reply Reply Quote 0
              • dennypageD
                dennypage
                last edited by

                Another in the middle of an update. Unbound again.

                Given that no one else seems to see these problems, maybe it is a hardware issue.

                Do you guys want to swap it out?

                @jimp:

                That's a pretty deep area for it to have crashed, unless it crashes repeatedly in the exact same spot I might be inclined to distrust the hardware at the moment.

                1 Reply Last reply Reply Quote 0
                • H
                  heper
                  last edited by

                  might be better to ask on the portal

                  1 Reply Last reply Reply Quote 0
                  • cwagzC
                    cwagz
                    last edited by

                    I have had two crashes on 2.2.5 in the last few days.  Never had a problem before with my equipment.  My IP should be the same as what is logged on this post and ends in .161

                    I did recently upgrade my FiOS to 150 / 150.  So my WAN port is now connected via gigabit.  Let me know if you need any more information.

                    Netgate 6100 MAX

                    1 Reply Last reply Reply Quote 0
                    • dennypageD
                      dennypage
                      last edited by

                      It looks like my crashes may have been the result of an issue with hardware crypto acceleration. At cmb's suggestion, I've disabled aesni and haven't had a crash since. Of course, your mileage may vary.

                      1 Reply Last reply Reply Quote 0
                      • cwagzC
                        cwagz
                        last edited by

                        I just turned AES-NI off and will see what happens.  Thanks for the information.

                        Netgate 6100 MAX

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmb
                          last edited by

                          @cwagz:

                          I just turned AES-NI off and will see what happens.  Thanks for the information.

                          I found a couple crash reports submitted from the same IP you're visiting the forum from, and it's not likely that's the cause in your case. There have been known AES-NI panics related to FPU in all versions, which the vast majority never hit, but some routinely hit. It's something we're pursuing upstream and expect to have resolved in 2.3. It's something to try, but I don't expect it'll have any impact for you.

                          Your crash looks nothing at all like those (nor any others I can recall offhand), and the two different crashes aren't even similar to each other. Most often when you're getting crashes with that frequency, and they're not the same or at least similar, the root cause is a hardware problem. Both those were memory corruption related, which could still be a software problem.

                          If you're continuing to get crashes, keep submitting the crash reports, and start a new thread since this is not the same as the original issue here, and I'll check them and suggest how to proceed from there.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.