Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Help with a crash dump

    Scheduled Pinned Locked Moved General pfSense Questions
    26 Posts 4 Posters 2.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • N
      nik.taylor
      last edited by

      Thanks for getting back so quickly.

      It was running fine for about 4 or 5 months. It started having issues a few months ago. I can't pinpoint the exact date unfortunately. No changes to hardware apart from the new drive I just installed. I have been keeping up with releases. No new packages installed recently. I only have Cron, nut, openvpn-client-export installed.

      1 Reply Last reply Reply Quote 0
      • N
        nik.taylor
        last edited by

        Is there any way I can dig into this further? It's happened twice since I posted this.

        Thanks.

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          What hardware are you using?

          Are you running anything unusual in the config?

          Attempting to replicate that in FreeBSD 11.2 is always a good step. Prove it's something pfSense is doing or something in base.

          Steve

          1 Reply Last reply Reply Quote 0
          • N
            nik.taylor
            last edited by

            Hardware:

            • ZOTAC C Series ZBOX CI327 NANO, Palm-Sized Passive Cooled Mini PC, Intel N3450 Quad-Core CPU, Intel HD Graphics 500, ZBOX-CI327NANO-U

            • G.SKILL Ripjaws Series 4GB 204-Pin DDR3 SO-DIMM DDR3 1866 (PC3 14900) Laptop Memory Model F3-1866C11S-4GRSL

            • Crucial BX500 120GB 3D NAND SATA 2.5-Inch Internal SSD - CT120BX500SSD1Z

            Nothing unusual in config. Can send it over if needed. I only have Cron, nut, openvpn-client-export installed as add ins.

            How do I replicate in FreeBSD other than installing on the hardware and letting it run for a few weeks?

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Hmm. Well I'd disable Nut as a test just because it's the only thing doing anything active.

              Pretty sure there are others running that box without issue so I'd guess it's either a config issue or some bad component, assuming you have not added any thing like a wifi card etc.

              1 Reply Last reply Reply Quote 0
              • N
                nik.taylor
                last edited by

                I'll disable nut and report back.

                Nothing unusual in the config. Not sure how I could tell if there is?

                No other components added and no additional cards / hardware.

                N 1 Reply Last reply Reply Quote 0
                • N
                  nik.taylor @nik.taylor
                  last edited by

                  I disabled nut and it crashed again this week.

                  Anything else I can do?

                  Thanks.

                  1 Reply Last reply Reply Quote 0
                  • chrismacmahonC
                    chrismacmahon
                    last edited by

                    Are the crashes all the same with this error:

                    db:0:kdb.enter.default>  bt
                    Tracing pid 12 tid 100026 td 0xfffff8000396b620
                    pfslowtimo() at pfslowtimo+0x52/frame 0xfffffe010e6e6810
                    softclock_call_cc() at softclock_call_cc+0x13a/frame 0xfffffe010e6e68c0
                    

                    Or is it different?

                    Need help fast? Our support is available 24/7 https://www.netgate.com/support/

                    Do Not PM For Help!

                    1 Reply Last reply Reply Quote 0
                    • N
                      nik.taylor
                      last edited by

                      This is the latest error I recieved:

                      db:0:kdb.enter.default>  bt
                      Tracing pid 12 tid 100026 td 0xfffff8000397d620
                      ipport_tick() at ipport_tick+0x4e/frame 0xfffffe010e6e6810
                      softclock_call_cc() at softclock_call_cc+0x13a/frame 0xfffffe010e6e68c0
                      softclock() at softclock+0x79/frame 0xfffffe010e6e68e0
                      
                      1 Reply Last reply Reply Quote 0
                      • chrismacmahonC
                        chrismacmahon
                        last edited by

                        It looks like that's hardware, I would potentially look at changing the on-board battery see if that helps, but I highly doubt it would.

                        Need help fast? Our support is available 24/7 https://www.netgate.com/support/

                        Do Not PM For Help!

                        1 Reply Last reply Reply Quote 0
                        • W
                          Warden
                          last edited by

                          Hi,

                          I'm experimenting the same kind of issue, my PFsense box crashing on daily basis since a couple of months. I did the same changing the SSD drive but getting the same results.

                          Looking at the logs I see the same kind of details as discussed above:

                          Fatal trap 12: page fault while in kernel mode
                          cpuid = 3; apic id = 06
                          fault virtual address	= 0xc46b3dd0
                          fault code		= supervisor read data, page not present
                          instruction pointer	= 0x20:0xffffffff80d89866
                          stack pointer	        = 0x28:0xfffffe01188d6688
                          frame pointer	        = 0x28:0xfffffe01188d6688
                          code segment		= base 0x0, limit 0xfffff, type 0x1b
                          			= DPL 0, pres 1, long 1, def32 0, gran 1
                          processor eflags	= interrupt enabled, resume, IOPL = 0
                          current process		= 55216 (darkstat)
                          ��version.txt������0600����0�������0�������336���������13421446713�  7622� ������ustar���root���������wheel���������FreeBSD 11.2-RELEASE-p6 #3 518496b29ae(RELENG_2_4_4): Wed Dec 12 07:41:44 EST 2018
                              root@buildbot2.nyi.netgate.com:/build/ce-crossbuild-244/obj/amd64/ZfGpH5cd/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/sys/pfSense��
                          Filename: /var/crash/textdump.tar.11
                          ddb.txt�����0600����0�������0�������140000������13422124500�  7063� ��ustar���root������wheel����db:0:kdb.enter.default>  run lockinfo
                          db:1:lockinfo> show locks
                          No such command; use "help" to list available commands
                          db:1:lockinfo>  show alllocks
                          No such command; use "help" to list available commands
                          db:1:lockinfo>  show lockedvnods
                          Locked vnodes
                          db:0:kdb.enter.default>  show pcpu
                          cpuid        = 2
                          dynamic pcpu = 0xfffffe018f873480
                          curthread    = 0xfffff80003dd0000: pid 12 "irq259: re0"
                          curpcb       = 0xfffffe0118646cc0
                          fpcurthread  = none
                          idlethread   = 0xfffff80003939000: tid 100005 "idle: cpu2"
                          curpmap      = 0xffffffff82b83898
                          tssp         = 0xffffffff82bb47e0
                          commontssp   = 0xffffffff82bb47e0
                          rsp0         = 0xfffffe0118646cc0
                          gs32p        = 0xffffffff82bbb038
                          ldt          = 0xffffffff82bbb078
                          tss          = 0xffffffff82bbb068
                          db:0:kdb.enter.default>  bt
                          Tracing pid 12 tid 100057 td 0xfffff80003dd0000
                          turnstile_broadcast() at turnstile_broadcast+0x47/frame 0xfffffe0118646050
                          __mtx_unlock_sleep() at __mtx_unlock_sleep+0xb9/frame 0xfffffe0118646080
                          pf_state_insert() at pf_state_insert+0xb33/frame 0xfffffe0118646110
                          pf_test_rule() at pf_test_rule+0x2c7c/frame 0xfffffe01186465a0
                          pf_test() at pf_test+0x20e9/frame 0xfffffe0118646800
                          pf_check_in() at pf_check_in+0x1d/frame 0xfffffe0118646820
                          pfil_run_hooks() at pfil_run_hooks+0x90/frame 0xfffffe01186468b0
                          ip_input() at ip_input+0x441/frame 0xfffffe0118646910
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0118646960
                          ether_demux() at ether_demux+0x173/frame 0xfffffe0118646990
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe01186469f0
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0118646a40
                          ether_input() at ether_input+0x26/frame 0xfffffe0118646a60
                          re_rxeof() at re_rxeof+0x601/frame 0xfffffe0118646ad0
                          re_intr_msi() at re_intr_msi+0xfc/frame 0xfffffe0118646b20
                          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0118646b60
                          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0118646bb0
                          fork_exit() at fork_exit+0x83/frame 0xfffffe0118646bf0
                          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0118646bf0
                          --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
                          

                          I also attached the latest crash report to this post: https://forum.netgate.com/post/819822

                          Thanks

                          chrismacmahonC 1 Reply Last reply Reply Quote 0
                          • chrismacmahonC
                            chrismacmahon @Warden
                            last edited by chrismacmahon

                            @warden said in Help with a crash dump:

                            your crash is very different from the other one, Can you open a new thread?

                            Need help fast? Our support is available 24/7 https://www.netgate.com/support/

                            Do Not PM For Help!

                            1 Reply Last reply Reply Quote 1
                            • N
                              nik.taylor
                              last edited by

                              It's definitely nut. I uninstalled and there were no crashes. I re-installed and it's started crashing. Is this an integration issue you can look at or should I contact the nut team?

                              Thanks.

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                Yet it still crashed with Nut installed but disabled previously?

                                Were you able to replicate that? It seems hard to imagine that could happen if it really was disabled.

                                If it's a problem with the nut binaries in FreeBSD that would need to be reported upstream but there must be be a lot of people running that in FreeBSD.

                                Do you have any additional crash reports? Anything showing the NUT package specifically?

                                Steve

                                1 Reply Last reply Reply Quote 0
                                • N
                                  nik.taylor
                                  last edited by

                                  I'm pretty sure it did. I'm going to disable it again and see if I get a crash dump with nut installed but disabled.

                                  Here is the latest crash:

                                  0_1551543086047_nut dump.txt

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Mmm, well identical crash then. Implies probably software at least.

                                    1 Reply Last reply Reply Quote 0
                                    • N
                                      nik.taylor
                                      last edited by

                                      Bumping this thread back up. I've continued to have this problem. I disabled nut for a few months and it didn't go away. I'm seeing crashes about once or twice a week still. Any next debugging steps?

                                      Latest crash dump attached.

                                      Thanks in advance.

                                      crash_dump.txt

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Hmm, well that is three almost identical crashes:

                                        hardclock_cnt() at hardclock_cnt+0x131/frame 0xfffffe010e4d44e0
                                        handleevents() at handleevents+0xc9/frame 0xfffffe010e4d4530
                                        timercb() at timercb+0xad/frame 0xfffffe010e4d4580
                                        lapic_handle_timer() at lapic_handle_timer+0xa2/frame 0xfffffe010e4d45c0
                                        Xtimerint() at Xtimerint+0xa8/frame 0xfffffe010e4d45c0
                                        

                                        I got to think it's some issue with the system clock being used on that system.

                                        I see it's loading the speedstep driver (est), is powerd enabled? You might disabling it if so. It's been a while since I've seen one but some systems has issues with varying the cpu clock that would throw errors.

                                        You could usually work past that by selevting a non variable system timer instead.
                                        For example:

                                        [2.5.0-DEVELOPMENT][admin@apu.stevew.lan]/root: sysctl kern.timecounter.choice
                                        kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC(800) dummy(-1000000)
                                        [2.5.0-DEVELOPMENT][admin@apu.stevew.lan]/root: sysctl kern.timecounter.hardware
                                        kern.timecounter.hardware: HPET
                                        

                                        Steve

                                        1 Reply Last reply Reply Quote 0
                                        • N
                                          nik.taylor
                                          last edited by

                                          @stephenw10 said in Help with a crash dump:

                                          sysctl kern.timecounter.hardware

                                          Thanks very much. powerd is not running.

                                          I changed to HPET and will see what happens. I have to be honest, I know next to nothing about system timers so this is a stab in the dark for me. Will report back if anything happens.

                                          1 Reply Last reply Reply Quote 0
                                          • N
                                            nik.taylor
                                            last edited by

                                            HPET didnt work. My system froze with 're1 watchdog timeout' within about 5 mins. Re-booted, reset HPET and same thing happened.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.