Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfSense crashes randomly - new setup

    Scheduled Pinned Locked Moved General pfSense Questions
    17 Posts 3 Posters 2.5k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • fariznoF Offline
      farizno
      last edited by

      I have a bit of an update and maybe some more questions:

      Running pciconf, I have determined that that I am using the em drivers:

      em0@pci0:3:0:0: class=0x020000 rev=0x06 hdr=0x00 vendor=0x8086 device=0x10bc subvendor=0x8086 subdevice=0x11bc
      vendor = 'Intel Corporation'
      device = '82571EB/82571GB Gigabit Ethernet Controller (Copper)'
      class = network
      subclass = ethernet

      I also read that that in the past ema nd igb drivers were merged. Could I be using the wrong drivers? Do I need to use igb drivers? I dont know how to tell pfsense to use a different driver. Or does this all just mean that my NIC is not compatible?

      Thanks for the help.

      1 Reply Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        That's the correct driver.

        We need to see the backtrace from the ddb.txt file in the crash report to know more.

        So for example:

        db:0:kdb.enter.default>  show pcpu
        cpuid        = 0
        dynamic pcpu = 0x532100
        curthread    = 0xfffff800033a0000: pid 11 "idle: cpu0"
        curpcb       = 0xfffffe0059bc3cc0
        fpcurthread  = none
        idlethread   = 0xfffff800033a0000: tid 100003 "idle: cpu0"
        curpmap      = 0xffffffff820f89a0
        tssp         = 0xffffffff82113890
        commontssp   = 0xffffffff82113890
        rsp0         = 0xfffffe0059bc3cc0
        gs32p        = 0xffffffff821152e8
        ldt          = 0xffffffff82115328
        tss          = 0xffffffff82115318
        db:0:kdb.enter.default>  bt
        Tracing pid 11 tid 100003 td 0xfffff800033a0000
        callout_process() at callout_process+0x1a0/frame 0xfffffe0059bc38b0
        handleevents() at handleevents+0x18e/frame 0xfffffe0059bc3910
        timercb() at timercb+0x318/frame 0xfffffe0059bc3970
        lapic_handle_timer() at lapic_handle_timer+0x9c/frame 0xfffffe0059bc39a0
        Xtimerint() at Xtimerint+0x8c/frame 0xfffffe0059bc39a0
        --- interrupt, rip = 0xffffffff80f84316, rsp = 0xfffffe0059bc3a70, rbp = 0xfffffe0059bc3a70 ---
        acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfffffe0059bc3a70
        acpi_cpu_idle() at acpi_cpu_idle+0x15a/frame 0xfffffe0059bc3ac0
        cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe0059bc3ae0
        cpu_idle() at cpu_idle+0x90/frame 0xfffffe0059bc3b00
        sched_idletd() at sched_idletd+0x1d5/frame 0xfffffe0059bc3bb0
        fork_exit() at fork_exit+0x9a/frame 0xfffffe0059bc3bf0
        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0059bc3bf0
        --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
        db:0:kdb.enter.default>  ps
        

        Steve

        fariznoF 1 Reply Last reply Reply Quote 0
        • fariznoF Offline
          farizno @stephenw10
          last edited by

          @stephenw10,

          here is what is in my ddb.txt file. I had to use pastebin because it was too long: https://pastebin.com/vWni3cmP

          Thanks.

          1 Reply Last reply Reply Quote 0
          • stephenw10S Offline
            stephenw10 Netgate Administrator
            last edited by

            OK so:

            db:0:kdb.enter.default>  show pcpu
            cpuid        = 3
            dynamic pcpu = 0xfffffe008cd085c0
            curthread    = 0xfffffe000fdc93a0: pid 16 tid 100079 critnest 1 "usbus1"
            curpcb       = 0xfffffe000fdc98c0
            fpcurthread  = none
            idlethread   = 0xfffffe000fcbce40: tid 100006 "idle: cpu3"
            self         = 0xffffffff84013000
            curpmap      = 0xffffffff8303ef30
            tssp         = 0xffffffff84013384
            rsp0         = 0xfffffe007ca93000
            kcr3         = 0xffffffffffffffff
            ucr3         = 0xffffffffffffffff
            scr3         = 0x0
            gs32p        = 0xffffffff84013404
            ldt          = 0xffffffff84013444
            tss          = 0xffffffff84013434
            curvnet      = 0
            db:0:kdb.enter.default>  bt
            Tracing pid 16 tid 100079 td 0xfffffe000fdc93a0
            kdb_enter() at kdb_enter+0x32/frame 0xfffffe007ca92bb0
            vpanic() at vpanic+0x183/frame 0xfffffe007ca92c00
            panic() at panic+0x43/frame 0xfffffe007ca92c60
            _mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x67/frame 0xfffffe007ca92c70
            _mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0xd5/frame 0xfffffe007ca92ce0
            cpu_new_callout() at cpu_new_callout+0x2a2/frame 0xfffffe007ca92d30
            callout_reset_sbt_on() at callout_reset_sbt_on+0x1a8/frame 0xfffffe007ca92d90
            sleepq_set_timeout_sbt() at sleepq_set_timeout_sbt+0xbd/frame 0xfffffe007ca92dd0
            _sleep() at _sleep+0x178/frame 0xfffffe007ca92e50
            pause_sbt() at pause_sbt+0xff/frame 0xfffffe007ca92e80
            usb_pause_mtx() at usb_pause_mtx+0x55/frame 0xfffffe007ca92eb0
            usb_process() at usb_process+0xd7/frame 0xfffffe007ca92ef0
            fork_exit() at fork_exit+0x7d/frame 0xfffffe007ca92f30
            fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe007ca92f30
            --- trap 0xd9738ee, rip = 0x85d0315689903142, rsp = 0x22e7aa0baea7aa0e, rbp = 0x9ed8862492988620 ---
            

            So it looks to be something USB related but I don't see any USB devices in your log other than controllers and hubs. Do you have any USB devices connected?

            fariznoF 2 Replies Last reply Reply Quote 0
            • fariznoF Offline
              farizno @stephenw10
              last edited by

              @stephenw10
              No, the only USB device plugged in is the keyboard. However, i tried running it completely headless without the keyboard or monitor connected. With just the power and one cable on the LAN side connected to a laptop for access the GUI. No WAN connection, as i am trying to work out the bugs befpre i disconmect my current internet setup. Even with only the LAN and power cable, it still gets a panic error.

              Also, i just got 2 brand new RAM sticks today 2x8GB and still have the problem so im guessing its not related to the RAM.

              1 Reply Last reply Reply Quote 0
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                If it's a RAM issue the panics will be more random. Do all your crash reports show that same backtrace?

                It could be a driver issue with one of the USB controllers. You might try disabling the USB3 (xhci) controller in the BIOS if you can.

                1 Reply Last reply Reply Quote 0
                • fariznoF Offline
                  farizno @stephenw10
                  last edited by

                  @stephenw10

                  It looks like this latest crash is different.

                  db:0:kdb.enter.default>  show registers
                  cs                        0x20
                  ds                        0x3b
                  es                        0x3b
                  fs                        0x13
                  gs                        0x1b
                  ss                        0x28
                  rax                       0x12
                  rcx                        0x1
                  rdx         0xfffffe001b7e4690
                  rbx                      0x100
                  rsp         0xfffffe001b7e4a70
                  rbp         0xfffffe001b7e4a70
                  rsi                       0x32
                  rdi         0xffffffff82d82918  vt_conswindow+0x10
                  r8                           0
                  r9                    0x1e6b00
                  r10         0xffffffff82d82908  vt_conswindow
                  r11                      0x15f
                  r12                          0
                  r13         0xfffffe001e255c80
                  r14         0xfffffe001b7e4b00
                  r15         0xfffffe001e2563a0
                  rip         0xffffffff80d43122  kdb_enter+0x32
                  rflags                    0x86
                  kdb_enter+0x32: movq    $0,0x2347ce3(%rip)
                  db:0:kdb.enter.default>  run lockinfo
                  db:1:lockinfo> show locks
                  No such command; use "help" to list available commands
                  db:1:lockinfo>  show alllocks
                  No such command; use "help" to list available commands
                  db:1:lockinfo>  show lockedvnods
                  Locked vnodes
                  db:0:kdb.enter.default>  show pcpu
                  cpuid        = 0
                  dynamic pcpu = 0x10865c0
                  curthread    = 0xfffffe001e2563a0: pid 11 tid 100003 critnest 3 "idle: cpu0"
                  curpcb       = 0xfffffe001e2568c0
                  fpcurthread  = none
                  idlethread   = 0xfffffe001e2563a0: tid 100003 "idle: cpu0"
                  self         = 0xffffffff84010000
                  curpmap      = 0xffffffff8303ef30
                  tssp         = 0xffffffff84010384
                  rsp0         = 0xfffffe001b7e5000
                  kcr3         = 0xffffffffffffffff
                  ucr3         = 0xffffffffffffffff
                  scr3         = 0x0
                  gs32p        = 0xffffffff84010404
                  ldt          = 0xffffffff84010444
                  tss          = 0xffffffff84010434
                  curvnet      = 0
                  db:0:kdb.enter.default>  bt
                  Tracing pid 11 tid 100003 td 0xfffffe001e2563a0
                  kdb_enter() at kdb_enter+0x32/frame 0xfffffe001b7e4a70
                  vpanic() at vpanic+0x183/frame 0xfffffe001b7e4ac0
                  panic() at panic+0x43/frame 0xfffffe001b7e4b20
                  _mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x67/frame 0xfffffe001b7e4b30
                  _mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0xd5/frame 0xfffffe001b7e4ba0
                  handleevents() at handleevents+0x2cb/frame 0xfffffe001b7e4be0
                  timercb() at timercb+0x25b/frame 0xfffffe001b7e4c30
                  hpet_intr_single() at hpet_intr_single+0x1b0/frame 0xfffffe001b7e4c60
                  intr_event_handle() at intr_event_handle+0x123/frame 0xfffffe001b7e4cd0
                  intr_execute_handlers() at intr_execute_handlers+0x4a/frame 0xfffffe001b7e4d00
                  Xapic_isr1() at Xapic_isr1+0xdc/frame 0xfffffe001b7e4d00
                  --- interrupt, rip = 0xffffffff8125b026, rsp = 0xfffffe001b7e4dd0, rbp = 0xfffffe001b7e4dd0 ---
                  acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfffffe001b7e4dd0
                  acpi_cpu_idle() at acpi_cpu_idle+0x2fe/frame 0xfffffe001b7e4e10
                  cpu_idle_acpi() at cpu_idle_acpi+0x48/frame 0xfffffe001b7e4e30
                  cpu_idle() at cpu_idle+0x9e/frame 0xfffffe001b7e4e50
                  sched_idletd() at sched_idletd+0x4d1/frame 0xfffffe001b7e4ef0
                  fork_exit() at fork_exit+0x7d/frame 0xfffffe001b7e4f30
                  fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001b7e4f30
                  --- trap 0x552ee2ab, rip = 0xdd69eb03d129eb07, rsp = 0x7a5e704f761e704b, rbp = 0xc6615c61ca215c65 ---
                  db:0:kdb.enter.default>  ps
                  
                  1 Reply Last reply Reply Quote 0
                  • stephenw10S Offline
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm still the same spin lock issue though but from something else.

                    Given that platform is known I would try removing the Intel NIC and running with only the Realtek NIC for a few days, see if it still crashes.

                    fariznoF 1 Reply Last reply Reply Quote 0
                    • fariznoF Offline
                      farizno @stephenw10
                      last edited by farizno

                      @stephenw10
                      Ok thanks for the advice. I will try that now and report back after a few days. If there are no crashes, then i guess ill blame the NIC.

                      fariznoF 1 Reply Last reply Reply Quote 0
                      • fariznoF Offline
                        farizno @farizno
                        last edited by

                        I took the NIC out and it has been running for 6+ hours with no issues. Ill let it run longer to make sure, but before it never made it past 2-1/2 hours without a kernel panic. Im guessing that the NIC was the issue. I guess i need to find a new NIC. I will report back tomorrow after i run it all night.

                        fariznoF 1 Reply Last reply Reply Quote 0
                        • fariznoF Offline
                          farizno @farizno
                          last edited by

                          So ive just passed 24 hrs runnung without the NIC installed and no kernel panics. Im assuming the NIC was causing the problems. I guess the IBM Pro/1000 PT Quad NIC is not compatible, even though it is listed as a compatible device? I guess ill be looking for an i350-T4 as they seem to be the best.

                          Thanks for your help!

                          N 1 Reply Last reply Reply Quote 0
                          • N Offline
                            nimrod @farizno
                            last edited by

                            @farizno said in pfSense crashes randomly - new setup:

                            So ive just passed 24 hrs runnung without the NIC installed and no kernel panics. Im assuming the NIC was causing the problems. I guess the IBM Pro/1000 PT Quad NIC is not compatible, even though it is listed as a compatible device? I guess ill be looking for an i350-T4 as they seem to be the best.

                            Thanks for your help!

                            It could just be a faulty card and not compatibility issue.

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S Offline
                              stephenw10 Netgate Administrator
                              last edited by

                              Or some low level compatibility with that particular device.

                              Or a power or heat issue there with the expansion card.

                              fariznoF 1 Reply Last reply Reply Quote 1
                              • fariznoF Offline
                                farizno @stephenw10
                                last edited by

                                @stephenw10 I do appreciate all the assistance. I will order an i350-T4 this week and report back after trying that card. Thanks again.

                                fariznoF 1 Reply Last reply Reply Quote 0
                                • fariznoF Offline
                                  farizno @farizno
                                  last edited by

                                  After installing an i350-T4 card, I can confirm that there have been no more kernel panics. I think this definitely points to an issue with the IBM/Intel PRO/1000 PT 82571EB/82571GB card that I have. I am not sure if the card is faulty and I don't really know how to test it. I guess I can try installing it in a Windows desktop PC that I have and see if it causes my desktop to crash, but the desktop that I have is connected on wireless (I don't have an ethernet drop near where it is located) so I am not sure if just having it installed will tell me if it functions properly.

                                  Anyways, thanks for all the assistance stephenw10.

                                  1 Reply Last reply Reply Quote 2
                                  • stephenw10S Offline
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Yup, testing the card in a different host is really the only way to know for sure.

                                    1 Reply Last reply Reply Quote 0
                                    • X xMrMurderx referenced this topic on
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.