Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Random crash

    Scheduled Pinned Locked Moved General pfSense Questions
    9 Posts 2 Posters 824 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      adelaide_guy
      last edited by

      Hi, Everyone.

      May I ask your expertise and experience in troubleshooting kernel panic. I have experience this kernel panic when looking in suricata alert logs. I was thinking it may be suricata. I just uninstall it and replace it with snort but still the issue persist but not as frequent when using suricata.

      I have also attached the textdump.tar.0 file. I hope you can help point out what may have caused this crash.

      textdump.tar.0

      Below is Fatal trap captured:

      Fatal trap 12: page fault while in kernel mode
      cpuid = 1; apic id = 02
      fault virtual address	= 0x28
      fault code		= supervisor read data, page not present
      instruction pointer	= 0x20:0xffffffff80ec01fe
      stack pointer	        = 0x28:0xfffffe003f946a90
      frame pointer	        = 0x28:0xfffffe003f946ac0
      code segment		= base 0x0, limit 0xfffff, type 0x1b
      			= DPL 0, pres 1, long 1, def32 0, gran 1
      processor eflags	= interrupt enabled, resume, IOPL = 0
      current process		= 12 (swi4: clock (0))
      trap number		= 12
      panic: page fault
      cpuid = 1
      time = 1628755983
      KDB: enter: panic
      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        The key parts of that are:

        db:0:kdb.enter.default>  show pcpu
        cpuid        = 1
        dynamic pcpu = 0xfffffe007f102380
        curthread    = 0xfffff800043c4000: pid 12 tid 100035 "swi4: clock (0)"
        curpcb       = 0xfffff800043c45a0
        fpcurthread  = none
        idlethread   = 0xfffff8000432a740: tid 100004 "idle: cpu1"
        curpmap      = 0xffffffff8368d5a8
        tssp         = 0xffffffff83717688
        commontssp   = 0xffffffff83717688
        rsp0         = 0xfffffe003f946e00
        kcr3         = 0x3d0b000
        ucr3         = 0xffffffffffffffff
        scr3         = 0x223630000
        gs32p        = 0xffffffff8371dea0
        ldt          = 0xffffffff8371dee0
        tss          = 0xffffffff8371ded0
        tlb gen      = 1389740
        curvnet      = 0
        db:0:kdb.enter.default>  bt
        Tracing pid 12 tid 100035 td 0xfffff800043c4000
        kdb_enter() at kdb_enter+0x37/frame 0xfffffe003f946750
        vpanic() at vpanic+0x197/frame 0xfffffe003f9467a0
        panic() at panic+0x43/frame 0xfffffe003f946800
        trap_fatal() at trap_fatal+0x391/frame 0xfffffe003f946860
        trap_pfault() at trap_pfault+0x4f/frame 0xfffffe003f9468b0
        trap() at trap+0x286/frame 0xfffffe003f9469c0
        calltrap() at calltrap+0x8/frame 0xfffffe003f9469c0
        --- trap 0xc, rip = 0xffffffff80ec01fe, rsp = 0xfffffe003f946a90, rbp = 0xfffffe003f946ac0 ---
        ether_8021q_frame() at ether_8021q_frame+0x2e/frame 0xfffffe003f946ac0
        vlan_transmit() at vlan_transmit+0xc8/frame 0xfffffe003f946b30
        vlan_altq_start() at vlan_altq_start+0xb4/frame 0xfffffe003f946b60
        cbqrestart() at cbqrestart+0x64/frame 0xfffffe003f946b90
        rmc_restart() at rmc_restart+0x6f/frame 0xfffffe003f946bc0
        softclock_call_cc() at softclock_call_cc+0x141/frame 0xfffffe003f946c70
        softclock() at softclock+0x79/frame 0xfffffe003f946c90
        ithread_loop() at ithread_loop+0x23c/frame 0xfffffe003f946cf0
        fork_exit() at fork_exit+0x7e/frame 0xfffffe003f946d30
        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003f946d30
        --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
        

        This looks like a possible traffic shaping issue though because your logs are filled with:

        config_aqm Unable to configure flowset, flowset busy!
        config_aqm Unable to configure flowset, flowset busy!
        config_aqm Unable to configure flowset, flowset busy!
        config_aqm Unable to configure flowset, flowset busy!
        config_aqm Unable to configure flowset, flowset busy!
        

        Though previously that error has been harmless: https://redmine.pfsense.org/issues/8991
        The backtrace shows it's VLAN related. If this has just started happening did you add a VLAN interface perhaps?

        Steve

        A 1 Reply Last reply Reply Quote 1
        • A
          adelaide_guy @stephenw10
          last edited by adelaide_guy

          @stephenw10

          Thank you so much for your response, you may be correct. I have added a VLAN but it is a while ago, and no issue encounter. To think of it, this issue started happening when I recreated the QoS from PRIQ to CBQ and also including this new VLAN that I have created in the list.

          Can you suggest what is the best approach to this? Do I recreate or just edit the existing and change back to PRIQ?

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            I would try one things at a time to try to isolate it.

            So maybe remove queues from the VLAN first. If that doesn't work then switch back to PRIQ as a test.

            Steve

            A 2 Replies Last reply Reply Quote 1
            • A
              adelaide_guy @stephenw10
              last edited by

              @stephenw10

              Thanks for the suggestion, I will follow this and see if that fixes the issue. I'll update this ticket after a week to provide some update.

              1 Reply Last reply Reply Quote 0
              • A
                adelaide_guy @stephenw10
                last edited by

                @stephenw10

                As of this moment I have not seen any crash. Thanks for the help I will leave that vlan not included in QoS for now.

                1 Reply Last reply Reply Quote 1
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Looks like you're hitting this: https://redmine.pfsense.org/issues/11470

                  A 1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    This post is deleted!
                    1 Reply Last reply Reply Quote 0
                    • A
                      adelaide_guy @stephenw10
                      last edited by

                      @stephenw10

                      Thanks for that info, I have created an account and provided the crash dump into the redmine ticket. I hope I could have provided the right information so they can fix the issue.

                      1 Reply Last reply Reply Quote 1
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.