Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Help with deciphering 2.7.0 crash dump

    Scheduled Pinned Locked Moved General pfSense Questions
    4 Posts 2 Posters 538 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      Frozen Fractals
      last edited by

      Hi,

      My pfSense instance has been crashing randomly for the past few months. It's crashed every 2-3 weeks on 2.6.0 and now for the first time since updating to 2.7.0.

      Hardware: Topton Mini PC (N5105, 32GB DDR4, 1TB NVMe, I226-V NICs)
      Hypervisor: ESXi 8.0 U1a
      Internet: AT&T BGW210 1Gb/s Fiber

      Is anyone able to assist in deciphering the crash dump below? I've snipped the relevant sections with the full crash dump attached. The anti-spam filter wouldn't let me post the trace.

      Filename: /var/crash/info.0
      Dump header from device: /dev/da0p2
        Architecture: amd64
        Architecture Version: 4
        Dump Length: 76288
        Blocksize: 512
        Compression: none
        Dumptime: 2023-07-08 06:29:23 -0700
        Hostname: pfsense.winata.xyz
        Magic: FreeBSD Text Dump
        Version String: FreeBSD 14.0-CURRENT #1 RELENG_2_7_0-n255866-686c8d3c1f0: Wed Jun 28 04:21:19 UTC 2023
          root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/obj/amd64/LwYAddCr/var/jenkins/
        Panic String: page fault
        Dump Parity: 3444675921
        Bounds: 0
        Dump Status: good
      
      Fatal trap 12: page fault while in kernel mode
      cpuid = 0; apic id = 00
      fault virtual address	= 0x22c00000234
      fault code		= supervisor read data, page not present
      instruction pointer	= 0x20:0xffffffff80d65095
      stack pointer	        = 0x28:0xfffffe0096528d50
      frame pointer	        = 0x28:0xfffffe0096528d90
      code segment		= base 0x0, limit 0xfffff, type 0x1b
      			= DPL 0, pres 1, long 1, def32 0, gran 1
      processor eflags	= interrupt enabled, resume, IOPL = 0
      current process		= 59902 (grep)
      rdi: fffff8003ac198c0 rsi: fffffe0096528da0 rdx: fffff800090a3d00
      rcx:                0  r8: fffffe00971ebac0  r9:                0
      rax:      22c0000022c rbx:             1000 rbp: fffffe0096528d90
      r10:             1000 r11: fffffe00971ebfe0 r12: fffffe0096528da0
      r13: fffff8003ac198c0 r14:                0 r15: fffffe00971ebac0
      trap number		= 12
      panic: page fault
      cpuid = 0
      time = 1688822963
      KDB: enter: panic
      

      info.0
      textdump.tar.0

      Thanks!

      stephenw10S 1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator @Frozen Fractals
        last edited by

        Backtrace:

        db:0:kdb.enter.default>  bt
        Tracing pid 59902 tid 100339 td 0xfffffe00971ebac0
        kdb_enter() at kdb_enter+0x32/frame 0xfffffe0096528b10
        vpanic() at vpanic+0x183/frame 0xfffffe0096528b60
        panic() at panic+0x43/frame 0xfffffe0096528bc0
        trap_fatal() at trap_fatal+0x409/frame 0xfffffe0096528c20
        trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0096528c80
        calltrap() at calltrap+0x8/frame 0xfffffe0096528c80
        --- trap 0xc, rip = 0xffffffff80d65095, rsp = 0xfffffe0096528d50, rbp = 0xfffffe0096528d90 ---
        dofilewrite() at dofilewrite+0x85/frame 0xfffffe0096528d90
        sys_write() at sys_write+0xbc/frame 0xfffffe0096528e00
        amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe0096528f30
        fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0096528f30
        --- syscall (4, FreeBSD ELF64, write), rip = 0x340381ff9f9a, rsp = 0x340380204f78, rbp = 0x340380204fa0 ---
        

        But there are a bunch of failed processes in the message buffer:

        <118>Bootup complete
        <6>pid 58393 (netstat), jid 0, uid 0: exited on signal 10
        <6>pid 98867 (awk), jid 0, uid 0: exited on signal 10 (core dumped)
        <6>pid 3439 (awk), jid 0, uid 0: exited on signal 10 (core dumped)
        <6>pid 90571 (grep), jid 0, uid 0: exited on signal 11 (core dumped)
        <6>pid 56919 (awk), jid 0, uid 0: exited on signal 10 (core dumped)
        <6>pid 9712 (awk), jid 0, uid 0: exited on signal 11 (core dumped)
        <6>pid 60062 (php-cgi), jid 0, uid 0: exited on signal 10 (core dumped)
        <6>pid 43746 (awk), jid 0, uid 0: exited on signal 10 (core dumped)
        <6>ovpns1: link state changed to DOWN
        <6>ovpns1: link state changed to UP
        <6>pid 9474 (dhcpd), jid 0, uid 136: exited on signal 6
        

        You are running the hypervisor on a Jasper Lake CPU though so the first thing I would try is disabling any power saving features in bios, EIST etc.

        Steve

        F 1 Reply Last reply Reply Quote 1
        • F
          Frozen Fractals @stephenw10
          last edited by

          @stephenw10 Thanks for pointing me in the right direction. Did some research, and indeed you are correct! The Jasper Lake platform has issues with power saving functions causing VMs to behave incorrectly. Apparently, there's a microcode/BIOS update to resolve it. I'll flash it and report back, hopefully with no more crashes!

          F 1 Reply Last reply Reply Quote 1
          • F
            Frozen Fractals @Frozen Fractals
            last edited by

            So far, so good! No crashes after about 10 days since updating the BIOS/microcode. Let's hope it stays that way!

            1 Reply Last reply Reply Quote 1
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.