Random reboots. hardware issues


  • Hi,
    I'm experiencing random reboot with pfsense. I assume it is a hardware issue.
    Are there any tools inside pfsense to debug hardware issues?
    Like memory check and ssd checks?
    Thanks,
    EVS

  • Netgate Administrator

    You can check the SMART status of the drive. You would want to boot fro something else to run a memory test.

    Do you get a crash report or any log entries when this happens?

    Steve


  • @stephenw10
    I got a crash report but no kernel panic.
    The crash report is very long and did find any clue what the problem can be.

  • Netgate Administrator

    The backtrace and panic string near the top of the report should provide a clue.


  • @stephenw10
    Have to wait for a reboot:
    Panic String: page fault

    any sugestion?

  • Netgate Administrator

    Could be any number of things. You want the values from "show cpu" to "ps" from ddb like:

    db:0:kdb.enter.default>  show pcpu
    cpuid        = 1
    dynamic pcpu = 0xfffffe0097d79200
    curthread    = 0xfffff801c2eb4500: pid 40463 "sh"
    curpcb       = 0xfffffe0064dc1cc0
    fpcurthread  = 0xfffff801c2eb4500: pid 40463 "sh"
    idlethread   = 0xfffff80003398a00: tid 100004 "idle: cpu1"
    curpmap      = 0xfffff80161811138
    tssp         = 0xffffffff82a1eb78
    commontssp   = 0xffffffff82a1eb78
    rsp0         = 0xfffffe0064dc1cc0
    gs32p        = 0xffffffff82a253d0
    ldt          = 0xffffffff82a25410
    tss          = 0xffffffff82a25400
    db:0:kdb.enter.default>  bt
    Tracing pid 40463 tid 100426 td 0xfffff801c2eb4500
    turnstile_broadcast() at turnstile_broadcast+0x9c/frame 0xfffffe0064dc1480
    __rw_wunlock_hard() at __rw_wunlock_hard+0x8f/frame 0xfffffe0064dc14b0
    vm_map_delete() at vm_map_delete+0x3dc/frame 0xfffffe0064dc1530
    vm_map_remove() at vm_map_remove+0x47/frame 0xfffffe0064dc1560
    exec_new_vmspace() at exec_new_vmspace+0x22f/frame 0xfffffe0064dc15e0
    exec_elf64_imgact() at exec_elf64_imgact+0xa58/frame 0xfffffe0064dc16f0
    kern_execve() at kern_execve+0x74d/frame 0xfffffe0064dc1a50
    sys_execve() at sys_execve+0x4a/frame 0xfffffe0064dc1ad0
    amd64_syscall() at amd64_syscall+0x4ce/frame 0xfffffe0064dc1bf0
    Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0064dc1bf0
    --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800b40d8a, rsp = 0x7fffffffe0f8, rbp = 0x7fffffffe240 ---
    db:0:kdb.enter.default>  ps
    

    And the panic output from the message buffer like:

    Fatal trap 12: page fault while in kernel mode
    cpuid = 1; apic id = 02
    fault virtual address	= 0x30
    fault code		= supervisor read data, page not present
    instruction pointer	= 0x20:0xffffffff80cb9bdc
    stack pointer	        = 0x28:0xfffffe0064dc1450
    frame pointer	        = 0x28:0xfffffe0064dc1480
    code segment		= base 0x0, limit 0xfffff, type 0x1b
    			= DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags	= resume, IOPL = 0
    current process		= 40463 (sh)
    

    Those are just examples.

    Steve


  • something like this:

    db:0:kdb.enter.default>  show pcpu
    cpuid        = 0
    dynamic pcpu = 0x860580
    curthread    = 0xfffff80004f68620: pid 16 "usbus1"
    curpcb       = 0xfffffe011f328b80
    fpcurthread  = none
    idlethread   = 0xfffff8000496e000: tid 100003 "idle: cpu0"
    curpmap      = 0xffffffff834f1c40
    tssp         = 0xffffffff835a32d0
    commontssp   = 0xffffffff835a32d0
    rsp0         = 0xfffffe011f328b80
    gs32p        = 0xffffffff835a9f28
    ldt          = 0xffffffff835a9f68
    tss          = 0xffffffff835a9f58
    tlb gen      = 97778
    db:0:kdb.enter.default>  bt
    Tracing pid 16 tid 100092 td 0xfffff80004f68620
    kdb_enter() at kdb_enter+0x3b/frame 0xfffffe011f328570
    vpanic() at vpanic+0x19b/frame 0xfffffe011f3285d0
    panic() at panic+0x43/frame 0xfffffe011f328630
    trap_pfault() at trap_pfault/frame 0xfffffe011f328680
    trap_pfault() at trap_pfault+0x49/frame 0xfffffe011f3286e0
    trap() at trap+0x29d/frame 0xfffffe011f3287f0
    calltrap() at calltrap+0x8/frame 0xfffffe011f3287f0
    --- trap 0xc, rip = 0xffffffff80cfc278, rsp = 0xfffffe011f3288c0, rbp = 0xfffffe011f328980 ---
    sched_switch() at sched_switch+0x548/frame 0xfffffe011f328980
    mi_switch() at mi_switch+0xeb/frame 0xfffffe011f3289b0
    sleepq_wait() at sleepq_wait+0x2c/frame 0xfffffe011f3289e0
    _cv_wait() at _cv_wait+0x16e/frame 0xfffffe011f328a30
    usb_process() at usb_process+0xf7/frame 0xfffffe011f328a70
    fork_exit() at fork_exit+0x83/frame 0xfffffe011f328ab0
    fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe011f328ab0
    
    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address	= 0x7
    fault code		= supervisor read data, page not present
    instruction pointer	= 0x20:0xffffffff80cfc278
    stack pointer	        = 0x28:0xfffffe011f3288c0
    frame pointer	        = 0x28:0xfffffe011f328980
    code segment		= base 0x0, limit 0xfffff, type 0x1b
    			= DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags	= resume, IOPL = 0
    current process		= 16 (usbus1)
    trap number		= 12
    panic: page fault
    cpuid = 0
    KDB: enter: panic
    

    edit: a usb problem? There is no USB device connected.

  • LAYER 8

    what device is it ? pfsense version?


  • @kiokoman

    pfsense version: 2.4.5-RELEASE-p1 (amd64)
    hardware: APU4D4 (https://pcengines.ch/apu4d4.htm)