SG-1100: unexpexcted reboots and vm_fault in logs - how to diagnose?
-
Ok well that's definitely a kernel panic.
What you want from the start of the panic output. So the initial panic string and the backtrace (
bt>
)So as an example:
0:kdb.enter.default> run pfs db:1:pfs> bt Tracing pid 0 tid 100007 td 0xfffffe00119bd720 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00101a86c0 vpanic() at vpanic+0x182/frame 0xfffffe00101a8710 panic() at panic+0x43/frame 0xfffffe00101a8770 trap_fatal() at trap_fatal+0x409/frame 0xfffffe00101a87d0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00101a8830 calltrap() at calltrap+0x8/frame 0xfffffe00101a8830 --- trap 0xc, rip = 0xffffffff80f66369, rsp = 0xfffffe00101a8900, rbp = 0xfffffe00101a8930 --- pppoe_findsession() at pppoe_findsession+0x79/frame 0xfffffe00101a8930 ng_pppoe_rcvdata_ether() at ng_pppoe_rcvdata_ether+0x461/frame 0xfffffe00101a89b0 ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe00101a8a40 ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe00101a8a80 ether_demux() at ether_demux+0x212/frame 0xfffffe00101a8ab0 ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe00101a8b10 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00101a8b60 ether_input() at ether_input+0x69/frame 0xfffffe00101a8bc0 ether_demux() at ether_demux+0x9e/frame 0xfffffe00101a8bf0 ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe00101a8c50 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00101a8ca0 ether_input() at ether_input+0x69/frame 0xfffffe00101a8d00 iflib_rxeof() at iflib_rxeof+0xbdb/frame 0xfffffe00101a8e00 _task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe00101a8e40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe00101a8ec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe00101a8ef0 fork_exit() at fork_exit+0x7e/frame 0xfffffe00101a8f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00101a8f30 --- trap 0x107a772c, rip = 0x11d295625b1b31a1, rsp = 0xf02460003c98dfb3, rbp = 0x41abfa0065646f --- db:1:pfs> show registers cs 0x20 ds 0x3b es 0x3b fs 0x13 gs 0x1b ss 0x28 rax 0x12 rcx 0x1 rdx 0xfffffe00101a82e0 rbx 0x100 rsp 0xfffffe00101a86c0 rbp 0xfffffe00101a86c0 rsi 0 rdi 0xffffffff83183f98 vt_conswindow+0x10 r8 0 r9 0x1c200001 r10 0xffffffff83183f88 vt_conswindow r11 0x20 r12 0x2cdc1f807000 r13 0xfffffe00101a8840 r14 0xfffffe00101a8750 r15 0xfffffe00119bd720 rip 0xffffffff80dd82f2 kdb_enter+0x32 rflags 0x82 kdb_enter+0x32: movq $0,0x27bd313(%rip) db:1:pfs> show pcpu cpuid = 0 dynamic pcpu = 0xbf6800 curthread = 0xfffffe00119bd720: pid 0 tid 100007 critnest 1 "if_io_tqg_0" curpcb = 0xfffffe00119bdc40 fpcurthread = none idlethread = 0xfffffe00119bf3a0: tid 100003 "idle: cpu0" self = 0xffffffff84010000 curpmap = 0xffffffff83549750 tssp = 0xffffffff84010384 rsp0 = 0xfffffe00101a9000 kcr3 = 0x8000000081c3e002 ucr3 = 0xffffffffffffffff scr3 = 0x1958cac7f gs32p = 0xffffffff84010404 ldt = 0xffffffff84010444 tss = 0xffffffff84010434 curvnet = 0xfffff800011c7a00
-
@stephenw10 thanks! Got it - what should I do with it?
-
You can upload it here and I'll check it:
https://nc.netgate.com/nextcloud/s/sFoGNTrcoDypsx5 -
@stephenw10 done :-)
-
Ok. Backtrace:
db:1:pfs> bt Tracing pid 0 tid 100131 td 0xffff00008ae04640 db_trace_self() at db_trace_self db_stack_trace() at db_stack_trace+0x120 db_command() at db_command+0x368 db_script_exec() at db_script_exec+0x1ac db_command() at db_command+0x368 db_script_exec() at db_script_exec+0x1ac db_script_kdbenter() at db_script_kdbenter+0x5c db_trap() at db_trap+0xfc kdb_trap() at kdb_trap+0x314 handle_el1h_sync() at handle_el1h_sync+0x18 --- exception, esr 0xf2000000 kdb_enter() at kdb_enter+0x4c vpanic() at vpanic+0x1e0 panic() at panic+0x48 vm_fault() at vm_fault+0x1780 vm_fault_trap() at vm_fault_trap+0xa0 data_abort() at data_abort+0xc8 handle_el1h_sync() at handle_el1h_sync+0x18 --- exception, esr 0x8600000f $d.2() at $d.2+0xc29 range_tree_add_impl() at range_tree_add_impl+0x8c metaslab_alloc_dva() at metaslab_alloc_dva+0xf48 metaslab_alloc() at metaslab_alloc+0xcc zio_dva_allocate() at zio_dva_allocate+0xb8 zio_execute() at zio_execute+0x58 taskqueue_run_locked() at taskqueue_run_locked+0x194 taskqueue_thread_loop() at taskqueue_thread_loop+0x134 fork_exit() at fork_exit+0x8c fork_trampoline() at fork_trampoline+0x18
Mmm, that looks like a drive or filesystem issue. Did you reinstall this clean? If not I would probably try that to rule out any filesytstem issue.
-
@stephenw10 I did install it clean some 2 months ago, but I can retry - will come back after that.
I already suspected something with storage in the past, but couldn't find any hint that shows a failing storage (I did check usage level as in the docs for the SG 1100, but that seems good).