Crash Report or Programming Bug
-
My brand new pfSense Plus 23.09 install just crashed. The hardware is new, too. Any ideas? Thanks.
-
Backtrace:
db:1:pfs> bt Tracing pid 51493 tid 103465 td 0xfffffe0103b023a0 kdb_enter() at kdb_enter+0x32/frame 0xfffffe0101fa5230 vpanic() at vpanic+0x163/frame 0xfffffe0101fa5360 panic() at panic+0x43/frame 0xfffffe0101fa53c0 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe0101fa5420 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0101fa5480 calltrap() at calltrap+0x8/frame 0xfffffe0101fa5480 --- trap 0xc, rip = 0xffffffff8116bd1b, rsp = 0xfffffe0101fa5550, rbp = 0xfffffe0101fa5550 --- vm_radix_lookup_ge() at vm_radix_lookup_ge+0x4b/frame 0xfffffe0101fa5550 kern_proc_vmmap_resident() at kern_proc_vmmap_resident+0x12b/frame 0xfffffe0101fa55c0 kern_proc_vmmap_out() at kern_proc_vmmap_out+0x19f/frame 0xfffffe0101fa5740 note_procstat_vmmap() at note_procstat_vmmap+0xfc/frame 0xfffffe0101fa5790 elf64_prepare_notes() at elf64_prepare_notes+0x577/frame 0xfffffe0101fa5820 elf64_coredump() at elf64_coredump+0x8b/frame 0xfffffe0101fa58f0 sigexit() at sigexit+0xbd5/frame 0xfffffe0101fa5d60 postsig() at postsig+0x237/frame 0xfffffe0101fa5e20 ast_sig() at ast_sig+0x1d7/frame 0xfffffe0101fa5ed0 ast_handler() at ast_handler+0x88/frame 0xfffffe0101fa5f10 ast() at ast+0x20/frame 0xfffffe0101fa5f30 doreti_ast() at doreti_ast+0x1c/frame 0x858f9eb30
Panic:
<118>Bootup complete <6>igc0: promiscuous mode enabled <6>igc5: promiscuous mode enabled <6>pid 28585 (suricata), jid 0, uid 0: exited on signal 11 (core dumped) <6>igc5: promiscuous mode disabled <6>pid 24905 (unbound-control), jid 0, uid 59: exited on signal 6 (no core dump - other error) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xfffffe400a490c50 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8116bd1b stack pointer = 0x0:0xfffffe0101fa5550 frame pointer = 0x0:0xfffffe0101fa5550 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 51493 (ntopng) rdi: fffff80222e1b510 rsi: 0000000000005869 rdx: fffffe400a490c29 rcx: 0000000000000009 r8: 000000000000007f r9: 000000000000009f rax: fffffe400a490c28 rbx: fffff80222ec8318 rbp: fffffe0101fa5550 r10: 000007fffffff000 r11: 0000000000000020 r12: 0000000000005869 r13: 00003627ec769000 r14: 00000000000005b9 r15: 0000000000000000 trap number = 12 panic: page fault cpuid = 3 time = 1711980524 KDB: enter: panic
N100 is not too new.
You have ACPI errors in the BIOS. Make sure it's the most recent version:Firmware Error (ACPI): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS01], AE_NOT_FOUND (20221020/dswload2-315) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20221020/psobject-372)
Nothing obvious but I'd guess something in a CPU power saving mode. Try disabling anything like that in the BIOS.
Is that the first time you've seen that crash?
Steve
-
Thanks, Steve, for the quick response. The N100 is not the newest platform but I was hoping that it will be stable enough.
Unfortunately, it is not the first crash; I have had this unit for over a week now. It will run for a day or two without a problem and then randomly crash and since I couldn't find the error by myself, I posted it in the forum.
I now disabled all power savings and hibernation options in the BIOS. Let's see.
-
Have all he crashes been identical? If they're all random it could be a hardware issue.
-
I am not sure because there is not always a crash report. I did get another one earlier today:
Unfortunately, the third crash has not generated any reports yet.
-
Mmm, no that's completely different.
Backtrace:
db:1:pfs> bt Tracing pid 2 tid 100041 td 0xfffffe00205b1560 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00c579a940 vpanic() at vpanic+0x163/frame 0xfffffe00c579aa70 panic() at panic+0x43/frame 0xfffffe00c579aad0 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00c579ab30 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00c579ab90 calltrap() at calltrap+0x8/frame 0xfffffe00c579ab90 --- trap 0xc, rip = 0xffffffff81298340, rsp = 0xfffffe00c579ac60, rbp = 0xfffffe00c579ac60 --- memset_erms() at memset_erms+0x30/frame 0xfffffe00c579ac60 uma_zalloc_arg() at uma_zalloc_arg+0x137/frame 0xfffffe00c579aca0 sigqueue_add() at sigqueue_add+0x99/frame 0xfffffe00c579acd0 tdsendsignal() at tdsendsignal+0x368/frame 0xfffffe00c579ad50 kern_psignal() at kern_psignal+0x8f/frame 0xfffffe00c579add0 realitexpire() at realitexpire+0x1a/frame 0xfffffe00c579ae10 softclock_call_cc() at softclock_call_cc+0x134/frame 0xfffffe00c579aec0 softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00c579aef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00c579af30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00c579af30 --- trap 0x5aa55aa5, rip = 0x5aa55aa55aa55aa5, rsp = 0x5aa55aa55aa55aa5, rbp = 0x5aa55aa55aa55aa5 ---
Panic:
<118>Bootup complete <6>igc0: promiscuous mode enabled <6>igc5: promiscuous mode enabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xfffff810090a78c0 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff81298340 stack pointer = 0x0:0xfffffe00c579ac60 frame pointer = 0x0:0xfffffe00c579ac60 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (clock (0)) rdi: fffff810090a78c0 rsi: 0000000000000000 rdx: 0000000000000070 rcx: 0000000000000070 r8: 0000000000000000 r9: fffffe00205b1560 rax: fffff810090a78c0 rbx: 0000000000010000 rbp: fffffe00c579ac60 r10: 0000000000000000 r11: 0000000080334b3c r12: fffff810090a78c0 r13: 0000000000000070 r14: 0000000000000101 r15: fffffe00db402800 trap number = 12 panic: page fault cpuid = 0 time = 1711984542 KDB: enter: panic
Ok with crashes that different I'd first run a few memtest cycles to be sure its not bad ram.
-
I already swapped the hard drive. I guess the RAM will be next. Let's see. However, I am afraid that maybe the entire unit is faulty.
-
It's possible if it's always been unstable.
-
Based on the experience, it is possible. One way or the other, it is a warranty case for the unit or the component. Let's see once the RAM arrives tomorrow.
-
@stephenw10 I don't want to praise the day before sunset but the new RAM may have done the trick! So far, the router has been stably running for almost a day without crashing!
Thank you for your support and for deciphering the crash report.