6100 Crash
-
@stephenw10
Hi Steve,
I had a similar unexplained crash yesterday on my 6100. The crash dump produced aninfo.0
file and atextdump.tar.0
file [slightly redacted]:Dump header from device: /dev/nvd0p3 Architecture: amd64 Architecture Version: 4 Dump Length: 224256 Blocksize: 512 Compression: none Dumptime: 2023-04-17 17:52:10 +0100 Hostname: Router-8.xxxxxxx.me Magic: FreeBSD Text Dump Version String: FreeBSD 14.0-CURRENT #0 plus-RELENG_23_01-n256037-6e914874a5e: Fri Feb 10 20:30:29 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/obj/amd64/VDZvZksF/var/j Panic String: page fault Dump Parity: 506679352 Bounds: 0 Dump Status: good
Presumably the back-trace section is of interest:
db:1:pfs> bt Tracing pid 2 tid 100042 td 0xfffffe0085059e40 kdb_enter() at kdb_enter+0x32/frame 0xfffffe0085250970 vpanic() at vpanic+0x182/frame 0xfffffe00852509c0 panic() at panic+0x43/frame 0xfffffe0085250a20 trap_fatal() at trap_fatal+0x409/frame 0xfffffe0085250a80 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0085250ae0 calltrap() at calltrap+0x8/frame 0xfffffe0085250ae0 --- trap 0xc, rip = 0xffffffff80fe7fe7, rsp = 0xfffffe0085250bb0, rbp = 0xfffffe0085250be0 --- in6_selecthlim() at in6_selecthlim+0x97/frame 0xfffffe0085250be0 tcp_default_output() at tcp_default_output+0x1c61/frame 0xfffffe0085250db0 tcp_timer_rexmt() at tcp_timer_rexmt+0x66b/frame 0xfffffe0085250e10 softclock_call_cc() at softclock_call_cc+0x133/frame 0xfffffe0085250ec0 softclock_thread() at softclock_thread+0xe9/frame 0xfffffe0085250ef0 fork_exit() at fork_exit+0x7e/frame 0xfffffe0085250f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0085250f30 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Memory page fault:
Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 0c fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80fe7fe7 stack pointer = 0x28:0xfffffe0085250bb0 frame pointer = 0x28:0xfffffe0085250be0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (clock (1)) rdi: fffffe0085250b58 rsi: fffff801013b1078 rdx: fffffe0085250b58 rcx: fffff801012e9000 r8: 0 r9: fffff801012e9060 rax: 0 rbx: 0 rbp: fffffe0085250be0 r10: fffff801013b1078 r11: ffffffff835bb040 r12: fffff80131f89d98 r13: 0 r14: fffffe0085250bb8 r15: fffff80131f89d00 trap number = 12 panic: page fault cpuid = 1 time = 1681750330 KDB: enter: panic FreeBSD 14.0-CURRENT #0 plus-RELENG_23_01-n256037-6e914874a5e: Fri Feb 10 20:30:29 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/obj/amd64/VDZvZksF/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/sources/FreeBSD-src-plus-RELENG_23_01/amd64.amd64/sys/pfSense
I can add more detail if you point me where to look.
️
-
I should add that this was immediately after upgrading the firmware to
03.00.00.03t-uc-18
.️
-
-
I split this into a different thread because it not an MCA fault like the other thread.
This is not a crash I've seen before. This is the first time you've seen it?
Is that the full backtrace?
Steve
-
There is a load more following that section, starting with
db:1:pfs> show registers
. I can add more detail from any section you have in mind.It's the first I have seen it...'ish. I had attributed a similar previous crash to the interface race-condition I am suffering with (which you put an upstream fix in place for).
Looking in the
system.log
I can see an interface related event 2 seconds before this crash (as said, going through the firmware change process), so that may or may not be linked:Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IFACE: Down event Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IFACE: Rename interface pppoe0 to pppoe0 Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IFACE: Set description "WAN" Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPCP: rec'd Terminate Ack #4 (Closing) Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPCP: state change Closing --> Closed Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPCP: LayerFinish Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPV6CP: rec'd Terminate Ack #2 (Closing) Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPV6CP: state change Closing --> Closed Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPV6CP: LayerFinish Apr 17 17:52:08 Router-8 ppp[19961]: [wan] Bundle: No NCPs left. Closing links... Apr 17 17:52:08 Router-8 ppp[19961]: [wan] Bundle: closing link "wan_link0"... Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] Link: CLOSE event Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: Close event Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: state change Opened --> Closing Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] Link: Leave bundle "wan" Apr 17 17:52:08 Router-8 ppp[19961]: [wan] Bundle: Status update: up 0 links, total bandwidth 9600 bps Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPCP: Close event Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPV6CP: Close event Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPCP: Down event Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPCP: state change Closed --> Initial Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPV6CP: Down event Apr 17 17:52:08 Router-8 ppp[19961]: [wan] IPV6CP: state change Closed --> Initial Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: SendTerminateReq #4 Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: LayerDown Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: rec'd Terminate Ack #4 (Closing) Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: state change Closing --> Closed Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: LayerFinish Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] Link: DOWN event Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] Link: giving up after 0 reconnection attempts Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: Close event Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: Down event Apr 17 17:52:08 Router-8 ppp[19961]: [wan_link0] LCP: state change Closed --> Initial Apr 17 17:52:09 Router-8 php-fpm[367]: /status_interfaces.php: Shutting down Router Advertisement daemon cleanly Apr 17 17:52:10 Router-8 ppp[19961]: [wan] Bundle: Shutdown Apr 17 17:52:10 Router-8 ppp[19961]: [wan_link0] Link: Shutdown Apr 17 17:52:10 Router-8 ppp[19961]: process 19961 terminated Apr 17 17:52:10 Router-8 vnstatd[39953]: Interface "pppoe0" disabled. Apr 17 17:52:10 Router-8 kernel: Apr 17 17:52:10 Router-8 kernel: Apr 17 17:52:10 Router-8 kernel: Fatal trap 12: page fault while in kernel mode Apr 17 17:52:10 Router-8 kernel: cpuid = 1; apic id = 0c Apr 17 17:52:10 Router-8 kernel: fault virtual address = 0x10 Apr 17 17:52:10 Router-8 kernel: fault code = supervisor read data, page not present Apr 17 17:52:10 Router-8 kernel: instruction pointer = 0x20:0xffffffff80fe7fe7 Apr 17 17:53:24 Router-8 syslogd: kernel boot file is /boot/kernel/kernel Apr 17 17:53:24 Router-8 kernel: ---<<BOOT>>---
Thanks for pulling this out from the other 6100 crash thread.
️
-
Let me see what I can find. I think it's very unlikely to be related to the blinkboot update.
-
Are you able to replicate that at all? Perhaps by closing the PPP link whilst trying to use it with IPv6?
-
I don't think it's the blinkboot either, only that the blinkboot update requires a power cycle & power disconnect.
Without the next update in place (I'm running a fully-patched production version, not the nightly builds) a link disconnect will probably trigger a crash in its own right, without the 'clean' shutdown ahead of this particular crash.
Figuring this out with the interface issue in play is tricky.
️
-
Ah, OK yes. I was forgetting what that previous issue was. It's almost certainly related to this. As you say hard to handle it separately until we have 23.05 released.
-
Hi Steve,
As fate would have it, another crash ~15 minutes ago, very similar to the previous one:
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 04 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80fe7fe7 stack pointer = 0x28:0xfffffe00cd6dc970 frame pointer = 0x28:0xfffffe00cd6dc9a0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 56065 (unbound) rdi: fffffe00cd6dc918 rsi: fffff801d93b0ac8 rdx: 1 rcx: fffffe00cd6dc918 r8: 0 r9: fffffe00cd6dc920 rax: 0 rbx: 0 rbp: fffffe00cd6dc9a0 r10: fffff801d93b0af8 r11: 8 r12: fffff8019a3ef898 r13: 0 r14: fffffe00cd6dc978 r15: fffff8019a3ef800 trap number = 12 panic: page fault cpuid = 0 time = 1682001736 KDB: enter: panic
Back Trace:
Crash report details: No PHP errors found. Filename: /var/crash/info.0 Dump header from device: /dev/nvd0p3 Architecture: amd64 Architecture Version: 4 Dump Length: 245760 Blocksize: 512 Compression: none Dumptime: 2023-04-20 15:42:16 +0100 Hostname: Router-8.xxxxxxx.me Magic: FreeBSD Text Dump Version String: FreeBSD 14.0-CURRENT #0 plus-RELENG_23_01-n256037-6e914874a5e: Fri Feb 10 20:30:29 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/obj/amd64/VDZvZksF/var/j Panic String: page fault Dump Parity: 1812278328 Bounds: 0 Dump Status: good Filename: /var/crash/textdump.tar.0 db:0:kdb.enter.default> run pfs db:1:pfs> bt Tracing pid 56065 tid 100566 td 0xfffffe00cdb73ac0 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00cd6dc730 vpanic() at vpanic+0x182/frame 0xfffffe00cd6dc780 panic() at panic+0x43/frame 0xfffffe00cd6dc7e0 trap_fatal() at trap_fatal+0x409/frame 0xfffffe00cd6dc840 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00cd6dc8a0 calltrap() at calltrap+0x8/frame 0xfffffe00cd6dc8a0 --- trap 0xc, rip = 0xffffffff80fe7fe7, rsp = 0xfffffe00cd6dc970, rbp = 0xfffffe00cd6dc9a0 --- in6_selecthlim() at in6_selecthlim+0x97/frame 0xfffffe00cd6dc9a0 tcp_default_output() at tcp_default_output+0x1c61/frame 0xfffffe00cd6dcb70 tcp_usr_send() at tcp_usr_send+0x345/frame 0xfffffe00cd6dcc20 sosend_generic() at sosend_generic+0x600/frame 0xfffffe00cd6dccd0 sosend() at sosend+0x3b/frame 0xfffffe00cd6dcd00 soo_write() at soo_write+0x33/frame 0xfffffe00cd6dcd40 dofilewrite() at dofilewrite+0x88/frame 0xfffffe00cd6dcd90 sys_write() at sys_write+0xbc/frame 0xfffffe00cd6dce00 amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe00cd6dcf30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00cd6dcf30 --- syscall (4, FreeBSD ELF64, sys_write), rip = 0x827a2ce6a, rsp = 0x82082cec8, rbp = 0x82082cf00 --- db:1:pfs> show registers cs 0x20
I'm happy to put this on the back-burner until the next release. This crash was preceded by taking the pppoe WAN down and Up again.
️
-
Yeah I think trying to diagnose this with the other other issue still present is going to be very difficult. There's s good chance they are related and it might already be fixed in 23.05.