pfSense Crash "Fatal trap 12: page fault while in kernel mode"
-
I have now after a longer time my next crash. Its again with tailscale.
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xb8 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80f44300 stack pointer = 0x28:0xfffffe008b818c80 frame pointer = 0x28:0xfffffe008b818d00 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 39102 (tailscaled) rdi: ffffffff82d62a40 rsi: 000000000000c8cb rdx: 0000000000000000 rcx: 0000000000000000 r8: fffff8000745fb00 r9: 0000000000000000 rax: 0000000000000030 rbx: fffff80006b44540 rbp: fffffe008b818d00 r10: 0000000000000000 r11: fffffe0069391e20 r12: fffff800198e1360 r13: 000000000000c8cb r14: 0000000000000001 r15: fffff8000745fb00 trap number = 12 panic: page fault cpuid = 0 time = 1710249378 KDB: enter: panic
-
Do you have the bacltrace? Full crash report? You can upload it to the same NextCloud link above.
-
@stephenw10 Files are uploaded.
-
Backtrace:
db:0:kdb.enter.default> bt Tracing pid 39102 tid 110696 td 0xfffffe0069391900 kdb_enter() at kdb_enter+0x32/frame 0xfffffe008b818960 vpanic() at vpanic+0x163/frame 0xfffffe008b818a90 panic() at panic+0x43/frame 0xfffffe008b818af0 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe008b818b50 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe008b818bb0 calltrap() at calltrap+0x8/frame 0xfffffe008b818bb0 --- trap 0xc, rip = 0xffffffff80f44300, rsp = 0xfffffe008b818c80, rbp = 0xfffffe008b818d00 --- in6_pcbbind() at in6_pcbbind+0x440/frame 0xfffffe008b818d00 udp6_bind() at udp6_bind+0x13c/frame 0xfffffe008b818d60 sobind() at sobind+0x32/frame 0xfffffe008b818d80 kern_bindat() at kern_bindat+0x96/frame 0xfffffe008b818dc0 sys_bind() at sys_bind+0x9b/frame 0xfffffe008b818e00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe008b818f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe008b818f30 --- syscall (104, FreeBSD ELF64, bind), rip = 0x482bff, rsp = 0x871d42a50, rbp = 0x871d42a50 ---
The message buffer is spammed with ARP movement logs. If that is expected from something you should consider disabling those logs. It may be hiding other useful entries:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/logs-arp-moved.htmlThat backtrace looks identical though so that looks like a software bug at this point.
What interfaces do you have? That error is something trying to listen on IPv6 and hitting something unexpected.
-
I had another Crash before an hour, with tailscale too.
That are my interfaces, none of them has ipv6 configured.
PS: I have uploaded the logs again if you need them, but i think they are identical.
-
Yup same crash but at least here it happened soon enough the message buffer still has useful data in it.
So I can see that you're running virtualised with vtnet NICs, a bunch of VLANs, and a PPPoE interface.
One of those things probably has something unusual about the v6 linklocal address. Can you send me the output of:
ifcnbfig -vma
to the nc folder? -
@stephenw10
Done. -
Ok two things:
Your tailscale interface has a valid IPv6 address which is probably why the error is happening there.But more likely you somehow have a lagg interface that doesn't have any member interfaces. I'm not sure how you might have that. Did you have a lagg configured previously? Is there any lagg config left over?
-
The LAGG is from my pfsense box, i use the vm at the moment to check if i have a hardware problem.
Disable IPV6 on Tailscale is not possible, should i then enable ipv6 on the pfsense again?
-
Nope there should be no problem having IPv6 only on tailscale.
More likely it's trying to listen on all interfaces including lagg0 but lagg0 is invalid. Remove the lagg entirely.
-
Ok LAGG Interface is removed, then i am waiting and check i have another crash.
-
Hi,
i changed back to my primary hardware last week. Now i got my first crash.
Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 02 fault virtual address = 0xb8 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80f44300 stack pointer = 0x28:0xffffffff83796c80 frame pointer = 0x28:0xffffffff83796d00 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6121 (tailscaled) rdi: ffffffff82d62a40 rsi: 0000000000005ce5 rdx: 0000000000000000 rcx: 0000000000000000 r8: fffff8001dd2f700 r9: 0000000000000000 rax: 0000000000000030 rbx: fffff8001da28380 rbp: ffffffff83796d00 r10: 0000000000000000 r11: fffffe006b33a8c0 r12: fffff80123e9bb80 r13: 0000000000005ce5 r14: 0000000000000001 r15: fffff8001dd2f700 trap number = 12 panic: page fault cpuid = 0 time = 1711432782 KDB: enter: panic
PS: Have uploaded the dump.
-
Pretty much identical backtrace:
db:0:kdb.enter.default> bt Tracing pid 6121 tid 101274 td 0xfffffe006b33a3a0 kdb_enter() at kdb_enter+0x32/frame 0xffffffff83796960 vpanic() at vpanic+0x163/frame 0xffffffff83796a90 panic() at panic+0x43/frame 0xffffffff83796af0 trap_fatal() at trap_fatal+0x40c/frame 0xffffffff83796b50 trap_pfault() at trap_pfault+0x4f/frame 0xffffffff83796bb0 calltrap() at calltrap+0x8/frame 0xffffffff83796bb0 --- trap 0xc, rip = 0xffffffff80f44300, rsp = 0xffffffff83796c80, rbp = 0xffffffff83796d00 --- in6_pcbbind() at in6_pcbbind+0x440/frame 0xffffffff83796d00 udp6_bind() at udp6_bind+0x13c/frame 0xffffffff83796d60 sobind() at sobind+0x32/frame 0xffffffff83796d80 kern_bindat() at kern_bindat+0x96/frame 0xffffffff83796dc0 sys_bind() at sys_bind+0x9b/frame 0xffffffff83796e00 amd64_syscall() at amd64_syscall+0x109/frame 0xffffffff83796f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xffffffff83796f30 --- syscall (104, FreeBSD ELF64, bind), rip = 0x482bff, rsp = 0x87058fa50, rbp = 0x87058fa50 ---
Message buffer is still spammed by arp movement logs hiding anything that might be useful. You should really think about just disabling that logging if those MACs are known:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/logs-arp-moved.htmlCan you upload the ifconfig output from that hardware?
-
@stephenw10 Files is uploaded.
I dont understand exactaly where i can disable the settings for that and why i have that messages.EDIT: Got it dont read the last line of the URL.
-
Hmm pretty much as before then. The only anomaly there is that one link in the lagg is not participating/active:
laggport: igc0 flags=8<COLLECTING> state=1f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,COLLECTING> [(8000,7C-2B-E1-13-62-5B,01E6,8000,0001), (FFFF,74-4D-28-07-F0-08,0007,00FF,0004)]
-
@stephenw10 I know that, i think the cable is broken. I have now set from SpeedShift to PowerD and since that no more crashes. Before the setting and the first crash in the morning i have a crash every hour.
-
Huh, well that's..... unexpected! There was some speculation that it could be a race condition between multiple processes accessing the same socket. Changing the CPU frequency could affect that.
-
Good morning,
i had another Crash in the morning now with powerd, then thats not the resolution for the crashes. I uploaded the logs bug i think the crash report is the same.
-
Yes identical crash.
What's connected to igc0? It flapping a lot:
<6>igc0: link state changed to DOWN <6>igc0: link state changed to UP <6>igc0: link state changed to DOWN <6>igc0: link state changed to UP <6>igc0: link state changed to DOWN <6>igc0: link state changed to UP <6>igc0: link state changed to DOWN <6>igc0: link state changed to UP
-
That's one of the LAGG ports. I have disabled the port for the moment.