pfSense Crash "Fatal trap 12: page fault while in kernel mode"

DrAg0n141

I had another Crash before an hour, with tailscale too.

That are my interfaces, none of them has ipv6 configured.

7952b964-6263-4c62-b005-e2f1c5fd5f8f-12.03.2024-370.png

PS: I have uploaded the logs again if you need them, but i think they are identical.

stephenw10

Yup same crash but at least here it happened soon enough the message buffer still has useful data in it.

So I can see that you're running virtualised with vtnet NICs, a bunch of VLANs, and a PPPoE interface.

One of those things probably has something unusual about the v6 linklocal address. Can you send me the output of: ifcnbfig -vma to the nc folder?

DrAg0n141

@stephenw10
Done.

stephenw10

Ok two things:
Your tailscale interface has a valid IPv6 address which is probably why the error is happening there.

But more likely you somehow have a lagg interface that doesn't have any member interfaces. I'm not sure how you might have that. Did you have a lagg configured previously? Is there any lagg config left over?

DrAg0n141

@stephenw10

The LAGG is from my pfsense box, i use the vm at the moment to check if i have a hardware problem.

Disable IPV6 on Tailscale is not possible, should i then enable ipv6 on the pfsense again?

stephenw10

Nope there should be no problem having IPv6 only on tailscale.

More likely it's trying to listen on all interfaces including lagg0 but lagg0 is invalid. Remove the lagg entirely.

DrAg0n141

@stephenw10

Ok LAGG Interface is removed, then i am waiting and check i have another crash.

DrAg0n141

Hi,

i changed back to my primary hardware last week. Now i got my first crash.


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address	= 0xb8
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80f44300
stack pointer	        = 0x28:0xffffffff83796c80
frame pointer	        = 0x28:0xffffffff83796d00
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 6121 (tailscaled)
rdi: ffffffff82d62a40 rsi: 0000000000005ce5 rdx: 0000000000000000
rcx: 0000000000000000  r8: fffff8001dd2f700  r9: 0000000000000000
rax: 0000000000000030 rbx: fffff8001da28380 rbp: ffffffff83796d00
r10: 0000000000000000 r11: fffffe006b33a8c0 r12: fffff80123e9bb80
r13: 0000000000005ce5 r14: 0000000000000001 r15: fffff8001dd2f700
trap number		= 12
panic: page fault
cpuid = 0
time = 1711432782
KDB: enter: panic

PS: Have uploaded the dump.

stephenw10

Pretty much identical backtrace:

db:0:kdb.enter.default>  bt
Tracing pid 6121 tid 101274 td 0xfffffe006b33a3a0
kdb_enter() at kdb_enter+0x32/frame 0xffffffff83796960
vpanic() at vpanic+0x163/frame 0xffffffff83796a90
panic() at panic+0x43/frame 0xffffffff83796af0
trap_fatal() at trap_fatal+0x40c/frame 0xffffffff83796b50
trap_pfault() at trap_pfault+0x4f/frame 0xffffffff83796bb0
calltrap() at calltrap+0x8/frame 0xffffffff83796bb0
--- trap 0xc, rip = 0xffffffff80f44300, rsp = 0xffffffff83796c80, rbp = 0xffffffff83796d00 ---
in6_pcbbind() at in6_pcbbind+0x440/frame 0xffffffff83796d00
udp6_bind() at udp6_bind+0x13c/frame 0xffffffff83796d60
sobind() at sobind+0x32/frame 0xffffffff83796d80
kern_bindat() at kern_bindat+0x96/frame 0xffffffff83796dc0
sys_bind() at sys_bind+0x9b/frame 0xffffffff83796e00
amd64_syscall() at amd64_syscall+0x109/frame 0xffffffff83796f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xffffffff83796f30
--- syscall (104, FreeBSD ELF64, bind), rip = 0x482bff, rsp = 0x87058fa50, rbp = 0x87058fa50 ---

Message buffer is still spammed by arp movement logs hiding anything that might be useful. You should really think about just disabling that logging if those MACs are known:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/logs-arp-moved.html

Can you upload the ifconfig output from that hardware?

DrAg0n141

@stephenw10 Files is uploaded.
I dont understand exactaly where i can disable the settings for that and why i have that messages.

EDIT: Got it dont read the last line of the URL.

stephenw10

Hmm pretty much as before then. The only anomaly there is that one link in the lagg is not participating/active:

        laggport: igc0 flags=8<COLLECTING> state=1f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,COLLECTING>
                [(8000,7C-2B-E1-13-62-5B,01E6,8000,0001),
                 (FFFF,74-4D-28-07-F0-08,0007,00FF,0004)]

DrAg0n141

@stephenw10 I know that, i think the cable is broken. I have now set from SpeedShift to PowerD and since that no more crashes. Before the setting and the first crash in the morning i have a crash every hour.

stephenw10

Huh, well that's..... unexpected! There was some speculation that it could be a race condition between multiple processes accessing the same socket. Changing the CPU frequency could affect that.

DrAg0n141

Good morning,

i had another Crash in the morning now with powerd, then thats not the resolution for the crashes. I uploaded the logs bug i think the crash report is the same.

stephenw10

Yes identical crash.

What's connected to igc0? It flapping a lot:

<6>igc0: link state changed to DOWN
<6>igc0: link state changed to UP
<6>igc0: link state changed to DOWN
<6>igc0: link state changed to UP
<6>igc0: link state changed to DOWN
<6>igc0: link state changed to UP
<6>igc0: link state changed to DOWN
<6>igc0: link state changed to UP

DrAg0n141

That's one of the LAGG ports. I have disabled the port for the moment.

stephenw10

Hmm, I can find no way of disabling IPv6 as a source address in tailscale.

One thing you could try is disabling IPv6 link-local addresses on the interface. Of course that breaks IPv6 if you need it. It also doesn't disable it on localhost so tailscale can still try to bind to that.

DrAg0n141

I get today another crash now again with tailscaled.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0xb8
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80f44300
stack pointer	        = 0x28:0xffffffff8377fc80
frame pointer	        = 0x28:0xffffffff8377fd00
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 90406 (tailscaled)
rdi: ffffffff82d62a40 rsi: 00000000000040f9 rdx: 0000000000000000
rcx: 0000000000000000  r8: fffff80020114900  r9: 0000000000000000
rax: 0000000000000030 rbx: fffff80109d95700 rbp: ffffffff8377fd00
r10: 0000000000000000 r11: fffffe007abb98c0 r12: fffff8000b5e3a40
r13: 00000000000040f9 r14: 0000000000000001 r15: fffff80020114900
trap number		= 12
panic: page fault
cpuid = 0
time = 1712133936
KDB: enter: panic

stephenw10

Same backtrace?

Are you able to test disabling link-local IPv6 addresses?

DrAg0n141

Thats the backtrace. I dont not find where i can disable the link-local IPv6 address.

db:0:kdb.enter.default>  bt
Tracing pid 90406 tid 101352 td 0xfffffe007abb93a0
kdb_enter() at kdb_enter+0x32/frame 0xffffffff8377f960
vpanic() at vpanic+0x163/frame 0xffffffff8377fa90
panic() at panic+0x43/frame 0xffffffff8377faf0
trap_fatal() at trap_fatal+0x40c/frame 0xffffffff8377fb50
trap_pfault() at trap_pfault+0x4f/frame 0xffffffff8377fbb0
calltrap() at calltrap+0x8/frame 0xffffffff8377fbb0
--- trap 0xc, rip = 0xffffffff80f44300, rsp = 0xffffffff8377fc80, rbp = 0xffffffff8377fd00 ---
in6_pcbbind() at in6_pcbbind+0x440/frame 0xffffffff8377fd00
udp6_bind() at udp6_bind+0x13c/frame 0xffffffff8377fd60
sobind() at sobind+0x32/frame 0xffffffff8377fd80
kern_bindat() at kern_bindat+0x96/frame 0xffffffff8377fdc0
sys_bind() at sys_bind+0x9b/frame 0xffffffff8377fe00
amd64_syscall() at amd64_syscall+0x109/frame 0xffffffff8377ff30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xffffffff8377ff30
--- syscall (104, FreeBSD ELF64, bind), rip = 0x482bff, rsp = 0x86cadaa50, rbp = 0x86cadaa50 ---