Got System Panic this am
-
System panicked this am. Would you like the textdump, and where should I send it?
-
You can put the backtrace up here of you want. Usually enough to see what's happening and doesn't have any identifying info in it.
-
db:0:kdb.enter.default> bt Tracing pid 11 tid 100004 td 0xfffff8000565f740 kdb_enter() at kdb_enter+0x37/frame 0xfffffe0075d5efa0 vpanic() at vpanic+0x194/frame 0xfffffe0075d5eff0 panic() at panic+0x43/frame 0xfffffe0075d5f050 trap_fatal() at trap_fatal+0x38f/frame 0xfffffe0075d5f0b0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0075d5f110 calltrap() at calltrap+0x8/frame 0xfffffe0075d5f110 --- trap 0xc, rip = 0xffffffff80d833cc, rsp = 0xfffffe0075d5f1e0, rbp = 0xfffffe0075d5f200 --- rm_cleanIPI() at rm_cleanIPI+0x5c/frame 0xfffffe0075d5f200 smp_rendezvous_action() at smp_rendezvous_action+0xac/frame 0xfffffe0075d5f230 Xrendezvous() at Xrendezvous+0xae/frame 0xfffffe0075d5f230 --- interrupt, rip = 0xffffffff804c5324, rsp = 0xfffffe0075d5f300, rbp = 0xfffffe0075d5f330 --- acpi_cpu_idle() at acpi_cpu_idle+0x304/frame 0xfffffe0075d5f330 cpu_idle_acpi() at cpu_idle_acpi+0x3e/frame 0xfffffe0075d5f350 cpu_idle() at cpu_idle+0x9f/frame 0xfffffe0075d5f370 sched_idletd() at sched_idletd+0x326/frame 0xfffffe0075d5f430 fork_exit() at fork_exit+0x7e/frame 0xfffffe0075d5f470 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0075d5f470 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
-
The actual backtrace to the panic is the important part there.
I'm not familiar with that though.
Is there anything in the message buffer leading up to that? Any errors?
What snapshot was that? What hardware is it running on?
Steve
-
@stephenw10
Oh - found this in msg buffer -<6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d7:76 to 3c:ec:ef:44:d0:c0 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d0:c1 to 3c:ec:ef:44:d7:76 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d7:76 to 3c:ec:ef:44:d0:c0 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d0:c1 to 3c:ec:ef:44:d7:76 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d7:76 to 3c:ec:ef:44:d0:c0 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d0:c1 to 3c:ec:ef:44:d7:76 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d7:76 to 3c:ec:ef:44:d0:c0 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d0:c1 to 3c:ec:ef:44:d7:76 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d7:76 to 3c:ec:ef:44:d0:c0 on ixl3.410 <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d0:c1 to 3c:ec:ef:44:d7:76 on ixl3.410 <6>ovpn1: changing name to 'ovpns3' <6>arp: 192.168.1.15 moved from 3c:ec:ef:44:d0:c0 to 3c:ec:ef:44:d0:c1 on ixl3.410 <6>ovpns3: link state changed to UP kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d833cc stack pointer = 0x28:0xfffffe0075d5f1e0 frame pointer = 0x28:0xfffffe0075d5f200 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 11 (idle: cpu1) trap number = 12 panic: page fault cpuid = 1 time = 1654268809 KDB: enter: panic
-
Hmm, the ARP movements are just log spam. You can stop logging it if you know what those devices are and they are in a lagg for example. It wouldn't cause a panic.
https://docs.netgate.com/pfsense/en/latest/troubleshooting/logs-arp-moved.html
-
@stephenw10 Could it be related to the new OpenVPN DCO thing? I had made a changed to the tunnel (trying to debug a latency problem) when it panicked. It looks like the last thing was the tunnel enabling.
-
It was immediately afterwards? And you had DCO enabled?
It could well be related then. What snapshot is that and what hardware?
Steve
-
@stephenw10 22.05-BETA (amd64)
built on Tue May 31 06:20:27 UTC 2022
FreeBSD 12.3-STABLE -
@swixo
Oh yes - DCO on - the tunnel that came up just before the panic - is a DCO tunnel -
How many OpenVPN tunnels in total do you have there? How many are using DCO?
-
@stephenw10 Only this one is UP.
There are two others - but they were completely idle.
-
All 3 had DCO enabled though?
-
@stephenw10 Yes. They were enabled.
-
Mmm, OK. Is this the only time you have seen it?
-
@stephenw10 Yes - this is the first panic I have observed.
-
Ok, thanks. I've opened an internal bug for it. Our developers are looking into it.
Steve
-
@stephenw10 If helpful -
This system is the 'server' side of an s-s OpenVPN tunnel and it crashed.
The action that caused it was an update made to the remote client end. After that change - the tunnel went down, then up - then crashed.
-
@stephenw10 Will you let us know when a change is logged internally that should affect this? I dont want to try again until we have a resolution.
-
We can't know for certain since as far as I know you are the only person who had hit that. We never replicated it here. But we have applied a fix for what appears to be the bug that caused it. It's in 22.05-RC now.
Steve