Crash Report
-
I have a new Pfsense box mad I need help with crash report. Thank you.
-
You can upload the crash report file here: https://nc.netgate.com/nextcloud/s/m8t7sSzGHLRW5ji
-
Sorry I didn't realise you had uploaded this.
Backtrace:
db:0:kdb.enter.default> bt Tracing pid 65762 tid 100256 td 0xfffffe00fbb20000 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00fa39c5f0 vpanic() at vpanic+0x163/frame 0xfffffe00fa39c720 panic() at panic+0x43/frame 0xfffffe00fa39c780 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00fa39c7e0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00fa39c840 calltrap() at calltrap+0x8/frame 0xfffffe00fa39c840 --- trap 0xc, rip = 0xffffffff80e5fc25, rsp = 0xfffffe00fa39c910, rbp = 0xfffffe00fa39caf0 --- sysctl_iflist() at sysctl_iflist+0x2b5/frame 0xfffffe00fa39caf0 sysctl_rtsock() at sysctl_rtsock+0x2fb/frame 0xfffffe00fa39cbd0 sysctl_root_handler_locked() at sysctl_root_handler_locked+0x90/frame 0xfffffe00fa39cc20 sysctl_root() at sysctl_root+0x216/frame 0xfffffe00fa39cca0 userland_sysctl() at userland_sysctl+0x176/frame 0xfffffe00fa39cd50 sys___sysctl() at sys___sysctl+0x5c/frame 0xfffffe00fa39ce00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe00fa39cf30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00fa39cf30 --- syscall (202, FreeBSD ELF64, __sysctl), rip = 0x82247826a, rsp = 0x820656178, rbp = 0x8206561b0 ---
Panic:
Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 02 fault virtual address = 0xffffea0055c55000 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80e5fc25 stack pointer = 0x28:0xfffffe00fa39c910 frame pointer = 0x28:0xfffffe00fa39caf0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 65762 (darkstat) rdi: fffffe00fa39ccc0 rsi: 0000000000000000 rdx: 0000000000000ae0 rcx: 00000000000009f8 r8: 00000000667dedf3 r9: 00000000667dedf3 rax: 0000000000000000 rbx: fffffe00fa39cb68 rbp: fffffe00fa39caf0 r10: 000000000007cf53 r11: fffffe00fbb20520 r12: ffffea0055c55000 r13: 00000000000000e8 r14: 0000000000000000 r15: fffff8001584dd00 trap number = 12 panic: page fault cpuid = 1 time = 1719551956 KDB: enter: panic
So it looks like it's in darkstat while trying to use sysctl to see available interfaces.
Do you see the same panic/backtrace on several crash reports?
Try disabling darkstat.
-
118>Bootup complete
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x100000000012
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80fa3cfb
stack pointer = 0x28:0xfffffe00c5975e60
frame pointer = 0x28:0xfffffe00c5975e80
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000100000000000 rsi: ffffffff84010000 rdx: 0000000000000180
rcx: fffffe00db3d83a0 r8: 0000000000000004 r9: ffffffff84013000
rax: 0000000000000000 rbx: 0000000000064071 rbp: fffffe00c5975e80
r10: 000000000000000f r11: 0000000081b3bc6d r12: fffffe00deee01c8
r13: 0000100000000000 r14: fffffe00deee01a8 r15: 00000000000025d7
trap number = 12
panic: page fault
cpuid = 3
time = 1721232192
KDB: enter: panic
����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������panic.txt�������������������������������������������������������������������������������������������0600����0�������0�������12����������14645765500� 7145� �����������������������������������������������������������������������������������������������������ustar���root����������������������������wheel������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������page fault����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������version.txt�����������������������������������������������������������������������������������������0600����0�������0�������457���������14645765500� 7635� �����������������������������������������������������������������������������������������������������ustar���root����������������������������wheel������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������FreeBSD 14.0-CURRENT amd64 1400094 #1 RELENG_2_7_2-n255948-8d2b56da39c: Wed Dec 6 20:45:47 UTC 2023
root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_2-main/obj/amd64/StdASW5b/var/jenkins/workspace/pfSense-CE-snapshots-2_7_2-main/sources/FreeBSD-src-RELENG_2_7_2/amd64.amd64/sys/pfSense -
@hassling under no circumstances provide more information :).
But more serious: is it the same system you opened the last thread?
If yes did you do what @stephenw10 recommended and disabled darkstat?
-
Yup need to see the backtrace. However that appears to be different.
If you have a number of crash reports and they are all different it's probably a hardware issue.
-
Body:
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x100000000012
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80fa3cfb
stack pointer = 0x28:0xfffffe00c5975e60
frame pointer = 0x28:0xfffffe00c5975e80
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000100000000000 rsi: ffffffff84010000 rdx: 0000000000000180
rcx: fffffe00db3d83a0 r8: 0000000000000004 r9: ffffffff84013000
rax: 0000000000000000 rbx: 0000000000064071 rbp: fffffe00c5975e80
r10: 000000000000000f r11: 0000000081b3bc6d r12: fffffe00deee01c8
r13: 0000100000000000 r14: fffffe00deee01a8 r15: 00000000000025d7
trap number = 12
panic: page fault -
Still need to see a backtrace. And posting new threads just confuses everyone.
How many crash reports do you have? We need to compare multiple crashes.
-
@stephenw10
Sorry for the confusion, yes it is the same system and yes and disabled darkstat. Multiple crashes occur and continues. -
Ok, can you upload any further crash reports to the Nextcloud link?
Or just give us the backtraces here to compare?
-
@patient0 I did. Still happening. I bought this Topton pfSense router from AliExpress two months ago.
-
@stephenw10 I'll post backtraces later as I am not home. Thank you
-
@stephenw10 uploaded.
-
This line
current process = 7 (pf purge)
indicates that the error occurs at the pfSense boot stage when initializing the PF kernel module.
I would recommend to install the system again firstIt is possible that there is some kind of problem at the hardware level
-
@Konstanti The last two differnet codes happened after I put an external USB fan (plugged to device the wall) under the router as per seller recommendation to dispense heat. Recorded temperature at 4 cores reach 62 Celsius though Pfsense occasionally throughs 70 Celsius.
Correlation vs causation? I am just trying to exclude.
-
Did the fan bring down the CPU temps? On an N100 that's normally passively cooled any fan should have a dramatic effect. If it doesn't I would check the CPU is correctly attached to the heasink/case, has thermal paste etc.
There are 4 distinct crashes there. One each in darkstat and netstat and all the others in pfctl and pfpurge which are very similar:
db:0:kdb.enter.default> show pcpu cpuid = 3 dynamic pcpu = 0xfffffe009d4fff80 curthread = 0xfffffe0020552740: pid 7 tid 100113 critnest 1 "pf purge" curpcb = 0xfffffe0020552c60 fpcurthread = none idlethread = 0xfffffe002047fe40: tid 100006 "idle: cpu3" self = 0xffffffff84013000 curpmap = 0xffffffff83020ab0 tssp = 0xffffffff84013384 rsp0 = 0xfffffe00c5809000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff84013404 ldt = 0xffffffff84013444 tss = 0xffffffff84013434 curvnet = 0xfffff80001240440 db:0:kdb.enter.default> bt Tracing pid 7 tid 100113 td 0xfffffe0020552740 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00c5808b40 vpanic() at vpanic+0x163/frame 0xfffffe00c5808c70 panic() at panic+0x43/frame 0xfffffe00c5808cd0 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00c5808d30 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00c5808d90 calltrap() at calltrap+0x8/frame 0xfffffe00c5808d90 --- trap 0xc, rip = 0xffffffff80fa3cfb, rsp = 0xfffffe00c5808e60, rbp = 0xfffffe00c5808e80 --- pf_state_expires() at pf_state_expires+0xb/frame 0xfffffe00c5808e80 pf_purge_expired_states() at pf_purge_expired_states+0xd5/frame 0xfffffe00c5808ec0 pf_purge_thread() at pf_purge_thread+0x13b/frame 0xfffffe00c5808ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00c5808f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00c5808f30 --- trap 0xe8e0e8e0, rip = 0x3cd03cdd431d431, rsp = 0x186b186b6af46af4, rbp = 0x5fc25fc2d476d476 ---
db:0:kdb.enter.default> show pcpu cpuid = 1 dynamic pcpu = 0xfffffe009d4e1f80 curthread = 0xfffffe00fdaee3a0: pid 36100 tid 100648 critnest 1 "pfctl" curpcb = 0xfffffe00fdaee8c0 fpcurthread = 0xfffffe00fdaee3a0: pid 36100 "pfctl" idlethread = 0xfffffe0020480c80: tid 100004 "idle: cpu1" self = 0xffffffff84011000 curpmap = 0xfffff800095f9ad0 tssp = 0xffffffff84011384 rsp0 = 0xfffffe00fcdb3000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff84011404 ldt = 0xffffffff84011444 tss = 0xffffffff84011434 curvnet = 0xfffff80001240440 db:0:kdb.enter.default> bt Tracing pid 36100 tid 100648 td 0xfffffe00fdaee3a0 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00fcdb23e0 vpanic() at vpanic+0x163/frame 0xfffffe00fcdb2510 panic() at panic+0x43/frame 0xfffffe00fcdb2570 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00fcdb25d0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00fcdb2630 calltrap() at calltrap+0x8/frame 0xfffffe00fcdb2630 --- trap 0xc, rip = 0xffffffff80fc1b62, rsp = 0xfffffe00fcdb2700, rbp = 0xfffffe00fcdb2be0 --- pfioctl() at pfioctl+0x2ca2/frame 0xfffffe00fcdb2be0 devfs_ioctl() at devfs_ioctl+0xcc/frame 0xfffffe00fcdb2c30 vn_ioctl() at vn_ioctl+0xcf/frame 0xfffffe00fcdb2ca0 devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00fcdb2cc0 kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00fcdb2d30 sys_ioctl() at sys_ioctl+0x123/frame 0xfffffe00fcdb2e00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe00fcdb2f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00fcdb2f30 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x929772e79ca, rsp = 0x929740778a8, rbp = 0x92974077920 ---
All of those crashes are after bootup completes.
Nothing else significant is shown. How long does it take to crash after booting?
-
Similar to this: https://redmine.pfsense.org/issues/13417
-
@stephenw10 random between 1-2 days. However, when I removed the fan the system was stable for 4 days. I put the fan and connected it throught a USB on wall outlet within two days two crashes and froze once.
Now I removed the fan completely and I am watching. Next I'll check the thermal paste and update.
Any chance how clear darkstat remnants files or other packages (I already removed them)?
-
@stephenw10 Yes, the fan brought the temperature down to 40 Celsius.
Anything I can do to clear darkstats/ netstat remnants cache files?
-
darkstat was the 1st crash like 20days ago (1719551956). The netstat crash was the 5th of July (1720211173).
Hard to imagine the fan caused a crash. Especially if it wasn't powered from the device itself.