Crash Report
-
You can upload the crash report file here: https://nc.netgate.com/nextcloud/s/m8t7sSzGHLRW5ji
-
Sorry I didn't realise you had uploaded this.
Backtrace:
db:0:kdb.enter.default> bt Tracing pid 65762 tid 100256 td 0xfffffe00fbb20000 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00fa39c5f0 vpanic() at vpanic+0x163/frame 0xfffffe00fa39c720 panic() at panic+0x43/frame 0xfffffe00fa39c780 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00fa39c7e0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00fa39c840 calltrap() at calltrap+0x8/frame 0xfffffe00fa39c840 --- trap 0xc, rip = 0xffffffff80e5fc25, rsp = 0xfffffe00fa39c910, rbp = 0xfffffe00fa39caf0 --- sysctl_iflist() at sysctl_iflist+0x2b5/frame 0xfffffe00fa39caf0 sysctl_rtsock() at sysctl_rtsock+0x2fb/frame 0xfffffe00fa39cbd0 sysctl_root_handler_locked() at sysctl_root_handler_locked+0x90/frame 0xfffffe00fa39cc20 sysctl_root() at sysctl_root+0x216/frame 0xfffffe00fa39cca0 userland_sysctl() at userland_sysctl+0x176/frame 0xfffffe00fa39cd50 sys___sysctl() at sys___sysctl+0x5c/frame 0xfffffe00fa39ce00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe00fa39cf30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00fa39cf30 --- syscall (202, FreeBSD ELF64, __sysctl), rip = 0x82247826a, rsp = 0x820656178, rbp = 0x8206561b0 ---
Panic:
Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 02 fault virtual address = 0xffffea0055c55000 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80e5fc25 stack pointer = 0x28:0xfffffe00fa39c910 frame pointer = 0x28:0xfffffe00fa39caf0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 65762 (darkstat) rdi: fffffe00fa39ccc0 rsi: 0000000000000000 rdx: 0000000000000ae0 rcx: 00000000000009f8 r8: 00000000667dedf3 r9: 00000000667dedf3 rax: 0000000000000000 rbx: fffffe00fa39cb68 rbp: fffffe00fa39caf0 r10: 000000000007cf53 r11: fffffe00fbb20520 r12: ffffea0055c55000 r13: 00000000000000e8 r14: 0000000000000000 r15: fffff8001584dd00 trap number = 12 panic: page fault cpuid = 1 time = 1719551956 KDB: enter: panic
So it looks like it's in darkstat while trying to use sysctl to see available interfaces.
Do you see the same panic/backtrace on several crash reports?
Try disabling darkstat.
-
118>Bootup complete
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x100000000012
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80fa3cfb
stack pointer = 0x28:0xfffffe00c5975e60
frame pointer = 0x28:0xfffffe00c5975e80
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000100000000000 rsi: ffffffff84010000 rdx: 0000000000000180
rcx: fffffe00db3d83a0 r8: 0000000000000004 r9: ffffffff84013000
rax: 0000000000000000 rbx: 0000000000064071 rbp: fffffe00c5975e80
r10: 000000000000000f r11: 0000000081b3bc6d r12: fffffe00deee01c8
r13: 0000100000000000 r14: fffffe00deee01a8 r15: 00000000000025d7
trap number = 12
panic: page fault
cpuid = 3
time = 1721232192
KDB: enter: panic
����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������panic.txt�������������������������������������������������������������������������������������������0600����0�������0�������12����������14645765500� 7145� �����������������������������������������������������������������������������������������������������ustar���root����������������������������wheel������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������page fault����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������version.txt�����������������������������������������������������������������������������������������0600����0�������0�������457���������14645765500� 7635� �����������������������������������������������������������������������������������������������������ustar���root����������������������������wheel������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������FreeBSD 14.0-CURRENT amd64 1400094 #1 RELENG_2_7_2-n255948-8d2b56da39c: Wed Dec 6 20:45:47 UTC 2023
root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_2-main/obj/amd64/StdASW5b/var/jenkins/workspace/pfSense-CE-snapshots-2_7_2-main/sources/FreeBSD-src-RELENG_2_7_2/amd64.amd64/sys/pfSense -
@hassling under no circumstances provide more information :).
But more serious: is it the same system you opened the last thread?
If yes did you do what @stephenw10 recommended and disabled darkstat?
-
Yup need to see the backtrace. However that appears to be different.
If you have a number of crash reports and they are all different it's probably a hardware issue.
-
Body:
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x100000000012
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80fa3cfb
stack pointer = 0x28:0xfffffe00c5975e60
frame pointer = 0x28:0xfffffe00c5975e80
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000100000000000 rsi: ffffffff84010000 rdx: 0000000000000180
rcx: fffffe00db3d83a0 r8: 0000000000000004 r9: ffffffff84013000
rax: 0000000000000000 rbx: 0000000000064071 rbp: fffffe00c5975e80
r10: 000000000000000f r11: 0000000081b3bc6d r12: fffffe00deee01c8
r13: 0000100000000000 r14: fffffe00deee01a8 r15: 00000000000025d7
trap number = 12
panic: page fault -
Still need to see a backtrace. And posting new threads just confuses everyone.
How many crash reports do you have? We need to compare multiple crashes.
-
@stephenw10
Sorry for the confusion, yes it is the same system and yes and disabled darkstat. Multiple crashes occur and continues. -
Ok, can you upload any further crash reports to the Nextcloud link?
Or just give us the backtraces here to compare?
-
@patient0 I did. Still happening. I bought this Topton pfSense router from AliExpress two months ago.
-
@stephenw10 I'll post backtraces later as I am not home. Thank you
-
@stephenw10 uploaded.
-
This line
current process = 7 (pf purge)
indicates that the error occurs at the pfSense boot stage when initializing the PF kernel module.
I would recommend to install the system again firstIt is possible that there is some kind of problem at the hardware level
-
@Konstanti The last two differnet codes happened after I put an external USB fan (plugged to device the wall) under the router as per seller recommendation to dispense heat. Recorded temperature at 4 cores reach 62 Celsius though Pfsense occasionally throughs 70 Celsius.
Correlation vs causation? I am just trying to exclude.
-
Did the fan bring down the CPU temps? On an N100 that's normally passively cooled any fan should have a dramatic effect. If it doesn't I would check the CPU is correctly attached to the heasink/case, has thermal paste etc.
There are 4 distinct crashes there. One each in darkstat and netstat and all the others in pfctl and pfpurge which are very similar:
db:0:kdb.enter.default> show pcpu cpuid = 3 dynamic pcpu = 0xfffffe009d4fff80 curthread = 0xfffffe0020552740: pid 7 tid 100113 critnest 1 "pf purge" curpcb = 0xfffffe0020552c60 fpcurthread = none idlethread = 0xfffffe002047fe40: tid 100006 "idle: cpu3" self = 0xffffffff84013000 curpmap = 0xffffffff83020ab0 tssp = 0xffffffff84013384 rsp0 = 0xfffffe00c5809000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff84013404 ldt = 0xffffffff84013444 tss = 0xffffffff84013434 curvnet = 0xfffff80001240440 db:0:kdb.enter.default> bt Tracing pid 7 tid 100113 td 0xfffffe0020552740 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00c5808b40 vpanic() at vpanic+0x163/frame 0xfffffe00c5808c70 panic() at panic+0x43/frame 0xfffffe00c5808cd0 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00c5808d30 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00c5808d90 calltrap() at calltrap+0x8/frame 0xfffffe00c5808d90 --- trap 0xc, rip = 0xffffffff80fa3cfb, rsp = 0xfffffe00c5808e60, rbp = 0xfffffe00c5808e80 --- pf_state_expires() at pf_state_expires+0xb/frame 0xfffffe00c5808e80 pf_purge_expired_states() at pf_purge_expired_states+0xd5/frame 0xfffffe00c5808ec0 pf_purge_thread() at pf_purge_thread+0x13b/frame 0xfffffe00c5808ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00c5808f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00c5808f30 --- trap 0xe8e0e8e0, rip = 0x3cd03cdd431d431, rsp = 0x186b186b6af46af4, rbp = 0x5fc25fc2d476d476 ---
db:0:kdb.enter.default> show pcpu cpuid = 1 dynamic pcpu = 0xfffffe009d4e1f80 curthread = 0xfffffe00fdaee3a0: pid 36100 tid 100648 critnest 1 "pfctl" curpcb = 0xfffffe00fdaee8c0 fpcurthread = 0xfffffe00fdaee3a0: pid 36100 "pfctl" idlethread = 0xfffffe0020480c80: tid 100004 "idle: cpu1" self = 0xffffffff84011000 curpmap = 0xfffff800095f9ad0 tssp = 0xffffffff84011384 rsp0 = 0xfffffe00fcdb3000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff84011404 ldt = 0xffffffff84011444 tss = 0xffffffff84011434 curvnet = 0xfffff80001240440 db:0:kdb.enter.default> bt Tracing pid 36100 tid 100648 td 0xfffffe00fdaee3a0 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00fcdb23e0 vpanic() at vpanic+0x163/frame 0xfffffe00fcdb2510 panic() at panic+0x43/frame 0xfffffe00fcdb2570 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00fcdb25d0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00fcdb2630 calltrap() at calltrap+0x8/frame 0xfffffe00fcdb2630 --- trap 0xc, rip = 0xffffffff80fc1b62, rsp = 0xfffffe00fcdb2700, rbp = 0xfffffe00fcdb2be0 --- pfioctl() at pfioctl+0x2ca2/frame 0xfffffe00fcdb2be0 devfs_ioctl() at devfs_ioctl+0xcc/frame 0xfffffe00fcdb2c30 vn_ioctl() at vn_ioctl+0xcf/frame 0xfffffe00fcdb2ca0 devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00fcdb2cc0 kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00fcdb2d30 sys_ioctl() at sys_ioctl+0x123/frame 0xfffffe00fcdb2e00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe00fcdb2f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00fcdb2f30 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x929772e79ca, rsp = 0x929740778a8, rbp = 0x92974077920 ---
All of those crashes are after bootup completes.
Nothing else significant is shown. How long does it take to crash after booting?
-
Similar to this: https://redmine.pfsense.org/issues/13417
-
@stephenw10 random between 1-2 days. However, when I removed the fan the system was stable for 4 days. I put the fan and connected it throught a USB on wall outlet within two days two crashes and froze once.
Now I removed the fan completely and I am watching. Next I'll check the thermal paste and update.
Any chance how clear darkstat remnants files or other packages (I already removed them)?
-
@stephenw10 Yes, the fan brought the temperature down to 40 Celsius.
Anything I can do to clear darkstats/ netstat remnants cache files?
-
darkstat was the 1st crash like 20days ago (1719551956). The netstat crash was the 5th of July (1720211173).
Hard to imagine the fan caused a crash. Especially if it wasn't powered from the device itself.
-
@stephenw10 A little update since last crash July 17th, no more crashes after removing the external (or USB) fan.
I am now inclined to believe that trying to cool the processor is the culprit for the crash. The processor seems to be very sensitive to temp change.
Hope this will help someone else. I'll post here if anything changes.
Thank you for your insight.