Fatal trap 12: page fault while in kernel mode
-
I've been having intermittent reboots on pfsense 2.7.2. I can see a few other posts of users having similar issues, but I can't tell from looking at my logs if I have quite the same problem or a variation. I've had the issue on two separate pieces of hardware. The first had eMMC storage so I couldn't retrieve any logs. I reloaded my config onto a mini PC so I could run some hardware tests on the first machine and while it was up I managed to capture some logs from it.
The reboots are mostly random, although I generally seem to have one at about 6:30am, followed by one or two more throughout the day. I saw in some of the previous posts that a phantom LAGG interface was an issue. I did have LAGG configured previously on another device, but I removed it and I don't see it listed anywhere in my config now, nor could I find mention of LAGG in the errors. I'm not running tailscale (cited in some of these other posts) but I do run Wireguard. I also don't have any ipv6 configured (I don't think at least).
I've got the rest of my crash dump saved and can upload it if I'm given a target
Here's a snippet from the logs:
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x458 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80cc0c9c stack pointer = 0x28:0xfffffe00d93a9800 frame pointer = 0x28:0xfffffe00d93a9880 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (netlink_socket (PID) rdi: fffff8000f6382e0 rsi: 0000000000000004 rdx: 0000000000000000 rcx: 000018575745e64c r8: fffffe00dd54b8c0 r9: fffffe00d93aa000 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00d93a9880 r10: 0000000000001388 r11: 000000008055c23d r12: fffffe00d93a9820 r13: fffffe00dd54b3a0 r14: 0000000000000000 r15: fffff8000f6382e0 trap number = 12 panic: page fault cpuid = 0 time = 1710642139 KDB: enter: panic
In some other posts I saw requests to post the output of ifconfig so I'll include that as well
re0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500 description: WAN options=8219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,LINKSTATE> capabilities=18399b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_UCAST,WOL_MCAST,WOL_MAGIC,LINKSTATE,NETMAP> ether 10:62:e5:06:ad:47 inet 68.147.174.168 netmask 0xfffffc00 broadcast 255.255.255.255 inet6 fe80::1262:e5ff:fe06:ad47%re0 prefixlen 64 scopeid 0x1 media: Ethernet 1000baseT <full-duplex> status: active supported media: media autoselect mediaopt flowcontrol media autoselect media 1000baseT mediaopt full-duplex,flowcontrol,master media 1000baseT mediaopt full-duplex,flowcontrol media 1000baseT mediaopt full-duplex,master media 1000baseT mediaopt full-duplex media 100baseTX mediaopt full-duplex,flowcontrol media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex,flowcontrol media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP media none nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> enc0: flags=0 metric 0 mtu 1536 options=0 capabilities=0 groups: enc nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> capabilities=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet 127.0.0.1 netmask 0x0 inet 10.10.10.1 netmask 0xffffffff inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> pflog0: flags=100<PROMISC> metric 0 mtu 33152 options=0 capabilities=0 groups: pflog pfsync0: flags=0 metric 0 mtu 1500 options=0 capabilities=0 maxupd: 128 defer: off version: 1400 syncok: 1 groups: pfsync ue0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500 description: Infrastructure options=68009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> capabilities=68009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:e0:4c:68:05:5b inet 192.168.10.1 netmask 0xffffff00 broadcast 192.168.10.255 inet6 fe80::2e0:4cff:fe68:55b%ue0 prefixlen 64 scopeid 0x6 media: Ethernet autoselect (1000baseT <full-duplex>) status: active supported media: media autoselect media 1000baseT mediaopt full-duplex,master media 1000baseT mediaopt full-duplex media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP media none nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ue0.30: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500 description: GuestVLAN30 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> capabilities=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:e0:4c:68:05:5b inet 192.168.30.1 netmask 0xffffff00 broadcast 192.168.30.255 inet6 fe80::2e0:4cff:fe68:55b%ue0.30 prefixlen 64 scopeid 0x9 groups: vlan vlan: 30 vlanproto: 802.1q vlanpcp: 0 parent interface: ue0 media: Ethernet autoselect (1000baseT <full-duplex>) status: active supported media: media autoselect nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ue0.15: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500 description: TrustVLAN15 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> capabilities=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:e0:4c:68:05:5b inet 192.168.15.1 netmask 0xffffff00 broadcast 192.168.15.255 inet6 fe80::2e0:4cff:fe68:55b%ue0.15 prefixlen 64 scopeid 0xa groups: vlan vlan: 15 vlanproto: 802.1q vlanpcp: 0 parent interface: ue0 media: Ethernet autoselect (1000baseT <full-duplex>) status: active supported media: media autoselect nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ue0.40: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500 description: LABVLAN40 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> capabilities=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:e0:4c:68:05:5b inet 192.168.40.1 netmask 0xffffff00 broadcast 192.168.40.255 inet6 fe80::2e0:4cff:fe68:55b%ue0.40 prefixlen 64 scopeid 0xb groups: vlan vlan: 40 vlanproto: 802.1q vlanpcp: 0 parent interface: ue0 media: Ethernet autoselect (1000baseT <full-duplex>) status: active supported media: media autoselect nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ovpns1: flags=1008043<UP,BROADCAST,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 1500 options=80000<LINKSTATE> capabilities=80000<LINKSTATE> inet 192.168.90.1 netmask 0xffffff00 broadcast 192.168.90.255 inet6 fe80::1262:e5ff:fe06:ad47%ovpns1 prefixlen 64 scopeid 0xc groups: tun openvpn nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> Opened by PID 68691 tun_wg0: flags=10080c1<UP,RUNNING,NOARP,MULTICAST,LOWER_UP> metric 0 mtu 1420 options=80000<LINKSTATE> capabilities=80000<LINKSTATE> inet 192.168.25.1 netmask 0xffffff00 groups: wg WireGuard nd6 options=101<PERFORMNUD,NO_DAD> ue1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=68009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> capabilities=68009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 00:e0:4c:68:10:c6 media: Ethernet autoselect (none) status: no carrier supported media: media autoselect media 1000baseT mediaopt full-duplex,master media 1000baseT mediaopt full-duplex media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP media none nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Here's a little bit from ddb, again, I can post the rest if required:
Tracing command kernel pid 0 tid 100084 td 0xfffffe00172ca740 sched_switch() at sched_switch+0x88a/frame 0xfffffe00a3b30e20 mi_switch() at mi_switch+0xbb/frame 0xfffffe00a3b30e40 _sleep() at _sleep+0x1f0/frame 0xfffffe00a3b30ec0 taskqueue_thread_loop() at taskqueue_thread_loop+0xb1/frame 0xfffffe00a3b30ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00a3b30f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00a3b30f30 --- trap 0xe8bb07c0, rip = 0x3c79153347a7e28e, rsp = 0x9e78b732e5a6408f, rbp = 0x76f50d2bfa9615ed --- Tracing command kernel pid 0 tid 100085 td 0xfffffe00172ca020 sched_switch() at sched_switch+0x88a/frame 0xfffffe00a3b2be20 mi_switch() at mi_switch+0xbb/frame 0xfffffe00a3b2be40 _sleep() at _sleep+0x1f0/frame 0xfffffe00a3b2bec0 taskqueue_thread_loop() at taskqueue_thread_loop+0xb1/frame 0xfffffe00a3b2bef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00a3b2bf30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00a3b2bf30 --- trap 0x18fd18fd, rip = 0x47deb0635f1881, rsp = 0x6066be91037e78a0, rbp = 0x3cbe3cbe3cbe3cbe --- Tracing command kernel pid 0 tid 100086 td 0xfffffe00b4352c80 sched_switch() at sched_switch+0x88a/frame 0xfffffe00a3bf2e20 mi_switch() at mi_switch+0xbb/frame 0xfffffe00a3bf2e40 _sleep() at _sleep+0x1f0/frame 0xfffffe00a3bf2ec0 taskqueue_thread_loop() at taskqueue_thread_loop+0xb1/frame 0xfffffe00a3bf2ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00a3bf2f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00a3bf2f30 --- trap 0x9aa1
-
We need to see the backtrace from the crash.
However I'd bet it's in the USB Ethernet driver. USB Ethernet is generally seen as so bad it's not worth using. Second worse are Realtek NICs which you also have in that box.
Steve
-
@stephenw10 happy to include the backtrace, what's the best way to attach it?
I'm aware that USB and Realtek are bad ideas. This is coming off a backup machine I put in place while I was running memtest on the device that I first started experiencing the crashes on. It has eMMC storage so I haven't been able to pull the crash logs off it. Weirdly this device has been more stable and experienced fewer crashes. The original device I was using was an R86S U1 with built in Intel NICs -
You can upload the crash report here: https://nc.netgate.com/nextcloud/s/9SkY7TYJRmkHsGQ
-
@stephenw10 Thank you, I've submitted them there. Is there a way to get comparable logs off my other device? I did set up remote syslog but it didn't capture the kernel panic part.
-
Ok yes it looks like one of your USB Ethernet devices disconnected itself for some reason:
ugen0.4: <Realtek USB 10/100/1000 LAN> at usbus0 (disconnected) ure1: at uhub0, port 23, addr 3 (disconnected) rgephy2: detached miibus2: detached ure1: detached Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x458 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80cc0c9c stack pointer = 0x28:0xfffffe00d93a9800 frame pointer = 0x28:0xfffffe00d93a9880 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (netlink_socket (PID) rdi: fffff8000f6382e0 rsi: 0000000000000004 rdx: 0000000000000000 rcx: 000018575745e64c r8: fffffe00dd54b8c0 r9: fffffe00d93aa000 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00d93a9880 r10: 0000000000001388 r11: 000000008055c23d r12: fffffe00d93a9820 r13: fffffe00dd54b3a0 r14: 0000000000000000 r15: fffff8000f6382e0 trap number = 12 panic: page fault cpuid = 0 time = 1710642139 KDB: enter: panic
And the backtrace does indeed show the issue is in the ure driver:
db:0:kdb.enter.default> bt Tracing pid 0 tid 102948 td 0xfffffe00dd54b3a0 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00d93a94e0 vpanic() at vpanic+0x163/frame 0xfffffe00d93a9610 panic() at panic+0x43/frame 0xfffffe00d93a9670 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00d93a96d0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00d93a9730 calltrap() at calltrap+0x8/frame 0xfffffe00d93a9730 --- trap 0xc, rip = 0xffffffff80cc0c9c, rsp = 0xfffffe00d93a9800, rbp = 0xfffffe00d93a9880 --- __mtx_lock_sleep() at __mtx_lock_sleep+0xbc/frame 0xfffffe00d93a9880 usbd_do_request_flags() at usbd_do_request_flags+0x75b/frame 0xfffffe00d93a9900 usbd_do_request_proc() at usbd_do_request_proc+0x5e/frame 0xfffffe00d93a9960 ure_miibus_readreg() at ure_miibus_readreg+0x185/frame 0xfffffe00d93a99d0 rgephy_status() at rgephy_status+0x7b/frame 0xfffffe00d93a9a10 rgephy_service() at rgephy_service+0x329/frame 0xfffffe00d93a9a60 mii_pollstat() at mii_pollstat+0x57/frame 0xfffffe00d93a9a90 ure_ifmedia_sts() at ure_ifmedia_sts+0x190/frame 0xfffffe00d93a9ae0 ifmedia_ioctl() at ifmedia_ioctl+0x163/frame 0xfffffe00d93a9b10 dump_iface() at dump_iface+0x145/frame 0xfffffe00d93a9bc0 rtnl_handle_getlink() at rtnl_handle_getlink+0x2a3/frame 0xfffffe00d93a9ca0 rtnl_handle_message() at rtnl_handle_message+0x195/frame 0xfffffe00d93a9d00 nl_taskqueue_handler() at nl_taskqueue_handler+0x79b/frame 0xfffffe00d93a9e40 taskqueue_run_locked() at taskqueue_run_locked+0x182/frame 0xfffffe00d93a9ec0 taskqueue_thread_loop() at taskqueue_thread_loop+0xc2/frame 0xfffffe00d93a9ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00d93a9f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00d93a9f30 --- trap 0xc, rip = 0x828ed73ea, rsp = 0x85dd21ca8, rbp = 0x85dd21cc0 ---
Which is expected if it did disconnect unexpectedly.
Avoid USB Ethernet if at all possible.
If a crash report exists on the other device it would be in /var/crash.