LAN crash with WAN still online
-
Have the latest 22.05 PFSense installed on a mini PC and has been running great until recently when it stops communicating over the LAN but I can get to it remotely using Tailscale. A reboot fixes the issue for a short time, but then it crashes again. I have the crash report but am having a hard time figuring out what caused the crash. Any help would be greatly appreciated. Thank youinfo.0 textdump.tar.0
-
For ease of reading:
db:0:kdb.enter.default> bt Tracing pid 9445 tid 100661 td 0xfffff80083325000 kdb_enter() at kdb_enter+0x37/frame 0xfffffe0034804500 vpanic() at vpanic+0x194/frame 0xfffffe0034804550 panic() at panic+0x43/frame 0xfffffe00348045b0 trap_fatal() at trap_fatal+0x38f/frame 0xfffffe0034804610 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0034804670 calltrap() at calltrap+0x8/frame 0xfffffe0034804670 --- trap 0xc, rip = 0xffffffff811f4004, rsp = 0xfffffe0034804740, rbp = 0xfffffe0034804770 --- vm_object_shadow() at vm_object_shadow+0x214/frame 0xfffffe0034804770 vm_map_lookup() at vm_map_lookup+0xaaa/frame 0xfffffe0034804860 vm_fault() at vm_fault+0x85/frame 0xfffffe00348049c0 vm_fault_trap() at vm_fault_trap+0x60/frame 0xfffffe0034804a00 trap_pfault() at trap_pfault+0x1e0/frame 0xfffffe0034804a60 trap() at trap+0x425/frame 0xfffffe0034804b70 calltrap() at calltrap+0x8/frame 0xfffffe0034804b70 --- trap 0xc, rip = 0x8003c009a, rsp = 0x7fffffffe2a0, rbp = 0x7fffffffe300 ---
Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 13 fault virtual address = 0x4f0040 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff811f4004 stack pointer = 0x0:0xfffffe0034804740 frame pointer = 0x0:0xfffffe0034804770 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 9445 (sh) trap number = 12 panic: page fault cpuid = 3 time = 1658774499 KDB: enter: panic
There are some other things in the message buffer though:
ugen2.2: <vendor 0x8087 product 0x07dc> at usbus2 (disconnected) ugen2.2: <vendor 0x8087 product 0x07dc> at usbus2
That's a Bluetooth device. pfSense can't do anything with it but it's odd to see it disconnect repeatedly like that. I would disable it if you can.
<6>arp: 10.13.87.45 moved from 98:ed:5c:08:82:b4 to cc:88:26:dc:c5:2a on re1 <6>arp: 10.13.87.92 moved from cc:88:26:dc:c5:2a to 98:ed:5c:08:82:b4 on re1 <6>arp: 10.13.87.45 moved from 98:ed:5c:08:82:b4 to cc:88:26:dc:c5:2a on re1 <6>arp: 10.13.87.92 moved from cc:88:26:dc:c5:2a to 98:ed:5c:08:82:b4 on re1
ARP movement like that is not necessarily a problem if you know what those are like a LAG on something. One of those OUIs appears to be Tesla Motors, which seems unlikely to be legit. You may have a conflict there somewhere.
That shouldn't cause a panic though. And if it really does panic you would not be able to access it at all. I suspect that when the LAN stops responding it's the Realtek driver/NIC locking up which is a known issue.
Do you ever see any watchdog timeout errors in the system log after rebooting?Steve
-
Thank you for the detailed breakdown.
Not sure what the bluetooth device is as there is nothing plugged in, unless it is something onboard the PC.
The Tesla Motors device is my Tesla Gateway for my solar system. It is currently connected over ethernet, but I could put it back on wifi.
Should I disable any Watchdog functions?
I am wondering if the mini PC device I am using is the issue. The strange thing it that it worked great for months with no issues and then these crashes came out of no where.
-
It's a Bluetooth controller so it probably is onboard or part of a wifi card maybe. If it can be disabled in the BIOS I would do that.
The Solar controller appears to be connected in two ways. If it's connected via Ethernet make sure it isn't also trying to connect via wifi.
You cannot disable the watchdog in the driver and you wouldn't want to anyway. Seeing errors from it only indicates the hardware/driver has stopped responding and that's useful information.
If you do see those logged you can try using the alternative driver.It could be a hardware failure, yes. You could try swapping the interface assignments and see if the error follows the interface or stays on the NIC. The WAN would then fail next time with LAN still accessible.
Steve
-
@stephenw10 Thank you