Help troubleshooting ALIX pfSense box
I have a Netgate m1n1wall box with an ALIX.2D13 that's been running very smoothly for me for about 3 years. Last night the router became unresponsive and was rebooting. I connected a serial console to it, and this is what I spotted:
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x6dfd4108 fault code = supervisor read, page not present instruction pointer = 0x20:0xddf8b980 stack pointer = 0x28:0xddf8b968 frame pointer = 0x28:0xc4ba4000 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi5: fast taskq) [ thread pid 12 tid 100023 ] Stopped at 0xddf8b980: xorb %bh,-0x56342208(%edx)
If anyone has some thoughts on what this means or where I can start troubleshooting that would be excellent. Unfortunately the reboots got worse, so much so I've had to temporarily connect a NetGear router in its place (yuck).
One thing I've noticed is that, when connecting a serial console connection, I am not able to do anything on the console. I can see the output of the console as it's rebooting via PuTTY, but hitting Enter or anything on the keyboard does nothing within the console. Is this indicative of a hardware issue?
I have the router offline with no network/internet connections, just a serial console. Even isolated like that the box is rebooting.
Either it has some CPU/cache/memory hardware problem that is causing page table references or some critical bit of the kernel got corrupted on the CF card - "page fault while in kernel mode" should never "just start happening".
The easiest thing to do is to take out the CF card and re-write it a new good pfSense nanoBSD image. If that works then you are in luck. If the CF card gives errors when writing it, then you know to get another CF card.
If you still get "page fault while in kernel mode" on a clean pfSense boot, then there has to be some hardware problem on the board. Since the CPU, memory is all fixed to the Alix board there is not much benefit in knowing if it is CPU or memory or…