pfSense v.2.6 crashes and reboot
-
Dear pfSense experts/developers,
during the last month, pfSense suddenly crashed and rebooted two times.
Nothing changed during this period.I read that, in this case, I should share the crash report here.
Could you please help me to understand the cause of this issue?
You can find the dump files in attachment.Many thanks in advance,
Mauro -
Backtrace:
db:0:kdb.enter.default> bt Tracing pid 84479 tid 100662 td 0xfffff80309ee7000 kdb_enter() at kdb_enter+0x37/frame 0xfffffe0094f3b860 vpanic() at vpanic+0x197/frame 0xfffffe0094f3b8b0 panic() at panic+0x43/frame 0xfffffe0094f3b910 pmap_remove_pages() at pmap_remove_pages+0xa1d/frame 0xfffffe0094f3ba10 vmspace_exit() at vmspace_exit+0x9e/frame 0xfffffe0094f3ba50 exit1() at exit1+0x55b/frame 0xfffffe0094f3bab0 sys_sys_exit() at sys_sys_exit+0xd/frame 0xfffffe0094f3bac0 amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe0094f3bbf0 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0094f3bbf0 --- syscall (1, FreeBSD ELF64, sys_sys_exit), rip = 0x8004095fa, rsp = 0x7fffffffebe8, rbp = 0x7fffffffec00 ---
Panics:
panic: bad pte va 80c200000 pte 80000002be20ac28 cpuid = 2 time = 1690324290 KDB: enter: panic panic: bad pte va 800f5c000 pte 0 cpuid = 4 time = 1692058926 KDB: enter: panic
Hmm, I would say that's likely a hardware error...excpet it's in VMWare. Has the hypervisor been updated in that time? What version of ESXi is it?
The only error shown other than the panic is this:
(da0:mpt0:0:0:0): UNMAP failed, disabling BIO_DELETE (da0:mpt0:0:0:0): UNMAP. CDB: 42 00 00 00 00 00 00 00 08 00 (da0:mpt0:0:0:0): CAM status: SCSI Status Error (da0:mpt0:0:0:0): SCSI status: Check Condition (da0:mpt0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB) (da0:mpt0:0:0:0): Command byte 7 is invalid (da0:mpt0:0:0:0): Error 22, Unretryable error
Which looks like a drive error. Except again it's in VMWare...
Steve
-
@stephenw10 A bad bit in hardware, if it is in the right place, could also affect the vmdk file. I would suspect that bit would be unreadable in the vmfs and get passed on. Could possibly still be a drive or controller error just getting passed up the stack.
-
Hi Stephen,
thank you for your support.
@stephenw10 said in pfSense v.2.6 crashes and reboot:
Hmm, I would say that's likely a hardware error...excpet it's in VMWare. Has the hypervisor been updated in that time? What version of ESXi is it?
No, the hypervisor hasn't been updated during that period.
The version of ESXi is 6.7 u3I'll check the status of drives and controller and I will let you know.
Thanks,
Mauro -
@Stewart thank you for the additional info.
I just checked the status of drives and controller from the server management GUI, but it seems everything is ok.
No lines has been recently added to the logs page of the server.It is very strange, I don't know how to manage it.
Mauro
-
Unfortunately none of that crash data is very revealing. Are those the only crashes it's seen?
-
@stephenw10 it happened again some minutes ago.
No CPU overload, no hard issues on controller and drives...
I'm still not able to understand where is the cause... -
Hmm, different crash but still nothing specific.
Fatal trap 9: general protection fault while in kernel mode cpuid = 6; apic id = 0c instruction pointer = 0x20:0xffffffff80d6f3f7 stack pointer = 0x28:0xfffffe000455f680 frame pointer = 0x28:0xfffffe000455f700 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 28 (dom0) trap number = 9 panic: general protection fault cpuid = 6 time = 1692624634 KDB: enter: panic
db:0:kdb.enter.default> bt Tracing pid 28 tid 100208 td 0xfffff800090ed000 kdb_enter() at kdb_enter+0x37/frame 0xfffffe000455f390 vpanic() at vpanic+0x197/frame 0xfffffe000455f3e0 panic() at panic+0x43/frame 0xfffffe000455f440 trap_fatal() at trap_fatal+0x391/frame 0xfffffe000455f4a0 trap() at trap+0x67/frame 0xfffffe000455f5b0 calltrap() at calltrap+0x8/frame 0xfffffe000455f5b0 --- trap 0x9, rip = 0xffffffff80d6f3f7, rsp = 0xfffffe000455f680, rbp = 0xfffffe000455f700 --- __mtx_lock_sleep() at __mtx_lock_sleep+0xd7/frame 0xfffffe000455f700 pmap_ts_referenced() at pmap_ts_referenced+0xc63/frame 0xfffffe000455f7b0 vm_pageout_worker() at vm_pageout_worker+0xf88/frame 0xfffffe000455fb70 vm_pageout() at vm_pageout+0x193/frame 0xfffffe000455fbb0 fork_exit() at fork_exit+0x7e/frame 0xfffffe000455fbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000455fbf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Is there some reason you're not on 2.7?
You should probably stop logging those ARP movements if those MACs are known.
-
@stephenw10 thank you for the analysis.
I'm still at 2.7 because pfsense is in production and we need to be sure that the update will not cause any issue...
I'm at 2.6...do you think that I can update to 2.7 without impacting the existing services (syslog-ng, snort, pfblocker-ng, iperf, and so on)?In addition, I noticed that some installed package names are in yellow.
Sorry, but I didn't understand your last sentence:
"You should probably stop logging those ARP movements if those MACs are known."What does I need to do in this case?
Thank you in advance,
Mauro -