pfsense 22.01 crashing and rebooting
-
Hello!
Running 22.01-RELEASE (amd64)
Recently the software started crashing where it went into reboot right away.
I attached the crashlog.
Sadly I am unable to read this crashlog properly.
Any help is appreciated.Logs below.
Thank you.
-
Backtrace:
db:0:kdb.enter.default> bt Tracing pid 3871 tid 100522 td 0xfffff80005610000 kdb_enter() at kdb_enter+0x37/frame 0xfffffe002c3c83e0 vpanic() at vpanic+0x197/frame 0xfffffe002c3c8430 panic() at panic+0x43/frame 0xfffffe002c3c8490 pmap_remove_pages() at pmap_remove_pages+0xa1d/frame 0xfffffe002c3c8590 exec_new_vmspace() at exec_new_vmspace+0x1ce/frame 0xfffffe002c3c8600 exec_elf64_imgact() at exec_elf64_imgact+0xa8c/frame 0xfffffe002c3c86f0 kern_execve() at kern_execve+0x728/frame 0xfffffe002c3c8a40 sys_execve() at sys_execve+0x51/frame 0xfffffe002c3c8ac0 amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe002c3c8bf0 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe002c3c8bf0 --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x48bab6, rsp = 0xc000d0f948, rbp = 0xc000d0f9e8 ---
Message Buffer:
mpt0: request 0xfffffe0000659928:41200 timed out for ccb 0xfffff80129838800 (req->ccb 0xfffff80129838800) mpt0: attempting to abort req 0xfffffe0000659928:41200 function 0 mpt0: mpt_wait_req(1) timed out mpt0: mpt_recover_commands: abort timed-out. Resetting controller mpt0: mpt_cam_event: 0x3b mpt0: mpt_cam_event: 0x3b mpt0: completing timedout/aborted req 0xfffffe0000659928:41200 Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 02 fault virtual address = 0x0 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff8137675b stack pointer = 0x0:0xfffffe0037a45830 frame pointer = 0x0:0xfffffe0037a45900 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 20191 (telegraf) trap number = 12 panic: page fault cpuid = 1 time = 1654283105 KDB: enter: panic
and
<118>Bootup complete panic: bad pte va fffff8007ff9b478 pte 0 cpuid = 0 time = 1654587675 KDB: enter: panic
That first panic looks like a drive controller issue but that seems unlikely given it's running in VMWare?
Bad PTE looks like bad harwdare (usually RAM) but that also seems unlikely.
Is this the first time you've seen it?
Steve
-
Hey,
thanks for the quick answer.
Indeed the firewall is running inside an ESXi hypervisor ESXi-7.0b-16324942-standard.
I have a pretty much mirrored firewalls running inside the same box at different locations without any issues whatsoever. amazing uptimes.These crashes started appearing recently when upgrading to the pfsense+ software.
The two boxes are running a wireguard site 2 site tunnel.
I noticed the instability 2 weeks ago without any significant changes to the overall system installation.
Sometimes uptimes don't go beyond 20 mins without restarts resulting in uploaded crashlogs.
Any further steps I can invoke to rule out this issue?
Cheers
-
I'm not really too familiar with ESXi but it looks like it's not running as a thin disk? I know FreeBSD can choke on that though not with that error normally.
No updates in the hypervisor?
Had you been running 22.01 for some time before these crashes started?
Steve
-
I have one istance of v22.01 running in production on ESXi-7.0U3d-19482537-standard and one on ESXi-6.7.0-20191204001-standard. No issues, thick provisioned.
-
This post is deleted! -
Both installs are running as Thick Provision Lazy Zeroed.
The first install is having no issues at all.
Site A
Site BNothing special in hypervisor. Machine is going to a reboot. No high disk writes/reads or anything.
I have been running the CE Edition ever since without any issues on Site A.
Started Site B with CE - no issues.
Moved to + - frequent issues.
Hard to say the update from CE to + is the issue. don't think that could be it, right? -
Same host on each site? Same VM settings?
-
What VM hardware version are you running on those VMs? Usually weird/unexplained instability and panics like that are from running a VM hardware version (or ESX version) not fully compatible with the version of FreeBSD used on the guest.
I no longer use ESX here (moved everything to Proxmox VE) so I can't speak to how things work on recent versions of ESX or specific VM hardware versions, but generally speaking it's safest to upgrade them to the most recent available VM hardware version. Sometimes with a much newer base/ESX it might not be a bad idea to keep it on an older version but that situation is more rare.
-
@stephenw10 Indeed. All settings are the same.
@jimp said in pfsense 22.01 crashing and rebooting:
What VM hardware version are you running on those VMs? Usually weird/unexplained instability and panics like that are from running a VM hardware version (or ESX version) not fully compatible with the version of FreeBSD used on the guest.
I no longer use ESX here (moved everything to Proxmox VE) so I can't speak to how things work on recent versions of ESX or specific VM hardware versions, but generally speaking it's safest to upgrade them to the most recent available VM hardware version. Sometimes with a much newer base/ESX it might not be a bad idea to keep it on an older version but that situation is more rare.
ESXi 7.0
Issue has resolved by now. Messed with hw offloading and stuff. Not sure what brought the fix but the firewalls are now stable again.
Installed latest updates as well.