Intermittent kernel panic on zfs_btree_remove()
-
I am running PFSense Community with the following versioning.
2.7.2-RELEASE (amd64)
built on Wed Dec 6 15:10:00 EST 2023
FreeBSD 14.0-CURRENTI am experiencing intermittent kernel panics that appear to be in the ZFS filesystem modules.
The panic is happening at least once a day and I am not able to track down the source other that its something with ZFS.
This is the panic snippet and the dump file is attached.
Dump File: textdump.tar.0
db:0:kdb.enter.default> bt Tracing pid 6 tid 100230 td 0xfffffe00edd4d720 kdb_enter() at kdb_enter+0x32/frame 0xfffffe0107538850 vpanic() at vpanic+0x163/frame 0xfffffe0107538980 spl_panic() at spl_panic+0x3a/frame 0xfffffe01075389e0 zfs_btree_remove() at zfs_btree_remove+0x64/frame 0xfffffe0107538a10 range_tree_add_impl() at range_tree_add_impl+0x323/frame 0xfffffe0107538ae0 range_tree_vacate() at range_tree_vacate+0x84/frame 0xfffffe0107538b20 metaslab_sync_done() at metaslab_sync_done+0x269/frame 0xfffffe0107538ba0 vdev_sync_done() at vdev_sync_done+0x4b/frame 0xfffffe0107538be0 spa_sync() at spa_sync+0x114b/frame 0xfffffe0107538e10 txg_sync_thread() at txg_sync_thread+0x26b/frame 0xfffffe0107538ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe0107538f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0107538f30 --- trap 0x1a24658, rip = 0xfffff8000788f150, rsp = 0xfffff80001aa7690, rbp = 0 --- ??() at 0xfffff8000788f150
-
Hmm unusual error. You do have non-responsive USB storage device attached:
ugen0.3: <USB Storage USB Storage> at usbus0 umass0 on uhub0 umass0: <USB Storage USB Storage, class 0/0, rev 2.00/14.04, addr 2> on usbus0 (probe0:umass-sim0:0:0:0): REPORT LUNS. CDB: a0 00 00 00 00 00 00 00 00 10 00 00 (probe0:umass-sim0:0:0:0): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI status: Check Condition (probe0:umass-sim0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present) (probe0:umass-sim0:0:0:0): Error 6, Unretryable error
It's unlikely to be causing an issue but I would remove it if possible.
It could just require a ZFS scrub:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/filesystem-check.html?highlight=scrub -
I tried the scrub with no errors reported. There also is no USB device attached other that my APC which is on a different bus. Still got a panic with the same error.
It appears that the N100 type of mini computer has internal USB devices that may not be attached.
I am stumped on this one.
-
Some sort of internal card reader maybe? That might be USB attached. You might be able to disable that in the BIOS.
Might be something in the UPS. Try reconnecting it after boot and see what's logged.
Make sure you're running the latest BIOS, that one has errors in the ACPI tables:
acpi0: <ALASKA A M I > Firmware Error (ACPI): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS01], AE_NOT_FOUND (20221020/dswload2-315) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20221020/psobject-372) Firmware Error (ACPI): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS02], AE_NOT_FOUND (20221020/dswload2-315) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20221020/psobject-372)