Crash / panic after upgrade from 22.05 to 23.01-RC

mfld

All aboard my Sunday fishing expedition!

23.01.r.20230202.1645 on an FW4B with Coreboot.

I think I saved the salient parts but if more is needed I can get it within a few hours. It crashes every few hours since upgrading.

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 04
fault virtual address	= 0x460
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80eb8026
stack pointer	        = 0x28:0xfffffe00107efec0
frame pointer	        = 0x28:0xfffffe00107efec0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 0 (if_io_tqg_2)
rdi:                0 rsi:                2 rdx:                1
rcx:                0  r8:                0  r9: e82722feff67e002
rax:                2 rbx:                0 rbp: fffffe00107efec0
r10: fffff8000d5a41f8 r11:                8 r12: fffffe00107eff28
r13: fffff80101205c78 r14:                0 r15: fffff80101205c00
trap number		= 12
panic: page fault
cpuid = 2
time = 1675548000
KDB: enter: panic

^^^ Does this mean my RAM or swap file is corrupted, i.e. hardware related ?

Issue started the moment it came back up from the upgrade reboot but wouldn't be the first time some impossibly timed coincidence sent me on the wrong track and ruined my weekend.

db:1:pfs> bt
Tracing pid 0 tid 100009 td 0xfffffe0011ff2c80
kdb_enter() at kdb_enter+0x32/frame 0xfffffe00107efc80
vpanic() at vpanic+0x182/frame 0xfffffe00107efcd0
panic() at panic+0x43/frame 0xfffffe00107efd30
trap_fatal() at trap_fatal+0x409/frame 0xfffffe00107efd90
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00107efdf0
calltrap() at calltrap+0x8/frame 0xfffffe00107efdf0
--- trap 0xc, rip = 0xffffffff80eb8026, rsp = 0xfffffe00107efec0, rbp = 0xfffffe00107efec0 ---
if_inc_counter() at if_inc_counter+0x6/frame 0xfffffe00107efec0
looutput() at looutput+0x4f/frame 0xfffffe00107efef0
ip6_forward() at ip6_forward+0x888/frame 0xfffffe00107efff0
pf_refragment6() at pf_refragment6+0x164/frame 0xfffffe00107f0040
pf_test6() at pf_test6+0x1380/frame 0xfffffe00107f01b0
pf_check6_out() at pf_check6_out+0x40/frame 0xfffffe00107f01e0
pfil_mbuf_out() at pfil_mbuf_out+0x35/frame 0xfffffe00107f0210
ip6_output() at ip6_output+0x1204/frame 0xfffffe00107f0450
icmp6_reflect() at icmp6_reflect+0x2dd/frame 0xfffffe00107f0500
icmp6_error() at icmp6_error+0x37c/frame 0xfffffe00107f0570
pf_route6() at pf_route6+0x7ff/frame 0xfffffe00107f0650
pf_test6() at pf_test6+0xce3/frame 0xfffffe00107f07d0
pf_route6() at pf_route6+0x6b3/frame 0xfffffe00107f08b0
pf_test6() at pf_test6+0xce3/frame 0xfffffe00107f0a30
pf_check6_in() at pf_check6_in+0x5b/frame 0xfffffe00107f0a60
pfil_mbuf_in() at pfil_mbuf_in+0x35/frame 0xfffffe00107f0a90
ip6_input() at ip6_input+0x4af/frame 0xfffffe00107f0b70
netisr_dispatch_src() at netisr_dispatch_src+0x2a6/frame 0xfffffe00107f0bc0
ether_demux() at ether_demux+0x144/frame 0xfffffe00107f0bf0
ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe00107f0c50
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00107f0ca0
ether_input() at ether_input+0x69/frame 0xfffffe00107f0d00
iflib_rxeof() at iflib_rxeof+0xbdb/frame 0xfffffe00107f0e00
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe00107f0e40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe00107f0ec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe00107f0ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe00107f0f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00107f0f30
--- trap 0x71a4e258, rip = 0xcd219e433487290e, rsp = 0x1385254b4a16942d, rbp = 0xd535aa6a74d489a9 ---
db:1:pfs>  show registers
cs                        0x20
ds                        0x3b
es                        0x3b
fs                        0x13
gs                        0x1b
ss                        0x28
rax                       0x12
rcx                        0x1
rdx                      0x3f8
rbx                      0x100
rsp         0xfffffe00107efc80
rbp         0xfffffe00107efc80
rsi                          0
rdi         0xffffffff83191c28  gdb_consdev
r8          0xfefefefefefefeff
r9          0x8080808080808080
r10         0xfffffe00107efb60
r11         0xcedfc2df9afff59c
r12                      0x400
r13         0xfffffe00107efe00
r14         0xfffffe00107efd10
r15         0xfffffe0011ff2c80
rip         0xffffffff80dd7d12  kdb_enter+0x32
rflags                    0x82
kdb_enter+0x32: movq    $0,0x27bc8f3(%rip)
db:1:pfs>  show pcpu
cpuid        = 2
dynamic pcpu = 0xfffffe008eb9e800
curthread    = 0xfffffe0011ff2c80: pid 0 tid 100009 critnest 1 "if_io_tqg_2"
curpcb       = 0xfffffe0011ff31a0
fpcurthread  = none
idlethread   = 0xfffffe0011fc3560: tid 100005 "idle: cpu2"
self         = 0xffffffff84612000
curpmap      = 0xffffffff83548750
tssp         = 0xffffffff84612384
rsp0         = 0xfffffe00107f1000
kcr3         = 0xffffffffffffffff
ucr3         = 0xffffffffffffffff
scr3         = 0x0
gs32p        = 0xffffffff84612404
ldt          = 0xffffffff84612444
tss          = 0xffffffff84612434
curvnet      = 0xfffff800051d4c00
db:1:pfs>  run lockinfo
db:2:lockinfo> show locks
No such command; use "help" to list available commands
db:2:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:2:lockinfo>  show lockedvnods
Locked vnodes

I had Avahi errors spam the log. This caused watchdog to freak out, too. Initially thought maybe there is a problem with that package. For testing, uninstalled watchdog and Avahi and that went away but the crashes persist.

nut is installed using the USB driver.

I notice it claiming it loses connection to the UPS every now and then. It will re-establish this connection the same moment. I think this is unrelated and caused by

rc.start_packages

Some site-site Wireguard tunnels and ha-proxy-devel. All seems to work - there is one haproxy-devel notice in the log, also seems unrelated:

haproxy: startup error output!: [NOTICE]   (11720) : haproxy version is 2.6.6-274d1a4[NOTICE]   (11720) : path to executable is /usr/local/sbin/haproxy[WARNING]  (11720) : config : ca-file: 0 CA were loaded from '@system-ca'

I tried reinstall all packages from UI and I tried

pkg-static clean -ay; pkg-static install -fy pkg pfSense-repo pfSense-upgrade
pkg-static upgrade -f