Crash / panic after upgrade from 22.05 to 23.01-RC
-
All aboard my Sunday fishing expedition!
23.01.r.20230202.1645 on an FW4B with Coreboot.
I think I saved the salient parts but if more is needed I can get it within a few hours. It crashes every few hours since upgrading.
Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 04 fault virtual address = 0x460 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80eb8026 stack pointer = 0x28:0xfffffe00107efec0 frame pointer = 0x28:0xfffffe00107efec0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (if_io_tqg_2) rdi: 0 rsi: 2 rdx: 1 rcx: 0 r8: 0 r9: e82722feff67e002 rax: 2 rbx: 0 rbp: fffffe00107efec0 r10: fffff8000d5a41f8 r11: 8 r12: fffffe00107eff28 r13: fffff80101205c78 r14: 0 r15: fffff80101205c00 trap number = 12 panic: page fault cpuid = 2 time = 1675548000 KDB: enter: panic
^^^ Does this mean my RAM or swap file is corrupted, i.e. hardware related ?
Issue started the moment it came back up from the upgrade reboot but wouldn't be the first time some impossibly timed coincidence sent me on the wrong track and ruined my weekend.
db:1:pfs> bt Tracing pid 0 tid 100009 td 0xfffffe0011ff2c80 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00107efc80 vpanic() at vpanic+0x182/frame 0xfffffe00107efcd0 panic() at panic+0x43/frame 0xfffffe00107efd30 trap_fatal() at trap_fatal+0x409/frame 0xfffffe00107efd90 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00107efdf0 calltrap() at calltrap+0x8/frame 0xfffffe00107efdf0 --- trap 0xc, rip = 0xffffffff80eb8026, rsp = 0xfffffe00107efec0, rbp = 0xfffffe00107efec0 --- if_inc_counter() at if_inc_counter+0x6/frame 0xfffffe00107efec0 looutput() at looutput+0x4f/frame 0xfffffe00107efef0 ip6_forward() at ip6_forward+0x888/frame 0xfffffe00107efff0 pf_refragment6() at pf_refragment6+0x164/frame 0xfffffe00107f0040 pf_test6() at pf_test6+0x1380/frame 0xfffffe00107f01b0 pf_check6_out() at pf_check6_out+0x40/frame 0xfffffe00107f01e0 pfil_mbuf_out() at pfil_mbuf_out+0x35/frame 0xfffffe00107f0210 ip6_output() at ip6_output+0x1204/frame 0xfffffe00107f0450 icmp6_reflect() at icmp6_reflect+0x2dd/frame 0xfffffe00107f0500 icmp6_error() at icmp6_error+0x37c/frame 0xfffffe00107f0570 pf_route6() at pf_route6+0x7ff/frame 0xfffffe00107f0650 pf_test6() at pf_test6+0xce3/frame 0xfffffe00107f07d0 pf_route6() at pf_route6+0x6b3/frame 0xfffffe00107f08b0 pf_test6() at pf_test6+0xce3/frame 0xfffffe00107f0a30 pf_check6_in() at pf_check6_in+0x5b/frame 0xfffffe00107f0a60 pfil_mbuf_in() at pfil_mbuf_in+0x35/frame 0xfffffe00107f0a90 ip6_input() at ip6_input+0x4af/frame 0xfffffe00107f0b70 netisr_dispatch_src() at netisr_dispatch_src+0x2a6/frame 0xfffffe00107f0bc0 ether_demux() at ether_demux+0x144/frame 0xfffffe00107f0bf0 ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe00107f0c50 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00107f0ca0 ether_input() at ether_input+0x69/frame 0xfffffe00107f0d00 iflib_rxeof() at iflib_rxeof+0xbdb/frame 0xfffffe00107f0e00 _task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe00107f0e40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe00107f0ec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe00107f0ef0 fork_exit() at fork_exit+0x7e/frame 0xfffffe00107f0f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00107f0f30 --- trap 0x71a4e258, rip = 0xcd219e433487290e, rsp = 0x1385254b4a16942d, rbp = 0xd535aa6a74d489a9 --- db:1:pfs> show registers cs 0x20 ds 0x3b es 0x3b fs 0x13 gs 0x1b ss 0x28 rax 0x12 rcx 0x1 rdx 0x3f8 rbx 0x100 rsp 0xfffffe00107efc80 rbp 0xfffffe00107efc80 rsi 0 rdi 0xffffffff83191c28 gdb_consdev r8 0xfefefefefefefeff r9 0x8080808080808080 r10 0xfffffe00107efb60 r11 0xcedfc2df9afff59c r12 0x400 r13 0xfffffe00107efe00 r14 0xfffffe00107efd10 r15 0xfffffe0011ff2c80 rip 0xffffffff80dd7d12 kdb_enter+0x32 rflags 0x82 kdb_enter+0x32: movq $0,0x27bc8f3(%rip) db:1:pfs> show pcpu cpuid = 2 dynamic pcpu = 0xfffffe008eb9e800 curthread = 0xfffffe0011ff2c80: pid 0 tid 100009 critnest 1 "if_io_tqg_2" curpcb = 0xfffffe0011ff31a0 fpcurthread = none idlethread = 0xfffffe0011fc3560: tid 100005 "idle: cpu2" self = 0xffffffff84612000 curpmap = 0xffffffff83548750 tssp = 0xffffffff84612384 rsp0 = 0xfffffe00107f1000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff84612404 ldt = 0xffffffff84612444 tss = 0xffffffff84612434 curvnet = 0xfffff800051d4c00 db:1:pfs> run lockinfo db:2:lockinfo> show locks No such command; use "help" to list available commands db:2:lockinfo> show alllocks No such command; use "help" to list available commands db:2:lockinfo> show lockedvnods Locked vnodes
I had Avahi errors spam the log. This caused watchdog to freak out, too. Initially thought maybe there is a problem with that package. For testing, uninstalled watchdog and Avahi and that went away but the crashes persist.
nut is installed using the USB driver.
I notice it claiming it loses connection to the UPS every now and then. It will re-establish this connection the same moment. I think this is unrelated and caused by
rc.start_packages
Some site-site Wireguard tunnels and ha-proxy-devel. All seems to work - there is one haproxy-devel notice in the log, also seems unrelated:
haproxy: startup error output!: [NOTICE] (11720) : haproxy version is 2.6.6-274d1a4[NOTICE] (11720) : path to executable is /usr/local/sbin/haproxy[WARNING] (11720) : config : ca-file: 0 CA were loaded from '@system-ca'
I tried reinstall all packages from UI and I tried
pkg-static clean -ay; pkg-static install -fy pkg pfSense-repo pfSense-upgrade pkg-static upgrade -f
-
-
-