Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Crash / panic after upgrade from 22.05 to 23.01-RC

    Scheduled Pinned Locked Moved Plus 23.01 Development Snapshots (Retired)
    1 Posts 1 Posters 471 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mfld LAYER 8
      last edited by

      All aboard my Sunday fishing expedition! 🤗

      23.01.r.20230202.1645 on an FW4B with Coreboot.

      I think I saved the salient parts but if more is needed I can get it within a few hours. It crashes every few hours since upgrading.

      Fatal trap 12: page fault while in kernel mode
      cpuid = 2; apic id = 04
      fault virtual address	= 0x460
      fault code		= supervisor read data, page not present
      instruction pointer	= 0x20:0xffffffff80eb8026
      stack pointer	        = 0x28:0xfffffe00107efec0
      frame pointer	        = 0x28:0xfffffe00107efec0
      code segment		= base 0x0, limit 0xfffff, type 0x1b
      			= DPL 0, pres 1, long 1, def32 0, gran 1
      processor eflags	= interrupt enabled, resume, IOPL = 0
      current process		= 0 (if_io_tqg_2)
      rdi:                0 rsi:                2 rdx:                1
      rcx:                0  r8:                0  r9: e82722feff67e002
      rax:                2 rbx:                0 rbp: fffffe00107efec0
      r10: fffff8000d5a41f8 r11:                8 r12: fffffe00107eff28
      r13: fffff80101205c78 r14:                0 r15: fffff80101205c00
      trap number		= 12
      panic: page fault
      cpuid = 2
      time = 1675548000
      KDB: enter: panic
      

      ^^^ Does this mean my RAM or swap file is corrupted, i.e. hardware related ?

      Issue started the moment it came back up from the upgrade reboot but wouldn't be the first time some impossibly timed coincidence sent me on the wrong track and ruined my weekend.

      db:1:pfs> bt
      Tracing pid 0 tid 100009 td 0xfffffe0011ff2c80
      kdb_enter() at kdb_enter+0x32/frame 0xfffffe00107efc80
      vpanic() at vpanic+0x182/frame 0xfffffe00107efcd0
      panic() at panic+0x43/frame 0xfffffe00107efd30
      trap_fatal() at trap_fatal+0x409/frame 0xfffffe00107efd90
      trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00107efdf0
      calltrap() at calltrap+0x8/frame 0xfffffe00107efdf0
      --- trap 0xc, rip = 0xffffffff80eb8026, rsp = 0xfffffe00107efec0, rbp = 0xfffffe00107efec0 ---
      if_inc_counter() at if_inc_counter+0x6/frame 0xfffffe00107efec0
      looutput() at looutput+0x4f/frame 0xfffffe00107efef0
      ip6_forward() at ip6_forward+0x888/frame 0xfffffe00107efff0
      pf_refragment6() at pf_refragment6+0x164/frame 0xfffffe00107f0040
      pf_test6() at pf_test6+0x1380/frame 0xfffffe00107f01b0
      pf_check6_out() at pf_check6_out+0x40/frame 0xfffffe00107f01e0
      pfil_mbuf_out() at pfil_mbuf_out+0x35/frame 0xfffffe00107f0210
      ip6_output() at ip6_output+0x1204/frame 0xfffffe00107f0450
      icmp6_reflect() at icmp6_reflect+0x2dd/frame 0xfffffe00107f0500
      icmp6_error() at icmp6_error+0x37c/frame 0xfffffe00107f0570
      pf_route6() at pf_route6+0x7ff/frame 0xfffffe00107f0650
      pf_test6() at pf_test6+0xce3/frame 0xfffffe00107f07d0
      pf_route6() at pf_route6+0x6b3/frame 0xfffffe00107f08b0
      pf_test6() at pf_test6+0xce3/frame 0xfffffe00107f0a30
      pf_check6_in() at pf_check6_in+0x5b/frame 0xfffffe00107f0a60
      pfil_mbuf_in() at pfil_mbuf_in+0x35/frame 0xfffffe00107f0a90
      ip6_input() at ip6_input+0x4af/frame 0xfffffe00107f0b70
      netisr_dispatch_src() at netisr_dispatch_src+0x2a6/frame 0xfffffe00107f0bc0
      ether_demux() at ether_demux+0x144/frame 0xfffffe00107f0bf0
      ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe00107f0c50
      netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00107f0ca0
      ether_input() at ether_input+0x69/frame 0xfffffe00107f0d00
      iflib_rxeof() at iflib_rxeof+0xbdb/frame 0xfffffe00107f0e00
      _task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe00107f0e40
      gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe00107f0ec0
      gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe00107f0ef0
      fork_exit() at fork_exit+0x7e/frame 0xfffffe00107f0f30
      fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00107f0f30
      --- trap 0x71a4e258, rip = 0xcd219e433487290e, rsp = 0x1385254b4a16942d, rbp = 0xd535aa6a74d489a9 ---
      db:1:pfs>  show registers
      cs                        0x20
      ds                        0x3b
      es                        0x3b
      fs                        0x13
      gs                        0x1b
      ss                        0x28
      rax                       0x12
      rcx                        0x1
      rdx                      0x3f8
      rbx                      0x100
      rsp         0xfffffe00107efc80
      rbp         0xfffffe00107efc80
      rsi                          0
      rdi         0xffffffff83191c28  gdb_consdev
      r8          0xfefefefefefefeff
      r9          0x8080808080808080
      r10         0xfffffe00107efb60
      r11         0xcedfc2df9afff59c
      r12                      0x400
      r13         0xfffffe00107efe00
      r14         0xfffffe00107efd10
      r15         0xfffffe0011ff2c80
      rip         0xffffffff80dd7d12  kdb_enter+0x32
      rflags                    0x82
      kdb_enter+0x32: movq    $0,0x27bc8f3(%rip)
      db:1:pfs>  show pcpu
      cpuid        = 2
      dynamic pcpu = 0xfffffe008eb9e800
      curthread    = 0xfffffe0011ff2c80: pid 0 tid 100009 critnest 1 "if_io_tqg_2"
      curpcb       = 0xfffffe0011ff31a0
      fpcurthread  = none
      idlethread   = 0xfffffe0011fc3560: tid 100005 "idle: cpu2"
      self         = 0xffffffff84612000
      curpmap      = 0xffffffff83548750
      tssp         = 0xffffffff84612384
      rsp0         = 0xfffffe00107f1000
      kcr3         = 0xffffffffffffffff
      ucr3         = 0xffffffffffffffff
      scr3         = 0x0
      gs32p        = 0xffffffff84612404
      ldt          = 0xffffffff84612444
      tss          = 0xffffffff84612434
      curvnet      = 0xfffff800051d4c00
      db:1:pfs>  run lockinfo
      db:2:lockinfo> show locks
      No such command; use "help" to list available commands
      db:2:lockinfo>  show alllocks
      No such command; use "help" to list available commands
      db:2:lockinfo>  show lockedvnods
      Locked vnodes
      

      I had Avahi errors spam the log. This caused watchdog to freak out, too. Initially thought maybe there is a problem with that package. For testing, uninstalled watchdog and Avahi and that went away but the crashes persist.

      nut is installed using the USB driver.

      I notice it claiming it loses connection to the UPS every now and then. It will re-establish this connection the same moment. I think this is unrelated and caused by

      rc.start_packages
      

      Some site-site Wireguard tunnels and ha-proxy-devel. All seems to work - there is one haproxy-devel notice in the log, also seems unrelated:

      haproxy: startup error output!: [NOTICE]   (11720) : haproxy version is 2.6.6-274d1a4[NOTICE]   (11720) : path to executable is /usr/local/sbin/haproxy[WARNING]  (11720) : config : ca-file: 0 CA were loaded from '@system-ca'
      

      I tried reinstall all packages from UI and I tried

      pkg-static clean -ay; pkg-static install -fy pkg pfSense-repo pfSense-upgrade
      pkg-static upgrade -f
      
      1 Reply Last reply Reply Quote 0
      • M mfld referenced this topic on
      • M mfld referenced this topic on
      • M mfld referenced this topic on
      • First post
        Last post
      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.