Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Panic String: page fault after migrating to a baremetal install

    Scheduled Pinned Locked Moved Hardware
    11 Posts 2 Posters 289 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Backtrace:

      db:1:pfs> bt
      Tracing pid 7 tid 100234 td 0xfffff80102ee0000
      kdb_enter() at kdb_enter+0x33/frame 0xfffffe0103ee2c40
      panic() at panic+0x43/frame 0xfffffe0103ee2ca0
      trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0103ee2d00
      trap_pfault() at trap_pfault+0x46/frame 0xfffffe0103ee2d50
      calltrap() at calltrap+0x8/frame 0xfffffe0103ee2d50
      --- trap 0xc, rip = 0xffffffff80fcc88b, rsp = 0xfffffe0103ee2e20, rbp = 0xfffffe0103ee2e40 ---
      pf_state_expires() at pf_state_expires+0xb/frame 0xfffffe0103ee2e40
      pf_purge_expired_states() at pf_purge_expired_states+0xd8/frame 0xfffffe0103ee2e90
      pf_purge_thread() at pf_purge_thread+0x15b/frame 0xfffffe0103ee2ef0
      fork_exit() at fork_exit+0x7b/frame 0xfffffe0103ee2f30
      fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0103ee2f30
      --- trap 0x5668643d, rip = 0xde2f6d95d26f6d91, rsp = 0x86e709278aa70923, rbp = 0x3ad825083698250c ---
      

      Panic:

      Fatal trap 12: page fault while in kernel mode
      cpuid = 11; apic id = 0b
      fault virtual address	= 0x100000012
      fault code		= supervisor read data, page not present
      instruction pointer	= 0x20:0xffffffff80fcc88b
      stack pointer	        = 0x28:0xfffffe0103ee2e20
      frame pointer	        = 0x28:0xfffffe0103ee2e40
      code segment		= base 0x0, limit 0xfffff, type 0x1b
      			= DPL 0, pres 1, long 1, def32 0, gran 1
      processor eflags	= interrupt enabled, resume, IOPL = 0
      current process		= 7 (pf purge)
      rdi: 0000000100000000 rsi: 000000000000000c rdx: 0000000000000580
      rcx: 0000000000383b50  r8: 000000000000b000  r9: 0000000000000fff
      rax: fffffe01145a0a80 rbx: 00000000000b3f10 rbp: fffffe0103ee2e40
      r10: 0000000000001388 r11: 00000000815eda0a r12: 0000000000000000
      r13: fffff80102ee0000 r14: 0000000100000000 r15: fffffe01145a0aa0
      trap number		= 12
      panic: page fault
      cpuid = 11
      time = 1751555338
      KDB: enter: panic
      

      Is that the same backtrace every time?

      It looks like this which was only seen one time: https://redmine.pfsense.org/issues/13417

      The message buffer is full of theses ARP movements:

      <6>arp: 10.27.27.19 moved from 48:b4:23:e1:a5:4b to f0:2f:74:7e:13:d0 on igc1
      <6>arp: 10.27.27.19 moved from f0:2f:74:7e:13:d0 to 48:b4:23:e1:a5:4b on igc1
      <6>arp: 10.27.27.19 moved from 48:b4:23:e1:a5:4b to f0:2f:74:7e:13:d0 on igc1
      <6>arp: 10.27.27.19 moved from f0:2f:74:7e:13:d0 to 48:b4:23:e1:a5:4b on igc1
      

      Is that expected?

      There is a Realtek NIC that's failing to attach. It looks like the same device failing 4 times. You should remove or disable that if you're not using it.

      re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xf000-0xf0ff mem 0xfc704000-0xfc704fff,0xfc700000-0xfc703fff at device 0.0 on pci4
      re0: Using 1 MSI-X message
      re0: Chip rev. 0x54000000
      re0: MAC rev. 0x00100000
      re0: attaching PHYs failed
      device_attach: re0 attach returned 6
      
      1 Reply Last reply Reply Quote 0
      • K
        KingCobra021
        last edited by

        thanks for the replay

        Is that the same backtrace every time?
        Yes
        Is that expected?
        That looks like the NIC's IP, so maybe
        The re0 is not in use, and the Realtek drives are disabled and deleted, but re0 is the motherboard's NIC, so I can't remove it, but I can disable it. Also since I removed the drivers, it's not even showing up as an interface for it to be enabled

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Is that IP actually a device that should be moving between MAC values though? Like a bonded link or some load balancer?
          If it's not you might have an IP conflict.

          Yes, you should disable that re NIC. It's only using resources.

          However neither should be causing that panic....

          K 1 Reply Last reply Reply Quote 0
          • K
            KingCobra021 @stephenw10
            last edited by

            @stephenw10

            So the device is expected to have multiple MACs, and I will disable the NIC through the terminal, but you said neither should cause the panic, so any idea where I can look to see what could be a possible cause?

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Not yet, still digging.

              Is there anything in the main system log just before this happens?

              You have anything 'exotic' configured using rules? Scheduled rules maybe or UPnP?

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Do you have more reports we can review?

                How often does this happen?

                K 1 Reply Last reply Reply Quote 0
                • K
                  KingCobra021 @stephenw10
                  last edited by

                  @stephenw10
                  Hi used to be every hour, but now that I disabled the Realtek NIC on the motherboard, it's less often Here are the new dump files and this time it seems like the panic didn't cause any outage, unlike before
                  info.0
                  textdump.tar.0

                  K 1 Reply Last reply Reply Quote 0
                  • K
                    KingCobra021 @KingCobra021
                    last edited by

                    @KingCobra021
                    info.0 textdump.tar.0

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Mmm, OK that last one is a completely different panic. You should run a memory test to rule out some RAM glitch because that would certainly explain it.

                      Completely unrelated panics like that are almost always hardware.

                      K 1 Reply Last reply Reply Quote 0
                      • K
                        KingCobra021 @stephenw10
                        last edited by

                        @stephenw10
                        ok i sure will do thanks

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.