Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    25.11: Fatal trap 12 on reboot

    Scheduled Pinned Locked Moved Plus 25.11 Development Snapshots
    14 Posts 2 Posters 1.0k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • w0wW Offline
      w0w
      last edited by

      Both HA firewalls panic with a “fatal trap” during reboot—one on every boot, the other only occasionally.

      2025-09-19 19:23:05.667018+03:00 	kernel 	- 	---<<BOOT>>---
      2025-09-19 19:23:05.666999+03:00 	kernel 	- 	KDB: enter: panic
      2025-09-19 19:23:05.666978+03:00 	kernel 	- 	time = 1758298825
      2025-09-19 19:23:05.666958+03:00 	kernel 	- 	cpuid = 7
      2025-09-19 19:23:05.666932+03:00 	kernel 	- 	panic: page fault
      2025-09-19 19:23:05.666912+03:00 	kernel 	- 	trap number = 12
      2025-09-19 19:23:05.666891+03:00 	kernel 	- 	r13: 0000000000000003 r14: fffffe00fbc60e08 r15: 0000000000000000
      2025-09-19 19:23:05.666870+03:00 	kernel 	- 	r10: 0000000000000001 r11: 0000000000002af8 r12: 00001c2ab4006873
      2025-09-19 19:23:05.666848+03:00 	kernel 	- 	rax: 00001c2ab4006873 rbx: fffffe002efd8a00 rbp: fffffe00fbc60df0
      2025-09-19 19:23:05.666826+03:00 	kernel 	- 	rcx: fffff80056cd8d30 r8: fffff8010268f550 r9: fffffe00fbc61000
      2025-09-19 19:23:05.666804+03:00 	kernel 	- 	rdi: fffffe002efd8a00 rsi: ffffffff828c1e00 rdx: fffffe002efd8a68
      2025-09-19 19:23:05.666782+03:00 	kernel 	- 	current process = 0 (softirq_7)
      2025-09-19 19:23:05.666762+03:00 	kernel 	- 	processor eflags = interrupt enabled, resume, IOPL = 0
      2025-09-19 19:23:05.666741+03:00 	kernel 	- 	= DPL 0, pres 1, long 1, def32 0, gran 1
      2025-09-19 19:23:05.666719+03:00 	kernel 	- 	code segment = base 0x0, limit 0xfffff, type 0x1b
      2025-09-19 19:23:05.666699+03:00 	kernel 	- 	frame pointer = 0x28:0xfffffe00fbc60df0
      2025-09-19 19:23:05.666678+03:00 	kernel 	- 	stack pointer = 0x28:0xfffffe00fbc60db0
      2025-09-19 19:23:05.666657+03:00 	kernel 	- 	instruction pointer = 0x20:0xffffffff80409b92
      2025-09-19 19:23:05.666637+03:00 	kernel 	- 	fault code = supervisor read data, page not present
      2025-09-19 19:23:05.666616+03:00 	kernel 	- 	fault virtual address = 0x1c2ab4006873
      2025-09-19 19:23:05.666596+03:00 	kernel 	- 	cpuid = 7; apic id = 0e
      2025-09-19 19:23:05.666574+03:00 	kernel 	- 	Fatal trap 12: page fault while in kernel mode
      
      2025-09-19 22:16:19.014599+03:00 	kernel 	- 	---<<BOOT>>---
      2025-09-19 22:16:19.014502+03:00 	kernel 	- 	KDB: enter: panic
      2025-09-19 22:16:19.014403+03:00 	kernel 	- 	time = 1758309297
      2025-09-19 22:16:19.014330+03:00 	kernel 	- 	cpuid = 0
      2025-09-19 22:16:19.014254+03:00 	kernel 	- 	panic: page fault
      2025-09-19 22:16:19.014181+03:00 	kernel 	- 	trap number = 12
      2025-09-19 22:16:19.014111+03:00 	kernel 	- 	r13: fffff80339a92c80 r14: fffff8003c20b140 r15: fffff80339a92c80
      2025-09-19 22:16:19.014041+03:00 	kernel 	- 	r10: 0000000000000002 r11: fffff800018e6c03 r12: fffff8014f66d900
      2025-09-19 22:16:19.013974+03:00 	kernel 	- 	rax: fffffe016bec9cf0 rbx: fffff8021890b3d0 rbp: fffffe00af41ec40
      2025-09-19 22:16:19.013901+03:00 	kernel 	- 	rcx: fffff80339589598 r8: 000000000000007b r9: fffff8000191b498
      2025-09-19 22:16:19.013825+03:00 	kernel 	- 	rdi: fffff8021890b3d0 rsi: fffff80339a92c80 rdx: 0000000000000038
      2025-09-19 22:16:19.013757+03:00 	kernel 	- 	current process = 95403 (reboot)
      2025-09-19 22:16:19.013689+03:00 	kernel 	- 	processor eflags = interrupt enabled, resume, IOPL = 0
      2025-09-19 22:16:19.013618+03:00 	kernel 	- 	= DPL 0, pres 1, long 1, def32 0, gran 1
      2025-09-19 22:16:19.013522+03:00 	kernel 	- 	code segment = base 0x0, limit 0xfffff, type 0x1b
      2025-09-19 22:16:19.013454+03:00 	kernel 	- 	frame pointer = 0x28:0xfffffe00af41ec40
      2025-09-19 22:16:19.013380+03:00 	kernel 	- 	stack pointer = 0x28:0xfffffe00af41ec00
      2025-09-19 22:16:19.013311+03:00 	kernel 	- 	instruction pointer = 0x20:0xffffffff82e0d97d
      2025-09-19 22:16:19.013240+03:00 	kernel 	- 	fault code = supervisor read data, page not present
      2025-09-19 22:16:19.013171+03:00 	kernel 	- 	fault virtual address = 0xfffffe016bec9d08
      2025-09-19 22:16:19.013105+03:00 	kernel 	- 	cpuid = 0; apic id = 00
      2025-09-19 22:16:19.013040+03:00 	kernel 	- 	Fatal trap 12: page fault while in kernel mode
      2025-09-19 22:16:19.012833+03:00 	kernel 	- 	Uptime: 34m30s
      2025-09-19 22:16:19.012763+03:00 	kernel 	- 	All buffers synced.
      2025-09-19 22:16:19.012700+03:00 	kernel 	- 	Syncing disks, vnodes remaining... 0 0 done
      2025-09-19 22:16:19.012637+03:00 	kernel 	- 	Waiting (max 60 seconds) for system process `syncer' to stop...
      2025-09-19 22:16:19.012539+03:00 	kernel 	- 	Waiting (max 60 seconds) for system process `vnlru' to stop... done
      2025-09-19 22:16:19.012368+03:00 	kernel 	- 	pflog0: promiscuous mode disabled
      2025-09-19 22:16:18.993725+03:00 	syslogd 	- 	kernel boot file is /boot/kernel/kernel
      2025-09-19 22:14:47.946100+03:00 	syslogd 	- 	exiting on signal 15
      2025-09-19 22:14:47.912518+03:00 	reboot 	95403 	rebooted by root
      

      Has anyone else seen something similar?

      1 Reply Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        Hmm, what are those running on? They are both the same hardware?

        Do you have the backtrace from the panics?

        w0wW 1 Reply Last reply Reply Quote 0
        • w0wW Offline
          w0w @stephenw10
          last edited by

          @stephenw10
          Different hardware, one of them were running on the procmox 9 and looks like a bit different backtraces too. Yes I have a couple of backtraces from both firewalls. Currently running 25.07.1 and no traps observed. Can I upload it somewhere?

          1 Reply Last reply Reply Quote 0
          • stephenw10S Offline
            stephenw10 Netgate Administrator
            last edited by

            Sure you can upload them here: https://nc.netgate.com/nextcloud/s/Ym2qxrrr7cpstgw

            w0wW 1 Reply Last reply Reply Quote 0
            • w0wW Offline
              w0w @stephenw10
              last edited by

              @stephenw10
              Uploaded primary_secondary.zip

              1 Reply Last reply Reply Quote 1
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                Great I see that.

                Were these VMs previously running 25.07.1 OK?

                w0wW 1 Reply Last reply Reply Quote 0
                • w0wW Offline
                  w0w @stephenw10
                  last edited by

                  @stephenw10
                  Yep

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S Offline
                    stephenw10 Netgate Administrator
                    last edited by

                    Are you able to test using the debug kernel in 25.11?

                    https://docs.netgate.com/pfsense/en/latest/troubleshooting/debug-kernel.html

                    Those crashes are not very informative unfortunately. The output from the debug kernel should give us more.

                    w0wW 1 Reply Last reply Reply Quote 1
                    • w0wW Offline
                      w0w @stephenw10
                      last edited by

                      @stephenw10
                      I’ve uploaded dumps from the primary firewall (primary.zip). I’ll check tomorrow whether the secondary firewall crashes as well.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S Offline
                        stephenw10 Netgate Administrator
                        last edited by

                        Thanks.

                        That's interesting. It looks like a different bug to me. Those are almost identical backtraces in the IPv6 stack. We are looking at it....

                        w0wW 1 Reply Last reply Reply Quote 1
                        • w0wW Offline
                          w0w @stephenw10
                          last edited by

                          @stephenw10
                          Uploaded secondary.zip, these look similar to me.
                          A bit off-topic.
                          Please note that both during system startup and when enabling CARP maintenance mode, there is a flood of CARP events. After the initial VIP reconfiguration at boot, it sometimes fails to complete (on the primary, almost consistently—likely due to higher latency). When PPPoE comes up, the configuration process starts over, even though PPPoE has no VIPs configured, and then it stops with nothing configured on LAN and I need to disable/enable CARP to make work it again. Is this the intended behavior? I understand this may be a separate issue and this is not related to 25.11, but it should be evident in the dumps.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S Offline
                            stephenw10 Netgate Administrator
                            last edited by

                            Yup this new panic you;re seeing is specific to the debug kernel. We are fixing that so we will then be able to see the reboot issue.

                            w0wW 1 Reply Last reply Reply Quote 1
                            • w0wW Offline
                              w0w @stephenw10
                              last edited by

                              @stephenw10
                              I am very grateful for your help and support!

                              1 Reply Last reply Reply Quote 1
                              • stephenw10S Offline
                                stephenw10 Netgate Administrator
                                last edited by

                                No worries. I'm happy you're able to test an early dev snaphot. Finding these issues earlier makes it much easier for us.

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.