Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Netgate 4100 - Fatal trap 12: page fault while in kernel mode

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    10 Posts 4 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sjcjonker
      last edited by

      Hi all,

      In less then 24 hours I now have 3 spontaneous reboots, of my 4100 running 23.01 which worked fine for months. Some minor config changes in firewall rule base but nothing major or any other tweaks/etc in the past days. First two reboots without anything I can find in the local system logs or on the remote syslog server. For the third one, the only thing in the logs I can find is:

      May 6 08:00:05	kernel		rdi: 0 rsi: 2 rdx: 1
      May 6 08:00:05	kernel		current process = 0 (if_io_tqg_1)
      May 6 08:00:05	kernel		processor eflags = interrupt enabled, resume, IOPL = 0
      May 6 08:00:05	kernel		= DPL 0, pres 1, long 1, def32 0, gran 1
      May 6 08:00:05	kernel		code segment = base 0x0, limit 0xfffff, type 0x1b
      May 6 08:00:05	kernel		frame pointer = 0x28:0xfffffe0009f8ed80
      May 6 08:00:05	kernel		stack pointer = 0x28:0xfffffe0009f8ed80
      May 6 08:00:05	kernel		instruction pointer = 0x20:0xffffffff80eb8606
      May 6 08:00:05	kernel		fault code = supervisor read data, page not present
      May 6 08:00:05	kernel		fault virtual address = 0x460
      May 6 08:00:05	kernel		cpuid = 1; apic id = 18
      May 6 08:00:05	kernel		Fatal trap 12: page fault while in kernel mode
      

      Any suggestions where to find more info on the cause and where to proceed troubleshooting this?

      Thanks,
      Stijn

      1 Reply Last reply Reply Quote 0
      • S
        sjcjonker
        last edited by

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • S
          sjcjonker
          last edited by

          Reply to my own post, some more details:

          Netgate 4100 - Fatal trap 12: page fault while in kernel mode

          Hi all,

          In less then 48 hours I now have 6 spontaneous reboots, of my 4100 running initially 23.01, now 23.05-BETA. The 4100 worked fine for months.

          Over the past days only some minor firewall config changes. Mostly expanding an alias from a few entries to 20+ as I had to break out an IPv4 /24 and IPv6 /64 out of a bogon range (100.64.0.0/10 and fc00::/7) and as well an internal NAT forcing all NTP to the NTP clock. For this last it as an IPv4 NAT rule, but the alias had the hosts IPv4 and iPv6 address in there. First couple reboots without anything I can find in the local system logs or on the remote syslog server.

          This is currently the reboot cadence, seems 8 hours apart:

          May  6 00:02:27 edge-mgmt.sjci.nl root[8707]: Bootup complete
          May  6 00:02:27 edge-mgmt.sjci.nl kernel: Bootup complete
          May  6 00:22:13 edge-mgmt.sjci.nl root[16602]: Bootup complete
          May  6 00:22:13 edge-mgmt.sjci.nl kernel: Bootup complete
          May  6 08:02:14 edge-mgmt.sjci.nl root[4881]: Bootup complete
          May  6 08:02:14 edge-mgmt.sjci.nl kernel: Bootup complete
          May  6 08:22:27 edge-mgmt.sjci.nl root[6060]: Bootup complete
          May  6 08:22:27 edge-mgmt.sjci.nl kernel: Bootup complete
          May  6 16:02:18 edge-mgmt.sjci.nl root[87706]: Bootup complete
          May  6 16:02:18 edge-mgmt.sjci.nl kernel: Bootup complete
          May  6 16:22:10 edge-mgmt.sjci.nl root[10286]: Bootup complete
          May  6 16:22:10 edge-mgmt.sjci.nl kernel: Bootup complete
          

          Two of the reboots showed this logging via remote syslog:

          May  6 08:00:05 edge-mgmt.sjci.nl kernel: Fatal trap 12: page fault while in kernel mode
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: cpuid = 1; apic id = 18
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: fault virtual address#011= 0x460
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: fault code#011#011= supervisor read data, page not present
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: instruction pointer#011= 0x20:0xffffffff80eb8606
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: stack pointer#011        = 0x28:0xfffffe0009f8ed80
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: frame pointer#011        = 0x28:0xfffffe0009f8ed80
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: code segment#011#011= base 0x0, limit 0xfffff, type 0x1b
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: #011#011#011= DPL 0, pres 1, long 1, def32 0, gran 1
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: processor eflags#011= interrupt enabled, resume, IOPL = 0
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: current process#011#011= 0 (if_io_tqg_1)
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: rdi:                0 rsi:                2 rdx:                1
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: rcx:                0  r8:                0  r9: 2b94cbbeab72e4cd
          May  6 08:00:05 edge-mgmt.sjci.nl kernel: rax:                2 rbx:                0 rbp: fffffe0009f8ed80
          
          
          May  6 08:19:58 edge-mgmt.sjci.nl kernel:
          May  6 08:19:58 edge-mgmt.sjci.nl kernel:
          May  6 08:19:58 edge-mgmt.sjci.nl kernel: Fatal trap 12: page fault while in kernel mode
          May  6 08:19:58 edge-mgmt.sjci.nl kernel: cpuid = 1; apic id = 18
          May  6 08:19:58 edge-mgmt.sjci.nl kernel: fault virtual address#011= 0x460
          May  6 08:19:58 edge-mgmt.sjci.nl kernel: fault code#011#011= supervisor read data, page not present
          May  6 08:19:58 edge-mgmt.sjci.nl kernel: instruction pointer#011= 0x20:0xffffffff80eb8606
          
          May  6 16:00:06 edge-mgmt.sjci.nl kernel: Fatal trap 12: page fault while in kernel mode
          

          What I did is connect a RaspberryPI to the USB console of the 4100, and I hope this works as with serial, the output is recorded on the RaspberryPI in a log file hopefully to capture more information via this way.

          Sincerely not hoping a HW failure..

          Thanks,
          Stijn

          M 1 Reply Last reply Reply Quote 0
          • M
            mfld LAYER 8 @sjcjonker
            last edited by mfld

            @sjcjonker

            I had this.

            And it was resolved in 23.05 beta via #14077

            I blamed coreboot, then the hardware but everything passed all checks and it started happening on some other devices that were completely different hardware and had AMI BIOS. What they all had in common was they were doing lots of IPv6 stuff :)

            Take a look in redmine #14077 - might be your thing.

            S 1 Reply Last reply Reply Quote 1
            • S
              sjcjonker @mfld
              last edited by

              Hi @mfld,

              Thanks for your feedback, it might indeed be the same issue. That said (finger crossed) it's now stable since I bumped the version to the 23.05 Beta (18+ hours).

              Indeed doing a reasonable amount of IPv6, so possibly indeed related. I hope not, but if it crashes I should have a full dump as the console is still connected to the RasberryPI to capture the output.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                You should see a crash report shown on the dashboard after it's rebooted. That wil have the backtrace that would confirm you're hitting that same issue.

                S 1 Reply Last reply Reply Quote 0
                • S
                  sjcjonker @stephenw10
                  last edited by

                  Hi @stephenw10,

                  Thanks for your reply; but none of the crashes seem to have triggered this. According to "Installs without Swap Space" and below output I do assume/believe the 4100 doesn't have any swap:

                  [23.05-BETA][sjcjonker@edge.sjci.nl]/home/sjcjonker: sudo swapinfo
                  Password:
                  Device          512-blocks     Used    Avail Capacity
                  

                  Neither is anything recorded in /var/crash

                  [23.05-BETA][sjcjonker@edge.sjci.nl]/home/sjcjonker: sudo ls -la /var/crash
                  total 19
                  drwxr-x---   2 root  wheel   3 May  6 16:33 .
                  drwxr-xr-x  29 root  wheel  29 May  3 09:02 ..
                  -rw-r--r--   1 root  wheel   5 May  3 09:02 minfree
                  

                  That said since the upgrade to 23.05 it is stable for 3 days now. So still 🤞 (fingerscrossed) my side. But if it does crash I should have the console logs this time.

                  Stijn

                  GertjanG 1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @sjcjonker
                    last edited by

                    @sjcjonker said in Netgate 4100 - Fatal trap 12: page fault while in kernel mode:

                    crash

                    I have a 4100, and, when I got it, pre loaded with 22.05 ( ? ) the swap was 'not there'.
                    Have a look here : swap not listed? [solved]

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    S 1 Reply Last reply Reply Quote 0
                    • S
                      sjcjonker @Gertjan
                      last edited by

                      Hi @gertjan,

                      Thanks, so now I have swap :-) just edited /etc/fstab with the right swap partition instead of the GPT-ID which I'm guessing came out of the installer (image).

                      At least I can decommission the Raspberry-PI doing console logging.

                      # cat /etc/fstab
                      # Device                Mountpoint      FStype  Options         Dump    Pass#
                      /dev/msdosfs/EFISYS     /boot/efi       msdosfs rw,noatime,noauto       0       0
                      /dev/mmcsd0p3		none	swap	sw		0	0
                      # swapinfo
                      Device          512-blocks     Used    Avail Capacity
                      /dev/mmcsd0p3      1336520        0  1336520     0%
                      #
                      

                      Stijn

                      GertjanG 1 Reply Last reply Reply Quote 0
                      • GertjanG
                        Gertjan @sjcjonker
                        last edited by

                        @sjcjonker said in Netgate 4100 - Fatal trap 12: page fault while in kernel mode:

                        At least I can decommission the Raspberry-PI doing console logging.

                        Wait 👍

                        A syslogger is always a nice thing to have. I'm using one : my NAS.
                        When things go downhill, chances are great that logging accelerates.
                        And when you finally take a look at the "what when who where" you'll notice that the interesting events were just rotated into /dev/null

                        No "help me" PM's please. Use the forum, the community will thank you.
                        Edit : and where are the logs ??

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.