• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

page fault kernel panics after 2.5.2 upgrade

Scheduled Pinned Locked Moved General pfSense Questions
crashkernel panic2.5.2
25 Posts 4 Posters 4.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    stephenw10 Netgate Administrator
    last edited by Oct 13, 2021, 9:08 PM

    Same backtrace?

    D 1 Reply Last reply Oct 14, 2021, 4:09 AM Reply Quote 0
    • D
      doubledgedboard @stephenw10
      last edited by Oct 14, 2021, 4:09 AM

      @stephenw10 Yeah probably, attached

      db:0:kdb.enter.default>  bt
      Tracing pid 63867 tid 100290 td 0xfffff8009646f740
      kdb_enter() at kdb_enter+0x37/frame 0xfffffe009e822980
      vpanic() at vpanic+0x197/frame 0xfffffe009e8229d0
      panic() at panic+0x43/frame 0xfffffe009e822a30
      trap_fatal() at trap_fatal+0x391/frame 0xfffffe009e822a90
      trap_pfault() at trap_pfault+0x4f/frame 0xfffffe009e822ae0
      trap() at trap+0x410/frame 0xfffffe009e822bf0
      calltrap() at calltrap+0x8/frame 0xfffffe009e822bf0
      --- trap 0xc, rip = 0x8002867c0, rsp = 0x7fffffffe9a0, rbp = 0x7fffffffe9a0 ---
      

      textdump.4.tar

      1 Reply Last reply Reply Quote 0
      • S
        stephenw10 Netgate Administrator
        last edited by Oct 14, 2021, 11:48 AM

        Mmm, still nothing leading up to the trap and nothing show on the console.
        Hard to say what that might be with nothing to go on really. 😕

        1 Reply Last reply Reply Quote 0
        • D
          doubledgedboard
          last edited by Nov 30, 2021, 1:30 AM

          I may have solved the issue, although I'm probably tempting fate by claiming it so soon.

          The issue persisted for some time, at first it was very periodic, approximately three days between panics, which is why I wasn't completely sold on a hardware problem yet.

          I tried seeing if restarting "ahead of schedule" would give me three extra days (from last normal restart), but it still panic'd only a day later.

          Eventually it naturally restarted sooner than three days.

          Last night it started restarting every few minutes, and then suddenly it was restarting before it could even finish booting.

          Aha!

          Classic symptoms of a power supply issue...

          I replaced the PSU (circa 2004) and it's been online ever since. I'll check back in a week and if it still hasn't panicked then I'll call that the issue.

          D 1 Reply Last reply Dec 1, 2021, 4:15 PM Reply Quote 0
          • D
            doubledgedboard @doubledgedboard
            last edited by Dec 1, 2021, 4:15 PM

            I just can't win...

            It rebooted last night. It wasn't the power supply.

            D 1 Reply Last reply Dec 10, 2021, 3:48 PM Reply Quote 0
            • D
              doubledgedboard @doubledgedboard
              last edited by Dec 10, 2021, 3:48 PM

              I'm about to hit 7 days uptime so I think I finally found the issue.

              I started pulling memory sticks out one by one and waiting for it to restart.

              I suspect I have at least one bad stick of ram.

              Posting this for posterity for anyone else who runs into this type of issue.

              M 1 Reply Last reply Feb 25, 2022, 4:19 AM Reply Quote 1
              • M
                MrPete @doubledgedboard
                last edited by Feb 25, 2022, 4:19 AM

                @doubledgedboard For future browsers: it's always a good idea to do an intense RAM test.

                FWIW, the folks at memtest86 dot com have recently done massive updates / upgrades to the (free) RAM tester.

                I recently had a situation where RAM passed a few-years-old version of memtest... but with the latest version, it immediately was detected as bad.

                I strongly encourage everybody to grab a current version :)

                D 1 Reply Last reply Mar 1, 2022, 10:24 PM Reply Quote 1
                • D
                  doubledgedboard @MrPete
                  last edited by doubledgedboard Mar 1, 2022, 10:25 PM Mar 1, 2022, 10:24 PM

                  @mrpete Oh for sure, I've been using memtest and variants for years

                  the issue here is that the system required near 24/7 uptime and I didn't have the time to take it down to run 8+ hour long memory tests, so I had to do what I could while maintaining uptime

                  (and for posterity, I'm back up to 88 days of uptime now 😄 )

                  1 Reply Last reply Reply Quote 0
                  • S
                    Schoolofhardknocks
                    last edited by Mar 1, 2022, 11:16 PM

                    I had something similar that happened to me and it happened during the boot sequence, so I had to reinstall pfsense altogether because I couldn't finish booting or restore a recent configuration, but the dump was more...

                    Tracing pid 431 tid 100111 td 0xfffff800055f2000
                    kdb_enter() at kdb_enter+0x37/frame 0xfffffe00005a4620
                    vpanic() at vpanic+0x197/frame 0xfffffe00005a4670
                    panic() at panic+0x43/frame 0xfffffe00005a46d0
                    ffs_valloc() at ffs_valloc+0x8f3/frame 0xfffffe00005a4760
                    ufs_makeinode() at ufs_makeinode+0xa3/frame 0xfffffe00005a48f0
                    ufs_create() at ufs_create+0x34/frame 0xfffffe00005a4910
                    VOP_CREATE_APV() at VOP_CREATE_APV+0x75/frame 0xfffffe00005a4940
                    vn_open_cred() at vn_open_cred+0x2d9/frame 0xfffffe00005a4a90
                    kern_openat() at kern_openat+0x213/frame 0xfffffe00005a4c00
                    amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe00005a4d30
                    fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00005a4d30
                    --- syscall (5, FreeBSD ELF64, sys_open), rip = 0x800b34e0a, rsp = 0x7fffffffd168, rbp = 0x7fffffffd1a0 ---

                    Then it proceeded with...

                    Tracing command sleep pid 96166 tid 100128 td 0xfffff800056c7740
                    sched_switch() at sched_switch+0x630/frame 0xfffffe00005f9a00
                    mi_switch() at mi_switch+0xd4/frame 0xfffffe00005f9a30
                    sleepq_catch_signals() at sleepq_catch_signals+0x403/frame 0xfffffe00005f9a80
                    sleepq_timedwait_sig() at sleepq_timedwait_sig+0x14/frame 0xfffffe00005f9ac0
                    _sleep() at _sleep+0x1b3/frame 0xfffffe00005f9b40
                    kern_clock_nanosleep() at kern_clock_nanosleep+0x1d2/frame 0xfffffe00005f9bc0
                    sys_nanosleep() at sys_nanosleep+0x3b/frame 0xfffffe00005f9c00
                    amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe00005f9d30
                    fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00005f9d30
                    --- syscall (240, FreeBSD ELF64, sys_nanosleep), rip = 0x80038b6aa, rsp = 0x7fffffffec18, rbp = 0x7fffffffec60 ---

                    It repeated these same sleep system calls for a long time and then forced a reboot.
                    After reinstalling everything the crash was occurring within the pfSense webConfigurator. Its still happening, not sure why, but its not rebooting my box anymore..at least.

                    S 1 Reply Last reply Mar 1, 2022, 11:44 PM Reply Quote 0
                    • S
                      Schoolofhardknocks @Schoolofhardknocks
                      last edited by Mar 1, 2022, 11:44 PM

                      @schoolofhardknocks

                      I also did some testing and when I reroot the device from WebConfigurator, it triggers the same crash dump. I see it launching on my COM.

                      1 Reply Last reply Reply Quote 0
                      • S
                        stephenw10 Netgate Administrator
                        last edited by Mar 2, 2022, 1:51 AM

                        @schoolofhardknocks said in page fault kernel panics after 2.5.2 upgrade:

                        panic() at panic+0x43/frame 0xfffffe00005a46d0
                        ffs_valloc() at ffs_valloc+0x8f3/frame 0xfffffe00005a4760
                        ufs_makeinode() at ufs_makeinode+0xa3/frame 0xfffffe00005a48f0
                        ufs_create() at ufs_create+0x34/frame 0xfffffe00005a4910

                        That's a filesystem error in UFS. You can probably recover by running a filesystem check:
                        https://docs.netgate.com/pfsense/en/latest/troubleshooting/filesystem-check.html#manual-filesystem-check

                        Steve

                        M 1 Reply Last reply Mar 2, 2022, 3:08 PM Reply Quote 0
                        • M
                          MrPete @stephenw10
                          last edited by MrPete Mar 2, 2022, 3:09 PM Mar 2, 2022, 3:08 PM

                          @stephenw10 @Schoolofhardknocks fsck is great for some filesystems... but if you have ZFS, a different set of tools is used. (Or, am I being dumb, and the fact that it is a UFS panic proves fsck is needed? ;) )

                          Expanding on the ZFS note on the linked page:

                          An advantage of ZFS: you can do the equivalent of fsck while up and running!

                          I do:

                          • zpool status (shows name of your pool(s), any past errors, and current scrub status)
                          • (assuming no HW error - use smartctl -x <device> to check the raw drives!***
                          • zpool scrub <poolname> to do a live scrub
                          • zpool clear to clear old errors (again, assuming it is not a HW error!)

                          *** The documentation describes SMART tools... not clear to me if the most extensive info is made available in the gui. smartctl -x shows quite a bit more... was added to the smart system a few years ago.

                          S 1 Reply Last reply Mar 2, 2022, 3:18 PM Reply Quote 0
                          • S
                            stephenw10 Netgate Administrator @MrPete
                            last edited by Mar 2, 2022, 3:18 PM

                            @mrpete said in page fault kernel panics after 2.5.2 upgrade:

                            the fact that it is a UFS panic proves fsck is needed?

                            Yes, that. You would not see that panic if ZFS was used.

                            Steve

                            1 Reply Last reply Reply Quote 1
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                              This community forum collects and processes your personal information.
                              consent.not_received