• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Fatal Trap 12 every few days…

Scheduled Pinned Locked Moved General pfSense Questions
20 Posts 6 Posters 8.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    GoldServe
    last edited by Sep 9, 2011, 1:59 PM

    Ever since I upgraded to the latest builds, i've experienced fatal trap 12 every few days with the same config.

    Can anyone see what module is causing this?

    Fatal trap 12: page fault while in kernel mode
    cpuid = 1; apic id = 01
    fault virtual address   = 0xc050048
    fault code              = supervisor read, page not present
    instruction pointer     = 0x20:0xc0b0c4d1
    stack pointer           = 0x28:0xeb07f7d8
    frame pointer           = 0x28:0xeb07f7fc
    code segment            = base 0x0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, def32 1, gran 1
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 12 (irq256: em0:rx 0)
    [thread]
    Stopped at      rn_match+0x11:  movl    0xc(%eax),%ebx
    db> bt
    Tracing pid 12 tid 64029 td 0xc4aff000
    rn_match(c130852c,c588e600,c5c31388,c52a001e,eb07f8b4,...) at rn_match+0x11
    pfr_match_addr(c5c18000,c52a001a,2,e072,eb07f89c,...) at pfr_match_addr+0xe0
    pf_test_udp(eb07f978,eb07f974,1,c4c59600,c52b4100,...) at pf_test_udp+0x8aa
    pf_test(1,c4b37400,eb07fb44,0,0,...) at pf_test+0x242f
    pf_check_in(0,eb07fb44,c4b37400,1,0,...) at pf_check_in+0x46
    pfil_run_hooks(c1353e60,eb07fb94,c4b37400,1,0,...) at pfil_run_hooks+0x93
    ip_input(c52b4100,10,982f000,0,0,...) at ip_input+0x359
    netisr_dispatch_src(1,0,c52b4100,eb07fc00,c0af9e4f,...) at netisr_dispatch_src+0x70
    netisr_dispatch(1,c52b4100,c4ab0700,c4b37400) at netisr_dispatch+0x20
    ether_demux(c4b37400,c52b4100,3,0,3,...) at ether_demux+0x19f
    ether_input(c4b37400,c52b4100,eb07fc58,c4aff000,eb07fc4c,...) at ether_input+0x15d
    em_rxeof(1,f0de766,c4b33e40,c4b25300,eb07fcc0,...) at em_rxeof+0x184
    em_msix_rx(c4ab0700,0,109,496b6ed8,16eaf,...) at em_msix_rx+0x23
    intr_event_execute_handlers(c498f7f8,c4b25300,c0ee391c,52d,c4b25370,...) at intr_event_execute_handlers+0xde
    ithread_loop(c4b39470,eb07fd38,ffffffff,ffffffff,ffffffff,...) at ithread_loop+0x66
    fork_exit(c0a11ad0,c4b39470,eb07fd38) at fork_exit+0x88
    fork_trampoline() at fork_trampoline+0x8
    --- trap 0, eip = 0, esp = 0xeb07fd70, ebp = 0 ---
    db>[/thread]
    
    1 Reply Last reply Reply Quote 0
    • G
      GoldServe
      last edited by Sep 9, 2011, 2:07 PM

      Here is vmstat -i…

      # vmstat -i
      interrupt                          total       rate
      irq4: uart0                         1261          2
      irq16: ath0 uhci3                  25219         49
      irq19: em5 uhci1+                   7239         14
      cpu0: timer                      1010862       1989
      irq256: em0:rx 0                   22828         44
      irq257: em0:tx 0                    7124         14
      irq258: em0:link                       1          0
      irq259: em1:rx 0                   17745         34
      irq260: em1:tx 0                    4251          8
      irq261: em1:link                       3          0
      irq262: em2:rx 0                      43          0
      irq263: em2:tx 0                     490          0
      irq264: em2:link                       2          0
      irq265: em3:rx 0                    8718         17
      irq266: em3:tx 0                   12026         23
      irq267: em3:link                       1          0
      cpu1: timer                      1010875       1989
      Total                            2128688       4190
      
      

      I also disabled hardware checksum offload (all three options) to see if this will help my crash. Funny this is happening now…

      1 Reply Last reply Reply Quote 0
      • E
        eri--
        last edited by Sep 10, 2011, 7:17 PM

        It seems like something in your local hardware that might be causing this.
        Broken RAM?

        1 Reply Last reply Reply Quote 0
        • G
          GoldServe
          last edited by Sep 10, 2011, 8:37 PM

          I can see in a few days if this happens but it all started after i upgraded to an august build and it seems to happen every few days. If i can get the BT and it looks the same, i don't think it is the ram so much. Thanks for looking though…

          1 Reply Last reply Reply Quote 0
          • G
            GoldServe
            last edited by Sep 11, 2011, 8:46 PM

            Nope, i don't think this is ram related. Same error just happened.

            Fatal trap 12: page fault while in kernel mode
            cpuid = 0; apic id = 00
            fault virtual address   = 0xc050048
            fault code              = supervisor read, page not present
            instruction pointer     = 0x20:0xc0b0c4d1
            stack pointer           = 0x28:0xeb0ba7d8
            frame pointer           = 0x28:0xeb0ba7fc
            code segment            = base 0x0, limit 0xfffff, type 0x1b
                                    = DPL 0, pres 1, def32 1, gran 1
            processor eflags        = interrupt enabled, resume, IOPL = 0
            current process         = 12 (irq259: em1:rx 0)
            [thread]
            Stopped at      rn_match+0x11:  movl    0xc(%eax),%ebx
            db> bt
            Tracing pid 12 tid 64034 td 0xc4afe280
            rn_match(c130852c,c5ada300,c5c3f710,c55f381e,eb0ba8b4,...) at rn_match+0x11
            pfr_match_addr(c5be4000,c55f381a,2,e072,eb0ba89c,...) at pfr_match_addr+0xe0
            pf_test_udp(eb0ba978,eb0ba974,1,c4c59500,c5829500,...) at pf_test_udp+0x8aa
            pf_test(1,c4b36c00,eb0bab44,0,0,...) at pf_test+0x242f
            pf_check_in(0,eb0bab44,c4b36c00,1,0,...) at pf_check_in+0x46
            pfil_run_hooks(c1353e60,eb0bab94,c4b36c00,1,0,...) at pfil_run_hooks+0x93
            ip_input(c5829500,10,c203000,0,0,...) at ip_input+0x359
            netisr_dispatch_src(1,0,c5829500,eb0bac00,c0af9e4f,...) at netisr_dispatch_src+0x70
            netisr_dispatch(1,c5829500,c4b35000,c4b36c00) at netisr_dispatch+0x20
            ether_demux(c4b36c00,c5829500,3,0,3,...) at ether_demux+0x19f
            ether_input(c4b36c00,c5829500,eb0bac58,c4afe280,eb0bac4c,...) at ether_input+0x15d
            em_rxeof(0,b7ad47a,c4b57340,c4b50080,eb0bacc0,...) at em_rxeof+0x184
            em_msix_rx(c4b35000,0,109,8c57ea70,117a2,...) at em_msix_rx+0x23
            intr_event_execute_handlers(c498f7f8,c4b50080,c0ee391c,52d,c4b500f0,...) at intr_event_execute_handlers+0xde
            ithread_loop(c4b4ca40,eb0bad38,0,0,0,...) at ithread_loop+0x66
            fork_exit(c0a11ad0,c4b4ca40,eb0bad38) at fork_exit+0x88
            fork_trampoline() at fork_trampoline+0x8
            --- trap 0, eip = 0, esp = 0xeb0bad70, ebp = 0 ---
            [/thread]
            
            1 Reply Last reply Reply Quote 0
            • E
              eri--
              last edited by Sep 12, 2011, 8:33 AM

              Can you check on the system log what events appear when this happens?

              1 Reply Last reply Reply Quote 0
              • G
                GoldServe
                last edited by Sep 12, 2011, 2:11 PM

                Sorry, how am I able to view the system log file from the db> prompt?

                1 Reply Last reply Reply Quote 0
                • E
                  eri--
                  last edited by Sep 12, 2011, 5:06 PM

                  After you reboot the machine!

                  1 Reply Last reply Reply Quote 0
                  • G
                    GoldServe
                    last edited by Sep 12, 2011, 6:28 PM

                    Hrm, doesn't the system log files get overwritten on every boot?

                    1 Reply Last reply Reply Quote 0
                    • E
                      eskild
                      last edited by Dec 5, 2011, 2:03 PM

                      My fw had a similar crash today! psSense 2.0.

                      I have never experienced this before. If there anyway I can collect information from pfSense after a reboot that makes troubleshooting easier? Logs, coredumps?

                      I have used pfSense since Rel1 was in alpha stage, and this is the first time I had to physically reset the fw due to instabillity.

                      Please see the attached image.

                      pfSenseCrash-05dec2011.jpg
                      pfSenseCrash-05dec2011.jpg_thumb

                      1 Reply Last reply Reply Quote 0
                      • G
                        GoldServe
                        last edited by Dec 5, 2011, 5:31 PM

                        Can you type "bt" at the prompt?

                        1 Reply Last reply Reply Quote 0
                        • E
                          eskild
                          last edited by Dec 5, 2011, 5:40 PM

                          bt: Command not found.

                          I have restarted the fw though as no traffic was possible after the crash.

                          1 Reply Last reply Reply Quote 0
                          • G
                            GoldServe
                            last edited by Dec 5, 2011, 7:09 PM

                            you must have an embedded platform where the kernel does not allow "back trace"

                            I believe there is a way to change the kernel on the embedded platform. Someone more knowledgeable can chime in.

                            1 Reply Last reply Reply Quote 0
                            • E
                              eskild
                              last edited by Dec 6, 2011, 9:06 AM

                              I do not have embedded plattform. I have a default x86 install: pfSense 2.0-RELEASE-pfSense (i386)

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by Dec 6, 2011, 12:14 PM

                                You can only run a back trace at the db> prompt, after a crash, not if:

                                @eskild:

                                I have restarted the fw though as no traffic was possible after the crash.

                                Steve

                                1 Reply Last reply Reply Quote 0
                                • E
                                  eskild
                                  last edited by Dec 6, 2011, 2:35 PM

                                  Thanks, I suspected that.
                                  It might be good to evaluate features that captures system information that can be used for troubleshooting by the dev team later.
                                  I was surprised that after the boot, there were no traces from the crash at all, and impossible to provide any hard evidence of what have happened.
                                  I doubt that there are many firewalls that can be offline for a long time while consulting support. Most of us need to reboot and have the system back in service right away.

                                  Just my two cents.

                                  Back to the problem at hand. Is it possible that the crash can be caused by memory issue (RAM)? I have seen instabillities on other systems being caused by failing RAM.

                                  Thanks

                                  1 Reply Last reply Reply Quote 0
                                  • jimpJ
                                    jimp Rebel Alliance Developer Netgate
                                    last edited by Dec 8, 2011, 5:12 PM Dec 8, 2011, 5:10 PM

                                    We fixed the fact that some crashes do not automatically restart in 2.0.1/2.1, but it's an easy fix:

                                    Edit /etc/ddb.conf and change

                                    script kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset
                                    

                                    to

                                    script kdb.enter.default=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset
                                    

                                    (So just change kdb.enter.panic to kdb.enter.default)

                                    Then run:

                                    /sbin/ddb /etc/ddb.conf
                                    

                                    From that point on it should collect the debug data and reboot itself automatically, and also give you a crash report notice in the GUI that you can use to upload the data to our servers (or grab it from /var/crash yourself)

                                    From that panic it could be faulty hardware, but it's hard to say for sure. Usually if it's bad RAM the crashes would be in a different place every time, not in the exact same path. Though it could be a faulty NIC.

                                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                    Need help fast? Netgate Global Support!

                                    Do not Chat/PM for help!

                                    1 Reply Last reply Reply Quote 0
                                    • A
                                      atul
                                      last edited by Jan 3, 2012, 7:10 AM

                                      Hello,

                                      I am also getting the same error every few days, or sometimes more than once a day. I have recently upgraded from 1.2.3 to 2.0.1. But, the pfSense was crashing and restarting before the upgrade, so it is not "only" associated with 2.0.1 release.

                                      I am attaching the entire crash log (long) that I was able to see on the GUI. I have sent it to pfSense team for further analysis.

                                      Atul.

                                      [Crash Report.txt](/public/imported_attachments/1/Crash Report.txt)

                                      1 Reply Last reply Reply Quote 0
                                      • jimpJ
                                        jimp Rebel Alliance Developer Netgate
                                        last edited by Jan 3, 2012, 1:13 PM

                                        @atul:

                                        Hello,

                                        I am also getting the same error every few days, or sometimes more than once a day. I have recently upgraded from 1.2.3 to 2.0.1. But, the pfSense was crashing and restarting before the upgrade, so it is not "only" associated with 2.0.1 release.

                                        I am attaching the entire crash log (long) that I was able to see on the GUI. I have sent it to pfSense team for further analysis.

                                        Atul.

                                        That crash is in code writing to the filesystem. There is very little likelihood there is a problem in that code, it's been solid for years on FreeBSD.

                                        More likely your HDD or storage media has issues, or it could be cabling/controller/DMA issues, but it's definitely storage.

                                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                        Need help fast? Netgate Global Support!

                                        Do not Chat/PM for help!

                                        1 Reply Last reply Reply Quote 0
                                        • A
                                          atul
                                          last edited by Jan 3, 2012, 1:23 PM

                                          Thanks jimp. I will change the hard disk and check again.

                                          Out of curiosity - how did you know that this is storage related?

                                          Atul.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            [[user:consent.lead]]
                                            [[user:consent.not_received]]