Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Fatal trap 12: page fault while in kernel mode

    Scheduled Pinned Locked Moved General pfSense Questions
    8 Posts 2 Posters 2.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tlum
      last edited by

      I've started getting a periodic crash, about once a week, though it varies. This box has been quite stable for years, but started this behavior after an update this past summer, though correlation does not equal causation. It's hard to peg the exact date and version since it happens so infrequently. From what I can see it looks like it's happening during packet inspection in pf.

      This seems the same as an issue posted in "2.0-RC Snapshot Feedback and Problems" https://forum.pfsense.org/index.php?topic=21743.40;wap2 That was four years ago and it's not clear what ever became of it.

      So today I became aggravated enough to drop everything I'm doing and concentrate on ending this forever. Unfortunately, I don't know of a way reproduce it on demand, but I suspect that it could be traffic related based on what circumstantial evidence I do have. And yes, this probably is a FreeBSD issue, however I would counter that pfSense is distro based and chooses the OS distro that it's packaged with and tested against, so I would think it's in our mutual best interest to understand and resolve it.

      Although I have not come across any recent complaints, can anyone verify this as a current problem? Are the pfSense developers aware of this or related issues? Are there any suggestions for capturing additional information on this? -TIA-

      FreeBSD 8.3-RELEASE-p16 #0: Mon Aug 25 08:25:41 EDT 2014
          root@pf2_1_1_i386.pfsense.org:/usr/obj.i386/usr/pfSensesrc/src/sys/pfSense_SMP.8
      
      db:0:kdb.enter.default>  bt
      Tracing pid 12 tid 100055 td 0xc702db80
      rn_match(c1520d4c,c9671300,ed797904,c7c4d200,ed79785c,...) at rn_match+0x11
      pfr_match_addr(c94689b0,c80a581a,2,16,ed797844,...) at pfr_match_addr+0xe0
      pf_test_tcp(ed797920,ed79791c,1,c7c4d200,c8107900,...) at pf_test_tcp+0xb05
      pf_test(1,c70d4400,ed797aec,0,0,...) at pf_test+0x2596
      pf_check_in(0,ed797aec,c70d4400,1,0,...) at pf_check_in+0x46
      pfil_run_hooks(c156e620,ed797b3c,c70d4400,1,0,...) at pfil_run_hooks+0x93
      ip_input(c8107900,c8107900,10,c0ac8dc9,c1569a10,...) at ip_input+0x35a
      netisr_dispatch_src(1,0,c8107900,ed797bac,c0b6838f,...) at netisr_dispatch_src+0x71
      netisr_dispatch(1,c8107900,5,c70d4400) at netisr_dispatch+0x20
      ether_demux(c70d4400,c8107900,3,0,3,...) at ether_demux+0x19f
      ether_input(c70d4400,c8107900,c7c4d804,c7acc800) at ether_input+0x174
      ether_demux(c7acc800,c8107900,3,0,3,...) at ether_demux+0x65
      ether_input(c7031400,c8107900,c155b180,ed797c3c,c6d94000,...) at ether_input+0x174
      em_rxeof(0,0,c70143c0,c702a880,ed797cc0,...) at em_rxeof+0x206
      em_msix_rx(c7026300,c702db80,0,109,98bc0483,...) at em_msix_rx+0x3f
      intr_event_execute_handlers(c6d92560,c702a880,c0f955af,529,c702a8f0,...) at intr_event_execute_handlers+0xd4
      ithread_loop(c7001b20,ed797d28,2a90d8a7,0,c7001b20,...) at ithread_loop+0x66
      fork_exit(c0a7a4e0,c7001b20,ed797d28) at fork_exit+0x87
      fork_trampoline() at fork_trampoline+0x8
      --- trap 0, eip = 0, esp = 0xed797d60, ebp = 0 ---
      
      1 Reply Last reply Reply Quote 0
      • C
        cmb
        last edited by

        are you using schedules on firewall rules?

        1 Reply Last reply Reply Quote 0
        • T
          tlum
          last edited by

          @cmb:

          are you using schedules on firewall rules?

          Nope, ZERO schedules. No traffic shaping, or anything else dynamic either.

          The configuration is not simplistic though. Two NIC's participate in a LAG, which presents 8 VLANs, two of which are WAN's with IP Aliases ( two /29 blocks) in addition to the primary. And, OpenVPN counts as a ninth interface. This configuration has been stable since at least 2008.

          1 Reply Last reply Reply Quote 0
          • T
            tlum
            last edited by

            Well, disabled textdump in favor of conventional minidump. I guess I wait till it happens again and see if I end up with more useful artifacts next time.

            1 Reply Last reply Reply Quote 0
            • T
              tlum
              last edited by

              Alright, so I finally got a dump on 2/5, and then another on 2/11. So, is there a debug build of the 8.3 kernel around that the pfSense developers use, or am I going to have to go build my own?

              1 Reply Last reply Reply Quote 0
              • C
                cmb
                last edited by

                Is lagg, VLANs, two WANs + VIPs, and OpenVPN all you're running on it? All those things are fine on 2.2, and that panic is almost certainly fixed in 10.1. It's really not worth the effort to track down unless it happens on 10.1.

                1 Reply Last reply Reply Quote 0
                • T
                  tlum
                  last edited by

                  Yes! I have run pfSense for years, first on an IBM x335, and now on a SuperMicro SYS-5015A-EHF-D525 since 5/30/2012. It is dedicated to firewall and routing. It's peer is a Cisco Catalyst 2690 switch. I run the reverse proxy and IDS behind it, I'd rather keep the firewall box native, simple, and clean. It is the network time server(NTP). It logs to a central network syslog server. It is the network DHCP server and manages static as well as dynamic pools. It does not get involved in DNS. I'd prefer not to even run OpenVPN on it, but there is a higher risk of not being able to remotely recover from internal issues if the VPN runs behind it.

                  I've been having this problem for less than a year, but more than six months, I'm not exactly sure. As of 12/23 I became fed up, but it took 44 days for it to panic again in order to get a real dump, then just 6 days for another.

                  I am VERY uncomfortable doing an upgrade without knowing what was causing the issue. Maybe 10.1 will solve the problem, and maybe it will sweep a hardware problem under the carpet for 3 months. Right now time is the only know way to reproduce the issue, and the reason for the seemingly random amount of time is unknown. While I would have no expectation of trying to fix a deprecated version, I would sleep better at night having identified the root cause and being ably to identify means of reproducing the issue and testing for it's presence and resolution. I know it seems counterproductive, and if I had a reproducible issue I could test against a new version I'd be all over it in a heartbeat. But, all I do have is the data contained in two separate core dumps, and no way of knowing if an upgrade will be of any value. So, I'm just crazy persistent enough to take what i do have to it's logical conclusion.

                  So, kgdb works a lot  better with debug symbols… not to mention that pfSense dosen't even ship with (k)gdb.

                  1 Reply Last reply Reply Quote 0
                  • C
                    cmb
                    last edited by

                    The panic is in something related to the packet filter. It looks a lot like what happens with schedules, but there is another similar panic in some unusual edge case. If the backtraces all look similar to that one, it's a near-certainty it's not a hardware problem. That would exhibit itself in a diff bt, or or varying ones.

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.