Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Crash Report - Fatal trap 12: page fault while in kernel mode (lsof)

    Scheduled Pinned Locked Moved General pfSense Questions
    16 Posts 2 Posters 920 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by stephenw10

      Hmm, not a crash I've seen before. Are you running lsof manually to trigger it?

      What packages do you have installed? lsof is not included in 2.7.2 by default.

      B 1 Reply Last reply Reply Quote 0
      • B
        Be-Bop-Bo @stephenw10
        last edited by

        @stephenw10
        I am not running it manually, so honestly I am not really sure what is using it. I do have a couple manual scripts running to populate some Grafana dashboards that collect a fair amount of stats, but I do not remember installing lsof for it's use. Reviewing the bash scripts I do not see it listed.

        Running ps -ax does not show a lsof as of now

        Packages:
        Acme
        apcupsd
        iperf
        nmap
        ntopng
        pfblockerng
        service_watchdog
        suricata
        telegraf
        wireguard

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Hmm, it appears lsof is a dependency of Telegraf: https://github.com/pfsense/FreeBSD-ports/blob/devel/net-mgmt/pfSense-pkg-Telegraf/Makefile#L17

          How do you have it configured?

          B 1 Reply Last reply Reply Quote 0
          • B
            Be-Bop-Bo @stephenw10
            last edited by

            @stephenw10
            Sure here is what I have. I will note that at the bottom there is a listed Github project, and files that should have been called; however, I did not replace them when I migrated to the Topton mini-pc from my old 1U atom that I had been using for the last +8 years.

            But from what you have shown it should be localized to the Telegraf package, so that really helps.

            [[inputs.net]]
              interfaces = ["igc0", "igc1", "igc2", "igc3","tun_wg0", "tun_wg1", "tun_wg2", "tun_wg3"]
            [[inputs.conntrack]]
            [[inputs.filestat]]
            [[inputs.internal]]
            [[inputs.interrupts]]
            [[inputs.linux_sysctl_fs]]
            [[inputs.net]]
            [[inputs.net_response]]
              protocol = "tcp"
              address = "localhost:443"
            [[inputs.netstat]]
            [[inputs.nstat]]
            [[inputs.procstat]]
              pattern = "."
              prefix = "pgrep_serviceprocess"
            
             [[inputs.dns_query]]
            #   ## servers to query
            #   servers = ["8.8.8.8"]
                 servers = ["208.67.222.222"]
            
            
            [[inputs.netstat]]
            #   # no configuration
            
            # Read metrics about swap memory usage
            [[inputs.swap]]
              # no configuration
            
            [[inputs.ping]]
            #   ## Hosts to send ping packets to.
                 urls = ["208.67.222.222"]
            #
            #   ## Method used for sending pings, can be either "exec" or "native".  When set
            #   ## to "exec" the systems ping command will be executed.  When set to "native"
            #   ## the plugin will send pings directly.
            #   ##
            #   ## While the default is "exec" for backwards compatibility, new deployments
            #   ## are encouraged to use the "native" method for improved compatibility and
            #   ## performance.
            #   # method = "exec"
            #
            #   ## Number of ping packets to send per interval.  Corresponds to the "-c"
            #   ## option of the ping command.
            #   # count = 1
            #
            #   ## Time to wait between sending ping packets in seconds.  Operates like the
            #   ## "-i" option of the ping command.
            #   # ping_interval = 1.0
            #
            #   ## If set, the time to wait for a ping response in seconds.  Operates like
            #   ## the "-W" option of the ping command.
            #   # timeout = 1.0
            #
            #   ## If set, the total ping deadline, in seconds.  Operates like the -w option
            #   ## of the ping command.
            #   # deadline = 10
            #
            #   ## Interface or source address to send ping from.  Operates like the -I or -S
            #   ## option of the ping command.
            #   # interface = ""
            #
            #   ## Specify the ping executable binary.
            #   # binary = "ping"
            #
            #   ## Arguments for ping command. When arguments is not empty, the command from
            #   ## the binary option will be used and other options (ping_interval, timeout,
            #   ## etc) will be ignored.
            #   # arguments = ["-c", "3"]
            #
            #   ## Use only IPv6 addresses when resolving a hostname.
            #   # ipv6 = false
            
            ####################
            ## GIT: https://github.com/VictorRobellini/pfSense-Dashboard
            [[inputs.exec]]
               commands = [
                 "/usr/local/bin/telegraf_pfinterface.php",
                 "/usr/local/bin/telegraf_gateways.py",
                  "/usr/local/bin/telegraf_pfifgw.php",
                  "sh /usr/local/bin/telegraf_temperature.sh",
                  "sh /usr/local/bin/telegraf_pinger_loss.sh"
               ]
               data_format = "influx"
            
            [[inputs.logparser]]
              files = ["/var/log/pfblockerng/dnsbl.log"]
              from_beginning=true
              [inputs.logparser.grok]
                measurement = "dnsbl_log"
                patterns = ["^%{WORD:BlockType}-%{WORD:BlockSubType},%{SYSLOGTIMESTAMP:timestamp:ts-syslog},%{IPORHOST:destination:tag},%{IPORHOST:source:tag},%{GREEDYDATA:call},%{WORD:BlockMethod},%{WORD:BlockList},%{IPORHOST:tld:tag},%{WORD:DefinedList:tag},%{GREEDYDATA:hitormiss}"]
                timezone = "Local"
                [inputs.logparser.tags]
                  value = "1"
            
            [[inputs.logparser]]
                files = ["/var/log/pfblockerng/ip_block.log"]
                from_beginning=true
                [inputs.logparser.grok]
                    measurement = "ip_block_log"
                    patterns = ["^%{SYSLOGTIMESTAMP:timestamp:ts-syslog},%{NUMBER:TrackerID},%{GREEDYDATA:Interface},%{WORD:InterfaceName},%{WORD:action},%{NUMBER:IPVersion},%{NUMBER:ProtocolID},%{GREEDYDATA:Protocol},%{IPORHOST:SrcIP:tag},%{IPORHOST:DstIP:tag},%{NUMBER:SrcPort},%{NUMBER:DstPort},%{WORD:Dir},%{WORD:GeoIP:tag},%{GREEDYDATA:AliasName},%{GREEDYDATA:IPEvaluated},%{GREEDYDATA:FeedName:tag},%{HOSTNAME:ResolvedHostname},%{HOSTNAME:ClientHostname},%{GREEDYDATA:ASN},%{GREEDYDATA:DuplicateEventStatus}"]
                    timezone = "Local"
            
            [[inputs.unbound]]
              server = "127.0.0.1:953"
              binary = "/usr/local/bin/telegraf_unbound.sh"
            
            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Which of those are custom scripts you've imported?

              Can you see how lsof is being called? Or disable that as a test?

              B 1 Reply Last reply Reply Quote 0
              • B
                Be-Bop-Bo @stephenw10
                last edited by

                @stephenw10
                This is the part of the config that called the custom scripts. But as they are not currently present I have removed this part of the config. Looking at the scrips I do not see any lsof reference.

                The second device had that same configs, and after I removed it the box rebooted for some reason. Now I cannot get to it, because it sometimes does not want to bring up the WAN correctly. Once online, I will have to see it if there was a crash report or not.

                ####################
                ## GIT: https://github.com/VictorRobellini/pfSense-Dashboard
                [[inputs.exec]]
                   commands = [
                     "/usr/local/bin/telegraf_pfinterface.php",
                     "/usr/local/bin/telegraf_gateways.py",
                      "/usr/local/bin/telegraf_pfifgw.php",
                      "sh /usr/local/bin/telegraf_temperature.sh",
                      "sh /usr/local/bin/telegraf_pinger_loss.sh"
                
                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Mmm, it seem like it must be the input.filestat call. What does that actually report? Can you comment it out to test?

                  B 2 Replies Last reply Reply Quote 0
                  • B
                    Be-Bop-Bo @stephenw10
                    last edited by

                    @stephenw10 - Roger that, I really do appreciate the help. I see no reason to have that in the config as I am not using it. It is not commented out. I will have to better look at the others to confirm I am using.

                    One more question, if I could: After these crashes I usually see push notifications of the reboot and Pushover web API notifications. So it has internet access for a while, then the device goes unreachable with this type of error.

                    arpresolve: can't allocate llinfo for x.x.x.x (WAN IP GW) on igc0
                    

                    I have seen other posts, but I did not think I found a good resolution for the issue. I suppose have them stop crashing, but... Yeah, just thought I would ask.

                    1 Reply Last reply Reply Quote 1
                    • B
                      Be-Bop-Bo @stephenw10
                      last edited by

                      @stephenw10 OK I kept looking at these as I did have another crash but this time with clock. Looking down the my list I am seeing another using lsof:
                      [[inputs.netstat]]

                      • https://github.com/influxdata/telegraf/tree/master/plugins/inputs/netstat

                      Will keep looking and see if I use these specific network collection. Network is my specific use-case, so will just have to try.

                      1 Reply Last reply Reply Quote 1
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Those arpresolve errors are usually nothing to worry about. It's trying to create an arp entry for the gateway but no longer has an interface in that subnet because it lost the WAN. As soon as the WAN comes back up it clears. You should only ever see it temporarily when that happens.

                        B 1 Reply Last reply Reply Quote 0
                        • B
                          Be-Bop-Bo @stephenw10
                          last edited by

                          @stephenw10
                          I had another instance of a crash and reboot. It always seems to happen when my modem reboots, or maybe just when changes in state/connectivity of the WAN interface? Should I ask out on Telegraf's forum?

                          I would post more, but I am getting flagged as spam?

                          B 1 Reply Last reply Reply Quote 1
                          • B
                            Be-Bop-Bo @Be-Bop-Bo
                            last edited by

                            @Be-Bop-Bo

                            Fatal trap 9: general protection fault while in kernel mode
                            cpuid = 0; apic id = 00
                            instruction pointer	= 0x20:0xffffffff80d4caa4
                            stack pointer	        = 0x28:0xfffffe0084131c00
                            frame pointer	        = 0x28:0xfffffe0084131c40
                            code segment		= base 0x0, limit 0xfffff, type 0x1b
                            			= DPL 0, pres 1, long 1, def32 0, gran 1
                            processor eflags	= resume, IOPL = 0
                            current process		= 2 (clock (0))
                            
                            db:0:kdb.enter.default>  show pcpu
                            cpuid        = 0
                            dynamic pcpu = 0x111bf80
                            curthread    = 0xfffffe0011faa560: pid 2 tid 100041 critnest 1 "clock (0)"
                            curpcb       = 0xfffffe0011faaa80
                            fpcurthread  = none
                            idlethread   = 0xfffffe0011ee63a0: tid 100003 "idle: cpu0"
                            self         = 0xffffffff84010000
                            curpmap      = 0xffffffff83020ab0
                            tssp         = 0xffffffff84010384
                            rsp0         = 0xfffffe0084132000
                            kcr3         = 0xffffffffffffffff
                            ucr3         = 0xffffffffffffffff
                            scr3         = 0x0
                            gs32p        = 0xffffffff84010404
                            ldt          = 0xffffffff84010444
                            tss          = 0xffffffff84010434
                            curvnet      = 0xfffff800012004c0
                            db:0:kdb.enter.default>  bt
                            Tracing pid 2 tid 100041 td 0xfffffe0011faa560
                            kdb_enter() at kdb_enter+0x32/frame 0xfffffe0084131940
                            vpanic() at vpanic+0x163/frame 0xfffffe0084131a70
                            panic() at panic+0x43/frame 0xfffffe0084131ad0
                            trap_fatal() at trap_fatal+0x40c/frame 0xfffffe0084131b30
                            calltrap() at calltrap+0x8/frame 0xfffffe0084131b30
                            --- trap 0x9, rip = 0xffffffff80d4caa4, rsp = 0xfffffe0084131c00, rbp = 0xfffffe0084131c40 ---
                            turnstile_wait() at turnstile_wait+0x134/frame 0xfffffe0084131c40
                            __mtx_lock_sleep() at __mtx_lock_sleep+0x171/frame 0xfffffe0084131cd0
                            crfree() at crfree+0xaf/frame 0xfffffe0084131cf0
                            in_pcbfree() at in_pcbfree+0x280/frame 0xfffffe0084131d20
                            sorele_locked() at sorele_locked+0x89/frame 0xfffffe0084131d40
                            tcp_close() at tcp_close+0x159/frame 0xfffffe0084131d80
                            tcp_timer_2msl() at tcp_timer_2msl+0xf9/frame 0xfffffe0084131dd0
                            tcp_timer_enter() at tcp_timer_enter+0x101/frame 0xfffffe0084131e10
                            softclock_call_cc() at softclock_call_cc+0x134/frame 0xfffffe0084131ec0
                            softclock_thread() at softclock_thread+0xe9/frame 0xfffffe0084131ef0
                            fork_exit() at fork_exit+0x7f/frame 0xfffffe0084131f30
                            fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0084131f30
                            --- trap 0xa42be40b, rip = 0x8ba58ba52e552c55, rsp = 0xb48fb48f3dac3dac, rbp = 0x2bce2bca8e5e8e7e ---
                            
                            1 Reply Last reply Reply Quote 1
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Upvoted a bunch of your posts, you should be good to avoid the spam filters now.

                              That looks like a completely different crash though. What, if anything, has changed since the last one?

                              I've seen that one time before and it seemed to be openvpn related.

                              B 1 Reply Last reply Reply Quote 1
                              • B
                                Be-Bop-Bo @stephenw10
                                last edited by

                                @stephenw10
                                The change is the telegraf config file. I thought I saw some more stability in the package. When changing it over 3 other devices, some it caused that crash. I have had OpenVPN in the past, so it might linger in my config, but it is not currently installed as I moved over to WG exclusively.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Hmm, might need to wait for another crash and see if it's identical. The only previous time we've seen this it was a one time incidents and we never found a cause.

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.