Crash Report - Fatal trap 12: page fault while in kernel mode (lsof)
-
@stephenw10
This is the part of the config that called the custom scripts. But as they are not currently present I have removed this part of the config. Looking at the scrips I do not see any lsof reference.The second device had that same configs, and after I removed it the box rebooted for some reason. Now I cannot get to it, because it sometimes does not want to bring up the WAN correctly. Once online, I will have to see it if there was a crash report or not.
#################### ## GIT: https://github.com/VictorRobellini/pfSense-Dashboard [[inputs.exec]] commands = [ "/usr/local/bin/telegraf_pfinterface.php", "/usr/local/bin/telegraf_gateways.py", "/usr/local/bin/telegraf_pfifgw.php", "sh /usr/local/bin/telegraf_temperature.sh", "sh /usr/local/bin/telegraf_pinger_loss.sh"
-
Mmm, it seem like it must be the
input.filestat
call. What does that actually report? Can you comment it out to test? -
@stephenw10 - Roger that, I really do appreciate the help. I see no reason to have that in the config as I am not using it. It is not commented out. I will have to better look at the others to confirm I am using.
One more question, if I could: After these crashes I usually see push notifications of the reboot and Pushover web API notifications. So it has internet access for a while, then the device goes unreachable with this type of error.
arpresolve: can't allocate llinfo for x.x.x.x (WAN IP GW) on igc0
I have seen other posts, but I did not think I found a good resolution for the issue. I suppose have them stop crashing, but... Yeah, just thought I would ask.
-
@stephenw10 OK I kept looking at these as I did have another crash but this time with clock. Looking down the my list I am seeing another using lsof:
[[inputs.netstat]]- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/netstat
Will keep looking and see if I use these specific network collection. Network is my specific use-case, so will just have to try.
-
Those arpresolve errors are usually nothing to worry about. It's trying to create an arp entry for the gateway but no longer has an interface in that subnet because it lost the WAN. As soon as the WAN comes back up it clears. You should only ever see it temporarily when that happens.
-
@stephenw10
I had another instance of a crash and reboot. It always seems to happen when my modem reboots, or maybe just when changes in state/connectivity of the WAN interface? Should I ask out on Telegraf's forum?I would post more, but I am getting flagged as spam?
-
Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80d4caa4 stack pointer = 0x28:0xfffffe0084131c00 frame pointer = 0x28:0xfffffe0084131c40 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 2 (clock (0))
db:0:kdb.enter.default> show pcpu cpuid = 0 dynamic pcpu = 0x111bf80 curthread = 0xfffffe0011faa560: pid 2 tid 100041 critnest 1 "clock (0)" curpcb = 0xfffffe0011faaa80 fpcurthread = none idlethread = 0xfffffe0011ee63a0: tid 100003 "idle: cpu0" self = 0xffffffff84010000 curpmap = 0xffffffff83020ab0 tssp = 0xffffffff84010384 rsp0 = 0xfffffe0084132000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff84010404 ldt = 0xffffffff84010444 tss = 0xffffffff84010434 curvnet = 0xfffff800012004c0 db:0:kdb.enter.default> bt Tracing pid 2 tid 100041 td 0xfffffe0011faa560 kdb_enter() at kdb_enter+0x32/frame 0xfffffe0084131940 vpanic() at vpanic+0x163/frame 0xfffffe0084131a70 panic() at panic+0x43/frame 0xfffffe0084131ad0 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe0084131b30 calltrap() at calltrap+0x8/frame 0xfffffe0084131b30 --- trap 0x9, rip = 0xffffffff80d4caa4, rsp = 0xfffffe0084131c00, rbp = 0xfffffe0084131c40 --- turnstile_wait() at turnstile_wait+0x134/frame 0xfffffe0084131c40 __mtx_lock_sleep() at __mtx_lock_sleep+0x171/frame 0xfffffe0084131cd0 crfree() at crfree+0xaf/frame 0xfffffe0084131cf0 in_pcbfree() at in_pcbfree+0x280/frame 0xfffffe0084131d20 sorele_locked() at sorele_locked+0x89/frame 0xfffffe0084131d40 tcp_close() at tcp_close+0x159/frame 0xfffffe0084131d80 tcp_timer_2msl() at tcp_timer_2msl+0xf9/frame 0xfffffe0084131dd0 tcp_timer_enter() at tcp_timer_enter+0x101/frame 0xfffffe0084131e10 softclock_call_cc() at softclock_call_cc+0x134/frame 0xfffffe0084131ec0 softclock_thread() at softclock_thread+0xe9/frame 0xfffffe0084131ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe0084131f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0084131f30 --- trap 0xa42be40b, rip = 0x8ba58ba52e552c55, rsp = 0xb48fb48f3dac3dac, rbp = 0x2bce2bca8e5e8e7e ---
-
Upvoted a bunch of your posts, you should be good to avoid the spam filters now.
That looks like a completely different crash though. What, if anything, has changed since the last one?
I've seen that one time before and it seemed to be openvpn related.
-
@stephenw10
The change is the telegraf config file. I thought I saw some more stability in the package. When changing it over 3 other devices, some it caused that crash. I have had OpenVPN in the past, so it might linger in my config, but it is not currently installed as I moved over to WG exclusively. -
Hmm, might need to wait for another crash and see if it's identical. The only previous time we've seen this it was a one time incidents and we never found a cause.