Seemingly Random Crashes. Help would be appreciated, please.

  • Hi All,

    I am wondering if anyone would mind having a look and try to shed some light on some crashes that keep occurring, and what it is that I may be doing wrong, I am still quite new to pfsense.

    The specs of the machine it is running on is,
    8gb of ram,
    128gb SSD.
    Onboard Intel NIC
    PCI-E Intel NIC

    Interfaces configured are
    WAN - PPPoE interface on vlan100 (required by provider)
    LAN - /24 running DHCP
    OPT1 - GRE to a cisco device at another site
    OPT2 - GRE to a cisco device at another site
    OpenVPN - NordVPN, covering the whole LAN

    The reboots don't seem to be tied to anything in particular, and I am sure i heard it reboot a couple of times during a period when it was connected to nothing and I had moved back to my other router, to gather my thoughts and try to work out what was going on. I had read others suspected their ram, I have removed and re-seated (have not run a memtest though)

    Below is the most recent log from this evening. I believe the highest number of textdump.tar.xx is the latest? And that file is likely what is required?

    Any assistance is greatly appreciated,

    Thanks in advance.

    [0_1557675733493_textdump.tar.18](Uploading 100%) textdump18.txt

  • Netgate Administrator

    db:0:kdb.enter.default>  show pcpu
    cpuid        = 1
    dynamic pcpu = 0xfffffe026400b480
    curthread    = 0xfffff8004de7d000: pid 17 "pf purge"
    curpcb       = 0xfffffe022fb82cc0
    fpcurthread  = none
    idlethread   = 0xfffff800061ff620: tid 100004 "idle: cpu1"
    curpmap      = 0xffffffff82b83898
    tssp         = 0xffffffff82bb4778
    commontssp   = 0xffffffff82bb4778
    rsp0         = 0xfffffe022fb82cc0
    gs32p        = 0xffffffff82bbafd0
    ldt          = 0xffffffff82bbb010
    tss          = 0xffffffff82bbb000
    db:0:kdb.enter.default>  bt
    Tracing pid 17 tid 100107 td 0xfffff8004de7d000
    pf_purge_expired_src_nodes() at pf_purge_expired_src_nodes+0xb3/frame 0xfffffe022fb82b80
    pf_purge_thread() at pf_purge_thread+0x81/frame 0xfffffe022fb82bb0
    fork_exit() at fork_exit+0x83/frame 0xfffffe022fb82bf0
    fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe022fb82bf0
    --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
    db:0:kdb.enter.default>  ps

    Is it always that same crash or random?

    If it's different backtraces everytime I would definitely check the RAM on something that old.


  • Thanks Steve,

    I will have a look this morning and see if the are all the some or not, is txtdump.tar.xx the correct file that i should be looking in?

    Thanks again.

  • Another couple of crashes. It looks like different processes this time, if I am reading it right?
    Apologies, trying to pick up which bits i should be looking at..

    textdump21.txt textdump20.txt

  • Netgate Administrator

    The backtrace is shown between > bt and > ps so:

    db:0:kdb.enter.default>  bt
    Tracing pid 33436 tid 100218 td 0xfffff8019628a000
    kdb_enter() at kdb_enter+0x3b/frame 0xfffffe022fe863f0
    vpanic() at vpanic+0x194/frame 0xfffffe022fe86450
    panic() at panic+0x43/frame 0xfffffe022fe864b0
    pmap_remove_pages() at pmap_remove_pages+0x7fc/frame 0xfffffe022fe86590
    exec_new_vmspace() at exec_new_vmspace+0x1b5/frame 0xfffffe022fe86600
    exec_elf64_imgact() at exec_elf64_imgact+0x931/frame 0xfffffe022fe866f0
    kern_execve() at kern_execve+0x77c/frame 0xfffffe022fe86a40
    sys_execve() at sys_execve+0x4a/frame 0xfffffe022fe86ac0
    amd64_syscall() at amd64_syscall+0xa38/frame 0xfffffe022fe86bf0
    fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe022fe86bf0
    --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800b4664a, rsp = 0x7fffffffe858, rbp = 0x7fffffffe9a0 ---
    db:0:kdb.enter.default>  ps


    db:0:kdb.enter.default>  bt
    Tracing pid 11 tid 100003 td 0xfffff80006200000
    sleepq_resume_thread() at sleepq_resume_thread+0x2b/frame 0xfffffe01e676f670
    sleepq_timeout() at sleepq_timeout+0xb5/frame 0xfffffe01e676f6b0
    softclock_call_cc() at softclock_call_cc+0x13a/frame 0xfffffe01e676f760
    callout_process() at callout_process+0x1ae/frame 0xfffffe01e676f7e0
    handleevents() at handleevents+0x1a8/frame 0xfffffe01e676f830
    timercb() at timercb+0x2a1/frame 0xfffffe01e676f880
    hpet_intr_single() at hpet_intr_single+0x1b9/frame 0xfffffe01e676f8b0
    hpet_intr() at hpet_intr+0x8e/frame 0xfffffe01e676f8f0
    intr_event_handle() at intr_event_handle+0x8b/frame 0xfffffe01e676f940
    intr_execute_handlers() at intr_execute_handlers+0x49/frame 0xfffffe01e676f970
    lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfffffe01e676f990
    Xapic_isr1() at Xapic_isr1+0xd0/frame 0xfffffe01e676f990
    --- interrupt, rip = 0xffffffff812f9426, rsp = 0xfffffe01e676fa60, rbp = 0xfffffe01e676fa60 ---
    acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfffffe01e676fa60
    acpi_cpu_idle() at acpi_cpu_idle+0x2e7/frame 0xfffffe01e676fab0
    cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe01e676fad0
    cpu_idle() at cpu_idle+0x95/frame 0xfffffe01e676faf0
    sched_idletd() at sched_idletd+0x544/frame 0xfffffe01e676fbb0
    fork_exit() at fork_exit+0x83/frame 0xfffffe01e676fbf0
    fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01e676fbf0
    --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
    db:0:kdb.enter.default>  ps

    The curthread line shows whatever the current was, usually what caused the crash.

    So, yes, all three of those are completely different which usually indicates a hardware issue of some sort.


Log in to reply