Had several crashes/reboots



  • Hey guys,

    I've had my second spontaneoous reboot yesterday on 2.4RC. I've had one a few weeks back too on the previous release.

    Both occasions I have uploaded the crashdump and I was wondering if there was any chance of someone having a look at it and possibly help me resolve the issue?

    Thanks!

    Edit: if the above is not possible, can I review the crashdumps, analyze them or paste something here to maybe get community support? Thing is, I can't find them anymore.

    Thanks again.



  • Once you submit a crashreport it is deleted.. (only pfSense/netgate personal can view it)
    So if/when it happens again take copies of the dump files before submitting them.

    It might be hardware, any usb storage or usb nic's or spare powersupply to swap, memory tested? Not saying thats the only reason but it could cause strange things..



  • I'm seeing these as well…..

    @netgate/@pfsense folks: Add to the IPv6 6RD issue, this makes me (a gold sub) unhappy.



  • Hi

    Same here, never happened on 2.3 and routinely had uptime measured in hundreds of days before, only rebooted when upgrading, but just had one, this is running 2.4.0.r.20170914.2215.

    Regards

    Phil


  • Rebel Alliance Developer Netgate

    I don't see any crash reports posted from the IP addresses listed as the source of your forum posts. We can look up the crash reports fine, but we need to know where they came from (IP address, IPv6 or IPv4) and approximate time.

    If you get a crash report, you can copy/paste what you see of it in the GUI into a .txt file and attach it to a forum post as well.

    Without having any detail whatsoever about what the crashes are, I can't offer any possible explanations. We certainly are not seeing that kind of instability here in our internal testing.



  • crazy idea

    maybe crash reporter needs link to copy to clip board to make form posting easier?



  • Hi

    @jimp:

    I don't see any crash reports posted from the IP addresses listed as the source of your forum posts. We can look up the crash reports fine, but we need to know where they came from (IP address, IPv6 or IPv4) and approximate time.

    If you get a crash report, you can copy/paste what you see of it in the GUI into a .txt file and attach it to a forum post as well.

    Without having any detail whatsoever about what the crashes are, I can't offer any possible explanations. We certainly are not seeing that kind of instability here in our internal testing.

    If the IP address didn't start with 212 for me please try the one logged on this post.  (Quite often I'm on a VPN so may have posted when connected.)

    Regards

    Phil


  • Rebel Alliance Developer Netgate

    @Phil_D:

    If the IP address didn't start with 212 for me please try the one logged on this post.  (Quite often I'm on a VPN so may have posted when connected.)

    Same address both times, no crash reports from it or anywhere close. IPv6 maybe? If you have it, the firewall prefers to use it.



  • @jimp:

    I don't see any crash reports posted from the IP addresses listed as the source of your forum posts. We can look up the crash reports fine, but we need to know where they came from (IP address, IPv6 or IPv4) and approximate time.

    If you get a crash report, you can copy/paste what you see of it in the GUI into a .txt file and attach it to a forum post as well.

    Without having any detail whatsoever about what the crashes are, I can't offer any possible explanations. We certainly are not seeing that kind of instability here in our internal testing.

    Thanks for trying. My internet for my clients is routed through one of 3 VPN servers out of a pool of several dozen servers. Unless I have pfsense logging I can look into to see which IP address my VPN clients had at the time that the report was uploaded … you see the problem :).

    Unless, it was sent over my WAN. In that case, since the IP doesn't change it would be easy. Can I send you a PM with the IP address?

    Next time I'll make a copy of the crashdump and logs before uploading.

    Maybe I should make a firewall rule that pfsense own traffic to your servers should always use direct WAN connection and not use the VPN connections?


  • Banned

    Seeing the debate here - guys, since you've already flooded with dashboard with a bunch of UUID/Netgate Device ID nonsense, why don't you actually make use of it? Would be a whole lot easier to look up the crash report compared to digging for IPs.

    Have I missed something here?


  • Rebel Alliance Developer Netgate

    We could use the unique ID, but the crash reporting is meant to be simple/minimal. The IP and time are used on the server side, and no other system info is collected  except what FreeBSD puts in the crash dump. It's probably due for an overhaul but that's not going to help anyone right now.

    If there is any question about the report, just copy/paste it out to a text file from the report screen. When submitted, it always follows the default route on the firewall using IPv6 if it's available, or IPv4 if it isn't. For the vast majority of users, it's not a problem.



  • Hi

    I wasn't prompted to send any crash report so unless it happened automatically it might be something you haven't received then.  I was using in memory RAM disk so perhaps that stops the crash dump surviving, so I've turned that off now.  I've updated to the lasted build of 2.4 RC and will see how it goes.

    Many thanks for trying to help.

    Regards

    Phil


  • Rebel Alliance Developer Netgate

    Nothing is automatically submitted, and RAM disks wouldn't matter. The only thing that would make a difference is whether or not your installation has any swap space. Without swap space, it can't store a crash dump from the kernel to be picked up on the next boot.

    If you don't have any swap space then you'd have to hook up a serial console and let it run with a large buffer or logging to file and hope it catches the crash data as it happens.



  • Hi

    Just seen a prompt to send the crash report to you, now done from this IP address.  Hope it is helpful.

    Regards

    Phil


  • Rebel Alliance Developer Netgate

    @Phil_D:

    Just seen a prompt to send the crash report to you, now done from this IP address.  Hope it is helpful.

    I see that one.

    db:0:kdb.enter.default>  bt
    Tracing pid 82185 tid 100207 td 0xfffff8015d96fa00
    turnstile_broadcast() at turnstile_broadcast+0x9c/frame 0xfffffe0230bda480
    __rw_wunlock_hard() at __rw_wunlock_hard+0x8f/frame 0xfffffe0230bda4b0
    vm_map_delete() at vm_map_delete+0x3dc/frame 0xfffffe0230bda530
    vm_map_remove() at vm_map_remove+0x47/frame 0xfffffe0230bda560
    exec_new_vmspace() at exec_new_vmspace+0x22f/frame 0xfffffe0230bda5e0
    exec_elf64_imgact() at exec_elf64_imgact+0xa58/frame 0xfffffe0230bda6f0
    kern_execve() at kern_execve+0x74d/frame 0xfffffe0230bdaa50
    sys_execve() at sys_execve+0x4a/frame 0xfffffe0230bdaad0
    amd64_syscall() at amd64_syscall+0x4ce/frame 0xfffffe0230bdabf0
    Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0230bdabf0
    --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800b40d8a, rsp = 0x7fffffffe178, rbp = 0x7fffffffe2c0 ---
    
    
    kernel trap 12 with interrupts disabled
    
    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address	= 0x30
    fault code		= supervisor read data, page not present
    instruction pointer	= 0x20:0xffffffff80cb9cfc
    stack pointer	        = 0x28:0xfffffe0230bda450
    frame pointer	        = 0x28:0xfffffe0230bda480
    code segment		= base 0x0, limit 0xfffff, type 0x1b
    			= DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags	= resume, IOPL = 0
    current process		= 82185 (sh)
    
    

    Not much to go on there since it doesn't have a very long backtrace or much of an indication of what it was doing beyond some memory operations it could be https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213903, which may actually be fixed on 11.1 so maybe try a 2.4.1 snapshot.

    At the moment 2.4.1 is nearly identical to 2.4 except it has a FreeBSD 11.1 base.



  • Hi

    Many thanks for the info and checking the crash report.  I will try with a jump up to 2.4.1.  I believe 2.4.1 will be close behind the official 2.4 release anyway.  The crash for me happened after around 3 days of up time, it is the only crash I've had so far, so it is infrequent in my case.

    Regards

    Phil



  • i downloaded the 2.4 on the 20th (19th 23:00 release) and i had it running for about 3 hours before it too did a dump and rebooted. how ever i didnt think to grab the files. i was using mestick install and the device it was on is Qbox nT-A3350 9 if it helps…