Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Crashes on APU2

    Scheduled Pinned Locked Moved Hardware
    11 Posts 4 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      Cypher100
      last edited by

      It seems to crash every 2 weeks or every month. I have no idea what I'm looking for in the error dump, and can't determine to what to even google for to find a answer. I've tried reinstalling it, run a self test on the ssd, and that hasn't fixed the issue in anyway. If anyone has any ideas, please let me know.

      0_1550212997635_textdump.tar.0

      1 Reply Last reply Reply Quote 0
      • C
        Cypher100
        last edited by

        Current temperatures is 54 °C to 60 °C. It seems to crash when I'm not actively using the internet, because it has never crashed on me during usage. The device is only a couple of months old. I don't think it was crashing on earlier versions of pfSense.

        D 1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by stephenw10

          The key parts of that are:

          db:0:kdb.enter.default>  show pcpu
          cpuid        = 2
          dynamic pcpu = 0xfffffe0197692480
          curthread    = 0xfffff801034d9620: pid 4609 "sh"
          curpcb       = 0xfffffe012089fb80
          fpcurthread  = 0xfffff801034d9620: pid 4609 "sh"
          idlethread   = 0xfffff80003975000: tid 100005 "idle: cpu2"
          curpmap      = 0xfffff8007b66f138
          tssp         = 0xffffffff82bb47e0
          commontssp   = 0xffffffff82bb47e0
          rsp0         = 0xfffffe012089fb80
          gs32p        = 0xffffffff82bbb038
          ldt          = 0xffffffff82bbb078
          tss          = 0xffffffff82bbb068
          db:0:kdb.enter.default>  bt
          Tracing pid 4609 tid 100201 td 0xfffff801034d9620
          pmap_remove_pages() at pmap_remove_pages+0x2d7/frame 0xfffffe012089f450
          exec_new_vmspace() at exec_new_vmspace+0x1b5/frame 0xfffffe012089f4c0
          exec_elf64_imgact() at exec_elf64_imgact+0x931/frame 0xfffffe012089f5b0
          kern_execve() at kern_execve+0x77c/frame 0xfffffe012089f900
          sys_execve() at sys_execve+0x4a/frame 0xfffffe012089f980
          amd64_syscall() at amd64_syscall+0xa38/frame 0xfffffe012089fab0
          fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe012089fab0
          --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800b4664a, rsp = 0x7fffffffe218, rbp = 0x7fffffffe360 ---
          db:0:kdb.enter.default>  ps
          

          and

          Fatal trap 12: page fault while in kernel mode
          cpuid = 2; apic id = 02
          fault virtual address	= 0xfffff83df000e028
          fault code		= supervisor read data, page not present
          instruction pointer	= 0x20:0xffffffff81181117
          stack pointer	        = 0x28:0xfffffe012089f380
          frame pointer	        = 0x28:0xfffffe012089f450
          code segment		= base 0x0, limit 0xfffff, type 0x1b
          			= DPL 0, pres 1, long 1, def32 0, gran 1
          processor eflags	= interrupt enabled, resume, IOPL = 0
          current process		= 4609 (sh)
          

          Unfortunately nothing super conclusive there but it does look similar to this:
          https://forum.netgate.com/topic/106192/regular-crash-reports-on-my-apu2-2-3-2

          I would boot memtest86+ and run that for a few loops to be sure if you can.

          Steve

          1 Reply Last reply Reply Quote 0
          • D
            dugeem @Cypher100
            last edited by

            @cypher100 Apart from running memtest on your APU2 you could also consider upgrading BIOS to v4.0.24 which enables ECC on APU2 models with 4GB RAM (e.g. APU2C4) variants. FreeBSD supports ECC and can report errors via MCA - although the APU2 ECC is relatively recent and so is unproven. Of course I'm not suggesting you continue using marginal HW but it may add another data point.

            it might also be worth checking the power supply - specifically as It seems to crash when I'm not actively using the internet could well be a PS issue.

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              It would be interesting to compare it to other reports if it crashes regularly. If they are all the same that's usually a pretty big clue.

              Steve

              1 Reply Last reply Reply Quote 0
              • C
                Cypher100
                last edited by

                I changed some options around, and the crashes continue. Memtest didn't show anything wrong with the memory, I turned off PowerD to make sure it wasn't a downclocking issue, and crashed sooner after I turned that off.

                I have a universal laptop charger, I'll test out the PSU theory, and report here.

                1 Reply Last reply Reply Quote 0
                • C
                  Cypher100
                  last edited by

                  I will also give v4.024 a shot, and update here if any crashes occur.

                  1 Reply Last reply Reply Quote 0
                  • C
                    Cypher100
                    last edited by Cypher100

                    Today it crashed again. I installed the latest BIOS with ECC, and used a third party adapter that matches the requirements for the APU2. I reinstalled PFSense after doing all that above to. I have attached the error log. I'm out of ideas on what could be causing this.

                    0_1551922109339_textdump.tar.0

                    1 Reply Last reply Reply Quote 0
                    • C
                      Cypher100
                      last edited by

                      I'm updating to v4.9.0.2 to see if that solves the issue.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Hmm, very different crash:

                        db:0:kdb.enter.default>  bt
                        Tracing pid 0 tid 100250 td 0xfffff8001d5f0000
                        lz4_compress() at lz4_compress+0x761/frame 0xfffffe01205358d0
                        zio_compress_data() at zio_compress_data+0x8c/frame 0xfffffe0120535910
                        zio_write_compress() at zio_write_compress+0x21f/frame 0xfffffe0120535990
                        zio_execute() at zio_execute+0xac/frame 0xfffffe01205359e0
                        taskqueue_run_locked() at taskqueue_run_locked+0x154/frame 0xfffffe0120535a40
                        taskqueue_thread_loop() at taskqueue_thread_loop+0x98/frame 0xfffffe0120535a70
                        fork_exit() at fork_exit+0x83/frame 0xfffffe0120535ab0
                        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0120535ab0
                        --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
                        db:0:kdb.enter.default>  ps
                        
                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 2; apic id = 02
                        fault virtual address	= 0x1
                        fault code		= supervisor read data, page not present
                        instruction pointer	= 0x20:0xffffffff8300ab51
                        stack pointer	        = 0x28:0xfffffe0120535860
                        frame pointer	        = 0x28:0xfffffe01205358d0
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        processor eflags	= interrupt enabled, resume, IOPL = 0
                        current process		= 0 (zio_write_issue_2)
                        

                        More like a hardware issue with different crashes like that.

                        Steve

                        1 Reply Last reply Reply Quote 0
                        • E
                          edz
                          last edited by

                          I seem to be having some instability issues with my APU2C. It was running OK for over a week. This morning the orange lights on each NIC were not flashing and all connected clients were receiving a self-assigned IP address.

                          The only way to resolve this was to reboot I had a look through the logs but couldn't find anything. My grafana dashboard shows that something odd started to occur around midnight:

                          Screen Shot 2019-11-13 at 06.39.40.png

                          CPU temperature on average is about 53 degrees Celsius and it is running the latest BIOS v4.10.0.2||spoiler||

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.