APU2C4 powers itself off!



  • Hi,
    for several days now, as I wake up in the morning I have no WIFI. Looking at the Access Point and modem I see they are connected, but looking at the APU2C4 I see it's powered off (some point during the night)!

    I plug it off and back on, and it boots normally.
    Looking at the logs show entries from the boot and onwards… :/

    Anything I can/should check?

    EDIT: this was posted around ~22:00 local time. It is now ~02:30 and the APU2C4 is off!
    Meaning it happens at some point between these hours.



  • Hi
    Is the APU2 to hot?
    Didy you put the alu heat spreader correctly?

    Regards
    admins



  • @admins:

    Hi
    Is the APU2 to hot?
    Didy you put the alu heat spreader correctly?

    Regards
    admins

    The APU2 had the heat spread already in place when I bought it.
    It's hard to say if it's too hot, since it happens in the middle of the night… :/



  • I have an APU2C4 too and my system-uptime can be up to 100 days, the case is quite hot but I never had an auto system reboot, I suggest you to open your system and check if the heat spreader is correctly mounted, a good idea can be to change the thermic gel between the CPU and heat spreader



  • Order new thermal pads from pcengines if you plan on opening it up. You need the specific thickness pads for things to mate properly.



  • @Jailer:

    Order new thermal pads from pcengines if you plan on opening it up. You need the specific thickness pads for things to mate properly.

    I sounds very weird to me that it gets hot:

    • In general, at all

    • In the middle of the night, roughly at the same time every night

    • In the coldest time of the day



  • Maybe it just wants to have a break, drink a bert, and sit for a while :P



  • @mtk:

    @Jailer:

    Order new thermal pads from pcengines if you plan on opening it up. You need the specific thickness pads for things to mate properly.

    I sounds very weird to me that it gets hot:

    • In general, at all

    • In the middle of the night, roughly at the same time every night

    • In the coldest time of the day

    Seems completely possible if there are scheduled maintenance jobs that run overnight. I'd run something to record the temperature every 5 or 10 seconds in a loop, see what happens. My money is on bad heat sink installation. Another possibility is that an indexing job or some such is causing enough I/O to make your storage overheat.



  • I think more details would be nice to troubleshoot this.

    For example what pfSense version and what device are you using for the install? MiniPCIe-mSATA or from the SD Card. Nano install?..

    I am using the APU2 as an FreeBSD AP and it seems very solid. I am using the SR71E module.


  • Netgate Administrator

    That does seem like a hardware issue. Potentially the power supply.

    If it has bad contact with the heatsink then the CPU will be running hot all the time even if not hot enough to fail. How hot is it?

    Steve



  • @VAMike:

    Seems completely possible if there are scheduled maintenance jobs that run overnight. I'd run something to record the temperature every 5 or 10 seconds in a loop, see what happens.

    How do I do that?

    @stephenw10:

    That does seem like a hardware issue. Potentially the power supply.

    If it has bad contact with the heatsink then the CPU will be running hot all the time even if not hot enough to fail. How hot is it?

    Steve

    How can I tell? is there a log in the system somewhere?

    @FranciscoFranco:

    I think more details would be nice to troubleshoot this.

    For example what pfSense version and what device are you using for the install? MiniPCIe-mSATA or from the SD Card. Nano install?..

    I am using the APU2 as an FreeBSD AP and it seems very solid. I am using the SR71E module.

    Latest pfSense 2.3.4 with mSATA.



  • If you don't see the temperature in the System Information page, you need to install the correct kernel module. (it's not yet in current pfSense)
    Do so by following these instructions:
    http://www.pcengines.info/forums/?page=post&id=795B2ACC-F4B0-4181-9B4A-54EC757D4001&fid=DF5ACB70-99C4-4C61-AFA6-4C0E0DB05B2A&pageindex=1

    Enable it in pfSense if needed, here: System\Advanced\Miscellaneous section "Cryptographic & Thermal Hardware"
    You can even add the widget but it will show 4 identical temps so not sure that adds any added value.

    I have an APU2 running at home, floating between 55°C and 60°C in an ambient of 20°C with zero issues (up to now, can't speak for the future)  8)
    If the cpu craps out, it would require it to go over 90°C IIRC. What is your "normal" cpu core temperature?

    Also: Status\Monitoring, using category: "system" graph: "processor" -> should give you an overview over configurable time if there is excessive load overnight


  • Netgate Administrator

    It looks like that device ID is included in amdtemp in 2.4 so you could just upgrade to that:
    https://github.com/pfsense/FreeBSD-src/blob/devel/sys/dev/amdtemp/amdtemp.c#L83

    I don't have one to test against though so I can't be 100% sure.

    Steve



  • @bennyc:

    If you don't see the temperature in the System Information page, you need to install the correct kernel module. (it's not yet in current pfSense)
    Do so by following these instructions:
    http://www.pcengines.info/forums/?page=post&id=795B2ACC-F4B0-4181-9B4A-54EC757D4001&fid=DF5ACB70-99C4-4C61-AFA6-4C0E0DB05B2A&pageindex=1

    Enable it in pfSense if needed, here: System\Advanced\Miscellaneous section "Cryptographic & Thermal Hardware"
    You can even add the widget but it will show 4 identical temps so not sure that adds any added value.

    I have an APU2 running at home, floating between 55°C and 60°C in an ambient of 20°C with zero issues (up to now, can't speak for the future)  8)
    If the cpu craps out, it would require it to go over 90°C IIRC. What is your "normal" cpu core temperature?

    Also: Status\Monitoring, using category: "system" graph: "processor" -> should give you an overview over configurable time if there is excessive load overnight

    now we know that the power off time it 00:40 (local time).
    no strange process load at any moment during the day…

    ![Screen Shot 2017-07-07 at 00.50.40.png](/public/imported_attachments/1/Screen Shot 2017-07-07 at 00.50.40.png)
    ![Screen Shot 2017-07-07 at 00.50.40.png_thumb](/public/imported_attachments/1/Screen Shot 2017-07-07 at 00.50.40.png_thumb)



  • Can you put something else on the same circuit to make sure it's not the power feed failing? I.e. a clock that resets if you cut the power, or maybe a mechanical timer?



  • @mtk: are there still leds active when you find it in the morning? (power and/or ethernetports)
    If it would be a power dip, it would have rebooted on its own, the APU2 does not have a power button.
    But plugging it out & back in initiates restart? Odd… Given the symptoms, I would interpret it as if it hangs.

    If pfSense logs do not show anything, I would try to look at console. Connect pc/laptop to the serial port of the APU2, leave console open overnight. Hopefully that reveals something...