Empty coretemp entries in thermal sensors widget



  • I have started using the thermal sensors widget. It was working fine, but now widget has gone haywire and showing up invalid entries. Invalid entries are increasing slowly: at first there were no invalid entries and slowly it's taken over the dashboard.

    Anyone knows what's going on?

    3e9e8dc7-bae6-4a03-9f74-d9924ecdd974-image.png


  • Netgate Administrator

    That looks like a gui glitch but it could also be broken ACPI tables on that device.

    What do you see those as if you edit the widget and enable 'full sensor name'?

    If you reboot do they appear exactly the same?

    Steve



  • Pasted the screenshot when I enabled full sensor name.

    I also restarted pfsense afterwards, but the widget still has the invalid entries

    thermals.png



  • Strange.

    The 'dev' list is parsed for the literal word 'temperature', not 'temp'

    https://github.com/pfsense/pfsense/blob/29b42d654071ce58a2e194319cfc6b7447fe2bca/src/usr/local/www/widgets/widgets/thermal_sensors.widget.php#L34

    Can you run

    /sbin/sysctl -q dev.cpu | grep temperature | sort
    

    on the command (console or SSH) line ?



  • I get:

    [2.4.5-RELEASE][admin@pfSense]/root: /sbin/sysctl -q dev.cpu | grep temperature | sort
    dev.cpu.0.temperature: 49.0C
    dev.cpu.1.temperature: 49.0C
    dev.cpu.2.temperature: 50.0C
    dev.cpu.3.temperature: 50.0C


  • Netgate Administrator

    Try using just: sysctl -aq | grep temperature which is what your device would be checking.

    https://github.com/pfsense/pfsense/blob/RELENG_2_4_5/src/usr/local/www/widgets/widgets/thermal_sensors.widget.php#L35



  • This is what I get when I run that command:

    coretemp3: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp3: critical temperature detected, suggest system shutdown
    coretemp1: critical temperature detected, suggest system shutdown
    coretemp3: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp3: critical temperature detected, suggest system shutdown
    coretemp1: critical temperature detected, suggest system shutdown
    coretemp1: critical temperature detected, suggest system shutdown
    coretemp3: critical temperature detected, suggest system shutdown
    coretemp3: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    coretemp0: critical temperature detected, suggest system shutdown
    hw.acpi.thermal.tz1.temperature: 29.9C
    hw.acpi.thermal.tz0.temperature: 27.9C
    dev.cpu.3.temperature: 57.0C
    dev.cpu.2.temperature: 57.0C
    dev.cpu.1.temperature: 64.0C
    dev.cpu.0.temperature: 64.0C


  • Netgate Administrator

    Aha! So it's matching the word 'temperature' in the warnings there.

    I would expect that to change after a reboot though, you say it appears identically?

    Has it actually overheated? That looks like it must be passively cooled.

    Steve


  • Netgate Administrator

    You could try changing the grep to match more accurately. So, for example, set that line in thermal_sensors.widget.php to:

    		$_gb = exec("/sbin/sysctl -aq | grep temperature:", $dfout);
    

    Steve


  • Netgate Administrator

    This looks like an easy fix, even I can do it! Opened a bug to track:

    https://redmine.pfsense.org/issues/10963



  • @stephenw10 said in Empty coretemp entries in thermal sensors widget:

    Try using just: sysctl -aq | grep temperature which is what your device would be checking.

    Your right !!
    I was linking the master (future 2.5.0 ...) code.
    The 2.4.5 code is what most of us are using right now.
    I was already using the new '2.5.0' code myself for identical reasons.
    Just one line to change and you'll be fine.



  • Thanks everyone for the help.

    Yes, it is a passively cooled machine. Not sure why reboot didn't clear the logs. Maybe it reached critical temp during bootup?


  • Netgate Administrator

    I would think at least the ordering would be somewhat random if it happened at boot. Probably not something that's cleared at boot then.
    It would be interesting to see where in the sysctl output that is shown. It's probably possible to clear it manually if we know what is generating it.

    Steve



  • I changed the php file under /usr/local/www/widgets/widgets/thermal_sensors.widget.php and it worked. Thanks!


Log in to reply