After upgrade from 2.6CE to 23.01RC pfSense Plus CPU Type info shows very high CPU clocking
-
A C2758 won't have Speed Shift. What is that device, is it passively cooled?
.> 90°C is worryingly hot in any circumstances.
-
Passive heatsink on die but I am dragging air across that with 3 back mounted 2" Noctua fans.
Apologies for the partial answer, re-read your post.
It's a SuperMicro MBD-A1SRI-2758F-O with 32GB DDR3 Ram housed in a SuperMicro CSE-505-203B 1u chassis with 3 Noctua NF-A4x20 PWM fans pulling air out the back.
-
@stephenw10
As you correctly stated, dev.hwpstate_intel.X.epp had no effect on the issue. I had all 8 cores set to almost max efficiency and it's clear my CPU does not support it.dev.hwpstate_intel.0.epp CPU 0 Speed Shift Level 95
dev.hwpstate_intel.1.epp CPU 1 Speed Shift Level 95
dev.hwpstate_intel.2.epp CPU 2 Speed Shift Level 95
dev.hwpstate_intel.3.epp CPU 3 Speed Shift Level 95
dev.hwpstate_intel.4.epp CPU 4 Speed Shift Level 95
dev.hwpstate_intel.5.epp CPU 5 Speed Shift Level 95
dev.hwpstate_intel.6.epp CPU 6 Speed Shift Level 95
dev.hwpstate_intel.7.epp CPU 7 Speed Shift Level 95I didn't let it get as far in to thermal runaway this time but it was continuing to climb and the CPU stayed close to the max rated speed of 2600Mhz.
Max Speed: 2600 MHz Current Speed: 2400 MHz
Did a quick snapshot of what was going on before I rebooted in to 22.05 again.
There was a busy grep that I didn't initiate running but otherwise idling along.[23.01-RELEASE][root@ENDPOINT]/root: top
last pid: 77152; load averages: 1.88, 1.90, 1.69 up 0+00:55:47 11:53:34
68 processes: 3 running, 65 sleeping
CPU: 14.4% user, 0.4% nice, 3.8% system, 0.0% interrupt, 81.4% idle
Mem: 895M Active, 317M Inact, 688M Wired, 29G Free
ARC: 173M Total, 62M MFU, 104M MRU, 152K Anon, 1127K Header, 4973K Other
114M Compressed, 271M Uncompressed, 2.38:1 Ratio
Swap: 2048M Total, 2048M FreePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
87135 root 1 134 0 338M 311M CPU3 3 2:13 98.63% grep
87013 unbound 8 20 0 183M 129M kqread 0 1:07 9.24% unbound
482 root 1 68 0 144M 81M accept 7 0:53 1.03% php-fpm
13618 root 1 20 0 71M 43M piperd 7 0:02 0.24% php
59673 root 1 68 20 13M 3148K piperd 5 0:01 0.24% sh
88623 root 1 20 0 14M 4104K CPU1 1 0:00 0.18% top
95918 root 1 20 0 31M 11M kqread 5 0:14 0.13% nginx[23.01-RELEASE][root@ENDPOINT]/root: ps -ef 87135
PID TT STAT TIME COMMAND
87135 - R 4:24.67 LOGNAME=root LANG=C.UTF-8 PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin PWD=/root USER=root HOME=/root SHELL=/bin/sh MM_CHARSET=UTF-8 BLOCKSIZE=K grep -vF -f /tmp/dnsbl_tld_remove.tsp /var/unbound/pfb_dnsbl.tLet me know if there is any other info that would help track this down.
Thanks,
KC515 -
Hmm, well I would say you definitely need to look at your cooling solution there because that should never get that hot. Even running at 100% on all cores. Do the fans not spin up with CPU temp?
But that grep seems unexpected. Looks like pfBlocker converting it's lists for TLD which requires a lot of CPU cycles but it should finish after a few minutes.
Steve
-
@stephenw10 It never did until 23.01. Been running CE and 22.0X for little over a year and average CPU temp hovers around 65C.
If the cooling solution worked under the older versions and still does, something has to be afoot with the 23.01 C2758 combo.
-
@stephenw10 said in After upgrade from 2.6CE to 23.01RC pfSense Plus CPU Type info shows very high CPU clocking:
But that grep seems unexpected. Looks like pfBlocker converting it's lists for TLD which requires a lot of CPU cycles but it should finish after a few minutes.
I’ve noticed the same huge grep spikes which I never saw before. You can also see it if you force an update in the GUI and the panel showing progress will actually time out and need to be told to go into view again several times during an update.
I got mine under a bit more control by turning off de duplication and CDIR processing.
-
@stephenw10 said in After upgrade from 2.6CE to 23.01RC pfSense Plus CPU Type info shows very high CPU clocking:
Looks like pfBlocker converting it's lists for TLD which requires a lot of CPU cycles but it should finish after a few minutes.
Normally, but there's a bug in recent versions where it can take quite a long time. Workaround is to disable the Wildcard Blocking (TLD) option.
-
I did a quick and dirty dd load test. Youngest process was 20 minutes, test was 7 processes wide. This was on 22.05.
4 Minutes to return to normal operating temp.
Apologies for the long output...
last pid: 16509; load averages: 7.95, 7.34, 6.50 up 0+03:47:32 15:46:55
104 processes: 10 running, 94 sleeping
CPU: 26.3% user, 0.0% nice, 64.7% system, 0.1% interrupt, 9.0% idle
Mem: 380M Active, 577M Inact, 934M Wired, 29G Free
ARC: 373M Total, 146M MFU, 220M MRU, 300K Anon, 1406K Header, 5583K Other
116M Compressed, 277M Uncompressed, 2.39:1 Ratio
Swap: 2048M Total, 2048M FreePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
13635 root 1 102 0 10M 2368K CPU3 3 32:15 95.54% dd
71834 root 1 102 0 10M 2368K CPU5 5 20:17 95.44% dd
90250 root 1 102 0 10M 2368K CPU2 2 36:22 95.13% dd
99834 root 1 102 0 10M 2368K CPU7 7 29:35 95.04% dd
65187 root 1 102 0 10M 2368K CPU1 1 35:14 94.73% dd
92067 root 1 102 0 10M 2368K RUN 6 33:31 93.61% dd
3743 root 1 20 0 17M 8012K kqread 3 2:08 1.25% lighttpd_pfb
2525 root 1 52 0 109M 62M piperd 4 0:28 1.18% php_pfb
61660 root 1 20 0 134M 80M accept 6 1:38 0.65% php-fpm
45545 unbound 8 20 0 317M 227M kqread 4 2:17 0.56% unbound
6466 root 1 20 0 13M 3824K CPU6 6 0:00 0.16% top
96879 avahi 1 20 0 13M 3948K select 4 0:08 0.13% avahi-daemon
42769 root 1 20 0 11M 2836K select 4 0:20 0.11% syslogd
96869 root 1 52 0 136M 80M piperd 5 0:34 0.10% php-fpm
51097 root 1 20 0 12M 2984K bpf 4 0:14 0.07% filterlog
70928 root 5 52 0 14M 2684K uwait 6 0:03 0.04% dpinger
53363 root 1 20 0 29M 9596K kqread 5 1:25 0.03% nginx
41291 root 1 20 0 20M 9704K select 0 0:00 0.02% sshd[22.05-RELEASE][root@ENDPOINT]/root: sysctl -a | grep "dev.cpu.*.temperature"
dev.cpu.7.temperature: 76.0C
dev.cpu.6.temperature: 76.0C
dev.cpu.5.temperature: 76.0C
dev.cpu.4.temperature: 77.0C
dev.cpu.3.temperature: 79.0C
dev.cpu.2.temperature: 79.0C
dev.cpu.1.temperature: 78.0C
dev.cpu.0.temperature: 79.0C -
Hmm, that's interesting. I wonder if the coretemp driver is somehow reporting the value differently. It's hard to imagine the CPU is actually able to dissipate more Watts in 23.01.
Also interesting that it's showing 2.6GHz available when that CPU doesn't actually support that as far as I know.
It still seems very hot for a 20W TDP CPU with any sort of active cooling.
I agree the increased CPU usage is unexpected though.
As a quick test I usually use
yes > /dev/null
. You can easily run the CPU to 100% with that.
Testing that with an 8200 that uses a C3758R which is a 26W TDP CPU it never gets above 45C even with all cores maxed:last pid: 32191; load averages: 9.81, 6.64, 3.27 up 0+00:11:51 23:23:56 66 processes: 11 running, 55 sleeping CPU: 8.7% user, 0.0% nice, 91.3% system, 0.0% interrupt, 0.0% idle Mem: 155M Active, 74M Inact, 560M Wired, 15G Free ARC: 97M Total, 23M MFU, 70M MRU, 148K Anon, 611K Header, 2562K Other 59M Compressed, 143M Uncompressed, 2.40:1 Ratio Swap: 1024M Total, 1024M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 15972 root 1 128 0 12M 2120K CPU5 5 5:02 99.41% yes 15979 root 1 131 0 12M 2124K RUN 4 5:07 99.39% yes 15085 root 1 127 0 12M 2124K CPU1 1 4:57 99.33% yes 15365 root 1 132 0 12M 2120K CPU2 2 5:00 99.20% yes 14966 root 1 128 0 12M 2124K RUN 6 5:04 88.49% yes 14909 root 1 127 0 12M 2116K RUN 0 5:16 85.01% yes 15795 root 1 127 0 12M 2124K CPU7 7 4:57 78.93% yes 15414 root 1 126 0 12M 2124K CPU3 3 5:06 73.80% yes
[23.01-RELEASE][admin@8200-2.stevew.lan]/root: sysctl -a | grep "dev.cpu.*.temperature" dev.cpu.7.temperature: 43.0C dev.cpu.6.temperature: 44.0C dev.cpu.5.temperature: 45.0C dev.cpu.4.temperature: 44.0C dev.cpu.3.temperature: 44.0C dev.cpu.2.temperature: 43.0C dev.cpu.1.temperature: 46.0C dev.cpu.0.temperature: 43.0C
So, yes, there does seem to be some unexplained load there in 23.01.
However you should check the cooling system on that box, you should never be able to get the CPU that hot.
Heatsink not seated correctly any longer maybe? Or the fan shrouding is out of position perhaps? Fans not actually working?Steve
-
It's not what I would consider active cooling since there is no fan on the CPU heatsink itself, just the exhaust fans on the back. And no shroud either.
I don't know that there is some unexplained load in 23.01, other than the grep thing I initially observed.
What I can't explain is the fact that I can drive the same load in 22.05 and only reach 80C (176F) but in 23.01 it reaches 95C (203F). Could be just a stat reporting issue, but I'll let someone else find that and stick with 22.05 for now.
-
I think there probably is some additional loading somewhere in 23.01. Other than the grep delay in pfBlocker (which appears to actually be in grep) it's unclear where that is but we are looking into it.
However it's much more noticeable on your system because of the cooling you have. When you buy that chassis with a board fitted from Supermicro it has a fan shroud to provide ducted cooling. The rear of the enclosure is basically open mesh so without it the fans draw air in from everywhere and do very little to cool the CPU. But even given that it still seems very hot for a 20W TDP CPU like that. It should not get that hot.
I can only recommend you check the cooling there. Replace the heatsink compound. Add a fan duct. You could probably lower those temps by at least 30°C.Steve
-
@stephenw10 Mine didn't come with any ducting at all.
-
Oh, it was supplied like that? From your description I has assumed (incorrectly) that you had fitted that board to the chassis.
Well I would still consider adding some basic duct. You'll be amased how much difference that makes. -
@stephenw10
Yep, going to look in to a noctua cpu fan. -
-