Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?
-
@stephenw10
I think we may miss stuff with the reporting rate vs actual rate of change. With different frequencies on different cores the values we see may not be truly representative.With my previous 'adaptive' profile I didn't observe any turbo frequencies (does not mean it wasn't happening I guess) but with Speed Shift on 80 it sticks at 799 MHz under varying demand and then rapidly turbos when things get busy:
[23.09-RELEASE][admin@Router-7.redacted.me]/root: powerd -v powerd: unable to determine AC line status CPU frequency is below user-defined minimum; changing frequency to 2700 MHz load 0%, current freq 799 MHz ( 0), wanted freq 4754 MHz load 4%, current freq 799 MHz ( 0), wanted freq 4605 MHz load 94%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 163%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 156%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 144%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 190%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 110%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 136%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 265%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 194%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 143%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 162%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 138%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 125%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 113%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 150%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 135%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 98%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 179%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 169%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 113%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 108%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 125%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 149%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 165%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 234%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 159%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 156%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 99%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 80%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 143%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 176%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 116%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 113%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 107%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 141%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 177%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 101%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 146%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 166%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 105%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 78%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 116%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 184%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 138%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 162%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 129%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 200%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 163%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 175%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 31%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 3%, current freq 3199 MHz ( 0), wanted freq 5231 MHz load 0%, current freq 3199 MHz ( 0), wanted freq 5067 MHz load 0%, current freq 3199 MHz ( 0), wanted freq 4908 MHz load 54%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 41%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 39%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 37%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 89%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 37%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 31%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 47%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 31%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 30%, current freq 3199 MHz ( 0), wanted freq 5400 MHz load 39%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 65%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 75%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 22%, current freq 799 MHz ( 0), wanted freq 5231 MHz load 25%, current freq 799 MHz ( 0), wanted freq 5231 MHz load 44%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 110%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 127%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 176%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 138%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 36%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 11%, current freq 799 MHz ( 0), wanted freq 5231 MHz load 81%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 185%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 210%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 197%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 38%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 44%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 19%, current freq 799 MHz ( 0), wanted freq 5231 MHz load 22%, current freq 799 MHz ( 0), wanted freq 5067 MHz load 21%, current freq 799 MHz ( 0), wanted freq 4908 MHz load 58%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 138%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 34%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 25%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 52%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 119%, current freq 799 MHz ( 0), wanted freq 5400 MHz load 17%, current freq 799 MHz ( 0), wanted freq 5231 MHz
I presume the above is just the fastest core though.
️
-
I agree. When I was first testing Speedshift I had a very hard time determining if my changes were actually doing anything. I came to the conclusion it's because simply running sysctl to read the cpu state causes it to bump the frequency. For example running
sysctl dev.cpu.0.freq dev.cpu.1.freq dev.cpu.2.freq dev.cpu.3.freq
gives the same result as runningsysctl dev.cpu.3.freq dev.cpu.2.freq dev.cpu.1.freq dev.cpu.0.freq
. I.e. whichever core you query first gives the lowest result and each subsequent core is running faster. -
@stephenw10 said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:
I came to the conclusion it's because simply running sysctl to read the cpu state causes it to bump the frequency.
Thankfully I don't see that, at least on this cpu:
[23.09-RELEASE]/root: sysctl dev.cpu.0.freq dev.cpu.1.freq dev.cpu.2.freq dev.cpu.3.freq dev.cpu.4.freq dev.cpu.5.freq dev.cpu.6.freq dev.cpu.7.freq dev.cpu.0.freq: 799 dev.cpu.1.freq: 799 dev.cpu.2.freq: 799 dev.cpu.3.freq: 799 dev.cpu.4.freq: 799 dev.cpu.5.freq: 799 dev.cpu.6.freq: 799 dev.cpu.7.freq: 799 [23.09-RELEASE]/root:
Heisenberg defeated.
Less luck with PPPoE handling though (920 Mbps download test):
[23.09-RELEASE]/root: sysctl dev.cpu.0.freq dev.cpu.1.freq dev.cpu.2.freq dev.cpu.3.freq dev.cpu.4.freq dev.cpu.5.freq dev.cpu.6.freq dev.cpu.7.freq dev.cpu.0.freq: 3199 dev.cpu.1.freq: 799 dev.cpu.2.freq: 799 dev.cpu.3.freq: 799 dev.cpu.4.freq: 799 dev.cpu.5.freq: 799 dev.cpu.6.freq: 799 dev.cpu.7.freq: 799 [23.09-RELEASE]/root:
️
-
Hmm, I guess as long as the load sysctl imposes doesn't push it over whatever the threshold is you wouldn't see that. You CPU is likely a lot more powerful than what I'm seeing that on.
-
Per core working just fine here.
-
BTW here is a great article showing why this is so important.
https://pcper.com/2015/11/intel-speed-shift-tested-significant-user-experience-improvements/
For those who weren't utilizing powerd it's no big deal while those of us who were welcome this update with open arms.
In these examples you can see that it really only take a couple of milliseconds now to ramp clock speed.
Essentially meaning there is no reason to not run this on a modern CPU that supports it when concerned about the best possible performance. HUGE for those of us with power hungry x86 CPU's that are running 24/7.
-
I appear to get higher (better) PPPoE throughput on my Xeon-D, which I didn't expect. I need to think a bit more as to why.
Anyway, just Intel Speed Select Technology to come...
️
-
@RobbieTT What is the model of your CPU?
Were you using PowerD before?
Also any virtualization involved?
-
-
@RobbieTT This is way better than PowerD, PowerD = SpeedStep
Check out my link above.
-
-
After testing / monitoring for another week, I have concluded that a Speed Shift setting of "60" works provides a pretty good performance / efficiency trade off on the Intel Xeon D-1718T CPU in one of my systems. If I increase the value further to "80", I find that the low CPU frequencies become too sticky (i.e. it seems to take too long to ramp up), while not really resulting in incremental power savings. If I lower to "50" the CPU ramps up too quickly to top frequencies, resulting in a temperature increase. What's interesting - on another system with a Intel Core i3-10100 CPU, a setting of "60" appears not conservative enough and the CPU still ramps up very quickly to higher frequencies. Could there be some differences between how different Intel CPU architectures handle / implement Speed Shift?
-
I just completed a number of WAN latency tests on my Xeon D-1718T system and had different results. I had to go down to a setting of 25 for Speed Shift to prevent the router from introducing latency on the WAN connection. 30 might have been ok, but 25 seemed to provide the best throughput and power results. A value of 60 increased the latency back to the same values I had when running on my Atom C3758 based router with PowerD set at Max values.
-
How much latency was that?
-
@stephenw10 Lowering the Speed Shift value from 60 to 30 dropped the loaded download latency by ~20ms on average based upon the Waveform Bufferbloat test: https://www.waveform.com/tools/bufferbloat
This is with a Comcast 1.2Gb download speed (real value is 1.4Gb).
-
Hmm, significant then.
-
I've not experienced anything like that, albeit I have more cores. I do have HyperThreading off though, so I wonder if that makes a difference?
️
-
@RobbieTT On the LAN side everything is 10Gb high powered devices, so maybe I'm just pushing the router a bit more.
-
@InstanceExtension Perhaps it is a difference in workload but my LANs are also 10GbE, plus I use bi-directional FQ_CoDel and I have the additional pfSense burden of limiting my PPPoE WAN to a single core.
️
-
It's definitely a trade off and and the type of workload / environment matters as well. By favoring power savings over performance I also noticed an increase in latency, but it was very small: maybe an additional 200 microseconds (0.2ms) or so to the first hop (i.e. the dpinger gateway ping), which is probably due to the CPU sitting an min frequency the majority of the time. On the WAN side I have a symmetric 2Gbit Fiber circuit and have not seen any increase in latency or decrease in performance there. Where I have noticed a decrease in performance is when running an iperf3 test between two 10Gbit hosts located on different internals network segments (i.e. basically just L3 routing). Prior to adjusting speed shift settings, I could max out the bandwidth (~9.4Gbit/s) between the hosts with one iperf3 stream (P=1). By favoring power savings, I see closer to 8.5-9.0Gbit/s (on average) now with a single stream. Increasing the streams to two or greater results in the bandwidth maxed out again. Perhaps the load of just one stream (or maybe iperf3 in general) isn't great enough to push the CPU into the highest frequencies when power saving is favored. In any case, it's currently not limiting in any way because I don't have a need to route at 10Gbit line speed. Once WAN speeds increase to that level I will probably have to adjust / tweak the speed shift settings again. In the meantime, with a system that's fairly lightly loaded most of the time, I'm fine with accepting a slight increase in latency and slight decrease in performance in exchange for 2-3 degrees C lower temperatures (on average), along with decreased power consumption.