Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?

    Scheduled Pinned Locked Moved Hardware
    74 Posts 8 Posters 18.8k Views 9 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B Offline
      bigjohns97 @RobbieTT
      last edited by

      @RobbieTT This is way better than PowerD, PowerD = SpeedStep

      Check out my link above.

      1 Reply Last reply Reply Quote 0
      • Q Offline
        q54e3w
        last edited by

        pfsensecpu80vs60.jpg

        1 Reply Last reply Reply Quote 0
        • T Offline
          tman222
          last edited by

          After testing / monitoring for another week, I have concluded that a Speed Shift setting of "60" works provides a pretty good performance / efficiency trade off on the Intel Xeon D-1718T CPU in one of my systems. If I increase the value further to "80", I find that the low CPU frequencies become too sticky (i.e. it seems to take too long to ramp up), while not really resulting in incremental power savings. If I lower to "50" the CPU ramps up too quickly to top frequencies, resulting in a temperature increase. What's interesting - on another system with a Intel Core i3-10100 CPU, a setting of "60" appears not conservative enough and the CPU still ramps up very quickly to higher frequencies. Could there be some differences between how different Intel CPU architectures handle / implement Speed Shift?

          I 1 Reply Last reply Reply Quote 2
          • I Offline
            InstanceExtension @tman222
            last edited by

            @tman222

            I just completed a number of WAN latency tests on my Xeon D-1718T system and had different results. I had to go down to a setting of 25 for Speed Shift to prevent the router from introducing latency on the WAN connection. 30 might have been ok, but 25 seemed to provide the best throughput and power results. A value of 60 increased the latency back to the same values I had when running on my Atom C3758 based router with PowerD set at Max values.

            1 Reply Last reply Reply Quote 0
            • stephenw10S Online
              stephenw10 Netgate Administrator
              last edited by

              How much latency was that?

              I 1 Reply Last reply Reply Quote 0
              • I Offline
                InstanceExtension @stephenw10
                last edited by

                @stephenw10 Lowering the Speed Shift value from 60 to 30 dropped the loaded download latency by ~20ms on average based upon the Waveform Bufferbloat test: https://www.waveform.com/tools/bufferbloat

                This is with a Comcast 1.2Gb download speed (real value is 1.4Gb).

                RobbieTTR 1 Reply Last reply Reply Quote 0
                • stephenw10S Online
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmm, significant then. 🤔

                  1 Reply Last reply Reply Quote 0
                  • RobbieTTR Offline
                    RobbieTT @InstanceExtension
                    last edited by

                    @InstanceExtension

                    I've not experienced anything like that, albeit I have more cores. I do have HyperThreading off though, so I wonder if that makes a difference?

                    ☕️

                    I 1 Reply Last reply Reply Quote 0
                    • I Offline
                      InstanceExtension @RobbieTT
                      last edited by

                      @RobbieTT On the LAN side everything is 10Gb high powered devices, so maybe I'm just pushing the router a bit more.

                      RobbieTTR 1 Reply Last reply Reply Quote 0
                      • RobbieTTR Offline
                        RobbieTT @InstanceExtension
                        last edited by

                        @InstanceExtension Perhaps it is a difference in workload but my LANs are also 10GbE, plus I use bi-directional FQ_CoDel and I have the additional pfSense burden of limiting my PPPoE WAN to a single core.

                        ☕️

                        1 Reply Last reply Reply Quote 1
                        • T Offline
                          tman222
                          last edited by

                          It's definitely a trade off and and the type of workload / environment matters as well. By favoring power savings over performance I also noticed an increase in latency, but it was very small: maybe an additional 200 microseconds (0.2ms) or so to the first hop (i.e. the dpinger gateway ping), which is probably due to the CPU sitting an min frequency the majority of the time. On the WAN side I have a symmetric 2Gbit Fiber circuit and have not seen any increase in latency or decrease in performance there. Where I have noticed a decrease in performance is when running an iperf3 test between two 10Gbit hosts located on different internals network segments (i.e. basically just L3 routing). Prior to adjusting speed shift settings, I could max out the bandwidth (~9.4Gbit/s) between the hosts with one iperf3 stream (P=1). By favoring power savings, I see closer to 8.5-9.0Gbit/s (on average) now with a single stream. Increasing the streams to two or greater results in the bandwidth maxed out again. Perhaps the load of just one stream (or maybe iperf3 in general) isn't great enough to push the CPU into the highest frequencies when power saving is favored. In any case, it's currently not limiting in any way because I don't have a need to route at 10Gbit line speed. Once WAN speeds increase to that level I will probably have to adjust / tweak the speed shift settings again. In the meantime, with a system that's fairly lightly loaded most of the time, I'm fine with accepting a slight increase in latency and slight decrease in performance in exchange for 2-3 degrees C lower temperatures (on average), along with decreased power consumption.

                          RobbieTTR 1 Reply Last reply Reply Quote 0
                          • RobbieTTR Offline
                            RobbieTT @tman222
                            last edited by

                            @tman222
                            With mine set at 80% I don't see any change in throughput when routing, with 1 or 4 streams. It is a pure flat line at 9.90 Gbps.

                            If I generate 10GbE using the router as the server or client I can detect a small ripple in the graph with 1 stream but the 9.90 Gbps average remains:

                             2023-12-16 at 12.10.38.png

                            4 streams is still a flat line though:

                             2023-12-16 at 12.02.03.png

                            If I simultaneously ping a gateway using the same link I do see a small increase, as you would expect when saturating it with an iPerf test:

                             2023-12-16 at 12.14.11.png

                            ☕️

                            Sergei_ShablovskyS 2 Replies Last reply Reply Quote 0
                            • T Offline
                              tman222
                              last edited by

                              Hi @RobbieTT - that's interesting! Are those tests done using 1500 byte packets or a larger MTU? Do you get similar results if testing with a client behind the router vs. from the router?

                              RobbieTTR 1 Reply Last reply Reply Quote 0
                              • RobbieTTR Offline
                                RobbieTT @tman222
                                last edited by

                                @tman222

                                Going via the router or to or from the router makes no difference at 10 GbE, other than the slight ripple you can see in the graphs when you are getting the router to produce the traffic.

                                Same for larger MTU (which I make use of) as the PPS is high enough for small packets within the 10 GbE cap I run at.

                                I don't have suitable equipment to produce results at 25 GbE but I would expect to see a clear delta between pure LAN to LAN routing and generating traffic from the router, like you do on low-power CPUs (albeit at much low speeds in those cases), when running above 10 GbE.

                                Real-world traffic through the WAN would be a lot lot lower due to the firewall, PPPoE limits and the limitations of pfSense/BSD at high bandwidths. I don't have a trustworthy means of testing, just enough to produce indicative results.

                                The platform and pfSense could do more if Netgate invests time and effort into using QAT in userspace as, at the moment, pfSense is still using the CPU for things that really should be directed at the dedicated silicon for. To still have the PPPoE weakness after all these years is another black-mark against pfSense/BSD.

                                ☕️

                                1 Reply Last reply Reply Quote 1
                                • B Offline
                                  bigjohns97
                                  last edited by bigjohns97

                                  Figured I would join in on the fun.

                                  I do notice a little added latency on speedtest.net going from 4-5ms to 7-8ms ping on download but in web browsing and downloading I still get the same speeds.

                                  As you can see from my graphs setting to 80 allows the CPU to clock down to 800mhz when idle and only scale up to 4.6ghz when under regular speed test load.
                                  f67caac9-3ef9-40e4-8cf8-16a1ce7f6a8c-image.png

                                  This also has a nice effect on temps.
                                  eb9a7e0c-4e78-4a67-a092-00eed0355102-image.png

                                  Before you can see the clock speed was a constant 4.6ghz (Intel 9700k) and it pretty much never left that speed, at least whenever telegraf queried it.

                                  I am going to leave this at 80 and see how it feels during normal usage, having this level of control over CPU usage really shifts the mindset to having a much more powerful CPU since it would be there when you need it and doesn't really cost much to have it there and waiting.

                                  BTW - this is telegraf, pfblockerng, suricata, and ntopng all running in the background which was probably what held the CPU at 4.6ghz @ 50.

                                  T 1 Reply Last reply Reply Quote 2
                                  • T Offline
                                    tman222 @bigjohns97
                                    last edited by

                                    @bigjohns97 - thanks for sharing those results. I'd be curious - since you have an Intel "K" CPU, do you have MultiCore Enhancement (or similar name; this is what Asus calls it) enabled in the motherboard's BIOS? As I understand it, this setting turbos the cores more aggressively (i.e. allows all cores to run at maximum turbo frequency) and is now often enabled by default. Overall I see this potentially useful for gaming or other heavily CPU bound workloads, but maybe not for a router/firewall. I did disable this setting recently for a desktop machine and temps dropped another 2-3 degrees Celsius with no noticeable impact to performance.

                                    B 1 Reply Last reply Reply Quote 0
                                    • B Offline
                                      bigjohns97 @tman222
                                      last edited by bigjohns97

                                      @tman222 I do have multicore enhancement enabled in the BIOS, I want the CPU to be able to scale as fast as Intel intended but only do it when it is absolutely necessary.

                                      I am hoping this 80 setting does exactly that.

                                      All my pfSense machines are old gaming boxes :)

                                      1 Reply Last reply Reply Quote 0
                                      • B Offline
                                        bigjohns97
                                        last edited by

                                        After going close to 24 hours here are the high level changes moving from 50 to 80 did for my 9700k.

                                        Temps dropped around 5 degrees on low load / idle and almost 10 degrees under load.

                                        Before
                                        4704ea02-c1c7-4f2e-a2ec-e38f044fa90c-image.png

                                        After
                                        fb592148-cecf-44b3-b614-154575e7b63b-image.png

                                        Clocks were much more balanced across the available frequencies

                                        Before
                                        a2825e31-8122-4309-9376-e2979e2cff84-image.png

                                        After
                                        209b1966-7525-4d41-8cab-64180686c485-image.png

                                        Notice before it was pretty much pegged at 4.6ghz all the time and now it can still hit that mark but the average is much closer to the middle of the range and it is very balanced.

                                        I couldn't be happier with this setting of 80, appreciate the thread everyone.

                                        Happy New Year!

                                        1 Reply Last reply Reply Quote 1
                                        • Sergei_ShablovskyS Offline
                                          Sergei_Shablovsky @RobbieTT
                                          last edited by Sergei_Shablovsky

                                          @RobbieTT said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                          @tman222
                                          With mine set at 80% I don't see any change in throughput when routing, with 1 or 4 streams. It is a pure flat line at 9.90 Gbps.

                                          If I generate 10GbE using the router as the server or client I can detect a small ripple in the graph with 1 stream but the 9.90 Gbps average remains:

                                           2023-12-16 at 12.10.38.png

                                          4 streams is still a flat line though:

                                           2023-12-16 at 12.02.03.png

                                          If I simultaneously ping a gateway using the same link I do see a small increase, as you would expect when saturating it with an iPerf test:

                                           2023-12-16 at 12.14.11.png

                                          @RobbieTT
                                          Please, is the last screenshot the PingPlotter for Mac?

                                          If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                                          (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing have scripted and EASY TO USE reactions on triggers (and forward pages in a doc…), but anyway…)

                                          —
                                          CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                          Help Ukraine to resist, save civilians people’s lives !
                                          (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                          RobbieTTR 1 Reply Last reply Reply Quote 0
                                          • RobbieTTR Offline
                                            RobbieTT @Sergei_Shablovsky
                                            last edited by

                                            @Sergei_Shablovsky said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                            @RobbieTT
                                            Please, is the last screenshot the PingPlotter for Mac?

                                            If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                                            (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing

                                            Yes, PingPlotter on my Mac server so it runs 24/7 on my WAN and occasionally to other things under test. I don't have an opinion on the choices out there or how they compare with it as I have just run this for years.

                                            ☕️

                                            Sergei_ShablovskyS 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.