Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?

    Scheduled Pinned Locked Moved Hardware
    74 Posts 8 Posters 18.8k Views 9 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • RobbieTTR Offline
      RobbieTT @InstanceExtension
      last edited by

      @InstanceExtension

      I've not experienced anything like that, albeit I have more cores. I do have HyperThreading off though, so I wonder if that makes a difference?

      ☕️

      I 1 Reply Last reply Reply Quote 0
      • I Offline
        InstanceExtension @RobbieTT
        last edited by

        @RobbieTT On the LAN side everything is 10Gb high powered devices, so maybe I'm just pushing the router a bit more.

        RobbieTTR 1 Reply Last reply Reply Quote 0
        • RobbieTTR Offline
          RobbieTT @InstanceExtension
          last edited by

          @InstanceExtension Perhaps it is a difference in workload but my LANs are also 10GbE, plus I use bi-directional FQ_CoDel and I have the additional pfSense burden of limiting my PPPoE WAN to a single core.

          ☕️

          1 Reply Last reply Reply Quote 1
          • T Offline
            tman222
            last edited by

            It's definitely a trade off and and the type of workload / environment matters as well. By favoring power savings over performance I also noticed an increase in latency, but it was very small: maybe an additional 200 microseconds (0.2ms) or so to the first hop (i.e. the dpinger gateway ping), which is probably due to the CPU sitting an min frequency the majority of the time. On the WAN side I have a symmetric 2Gbit Fiber circuit and have not seen any increase in latency or decrease in performance there. Where I have noticed a decrease in performance is when running an iperf3 test between two 10Gbit hosts located on different internals network segments (i.e. basically just L3 routing). Prior to adjusting speed shift settings, I could max out the bandwidth (~9.4Gbit/s) between the hosts with one iperf3 stream (P=1). By favoring power savings, I see closer to 8.5-9.0Gbit/s (on average) now with a single stream. Increasing the streams to two or greater results in the bandwidth maxed out again. Perhaps the load of just one stream (or maybe iperf3 in general) isn't great enough to push the CPU into the highest frequencies when power saving is favored. In any case, it's currently not limiting in any way because I don't have a need to route at 10Gbit line speed. Once WAN speeds increase to that level I will probably have to adjust / tweak the speed shift settings again. In the meantime, with a system that's fairly lightly loaded most of the time, I'm fine with accepting a slight increase in latency and slight decrease in performance in exchange for 2-3 degrees C lower temperatures (on average), along with decreased power consumption.

            RobbieTTR 1 Reply Last reply Reply Quote 0
            • RobbieTTR Offline
              RobbieTT @tman222
              last edited by

              @tman222
              With mine set at 80% I don't see any change in throughput when routing, with 1 or 4 streams. It is a pure flat line at 9.90 Gbps.

              If I generate 10GbE using the router as the server or client I can detect a small ripple in the graph with 1 stream but the 9.90 Gbps average remains:

               2023-12-16 at 12.10.38.png

              4 streams is still a flat line though:

               2023-12-16 at 12.02.03.png

              If I simultaneously ping a gateway using the same link I do see a small increase, as you would expect when saturating it with an iPerf test:

               2023-12-16 at 12.14.11.png

              ☕️

              Sergei_ShablovskyS 2 Replies Last reply Reply Quote 0
              • T Offline
                tman222
                last edited by

                Hi @RobbieTT - that's interesting! Are those tests done using 1500 byte packets or a larger MTU? Do you get similar results if testing with a client behind the router vs. from the router?

                RobbieTTR 1 Reply Last reply Reply Quote 0
                • RobbieTTR Offline
                  RobbieTT @tman222
                  last edited by

                  @tman222

                  Going via the router or to or from the router makes no difference at 10 GbE, other than the slight ripple you can see in the graphs when you are getting the router to produce the traffic.

                  Same for larger MTU (which I make use of) as the PPS is high enough for small packets within the 10 GbE cap I run at.

                  I don't have suitable equipment to produce results at 25 GbE but I would expect to see a clear delta between pure LAN to LAN routing and generating traffic from the router, like you do on low-power CPUs (albeit at much low speeds in those cases), when running above 10 GbE.

                  Real-world traffic through the WAN would be a lot lot lower due to the firewall, PPPoE limits and the limitations of pfSense/BSD at high bandwidths. I don't have a trustworthy means of testing, just enough to produce indicative results.

                  The platform and pfSense could do more if Netgate invests time and effort into using QAT in userspace as, at the moment, pfSense is still using the CPU for things that really should be directed at the dedicated silicon for. To still have the PPPoE weakness after all these years is another black-mark against pfSense/BSD.

                  ☕️

                  1 Reply Last reply Reply Quote 1
                  • B Offline
                    bigjohns97
                    last edited by bigjohns97

                    Figured I would join in on the fun.

                    I do notice a little added latency on speedtest.net going from 4-5ms to 7-8ms ping on download but in web browsing and downloading I still get the same speeds.

                    As you can see from my graphs setting to 80 allows the CPU to clock down to 800mhz when idle and only scale up to 4.6ghz when under regular speed test load.
                    f67caac9-3ef9-40e4-8cf8-16a1ce7f6a8c-image.png

                    This also has a nice effect on temps.
                    eb9a7e0c-4e78-4a67-a092-00eed0355102-image.png

                    Before you can see the clock speed was a constant 4.6ghz (Intel 9700k) and it pretty much never left that speed, at least whenever telegraf queried it.

                    I am going to leave this at 80 and see how it feels during normal usage, having this level of control over CPU usage really shifts the mindset to having a much more powerful CPU since it would be there when you need it and doesn't really cost much to have it there and waiting.

                    BTW - this is telegraf, pfblockerng, suricata, and ntopng all running in the background which was probably what held the CPU at 4.6ghz @ 50.

                    T 1 Reply Last reply Reply Quote 2
                    • T Offline
                      tman222 @bigjohns97
                      last edited by

                      @bigjohns97 - thanks for sharing those results. I'd be curious - since you have an Intel "K" CPU, do you have MultiCore Enhancement (or similar name; this is what Asus calls it) enabled in the motherboard's BIOS? As I understand it, this setting turbos the cores more aggressively (i.e. allows all cores to run at maximum turbo frequency) and is now often enabled by default. Overall I see this potentially useful for gaming or other heavily CPU bound workloads, but maybe not for a router/firewall. I did disable this setting recently for a desktop machine and temps dropped another 2-3 degrees Celsius with no noticeable impact to performance.

                      B 1 Reply Last reply Reply Quote 0
                      • B Offline
                        bigjohns97 @tman222
                        last edited by bigjohns97

                        @tman222 I do have multicore enhancement enabled in the BIOS, I want the CPU to be able to scale as fast as Intel intended but only do it when it is absolutely necessary.

                        I am hoping this 80 setting does exactly that.

                        All my pfSense machines are old gaming boxes :)

                        1 Reply Last reply Reply Quote 0
                        • B Offline
                          bigjohns97
                          last edited by

                          After going close to 24 hours here are the high level changes moving from 50 to 80 did for my 9700k.

                          Temps dropped around 5 degrees on low load / idle and almost 10 degrees under load.

                          Before
                          4704ea02-c1c7-4f2e-a2ec-e38f044fa90c-image.png

                          After
                          fb592148-cecf-44b3-b614-154575e7b63b-image.png

                          Clocks were much more balanced across the available frequencies

                          Before
                          a2825e31-8122-4309-9376-e2979e2cff84-image.png

                          After
                          209b1966-7525-4d41-8cab-64180686c485-image.png

                          Notice before it was pretty much pegged at 4.6ghz all the time and now it can still hit that mark but the average is much closer to the middle of the range and it is very balanced.

                          I couldn't be happier with this setting of 80, appreciate the thread everyone.

                          Happy New Year!

                          1 Reply Last reply Reply Quote 1
                          • Sergei_ShablovskyS Offline
                            Sergei_Shablovsky @RobbieTT
                            last edited by Sergei_Shablovsky

                            @RobbieTT said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                            @tman222
                            With mine set at 80% I don't see any change in throughput when routing, with 1 or 4 streams. It is a pure flat line at 9.90 Gbps.

                            If I generate 10GbE using the router as the server or client I can detect a small ripple in the graph with 1 stream but the 9.90 Gbps average remains:

                             2023-12-16 at 12.10.38.png

                            4 streams is still a flat line though:

                             2023-12-16 at 12.02.03.png

                            If I simultaneously ping a gateway using the same link I do see a small increase, as you would expect when saturating it with an iPerf test:

                             2023-12-16 at 12.14.11.png

                            @RobbieTT
                            Please, is the last screenshot the PingPlotter for Mac?

                            If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                            (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing have scripted and EASY TO USE reactions on triggers (and forward pages in a doc…), but anyway…)

                            —
                            CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                            Help Ukraine to resist, save civilians people’s lives !
                            (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                            RobbieTTR 1 Reply Last reply Reply Quote 0
                            • RobbieTTR Offline
                              RobbieTT @Sergei_Shablovsky
                              last edited by

                              @Sergei_Shablovsky said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                              @RobbieTT
                              Please, is the last screenshot the PingPlotter for Mac?

                              If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                              (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing

                              Yes, PingPlotter on my Mac server so it runs 24/7 on my WAN and occasionally to other things under test. I don't have an opinion on the choices out there or how they compare with it as I have just run this for years.

                              ☕️

                              Sergei_ShablovskyS 1 Reply Last reply Reply Quote 1
                              • Sergei_ShablovskyS Offline
                                Sergei_Shablovsky @RobbieTT
                                last edited by

                                @RobbieTT said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                @Sergei_Shablovsky said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                @RobbieTT
                                Please, is the last screenshot the PingPlotter for Mac?

                                If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                                (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing

                                Yes, PingPlotter on my Mac server so it runs 24/7 on my WAN and occasionally to other things under test. I don't have an opinion on the choices out there or how they compare with it as I have just run this for years.

                                Thank You for answering!

                                Ping plotter are really the MUST HAVE for Mac server. ;)

                                So, do You make some scripting?

                                Do You start PingPlotter AS SERVICE?

                                —
                                CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                Help Ukraine to resist, save civilians people’s lives !
                                (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                1 Reply Last reply Reply Quote 0
                                • Dobby_D Dobby_ referenced this topic on
                                • C Offline
                                  chrcoluk
                                  last edited by

                                  On a N100 it seems impossible to get it to idle at min speeds, even setting to 100 (with per core), it will idle at around 1.4ghz.

                                  However it needs to be no higher than about 70 to reliably exceed 2.5ghz clock speeds under openssl benchmark.

                                  I might need to lower it further though as I have been seeing small CPU spikes enough to cause noticeable jitter on traffic, which I suspect isn't generating enough clocks.

                                  It is PPPoE so no doubt that is making things harder.

                                  pfSense CE 2.8.1

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S Offline
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    How are you checking? The gui code will always make it ramp up. In my testing even running sysctl increased across each core as it read it.

                                    RobbieTTR C 2 Replies Last reply Reply Quote 0
                                    • RobbieTTR Offline
                                      RobbieTT @stephenw10
                                      last edited by

                                      @stephenw10
                                      Surely that will be CPU & load specific, with weaker CPUs needing to work harder with minor tasks?

                                      ☕️

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S Offline
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Yup. However SpeedShift appears so fast (sensitive) that even on something relatively powerful it will react where SpeedStep would not.

                                        1 Reply Last reply Reply Quote 1
                                        • C Offline
                                          chrcoluk @stephenw10
                                          last edited by

                                          @stephenw10 with the sysctl freq variables.

                                          Ironically I have dropped it to 60 now from 70, as 70 wasnt ramping up to highest clock speed during high throughput whilst 60 does.

                                          These are the lowest clocks I have managed to see, I managed to get 1 core below 1ghz :) This is by luck, usually its above 1.4ghz.

                                          Also I was logged out of UI.

                                           # sysctl dev.cpu.0.freq dev.cpu.1.freq dev.cpu.2.freq dev.cpu.3.freq
                                          dev.cpu.0.freq: 926
                                          dev.cpu.1.freq: 1125
                                          dev.cpu.2.freq: 1337
                                          dev.cpu.3.freq: 1410
                                          

                                          pfSense CE 2.8.1

                                          RobbieTTR 1 Reply Last reply Reply Quote 0
                                          • RobbieTTR Offline
                                            RobbieTT @chrcoluk
                                            last edited by RobbieTT

                                            @chrcoluk The CPU capabilities are probably more dominant than the fine controls provided by pfSense.

                                            I also happen to have my SpeedShift set at 70 and with my router doing regular work (no VPNs or invasive tasks) the 8 physical cores sit at the figures below (NB I have disabled hyper-threading as it arguably gets more in the way than actually helping):

                                             2024-06-11 at 11.12.24.png

                                            [24.03-RELEASE][admin@Router-7]/root: sysctl dev.cpu.0.freq dev.cpu.1.freq dev.cpu.2.freq dev.cpu.3.freq sysctl dev.cpu.4.freq sysctl dev.cpu.5.freq sysctl dev.cpu.6.freq sysctl dev.cpu.7.freq
                                            dev.cpu.0.freq: 799
                                            dev.cpu.1.freq: 799
                                            dev.cpu.2.freq: 799
                                            dev.cpu.3.freq: 799
                                            dev.cpu.4.freq: 799
                                            dev.cpu.5.freq: 799
                                            dev.cpu.6.freq: 799
                                            dev.cpu.7.freq: 799
                                            [24.03-RELEASE][admin@Router-7]/root: 
                                            

                                            I suspect if I switched over to my Netgate 6100 I would see frequencies routinely above the minimum values.

                                            ☕️

                                            C 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.