Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?

    Scheduled Pinned Locked Moved Hardware
    74 Posts 8 Posters 18.8k Views 9 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S Online
      stephenw10 Netgate Administrator
      last edited by

      Hmm, significant then. 🤔

      1 Reply Last reply Reply Quote 0
      • RobbieTTR Offline
        RobbieTT @InstanceExtension
        last edited by

        @InstanceExtension

        I've not experienced anything like that, albeit I have more cores. I do have HyperThreading off though, so I wonder if that makes a difference?

        ☕️

        I 1 Reply Last reply Reply Quote 0
        • I Offline
          InstanceExtension @RobbieTT
          last edited by

          @RobbieTT On the LAN side everything is 10Gb high powered devices, so maybe I'm just pushing the router a bit more.

          RobbieTTR 1 Reply Last reply Reply Quote 0
          • RobbieTTR Offline
            RobbieTT @InstanceExtension
            last edited by

            @InstanceExtension Perhaps it is a difference in workload but my LANs are also 10GbE, plus I use bi-directional FQ_CoDel and I have the additional pfSense burden of limiting my PPPoE WAN to a single core.

            ☕️

            1 Reply Last reply Reply Quote 1
            • T Offline
              tman222
              last edited by

              It's definitely a trade off and and the type of workload / environment matters as well. By favoring power savings over performance I also noticed an increase in latency, but it was very small: maybe an additional 200 microseconds (0.2ms) or so to the first hop (i.e. the dpinger gateway ping), which is probably due to the CPU sitting an min frequency the majority of the time. On the WAN side I have a symmetric 2Gbit Fiber circuit and have not seen any increase in latency or decrease in performance there. Where I have noticed a decrease in performance is when running an iperf3 test between two 10Gbit hosts located on different internals network segments (i.e. basically just L3 routing). Prior to adjusting speed shift settings, I could max out the bandwidth (~9.4Gbit/s) between the hosts with one iperf3 stream (P=1). By favoring power savings, I see closer to 8.5-9.0Gbit/s (on average) now with a single stream. Increasing the streams to two or greater results in the bandwidth maxed out again. Perhaps the load of just one stream (or maybe iperf3 in general) isn't great enough to push the CPU into the highest frequencies when power saving is favored. In any case, it's currently not limiting in any way because I don't have a need to route at 10Gbit line speed. Once WAN speeds increase to that level I will probably have to adjust / tweak the speed shift settings again. In the meantime, with a system that's fairly lightly loaded most of the time, I'm fine with accepting a slight increase in latency and slight decrease in performance in exchange for 2-3 degrees C lower temperatures (on average), along with decreased power consumption.

              RobbieTTR 1 Reply Last reply Reply Quote 0
              • RobbieTTR Offline
                RobbieTT @tman222
                last edited by

                @tman222
                With mine set at 80% I don't see any change in throughput when routing, with 1 or 4 streams. It is a pure flat line at 9.90 Gbps.

                If I generate 10GbE using the router as the server or client I can detect a small ripple in the graph with 1 stream but the 9.90 Gbps average remains:

                 2023-12-16 at 12.10.38.png

                4 streams is still a flat line though:

                 2023-12-16 at 12.02.03.png

                If I simultaneously ping a gateway using the same link I do see a small increase, as you would expect when saturating it with an iPerf test:

                 2023-12-16 at 12.14.11.png

                ☕️

                Sergei_ShablovskyS 2 Replies Last reply Reply Quote 0
                • T Offline
                  tman222
                  last edited by

                  Hi @RobbieTT - that's interesting! Are those tests done using 1500 byte packets or a larger MTU? Do you get similar results if testing with a client behind the router vs. from the router?

                  RobbieTTR 1 Reply Last reply Reply Quote 0
                  • RobbieTTR Offline
                    RobbieTT @tman222
                    last edited by

                    @tman222

                    Going via the router or to or from the router makes no difference at 10 GbE, other than the slight ripple you can see in the graphs when you are getting the router to produce the traffic.

                    Same for larger MTU (which I make use of) as the PPS is high enough for small packets within the 10 GbE cap I run at.

                    I don't have suitable equipment to produce results at 25 GbE but I would expect to see a clear delta between pure LAN to LAN routing and generating traffic from the router, like you do on low-power CPUs (albeit at much low speeds in those cases), when running above 10 GbE.

                    Real-world traffic through the WAN would be a lot lot lower due to the firewall, PPPoE limits and the limitations of pfSense/BSD at high bandwidths. I don't have a trustworthy means of testing, just enough to produce indicative results.

                    The platform and pfSense could do more if Netgate invests time and effort into using QAT in userspace as, at the moment, pfSense is still using the CPU for things that really should be directed at the dedicated silicon for. To still have the PPPoE weakness after all these years is another black-mark against pfSense/BSD.

                    ☕️

                    1 Reply Last reply Reply Quote 1
                    • B Offline
                      bigjohns97
                      last edited by bigjohns97

                      Figured I would join in on the fun.

                      I do notice a little added latency on speedtest.net going from 4-5ms to 7-8ms ping on download but in web browsing and downloading I still get the same speeds.

                      As you can see from my graphs setting to 80 allows the CPU to clock down to 800mhz when idle and only scale up to 4.6ghz when under regular speed test load.
                      f67caac9-3ef9-40e4-8cf8-16a1ce7f6a8c-image.png

                      This also has a nice effect on temps.
                      eb9a7e0c-4e78-4a67-a092-00eed0355102-image.png

                      Before you can see the clock speed was a constant 4.6ghz (Intel 9700k) and it pretty much never left that speed, at least whenever telegraf queried it.

                      I am going to leave this at 80 and see how it feels during normal usage, having this level of control over CPU usage really shifts the mindset to having a much more powerful CPU since it would be there when you need it and doesn't really cost much to have it there and waiting.

                      BTW - this is telegraf, pfblockerng, suricata, and ntopng all running in the background which was probably what held the CPU at 4.6ghz @ 50.

                      T 1 Reply Last reply Reply Quote 2
                      • T Offline
                        tman222 @bigjohns97
                        last edited by

                        @bigjohns97 - thanks for sharing those results. I'd be curious - since you have an Intel "K" CPU, do you have MultiCore Enhancement (or similar name; this is what Asus calls it) enabled in the motherboard's BIOS? As I understand it, this setting turbos the cores more aggressively (i.e. allows all cores to run at maximum turbo frequency) and is now often enabled by default. Overall I see this potentially useful for gaming or other heavily CPU bound workloads, but maybe not for a router/firewall. I did disable this setting recently for a desktop machine and temps dropped another 2-3 degrees Celsius with no noticeable impact to performance.

                        B 1 Reply Last reply Reply Quote 0
                        • B Offline
                          bigjohns97 @tman222
                          last edited by bigjohns97

                          @tman222 I do have multicore enhancement enabled in the BIOS, I want the CPU to be able to scale as fast as Intel intended but only do it when it is absolutely necessary.

                          I am hoping this 80 setting does exactly that.

                          All my pfSense machines are old gaming boxes :)

                          1 Reply Last reply Reply Quote 0
                          • B Offline
                            bigjohns97
                            last edited by

                            After going close to 24 hours here are the high level changes moving from 50 to 80 did for my 9700k.

                            Temps dropped around 5 degrees on low load / idle and almost 10 degrees under load.

                            Before
                            4704ea02-c1c7-4f2e-a2ec-e38f044fa90c-image.png

                            After
                            fb592148-cecf-44b3-b614-154575e7b63b-image.png

                            Clocks were much more balanced across the available frequencies

                            Before
                            a2825e31-8122-4309-9376-e2979e2cff84-image.png

                            After
                            209b1966-7525-4d41-8cab-64180686c485-image.png

                            Notice before it was pretty much pegged at 4.6ghz all the time and now it can still hit that mark but the average is much closer to the middle of the range and it is very balanced.

                            I couldn't be happier with this setting of 80, appreciate the thread everyone.

                            Happy New Year!

                            1 Reply Last reply Reply Quote 1
                            • Sergei_ShablovskyS Offline
                              Sergei_Shablovsky @RobbieTT
                              last edited by Sergei_Shablovsky

                              @RobbieTT said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                              @tman222
                              With mine set at 80% I don't see any change in throughput when routing, with 1 or 4 streams. It is a pure flat line at 9.90 Gbps.

                              If I generate 10GbE using the router as the server or client I can detect a small ripple in the graph with 1 stream but the 9.90 Gbps average remains:

                               2023-12-16 at 12.10.38.png

                              4 streams is still a flat line though:

                               2023-12-16 at 12.02.03.png

                              If I simultaneously ping a gateway using the same link I do see a small increase, as you would expect when saturating it with an iPerf test:

                               2023-12-16 at 12.14.11.png

                              @RobbieTT
                              Please, is the last screenshot the PingPlotter for Mac?

                              If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                              (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing have scripted and EASY TO USE reactions on triggers (and forward pages in a doc…), but anyway…)

                              —
                              CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                              Help Ukraine to resist, save civilians people’s lives !
                              (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                              RobbieTTR 1 Reply Last reply Reply Quote 0
                              • RobbieTTR Offline
                                RobbieTT @Sergei_Shablovsky
                                last edited by

                                @Sergei_Shablovsky said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                @RobbieTT
                                Please, is the last screenshot the PingPlotter for Mac?

                                If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                                (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing

                                Yes, PingPlotter on my Mac server so it runs 24/7 on my WAN and occasionally to other things under test. I don't have an opinion on the choices out there or how they compare with it as I have just run this for years.

                                ☕️

                                Sergei_ShablovskyS 1 Reply Last reply Reply Quote 1
                                • Sergei_ShablovskyS Offline
                                  Sergei_Shablovsky @RobbieTT
                                  last edited by

                                  @RobbieTT said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                  @Sergei_Shablovsky said in Enhanced Intel SpeedStep / Speed Shift - Are they fully supported?:

                                  @RobbieTT
                                  Please, is the last screenshot the PingPlotter for Mac?

                                  If answer is “yes”, please told me how You find it ACCURATE (if RUNNING AS A SERVICE) rater than SmokePing ?

                                  (I know that this is different class of software, and the best thing that PingPlotter opposite to SmokePing

                                  Yes, PingPlotter on my Mac server so it runs 24/7 on my WAN and occasionally to other things under test. I don't have an opinion on the choices out there or how they compare with it as I have just run this for years.

                                  Thank You for answering!

                                  Ping plotter are really the MUST HAVE for Mac server. ;)

                                  So, do You make some scripting?

                                  Do You start PingPlotter AS SERVICE?

                                  —
                                  CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                  Help Ukraine to resist, save civilians people’s lives !
                                  (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                  1 Reply Last reply Reply Quote 0
                                  • Dobby_D Dobby_ referenced this topic on
                                  • C Offline
                                    chrcoluk
                                    last edited by

                                    On a N100 it seems impossible to get it to idle at min speeds, even setting to 100 (with per core), it will idle at around 1.4ghz.

                                    However it needs to be no higher than about 70 to reliably exceed 2.5ghz clock speeds under openssl benchmark.

                                    I might need to lower it further though as I have been seeing small CPU spikes enough to cause noticeable jitter on traffic, which I suspect isn't generating enough clocks.

                                    It is PPPoE so no doubt that is making things harder.

                                    pfSense CE 2.8.1

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S Online
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      How are you checking? The gui code will always make it ramp up. In my testing even running sysctl increased across each core as it read it.

                                      RobbieTTR C 2 Replies Last reply Reply Quote 0
                                      • RobbieTTR Offline
                                        RobbieTT @stephenw10
                                        last edited by

                                        @stephenw10
                                        Surely that will be CPU & load specific, with weaker CPUs needing to work harder with minor tasks?

                                        ☕️

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S Online
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Yup. However SpeedShift appears so fast (sensitive) that even on something relatively powerful it will react where SpeedStep would not.

                                          1 Reply Last reply Reply Quote 1
                                          • C Offline
                                            chrcoluk @stephenw10
                                            last edited by

                                            @stephenw10 with the sysctl freq variables.

                                            Ironically I have dropped it to 60 now from 70, as 70 wasnt ramping up to highest clock speed during high throughput whilst 60 does.

                                            These are the lowest clocks I have managed to see, I managed to get 1 core below 1ghz :) This is by luck, usually its above 1.4ghz.

                                            Also I was logged out of UI.

                                             # sysctl dev.cpu.0.freq dev.cpu.1.freq dev.cpu.2.freq dev.cpu.3.freq
                                            dev.cpu.0.freq: 926
                                            dev.cpu.1.freq: 1125
                                            dev.cpu.2.freq: 1337
                                            dev.cpu.3.freq: 1410
                                            

                                            pfSense CE 2.8.1

                                            RobbieTTR 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.