• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

How pfSense utilize multicore processors and multi-CPU systems ?

Scheduled Pinned Locked Moved General pfSense Questions
hardwaremulti coremulti cpusetuptuning
23 Posts 6 Posters 19.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    Gertjan @Sergei_Shablovsky
    last edited by Jan 16, 2020, 10:43 AM

    @Sergei_Shablovsky said in How pfSense utilize multicore processors and multi-CPU systems ?:

    So, the question start from is pfSense a strictly single-threaded?

    The underlying FreeBSD is multicore and multithreading.
    As are most FreeBSD applications and tools used by pfSense.
    pfSense is a web interface that enables you to manipulate all the settings using a GUI, not a command line. Basically, it's a web interface and a lot of PHP script file ( I over-simplify )

    The thing is : are your administrate this devices, or are you allowing hundreds or thousands of admins to do so ? ^^

    See here https://www.netgate.com/products/appliances/ so you can see what Netgate itself uses for it's devices.

    Example :
    Intel(R) Pentium(R) 4 CPU 3.20GHz
    2 CPUs: 1 package(s) x 2 hardware threads
    AES-NI CPU Crypto: No
    ancient "PC device" (15 years old) handles Gbit connections easily.

    Btw : Netgate (pfSense) doesn't modify the original FreeBSD source a lot. It would be far to much work to bring out newer versions. I guess there will be some patches.

    No "help me" PM's please. Use the forum, the community will thank you.
    Edit : and where are the logs ??

    1 Reply Last reply Reply Quote 1
    • S
      stephenw10 Netgate Administrator
      last edited by Jan 16, 2020, 6:02 PM

      pfSense is not single threaded. pf is no longer single threaded so there are certainly advantages to use multiple CPU cores.
      Some things are still single threaded. OpenVPN and PPPoE are two we most commonly see. Some NIC drivers cannot use more than one queue but most now do.
      There's no significant difference between multiple cpus and multiple cores in a single CPU as far as I know.

      Steve

      S 2 Replies Last reply Jan 16, 2020, 11:11 PM Reply Quote 0
      • S
        Sergei_Shablovsky @stephenw10
        last edited by Jan 16, 2020, 11:11 PM

        @stephenw10 said in How pfSense utilize multicore processors and multi-CPU systems ?:

        Some NIC drivers cannot use more than one queue but most now do.

        Where am I able to see a list of NICs that able to using multitreads on FreeBSD ?

        —
        CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
        Help Ukraine to resist, save civilians people’s lives !
        (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

        N 1 Reply Last reply Jan 17, 2020, 1:26 PM Reply Quote 0
        • G
          Gertjan
          last edited by Jan 17, 2020, 12:42 AM

          Well ....
          Stay away from Realtek
          Prefer Intel
          and your good.

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          S 1 Reply Last reply Jan 17, 2020, 12:58 PM Reply Quote 0
          • S
            Sergei_Shablovsky @Gertjan
            last edited by Jan 17, 2020, 12:58 PM

            @Gertjan said in How pfSense utilize multicore processors and multi-CPU systems ?:

            Well ....
            Stay away from Realtek
            Prefer Intel
            and your good.

            Thank You for advise!

            Are You sure about intel ? Because even in pfSense official doc i able to see from all NICs troubleshootings at least 2 issues linked to Broadcom and 2 issues linked to Intel. No other NICs.
            From statistical point of view this may be not good result.
            Also search on this forum also give a point that many issues linked to Intel. Of course may be a lot of users prefer to using Intel NICs, and some of them have an issues...

            —
            CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
            Help Ukraine to resist, save civilians people’s lives !
            (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

            S 1 Reply Last reply Jan 17, 2020, 2:12 PM Reply Quote 0
            • S
              Sergei_Shablovsky @stephenw10
              last edited by Jan 17, 2020, 1:06 PM

              @stephenw10 said in How pfSense utilize multicore processors and multi-CPU systems ?:

              pfSense is not single threaded. pf is no longer single threaded so there are certainly advantages to use multiple CPU cores.
              Some things are still single threaded. OpenVPN and PPPoE are two we most commonly see. Some NIC drivers cannot use more than one queue but most now do.
              There's no significant difference between multiple cpus and multiple cores in a single CPU as far as I know.

              You mean "no significant difference" from FreeBSD CPU-related kernel drivers that care about apps threads?

              —
              CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
              Help Ukraine to resist, save civilians people’s lives !
              (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

              1 Reply Last reply Reply Quote 0
              • N
                NogBadTheBad @Sergei_Shablovsky
                last edited by Jan 17, 2020, 1:26 PM

                @Sergei_Shablovsky said in How pfSense utilize multicore processors and multi-CPU systems ?:

                @stephenw10 said in How pfSense utilize multicore processors and multi-CPU systems ?:

                Some NIC drivers cannot use more than one queue but most now do.

                Where am I able to see a list of NICs that able to using multitreads on FreeBSD ?

                The FreeBSD web pages would be a good place to start.

                A lot of the drivers are provided by the chip manufacturers igb for example is written by Intel.

                https://www.freebsd.org/cgi/man.cgi?query=igb&sektion=4&manpath=freebsd-release-ports

                https://www.freebsd.org/releases/11.2R/hardware.html#ethernet

                Andy

                1 x Netgate SG-4860 - 3 x Linksys LGS308P - 1 x Aruba InstantOn AP22

                1 Reply Last reply Reply Quote 0
                • S
                  stephenw10 Netgate Administrator @Sergei_Shablovsky
                  last edited by Jan 17, 2020, 2:12 PM

                  @Sergei_Shablovsky said in How pfSense utilize multicore processors and multi-CPU systems ?:

                  Are You sure about intel ?

                  Very sure. Use Intel based NICs if you want the least likelihood of seeing issues.

                  Steve

                  S 1 Reply Last reply Jan 17, 2020, 7:07 PM Reply Quote 0
                  • S
                    Sergei_Shablovsky @stephenw10
                    last edited by Jan 17, 2020, 7:07 PM

                    @stephenw10 said in How pfSense utilize multicore processors and multi-CPU systems ?:

                    Steve

                    Appreciate Your help, Steve! :)

                    —
                    CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                    Help Ukraine to resist, save civilians people’s lives !
                    (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                    1 Reply Last reply Reply Quote 0
                    • S
                      Sergei_Shablovsky
                      last edited by Sergei_Shablovsky Dec 16, 2020, 3:31 AM Dec 16, 2020, 3:20 AM

                      After FreeBSD coming and pfSense have a several major updates, time to return to this question.

                      In FreeBSD 9-11 separate process was creating for each card
                      (for example for Intels cards with 2Eth)
                      intr{irq273: igb1:que}
                      intr{irq292: igb3:que}
                      ...and so on...

                      And because FreeBSD (BTW for a long time!) not able to paralleling PPPOE traffic for several threads, in FreeBSD 9-11 this processes going to several cores by using cpuset. This working nod bad until FreeBSD 12 come in.

                      Now on FreeBSD 12 all processes are together
                      kernel{if_io_tqg_0}
                      kernel{if_io_tqg_1}
                      kernel{if_io_tqg_2}
                      kernel{if_io_tqg_3}
                      ....and so on....

                      And looks like no ability to assign each card to separate core.

                      As a result we have first core 75-80% loaded in middle, and up to 100% - at peak traffic loading.

                      Some people’s suggest tuning the iflib settings (sometime in conjunction with switching OFF hyper threading)

                      In loader:
                      net.isr.maxthreads="1024" # Use at most this many CPUs for netisr processing
                      net.isr.bindthreads="1" # Bind netisr threads to CPUs.
                      In sysctl:
                      net.isr.dispatch=deferred # direct / hybrid / deffered // Interrupt handling via multiple CPU, but with context switc

                      Or
                      dev.igb.0.iflib.tx_abdicate=1
                      dev.igb.0.iflib.separate_txrx=1

                      So the question is still: how to effectively manage loading on multi-core multi-CPU systems?

                      Especially when problems with powerd and est drivers for ALL Intel CPU still exist (look of this thread about SpeedStep & TurboBoost work together in FreeBSD https://forum.netgate.com/topic/112201/issue-with-intel-speedstep-settings)

                      —
                      CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                      Help Ukraine to resist, save civilians people’s lives !
                      (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                      1 Reply Last reply Reply Quote 1
                      • S
                        Sergei_Shablovsky
                        last edited by Dec 16, 2020, 4:01 AM

                        Also this post about FreeBSD optimization and tuning for networking for Yours attention https://calomel.org/freebsd_network_tuning.html

                        —
                        CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                        Help Ukraine to resist, save civilians people’s lives !
                        (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                        1 Reply Last reply Reply Quote 0
                        • S
                          Sergei_Shablovsky
                          last edited by Sergei_Shablovsky Dec 17, 2020, 4:14 AM Dec 17, 2020, 4:09 AM

                          And another one example how to manually dispatching processes to certain CPUs cores:

                          for l in `cat ${basedir}/ix_cpu_16core_2nic`; do
                              if [ -n "$l" ]; then
                                  iface=`echo $l | cut -f 1 -d ":"`
                                  queue=`echo $l | cut -f 2 -d ":"`
                                  cpu=`echo $l | cut -f 3 -d ":"`
                                  irq=`vmstat -i | grep "${iface}:q${queue}" | cut -f 1 -d ":" | sed "s/irq//g"`
                                  echo "Binding ${iface} queue #${queue} (irq ${irq}) -> CPU${cpu}"
                                  cpuset -l $cpu -x $irq
                              fi
                          done
                           
                          ix0:0:1
                          ix0:1:2
                          ix0:2:3
                          ix0:3:4
                          ix0:4:5
                          ix0:5:6
                          ix0:6:7
                          ix0:7:8
                          ix1:0:9
                          ix1:1:10
                          ix1:2:11
                          ix1:3:12
                          ix1:4:13
                          ix1:5:14
                          ix1:6:15
                          ix1:7:16
                          

                          Totally 8 interrupts: 1 interrupt to 1 CPU core, 0 for dummynet

                          manual interrupts on cpu cores

                          From here (You need a Google translator): https://local.com.ua/forum/topic/117570-freebsd-gateway-10g/

                          —
                          CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                          Help Ukraine to resist, save civilians people’s lives !
                          (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                          1 Reply Last reply Reply Quote 0
                          • S
                            Sergei_Shablovsky
                            last edited by Sergei_Shablovsky Dec 18, 2020, 3:19 PM Dec 17, 2020, 4:32 AM

                            And at last another one interesting thread about Binding igb(4) IRQs and dummynet to CPUs https://dadv.livejournal.com/139366.html (Use a translate.google.com to read)

                            Shortly to say, because igb(4) driver queues linking algorithm (when first queue created, they linking to fires core - CPU0, and doing the same for each card) , PPPoE/GRE traffic in first queue on each of Intel cards linked strongly to CPU0, and because this traffic are high load -> CPU0 become quickly overloaded -> packets are more holding in NIC buffers -> we have dramatically latency increasing

                            Another interesting thing are the FreeBSD behavior of system thread that service dummynet: manually linking dummynet to CPU0 decrease core loading from 80% to 0,1%

                            Article worth to read.

                            —
                            CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                            Help Ukraine to resist, save civilians people’s lives !
                            (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                            1 Reply Last reply Reply Quote 0
                            • S
                              Sergei_Shablovsky
                              last edited by Sergei_Shablovsky Dec 17, 2020, 9:03 AM Dec 17, 2020, 8:54 AM

                              Need to note that mostly all this links are about high-load PPPoE/PPtP/GRE traffic with 90% of traffic are ~600 bytes size.

                              Interesting to read detailed comments from pfSense developers side, even we speak about Netgate-branded hardware (SuperMicro motherboard and case, yes?) because Intel CPUs are the same, FreeBSD are the same and all drivers are the same for Your own bare metal and Netgate hardware.

                              And in near future we see only frequency increasing, numbers of cores increasing, and energy consuming decrease. Se the proper using multi cores CPUs in case of specialized solution like “network packet grinder” pfSense still actual.

                              —
                              CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                              Help Ukraine to resist, save civilians people’s lives !
                              (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                              1 Reply Last reply Reply Quote 0
                              • S
                                stephenw10 Netgate Administrator
                                last edited by Dec 17, 2020, 2:04 PM

                                What sort of increase in throughput do you see by applying that?

                                Were you seeing very uneven CPU core loading before applying it?

                                PPPoE is a special case in FreeBSD/pfSense. Only one Rx queue on a NIC will ever be used so only one core.

                                Steve

                                1 Reply Last reply Reply Quote 0
                                • S
                                  Sergei_Shablovsky
                                  last edited by Sergei_Shablovsky Dec 19, 2020, 7:13 AM Dec 19, 2020, 7:02 AM

                                  Need to note that I understand that dumminet was written by Luigi Rizzo as system shaper for imitating environment of a low-quality channels (with big latency, packet drops, etc.), that exist more in 2008-2010.
                                  For nowadays ALTQ and NetGraph working better on fast 1-10-100G links.

                                  I not speak especially about dummynet, or PPPoE/GRE but more about how to effectively loading multi-CPUs systems. Because in case firewalls, system with 2-4 Intel CPUs (E or X server series) and independent RAM banks on each CPU ARE MORE EFFECTIVE THAN system based on 1 CPU, but hi-frequency.

                                  Effectively because, this mean ability to “fine tuning” the pfSense (FreeBSD) to professional cases (for example):

                                  • in small ISP where exist hi-loading by PPPoE/GRE traffic;
                                  • in middle companies networking with a lot of traffic with small packets (~500~800 bytes);
                                  • in broadcasting services/platforms oriented on mobile clients (with a lot of reconnections and small packets size);
                                  • ...

                                  The initial question in this thread mean
                                  1. How processes and FreeBSD services in pfSense bundle utilize the cores and memory in multi-CPUs systems? Which behavior ?
                                  2. When I understand each process / system service behavior, I able easy tuning pfSense in each usecase to achieve MORE BANDWITH, LESS LATENCY with not spending another $2-3k on a new server + NICs.

                                  From my point of view this is reasonable in nowadays when each company try to cutting costs on a tight budget due economic situation from one side and online services needs rapidly increasing (due COVID-19) from the other side.

                                  —
                                  CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                  Help Ukraine to resist, save civilians people’s lives !
                                  (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                  1 Reply Last reply Reply Quote 0
                                  • S
                                    Sergei_Shablovsky @Sergei_Shablovsky
                                    last edited by Feb 11, 2021, 4:12 AM

                                    @sergei_shablovsky

                                    Hm. Looks like hard to find right answer...

                                    I need a little bit to explain the topic start question:

                                    What system is better for network-related operation (i.e. firewall, load balancing, gate, proxy, media stream,...):
                                    a) 1 CPU with 4-10 cores, hi-frequency
                                    b) 2-4 CPU with 4-6 cores, mid-frequency

                                    And how the cache in CPU L2 (2-56Mb) and L3 (2-57Mb) impact on network-related operation (in cooperation with NIC card) ?

                                    —
                                    CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                    Help Ukraine to resist, save civilians people’s lives !
                                    (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      Sergei_Shablovsky @Sergei_Shablovsky
                                      last edited by Jun 29, 2022, 3:36 AM

                                      Is anything changes in this after FreeBSD 13-based pfSense rolled out? Better CPU using? More cores better than CPU frequency? Etc...

                                      —
                                      CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                      Help Ukraine to resist, save civilians people’s lives !
                                      (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                      AndyRHA 1 Reply Last reply Jun 29, 2022, 1:48 PM Reply Quote 0
                                      • AndyRHA
                                        AndyRH @Sergei_Shablovsky
                                        last edited by Jun 29, 2022, 1:48 PM

                                        @sergei_shablovsky Your question from 2/2021 is slightly flawed. The CPU package count is not relevant. Cores (threads) and frequency are relevant.
                                        For tasks that are single threaded frequency is what you want, for tasks that are multi threaded you want enough threads to allow the concurrency you need. The result is a balance based on your goals. If single threaded tasks are your number one concern, you will lean to frequency at the expense of cores. However if you have several packages and many NICs, you will lean to core count at the expense of frequency because you will have many threads needing to execute at the same time and it is more efficient for the computer to have many threads vs having to share.

                                        I hope that helps.

                                        o||||o
                                        7100-1u

                                        1 Reply Last reply Reply Quote 1
                                        • S
                                          Sergei_Shablovsky
                                          last edited by Sergei_Shablovsky Jun 30, 2022, 12:00 AM Jun 29, 2022, 11:57 PM

                                          At the first let me say BIG THANKS for briefly but “all that you need” answer!
                                          And for patience, because I thinking this is one of the most asking question on that forum (and on FreeBSD.org also). :) Take my respect!

                                          @andyrh said in How pfSense utilize multicore processors and multi-CPU systems ?:

                                          @sergei_shablovsky Your question from 2/2021 is slightly flawed. The CPU package count is not relevant. Cores (threads) and frequency are relevant.

                                          Because most of all rack servers are 2xCPU packaged, this is something like default for users that prefer own rackmounted hardware.
                                          But for Netgate original brand firewalls - may be this is not true default because most of their Motherboards (I remember NetGate using LANNER MB in early models, but after switch to SuperMicro, is this still true using SuperMicro?) are 1 x CPU package. And in this case there are only one ability to increasing performance when You bandwidth grow - more speedy processor with a large Cache 3 level.

                                          For tasks that are single threaded frequency is what you want, for tasks that are multi threaded you want enough threads to allow the concurrency you need.

                                          So, may be great idea for NetGate create some reference table for users that prefer own hardware, in which some characteristics (single/multithreaded, requirements of CPU frequency, memory) linked to main services and additional packages (like Snort, Suricata, etc...) that require much system resources?
                                          To help choose right NICs and server platform.
                                          Reasonable ?

                                          The result is a balance based on your goals. If single threaded tasks are your number one concern, you will lean to frequency at the expense of cores. However if you have several packages and many NICs, you will lean to core count at the expense of frequency because you will have many threads needing to execute at the same time and it is more efficient for the computer to have many threads vs having to share.

                                          The initial question in this topic combine inside two:

                                          • how FreeBSD core manage threads between CPU packages in multi-CPU packages systems (2, 4), because bus between CPUs packages involved, NUMA/non-NUMA/chipkill memory configuration in BIOS, etc...;
                                          • how different packages/services in pfSense depends on NICs configuration and tuning, CPUs packages numbers and frequencies;

                                          I hope that helps.

                                          Really, thank You.

                                          —
                                          CLOSE SKY FOR UKRAINE https://youtu.be/_tU1i8VAdCo !
                                          Help Ukraine to resist, save civilians people’s lives !
                                          (Take an active part in public protests, push on Your country’s politics, congressmans, mass media, leaders of opinion.)

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received