Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    Multi-Core Advantages in pfSense

    Hardware
    5
    10
    210
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tman222 last edited by

      Hi all,

      I realize this has been discussed a few times on these boards, but I'm curious what the current view is (especially with pfSense now running on FreeBSD 14) of how much advantage there is (if any) to running multiple slower speed processor cores vs. a few faster processor cores. For instance, can pfSense and FreeBSD effectively take advantage of 16 cores (or higher) for routing/firewalling, or would one be better off getting just 4 or maybe 8 faster processor cores? More specifically, what's the better alternative - having e.g. 16 cores operating at 2 GHz, or 8 cores operating at 2.5 GHz? I understand that are some packages that are single threaded and thus perform better with faster individual cores, but this is more of a general question on the advantages of parallelism in the current version of pfSense/FreeBSD. Thanks in advance.

      Dobby_ 1 Reply Last reply Reply Quote 0
      • Dobby_
        Dobby_ @tman222 last edited by Dobby_

        @tman222

        This might be not so easy and short to answer as you ask
        for it or you mean. It will be from my point of view more
        many things comes and/or play together in that case.

        Xeon E3, Xeon E5 and E7 are more strong then the smaller
        Intel Atom cores, but the Intel Atom cores are also different, the generation C2000 will be outperformed
        by the C3000 series and the new C5000 and P5000 series
        will be offering more capabilities then the both before.

        FreeBSD is capable of using more then one CPU core
        and also supports HT and Turbo Boost. So if you are
        running pfSense and you may not using PPPoE that is
        only single threaded, without PPPoE it is multi threaded
        and over each CPU core a network queue can be run, so
        with much CPU cores and HT it might be gaining or more
        profiting from the many CPU cores. On top comes that if you are using packets they are CPU multi threaded and you are not using PPPoE, I mean if this comes together
        you will be seeing a performance gain for sure.

        If other things comes on top of this situation you are in,
        like DPDK capable CPUs and/or eth ports you may get
        one times more a benefit from. If QuickAssist will be
        then on both sites working in the VPN part you could
        see once more something will be more liquid and smooth
        running, all in all it is more a game play and what is nice playing together or not.

        It depends also on other numbers like, how big is your entire network, how many clients you have to serve, how
        many services, traffic and/or servers are inside, is there
        mixed traffic such as WiFi, VOIP, File and Mail traffic or
        are also one or more DMZ´s in the network, is pfSense routing all alone (the entire network) or are there other
        routing devices inside? What is the topology of your network? (Central, decentral or distributed) What layers
        are in the network? (Core, Distribution, Access) what
        protocols are in use, VRRP, OSPF, eBGP/iBGP,.....

        A small Xeon E3-12xxv3 4C/8T at >3.xGHz may be a good choice, it is made for 24h/7Days and supports a large
        amount of RAM up to 32 GB.

        Intel Atom C3000 from 2 till 16 cores might be mid ranged
        placed, but for what? From what we are talking here about? Only you know it!

        Supermicro SYS-E300-9D-8CN8TP Server with 8C/16T
        from 2.3GHZ to 3.0GHz that can be sorted with U2 SSDs
        for cache and logfiles, SIM & modem and a WiFi card all
        in all but small in footprint and able to add a NIC with 2.5
        GBit/s ports on top.

        So you may see it is not easy to answer you question in short without you provide us with more informations.

        #~. @Dobby

        PC Engines APU4D4 - 4 Ports - 4 GB RAM
        Kingston mSATA 256GB - SSD
        Sierra Wireless MC7710 - LTE
        Compex WLE200nx - WiFi
        Sintrones VGB-800 - GPS
        pfSense+ 23.01 (ZFS)

        T 1 Reply Last reply Reply Quote 0
        • T
          tman222 @Dobby_ last edited by

          @dobby_ - thank for the detailed reply. What prompted this question was me comparing and contrasting these two 1U systems from Supermicro:

          Intel Atom C3958 based (16C): https://www.supermicro.com/en/products/system/1U/5019/SYS-5019A-FN5T.cfm

          Intel Xeon D-1541 based (8C/16T): https://www.supermicro.com/en/products/system/1U/5018/SYS-5018D-FN4T.cfm

          The Atom is slightly newer, but the Xeon has a faster single core speed. Processor TDP is slightly better on the Atom (31W vs 45W) and the Atom also comes with Quick Assist (QAT). Both systems come with 10Gbit RJ45 interfaces, which is a requirement. Given the choice between these two options, which would you choose and why? I see these priced quite similarly currently.

          The only other requirements (besides having 10Gbit copper ports ) are a reasonable thermal budget (50W or less processor TDP would be ideal) and enough power to route WAN speeds between 1Gbit/s and 10Gbit/s. Also some remote access VPN via OpenVPN but only handful of clients. Would either of these systems work well, or would you recommend a third alternative?

          Thanks again for your help and insight.

          S Dobby_ 2 Replies Last reply Reply Quote 0
          • S
            SteveITS @tman222 last edited by

            @tman222 the latter looks like the CPU in the 1541: https://shop.netgate.com/collections/rack-appliances/products/1541-base-pfsense
            So you could compare specs.

            Re:packages, OpenVPN and Snort are single core.

            Steve

            Only install packages for your version, or risk breaking it. If yours is older, select it in System/Update/Update Settings.
            When upgrading, let it finish; do not reboot early. Allow 10-15 minutes, or more depending on packages and device speed.

            1 Reply Last reply Reply Quote 0
            • Dobby_
              Dobby_ @tman222 last edited by

              @tman222 said in Multi-Core Advantages in pfSense:

              @dobby_ - thank for the detailed reply. What prompted this question was me comparing and contrasting these two 1U systems from Supermicro:

              Meeting all of your criteria's may be not easy to do.
              This would be my choice of the C3958 series:
              Supermicro SYS-E300-9A-16CN8TP - Intel® Atom® processor C3958 16 Cores, 2.0GHz - 2x 10GbE SFP+, 2x 10GbE LAN, 4x 1GbE LAN, dedicated LAN for IPMI 2.0

              Pros:

              • 4 LAN Ports more
                (2x SFP+ / 2x 10GBe / 4x 1GBit/s)
              • WiFi capable for using the captive portal!

              Cons:

              • 85 watt
              • ~1500 €/$
              • No DPDK
              • No TurboBoost
              • 1 PCIe slot less

              This would be my choice of the Intel Xeon D-xxxx series:
              SuperServer 5019D-FN8TP

              Pros:

              • 2x SFP+ / 2x 10GbE / 4x 1 GBit/s
                (Intel x557 and Intel i350-AM4 are DPDK capable)
              • Intel QAT / AES-NI / TurboBoost / HT onboard
              • SIM & Modem slot + M.2 Slot + miniPCIe slot
              • WiFi capable for using the captive portal!
              • from 2.3GHz to 3,0GHz
              • fast DDR4 RAM

              Cons:

              • 80 Watt
              • High price ~1900 €/$
              • Only one PCIe slot

              If the money is not there often it is better then to wait some month to spare the money and get something
              that is not that cheap, but it is 100 % able to deal with
              the 10 GBit/s and is a bit more, or let us say best as able
              to realize it, futureproof.

              I don´t know at this moment how FreeBSD and or pfSense
              are playing together with the performance Cores of some
              CPUs so it is not that easy to answer, but things can be
              find out by doing a deeper research about it.

              Don´t get me wrong here, but if you are using 3rd party
              hardware, you may need to fine tune this hardware matching your companies traffic exactly from both
              mainboards the SFP+ ports are directly connected
              to the Intel SoC so they will be able to use as the DMZ
              or LAN ports, but both will be offering also one PCIe
              slot for adding a card if needed is no problem. I know
              electric power usage is today more then a green thinking
              owed to the higher prices in many countries, but if you
              want to deal with 10 GBit/s and need them really as a
              present throughput, it is not that point you should be
              aware of.

              #~. @Dobby

              PC Engines APU4D4 - 4 Ports - 4 GB RAM
              Kingston mSATA 256GB - SSD
              Sierra Wireless MC7710 - LTE
              Compex WLE200nx - WiFi
              Sintrones VGB-800 - GPS
              pfSense+ 23.01 (ZFS)

              1 Reply Last reply Reply Quote 0
              • T
                tman222 last edited by

                Thank you @Dobby_ and @SteveITS for your replies and recommendations.

                I did actually look at the Netgate offerings and saw that the SYS-5018D-FN4T was was essentially the same specs as the 1541. I also saw that the Netgate 8200 has somewhat similar performance to the 1541, but uses at Atom chip instead (C3758R). This made me wonder how the C3958 would perform in comparison to the C3758R, with twice the number of cores, but operating at a slightly slower frequency (i.e. 2GHz vs. 2.4GHz):

                https://www.intel.com/content/www/us/en/products/compare.html?productIds=97927,204840

                It is a bit tough to find performance numbers on the C3958, but I did see it listed in the benchmark charts of this recent review at ServeTheHome:

                https://www.servethehome.com/supermicro-x12sdv-10c-spt4f-review-intel-xeon-d-1749nt-motherboard/2/

                I realize these benchmarks may not be the best representation of how the CPU would perform in a firewall setting, but the chip does tend to hold up well against some of the Xeon D peers.

                I guess this brings me back to the original question - is there an advantage to having an extra 8 cores, e.g. for packet processing for NIC TX and RX queues so that the overall firewall throughput (pps) increases? I realize for anything that is single threaded there would be a performance hit given the lower single core frequency.

                Thanks again for all your help.

                stephenw10 1 Reply Last reply Reply Quote 0
                • Cool_Corona
                  Cool_Corona last edited by Cool_Corona

                  If you run IDS/IPS then its faster CPU clock.

                  A XEON 4C 3,6 GHz CPU will outperform an 8C 2,2GHz CPU any time.

                  1 Reply Last reply Reply Quote 1
                  • stephenw10
                    stephenw10 Netgate Administrator @tman222 last edited by

                    @tman222 said in Multi-Core Advantages in pfSense:

                    is there an advantage to having an extra 8 cores, e.g. for packet processing for NIC TX and RX queues

                    It depends on the NICs. For example the ix NICs on the 8200 can use all cores for both Rx and Tx:

                    ix0: <Intel(R) X553 N (SFP+)> mem 0x80400000-0x805fffff,0x80604000-0x80607fff at device 0.0 on pci9
                    ix0: Using 2048 TX descriptors and 2048 RX descriptors
                    ix0: Using 8 RX queues 8 TX queues
                    ix0: Using MSI-X interrupts with 9 vectors
                    ix0: allocated for 8 queues
                    ix0: allocated for 8 rx queues
                    ix0: Ethernet address: 90:ec:77:47:5c:e6
                    ix0: eTrack 0x8000084b PHY FW V65535
                    ix0: netmap queues/slots: TX 8/2048, RX 8/2048
                    

                    But the igc NICs only uses 4:

                    igc0: <Intel(R) Ethernet Controller I226-V> mem 0x81300000-0x813fffff,0x81400000-0x81403fff at device 0.0 on pci4
                    igc0: Using 1024 TX descriptors and 1024 RX descriptors
                    igc0: Using 4 RX queues 4 TX queues
                    igc0: Using MSI-X interrupts with 5 vectors
                    igc0: Ethernet address: 90:ec:77:47:5c:e8
                    igc0: netmap queues/slots: TX 4/1024, RX 4/1024
                    

                    However that's still 8 queues total so in a router even if using igc for WAN and LAN that could still use 8 cores effectively given multiple connections.

                    Steve

                    1 Reply Last reply Reply Quote 0
                    • T
                      tman222 last edited by

                      Thanks @Cool_Corona and @stephenw10 for the replies.

                      From what I can tell the NICs I would be using do support a large enough number of RX and TX queues so that all cores could be utilized (from what I read, the X550 based NICs can support up to 64 even). I currently don't have a need for IDS/IPS, but I'm concerned about routing throughput (i.e. pps processing capability). Could a 16 core chip at 2GHz effectively route up to 10Gbit/s given multiple processing cores and NIC RX/TX queues, or would higher single core clock speed ultimately end up being a more important factor? Thanks again for all your help.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10
                        stephenw10 Netgate Administrator last edited by

                        It depends what the traffic is. Given a large number of connections and routing/firewalling only then it should be possible to take advantage of a large number of cores.

                        If you want to see the highest throughput when testing from a single client against speedtest.net then fewer cores at a higher frequency is going to give better results.
                        That also applies traffic passing any other process that is single threaded as mentioned. So typically Snort or OpenVPN.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post