Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    10 GBit questions

    Scheduled Pinned Locked Moved General pfSense Questions
    25 Posts 6 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      rvdbijl @tman222
      last edited by

      @tman222 It's an old embedded system (Advantech Uno) with a PCIe x8 slot that the network board is mounted in. I got it for free a few years ago, and it was a good upgrade for my old Atom-based pfSense box when my network was 500MBit/50MBit and it was straining to keep up, especially with OpenVPN.
      Then I switched ISPs to 1G/1G, and it still kept up, seemingly with no issue (note that I do not run snort, or any other packages that could slow things down -- just firewall and some OpenVPN, DHCP and DNS). Now I'm at 2G/2G and I'm only seeing 1.1G of that with the Uno box (using Speedtest).

      I don't think it's constrained on its PCIe bus - it claims to be connected to the board at 8x link, etc. Also -- I can pump data TO it at 5-6 GBit/s based on iperf3. Just FROM it seems to be limited to 1.2 GBit/s (with iperf3).

      Either way -- I don't want to drop a ton of money on a new system ($500-$800 would be my limit), so the Xeon D-1718T is a bit out of my price range. I did just buy an older Dell R210-II with Xeon E3-1275 v2 on it off eBay for a little under $300, shipped. When that arrives, I'll give it a try. I know it's going to consume more power than my current system, but I can't justify >$1000 right now. If the Dell fails in getting me the full 2 GBit bandwidth (with some margin), I'll look for a faster system.

      T 1 Reply Last reply Reply Quote 0
      • T
        tman222 @rvdbijl
        last edited by tman222

        @rvdbijl - that Dell R210-II system should be more than capable based on the CPU specs I can see. Also, I agree with you that it sounds like there is enough PCI Express bandwidth with the x8 slot on your current syste.

        There may be another performance limitation / some network tuning required to try to overcome the current limit you're seeing. A couple good links to check out:

        https://calomel.org/freebsd_network_tuning.html
        https://wiki.freebsd.org/NetworkPerformanceTuning
        https://docs.netgate.com/pfsense/en/latest/hardware/tune.html

        Hope this helps.

        S 1 Reply Last reply Reply Quote 0
        • Dobby_D
          Dobby_ @rvdbijl
          last edited by

          @rvdbijl said in 10 GBit questions:

          Looks like those *NT options add quite a bit to the cost ...

          If you can get your hands on a used hardware, and there
          will be two choices! One with IT and one with N or NT
          and they are nearly the same price range, let us say at
          something around $500 ish, my tip was to go with the
          N or NT named boards (CPUs) over the IT named ones.

          At least, that I've found. Also to TDP (which I guess
          makes sense).

          If this will be even more important to many of us, you
          could also play with the idea to get hands on a small
          Intel Denverton (C3000) hardware. 4 cores are enough
          for your 1 GBit/s internet line if you are not using PPPoE.
          Board ~389 plus shipping fee
          case

          Perhaps on top of this will be coming then if needed;

          • RAM 2x DDR4 2400MHz
          • M.2 SSD
          • 2.5 GBit/s NIC

          It looks like the 2123IT has everything except QAT, which isn't implemented in CE 2.6.0 (which I run). So it doesn't sound like I should expect a big change between an *IT and *NT CPU, right?

          With pfSense CE you may be also able to use QAT but
          not able to set it up so easy like with pfSense+ Plus.

          The N or NT is giving you the full "potential" of the mostly
          wished or "needed" points, but this is all different from user to user!

          • QAT
          • AES-NI
          • TurboBoost
          • HT
          • Support DPDK

          Mostly or often this may be the main points all are
          talking about.

          #~. @Dobby

          Turris Omnia - 4 Ports - 2 GB RAM / TurrisOS 7 Release (Btrfs)
          PC Engines APU4D4 - 4 Ports - 4 GB RAM / pfSense CE 2.7.2 Release (ZFS)
          PC Engines APU6B4 - 4 Ports - 4 GB RAM / pfSense+ (Plus) 24.03_1 Release (ZFS)

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            You should make sure the 10G NICs show the expeced number of queues when they attach at boot. Especially since you're seeing traffic limited in one direction.

            R 1 Reply Last reply Reply Quote 1
            • R
              rvdbijl @stephenw10
              last edited by stephenw10

              @stephenw10 said in 10 GBit questions:

              You should make sure the 10G NICs show the expeced number of queues when they attach at boot. Especially since you're seeing traffic limited in one direction.

              It looks like 2 queues are being allocated for both ix0/ix1 interfaces. Not sure if that's what it's supposed to be:

              ix1: netmap queues/slots: TX 2/2048, RX 2/2048
              ix1: eTrack 0x80000528 PHY FW V286
              ix1: PCI Express Bus: Speed 5.0GT/s Width x8
              ix1: Ethernet address: 80:61:5f:0e:8c:25
              ix1: allocated for 2 rx queues
              ix1: allocated for 2 queues
              ix1: Using MSI-X interrupts with 3 vectors
              ix1: Using 2 RX queues 2 TX queues
              ix1: Using 2048 TX descriptors and 2048 RX descriptors
              ix1: <Intel(R) X540-AT2> mem 0xf0000000-0xf01fffff,0xf0400000-0xf0403fff irq 18 at device 0.1 on pci2
              ix0: netmap queues/slots: TX 2/2048, RX 2/2048
              ix0: eTrack 0x80000528 PHY FW V286
              ix0: PCI Express Bus: Speed 5.0GT/s Width x8
              ix0: Ethernet address: 80:61:5f:0e:8c:24
              ix0: allocated for 2 rx queues
              ix0: allocated for 2 queues
              ix0: Using MSI-X interrupts with 3 vectors
              ix0: Using 2 RX queues 2 TX queues
              ix0: Using 2048 TX descriptors and 2048 RX descriptors
              ix0: <Intel(R) X540-AT2> mem 0xf0200000-0xf03fffff,0xf0404000-0xf0407fff irq 17 at device 0.0 on pci2
              
              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Yeah, that's probably correct Though that CPU looks like it's 2-cores with 2-threads per core so 4 virtual cores., if hyper-threading is enabled. If it shows 4 CPUs I'd expect 4 queues.

                It's the same number of queues for Tx and Rx though so that doesn't look like a problem.

                R 1 Reply Last reply Reply Quote 1
                • R
                  rvdbijl @stephenw10
                  last edited by

                  @stephenw10 said in 10 GBit questions:

                  Yeah, that's probably correct Though that CPU looks like it's 2-cores with 2-threads per core so 4 virtual cores., if hyper-threading is enabled. If it shows 4 CPUs I'd expect 4 queues.

                  I was wondering about that myself -- I did turn Hyperthreading off to test to see if it got faster, but there was no appreciable difference. So I turned it back on. This is the log from the last boot and shows 2 queues, despite there being 4 cores (HT).

                  It's the same number of queues for Tx and Rx though so that doesn't look like a problem.

                  So it sounds like there aren't many more avenues to optimize this box. For whatever reason it's just not able to handle > 1GBit. My new box should be here in a week or so. Hopefully that one will run a lot faster...

                  (I did go through the optimization articles that were mentioned earlier, but none of the tricks made it any faster. Some actually made it slower -- like turning off HW offload options).

                  Thanks all for the tips and suggestions!

                  Dobby_D 1 Reply Last reply Reply Quote 0
                  • Dobby_D
                    Dobby_ @rvdbijl
                    last edited by

                    @rvdbijl

                    The Xeon D2123IT

                    4 Cores
                    8 Threads

                    max 3.0GHz

                    TurboBoost
                    HyperThreading
                    AES-NI
                    DPDK?

                    4 from 5!

                    If you are not using the PPPoE it will saturate a 1 GBit/s
                    with ease. And in theoretic it should be then able to
                    feed or support 8 queues, but you can also "tune" the;

                    • queue size
                    • queue length
                    • queue amount pending on the CPU "C/T"

                    perhaps you will be reporting back here if that box
                    was arriving.

                    #~. @Dobby

                    Turris Omnia - 4 Ports - 2 GB RAM / TurrisOS 7 Release (Btrfs)
                    PC Engines APU4D4 - 4 Ports - 4 GB RAM / pfSense CE 2.7.2 Release (ZFS)
                    PC Engines APU6B4 - 4 Ports - 4 GB RAM / pfSense+ (Plus) 24.03_1 Release (ZFS)

                    R 1 Reply Last reply Reply Quote 0
                    • R
                      rvdbijl @Dobby_
                      last edited by

                      @dobby_
                      I ended up going with the Dell R210-II system with Xeon 1275v2 CPU. I'll be more than happy to report to this thread once I have some measurements with iperf3 / Speedtest!

                      Dobby_D 1 Reply Last reply Reply Quote 1
                      • S
                        SpaceBass @tman222
                        last edited by

                        @tman222 said in 10 GBit questions:

                        Dell R210-II system should be more than capable

                        In my testing, the R2x series cannot move more than about 1.8-2Gbps ... the CPUs simply max out on single thread routing

                        R 1 Reply Last reply Reply Quote 0
                        • Dobby_D
                          Dobby_ @rvdbijl
                          last edited by

                          @rvdbijl said in 10 GBit questions:

                          Dell R210-II system with Xeon 1275v2 CPU

                          3,5 - 3,9 GHZ
                          CPU 4C/8T
                          AES-NI
                          TurboBoost
                          Hyperthreading

                          May be also an interesting choice! If you will not forced
                          to use PPPoE it can be significant faster then imagine of.

                          #~. @Dobby

                          Turris Omnia - 4 Ports - 2 GB RAM / TurrisOS 7 Release (Btrfs)
                          PC Engines APU4D4 - 4 Ports - 4 GB RAM / pfSense CE 2.7.2 Release (ZFS)
                          PC Engines APU6B4 - 4 Ports - 4 GB RAM / pfSense+ (Plus) 24.03_1 Release (ZFS)

                          1 Reply Last reply Reply Quote 0
                          • R
                            rvdbijl @SpaceBass
                            last edited by

                            @spacebass said in 10 GBit questions:

                            @tman222 said in 10 GBit questions:

                            Dell R210-II system should be more than capable

                            In my testing, the R2x series cannot move more than about 1.8-2Gbps ... the CPUs simply max out on single thread routing

                            What CPU did you test with on the R210-ii?

                            S 1 Reply Last reply Reply Quote 0
                            • S
                              SpaceBass @rvdbijl
                              last edited by

                              @rvdbijl 1270 v5

                              R 1 Reply Last reply Reply Quote 0
                              • R
                                rvdbijl @SpaceBass
                                last edited by

                                @spacebass
                                The v5's work on the R210-ii? From what I read, it only supports up to the E3-12xx v2 series ...

                                R 1 Reply Last reply Reply Quote 0
                                • R
                                  rvdbijl @rvdbijl
                                  last edited by

                                  To wrap up this post -- I have my R210-II in with the E3-1275v2 CPU and 16GB RAM. I loaded pfSense, restored my backup and replaced the old i7 box with this one. I haven't tested with a 10G to 10G connection yet, but the 10G to 2.5G connection on two of my PC's seems to be able to push 2Gbit up and down to my ISP with no issues. I'll do some more benchmarking and post the results in the next few days.

                                  Very happy that this box also seems to use ~50W while running, and is quiet as a mouse (once the fans have done their test when the system boots).

                                  R 1 Reply Last reply Reply Quote 2
                                  • R
                                    rvdbijl @rvdbijl
                                    last edited by

                                    And here are some results --
                                    e2b2e7f7-289a-47f5-840e-753245fbd218-image.png

                                    on a 2.5Gbit NIC and on a 10GBit NIC the performance is identical. I have no (easy) way to get more bandwidth on the WAN side of my connection, so I can't test beyond that. What I can see is how busy the box is while doing this Speedtest:

                                    68899383-8cf7-44b1-b559-33e8be1ab2a4-image.png

                                    Not too shabby .. Looks like there is some more headroom.

                                    Doing iperf3 testing is a bit more .. interesting. On a 10Gbit NIC I see this with pfSense as the client and my 10Gbit NIC PC as the server (the server is a Core i7-10700T running Win10):

                                    d66ee6ea-2b7e-4cc9-a79e-894056d34c08-image.png

                                    And the process running iperf3 is using ~18% CPU:
                                    9262b51d-1eef-415b-93f1-7f9590003463-image.png

                                    The reverse path is worse (pfSense as server, 10Gbit NIC as client):
                                    a550cddd-ab19-41a8-bf37-b463e296d1ef-image.png

                                    With the utilization here:
                                    77281b12-f802-483e-91ee-42f33770c4ed-image.png

                                    That is weird -- why does this path suck up 42-43% of CPU?

                                    Multiple parallel threads don't seem to help here either. I'm guessing that this is some strange artifact of iperf3 running on the pfSense box?

                                    In any case, I don't see this while running a speedtest in either direction (up or down) to my ISP. Solid 2Gbit as they promised. We'll see how this box does if/when my ISP raises speeds again. ;) I may try a VLAN-VLAN routing through this box and see how much data I can push, but that'll require another 10 Gbit NIC which I don't have .. yet .. ;)

                                    Hope this benchmark data is at least helpful to some folks.

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      iperf is deliberately single threaded no matter how many parallel streams you set. To use multiple cores you need to run multiple instances of iperf.
                                      iperf ends traffic from the client to the server by default so you're seeing worse performance when pfSense is receiving. Some of the hardware off-loading options may improve that.

                                      johnpozJ 1 Reply Last reply Reply Quote 0
                                      • johnpozJ
                                        johnpoz LAYER 8 Global Moderator @stephenw10
                                        last edited by johnpoz

                                        @stephenw10 said in 10 GBit questions:

                                        need to run multiple instances of iperf.

                                        Or you could use the new beta that is out..

                                        https://github.com/esnet/iperf/releases/tag/3.13-mt-beta2

                                        iperf3 was originally designed as a single-threaded
                                        program. Unfortunately, as network speeds increased faster than CPU
                                        clock rates, this design choice meant that iperf3 became incapable of
                                        using the bandwidth of the links in its intended operating environment
                                        (high-performance R&E networks with Nx10Gbps or Nx100Gbps network
                                        links and paths).
                                        
                                        We have created a variant of iperf3 that uses a separate thread
                                        (pthread) for each test stream. As the streams run more-or-less
                                        independently, this should remove some of the performance bottlenecks,
                                        and allow iperf3 to perform higher-speed tests, particularly on
                                        100+Gbps paths. This version has recorded transfers as high as 148Gbps
                                        in internal testing at ESnet.
                                        

                                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                                        If you get confused: Listen to the Music Play
                                        Please don't Chat/PM me for help, unless mod related
                                        SG-4860 24.11 | Lab VMs 2.8, 24.11

                                        1 Reply Last reply Reply Quote 2
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Ooo, that's fun.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.