Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Supermicro A1SRM-2758F board seems to not multi-thread

    Scheduled Pinned Locked Moved Hardware
    14 Posts 5 Posters 1.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dopey
      last edited by

      Is this over the WAN connection using PPPoE?

      If so it's potentially a limitation of the igb driver and PPPoE.  See:
      https://redmine.pfsense.org/issues/4821

      1 Reply Last reply Reply Quote 0
      • B
        bigjerms
        last edited by

        Its just a static IP on the Wan side. We are testing directly between two laptops with the firewall in between Static IP on the outside with no PPPoE enabled.

        1 Reply Last reply Reply Quote 0
        • D
          dopey
          last edited by

          I missed that you were calculating throughput based on 64byte packets.

          See:
          https://blog.pfsense.org/?p=1866

          Using a bit more pedestrian hardware, such as the C2758 that is for sale on the pfSense store, we find that we can forward at a rate of around 270 Kpps, and with fast forwarding or tryforward, we can obtain 426 Kpps.  A simple SG-2220 will support 123 Kpps until we enable fastforward or tryforward, when we can obtain 217 Kpps.

          Your 300mbps  is not too far off the 426kpps value based on some calculations.

          1 Reply Last reply Reply Quote 0
          • B
            bigjerms
            last edited by

            I read that post that you are quoting.  I thought for some reason that that was single threaded tests and that this would be per cpu.  I'm probably mistaken.

            One of the reasons I thought this was per processor is when I disable cores on my motherboard I get the same results.  So using 8, 4 or 2 cores still gets the exact same results.

            1 Reply Last reply Reply Quote 0
            • D
              dopey
              last edited by

              I guess that's a good point.  I've actually wondered that myself.  i have the mini-itx version of the same board and always wondered just what the additional CPUs really buy in terms pfsense functionality.

              Unfortunately, I'm bit by the PPPoE queue issue so am limited to only ~700mbit/s downloads over my gigabit fiber line so I never really bothered to dig too much further past that.

              1 Reply Last reply Reply Quote 0
              • PippinP
                Pippin
                last edited by

                What happens when you iperf directly between the two laptops? So without pfS in between.
                Try both directions with same values/parameters for iperf.

                laptop1-iperfserver –> laptop2-iperfclient
                laptop2-iperfserver --> laptop1-iperfclient

                I gloomily came to the ironic conclusion that if you take a highly intelligent person and give them the best possible, elite education, then you will most likely wind up with an academic who is completely impervious to reality.
                Halton Arp

                1 Reply Last reply Reply Quote 0
                • B
                  bigjerms
                  last edited by

                  I'll have to check that.  The other laptop is another user who isn't in the office today or tomorrow.  I'll test that and get back to you.  Kind of silly we didn't test that before.

                  I'll respond with results as soon as I can.

                  1 Reply Last reply Reply Quote 0
                  • D
                    DeLorean
                    last edited by

                    @bigjerms:

                    Yes I have hardware checksum offload enabled.

                    Do you mean that this option checkbox in pfSense is marked or not ?
                    Marked -> hardware checksum offload is disabled
                    Unmarked (default) -> hardware checksum offload is enabled

                    Grtz
                    DeLorean

                    1 Reply Last reply Reply Quote 0
                    • B
                      bigjerms
                      last edited by

                      I have it marked so its disabled.

                      I was able to test the two test systems back to back and found that the results were the same so its not the firewall that is the limiting factor.  Now I have to figure out why two macs back to back have low throughput with the 64byte packets.

                      It looks like the problem may be that iperf 2 and 3 is not multi-threaded so only one cpu is being used on the test boxes.  This would explain why high small packets reduces the throughput.

                      1 Reply Last reply Reply Quote 0
                      • PippinP
                        Pippin
                        last edited by

                        Maybe try the -Z argument?

                        –zerocopy : use a 'zero copy' sendfile() method of sending data. This uses much less CPU.

                        I gloomily came to the ironic conclusion that if you take a highly intelligent person and give them the best possible, elite education, then you will most likely wind up with an academic who is completely impervious to reality.
                        Halton Arp

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmb
                          last edited by

                          Lots of small packets are just difficult to process in general. If you want to fill a 1 Gb pipe with 64 byte frames, you likely need something like netmap to do so, no tool like iperf is going to achieve that rate.

                          With a single stream between a given source and destination, you're not likely going to utilize all the queues on the NIC. Assuming you didn't force it to a single queue (which would be bad), that's likely why you're not getting >1 core utilized.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.