Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Supermicro A1SRM-2758F board seems to not multi-thread

    Scheduled Pinned Locked Moved Hardware
    14 Posts 5 Posters 1.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      bigjerms
      last edited by

      Yes I have hardware checksum offload enabled.

      Could the problem be in the way I'm testing?  I'm using IPerf with multiple parallel streams.  Would this hash to the same cpu? I'm not familiar with how the load is split across the cpus.

      1 Reply Last reply Reply Quote 0
      • D
        dopey
        last edited by

        Is this over the WAN connection using PPPoE?

        If so it's potentially a limitation of the igb driver and PPPoE.  See:
        https://redmine.pfsense.org/issues/4821

        1 Reply Last reply Reply Quote 0
        • B
          bigjerms
          last edited by

          Its just a static IP on the Wan side. We are testing directly between two laptops with the firewall in between Static IP on the outside with no PPPoE enabled.

          1 Reply Last reply Reply Quote 0
          • D
            dopey
            last edited by

            I missed that you were calculating throughput based on 64byte packets.

            See:
            https://blog.pfsense.org/?p=1866

            Using a bit more pedestrian hardware, such as the C2758 that is for sale on the pfSense store, we find that we can forward at a rate of around 270 Kpps, and with fast forwarding or tryforward, we can obtain 426 Kpps.  A simple SG-2220 will support 123 Kpps until we enable fastforward or tryforward, when we can obtain 217 Kpps.

            Your 300mbps  is not too far off the 426kpps value based on some calculations.

            1 Reply Last reply Reply Quote 0
            • B
              bigjerms
              last edited by

              I read that post that you are quoting.  I thought for some reason that that was single threaded tests and that this would be per cpu.  I'm probably mistaken.

              One of the reasons I thought this was per processor is when I disable cores on my motherboard I get the same results.  So using 8, 4 or 2 cores still gets the exact same results.

              1 Reply Last reply Reply Quote 0
              • D
                dopey
                last edited by

                I guess that's a good point.  I've actually wondered that myself.  i have the mini-itx version of the same board and always wondered just what the additional CPUs really buy in terms pfsense functionality.

                Unfortunately, I'm bit by the PPPoE queue issue so am limited to only ~700mbit/s downloads over my gigabit fiber line so I never really bothered to dig too much further past that.

                1 Reply Last reply Reply Quote 0
                • PippinP
                  Pippin
                  last edited by

                  What happens when you iperf directly between the two laptops? So without pfS in between.
                  Try both directions with same values/parameters for iperf.

                  laptop1-iperfserver –> laptop2-iperfclient
                  laptop2-iperfserver --> laptop1-iperfclient

                  I gloomily came to the ironic conclusion that if you take a highly intelligent person and give them the best possible, elite education, then you will most likely wind up with an academic who is completely impervious to reality.
                  Halton Arp

                  1 Reply Last reply Reply Quote 0
                  • B
                    bigjerms
                    last edited by

                    I'll have to check that.  The other laptop is another user who isn't in the office today or tomorrow.  I'll test that and get back to you.  Kind of silly we didn't test that before.

                    I'll respond with results as soon as I can.

                    1 Reply Last reply Reply Quote 0
                    • D
                      DeLorean
                      last edited by

                      @bigjerms:

                      Yes I have hardware checksum offload enabled.

                      Do you mean that this option checkbox in pfSense is marked or not ?
                      Marked -> hardware checksum offload is disabled
                      Unmarked (default) -> hardware checksum offload is enabled

                      Grtz
                      DeLorean

                      1 Reply Last reply Reply Quote 0
                      • B
                        bigjerms
                        last edited by

                        I have it marked so its disabled.

                        I was able to test the two test systems back to back and found that the results were the same so its not the firewall that is the limiting factor.  Now I have to figure out why two macs back to back have low throughput with the 64byte packets.

                        It looks like the problem may be that iperf 2 and 3 is not multi-threaded so only one cpu is being used on the test boxes.  This would explain why high small packets reduces the throughput.

                        1 Reply Last reply Reply Quote 0
                        • PippinP
                          Pippin
                          last edited by

                          Maybe try the -Z argument?

                          –zerocopy : use a 'zero copy' sendfile() method of sending data. This uses much less CPU.

                          I gloomily came to the ironic conclusion that if you take a highly intelligent person and give them the best possible, elite education, then you will most likely wind up with an academic who is completely impervious to reality.
                          Halton Arp

                          1 Reply Last reply Reply Quote 0
                          • C
                            cmb
                            last edited by

                            Lots of small packets are just difficult to process in general. If you want to fill a 1 Gb pipe with 64 byte frames, you likely need something like netmap to do so, no tool like iperf is going to achieve that rate.

                            With a single stream between a given source and destination, you're not likely going to utilize all the queues on the NIC. Assuming you didn't force it to a single queue (which would be bad), that's likely why you're not getting >1 core utilized.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.