Supermicro SuperServer 5018A, igb NIC - very slow inter VLAN routing


  • Hi,

    we recently replaced one of our 2 pfsense firewalls with a Supermicro box using a 5018A Mainboard and its C2758 Atom / Quad Intel igb card on-board, pfsense is at version 2.2.6-RELEASE.
    Unfortunately the inter VLAN throughput is very bad, only around 7-10MB/s even though everything is Gbit cabled and set to autoneg.
    When bypassing pfsense, I get more or less Gbit througput so I can rule out switchport/bad cables etc….

    I had a look at the network card troubleshooting and raised mbufs, limited num_queues to 4 + disabled HT, but with no success.
    I enabled Hardware TCP Segmentation Offload (TSO) and Hardware Large Receive Offload (LRO) - no success.

    I belive the hardware should be identical to what is sold via pfsense shop, so its pretty hard for me to belive this can't be fixed/adjusted.

    Is anyone using a similar hardware and has overcome the limitations or is my hardware probably broken?
    See attached the dmesg.boot and my loader.conf

    Thanks a lot in advance.
    dmesg.boot.txt
    loader.conf.txt


  • we recently replaced one of our 2 pfsense firewalls with a Supermicro box using a 5018A Mainboard and its C2758 Atom / Quad Intel igb card on-board, pfsense is at version 2.2.6-RELEASE.

    Hello, its a really heavy and strong board, only the WAN part is single CPU core threaded at this time,
    the nearly entire rest is more multi CPU core threaded. So what else you are running beside of the pf
    part from pfSense? Squid, Snort, HAVP, ClamAV, pfBlocker-NG,….

    Unfortunately the inter VLAN throughput is very bad, only around 7-10MB/s

    How many VLANs are there 10, 20, 50 or 500?
    And what is MB/s meaning? If you got a throughput from 10 MB/s (MegaByte) that means in real
    you got there: 10 MB/s * 8 = 80 MBit/s, because 8 Bit are one 1 Byte!

    even though everything is Gbit cabled and set to autoneg.

    Can it be that the autoneg. is the real problem?

    When bypassing pfsense, I get more or less Gbit througput so I can rule out switchport/bad cables etc….

    A Layer3 Switch that is able to route between the VLANs should do the trick perhaps!

    • Cisco SG300-26 (Layer3 and rich feature set)
    • D-Link DGS1510-24 (Layer3 & 2 x SFP+)

    I had a look at the network card troubleshooting and raised mbufs, limited num_queues to 4 + disabled HT, but with no success.

    • Please enable HT again, as explained before only the PPPoE WAN part is single cpu core oriented not the rest
    • The mbuf size to a half or one million would be good
    • only 4 queues is a little bit less, 8 Cores * 5 LAN Ports = 40 queues and if now the data must be put in
      1/10 of all normal queues it would be a pain for the pfSense.

    I enabled Hardware TCP Segmentation Offload (TSO) and Hardware Large Receive Offload (LRO) - no success.

    You have better to enable at all stations Jumboframe support.

    I belive the hardware should be identical to what is sold via pfsense shop, so its pretty hard for me to belive this can't be fixed/adjusted.

    You are right, but I don´t think you will be able to pimp and tune your pfSense according to your
    hardware like the pfSense development team was doing it for there boards!  ;)

    If you are doing a fresh and full install on a mSATA or SSD drive and you measure then what is going on
    and what throughput you get, you will be seeing more then now, with all features and services activated.
    In many cases i will help a bit to tune only less then more! The common three things would be mostly;

    • Enable TRIM support for mSATA/SSD
    • Enable PowerD (hi adaptive) for TurboBoost or a more balanced system
    • High up the mbuf size to a quarter, half or one million if enough RAM is there
    • in normal the igb(4) dirver will be used for that NICs and don´t need any tunings or pimps

    Is anyone using a similar hardware and has overcome the limitations or is my hardware probably broken?

    For sure many of the customers and users will be using a board such as yours! But pending on the mostly
    different configurations and used or offered services it would be the best to see what helps in each case.


  • Remove hw.igb.num_queues, that's not something that should be set in any recent version (and isn't in any docs that I'm seeing, no reference to it on doc.pfsense.org, if you found that somewhere, please link where so we can fix that).

    Disable TSO and LRO, they're off by default for good reason and should be left that way in nearly every circumstance.


  • Still debugging, but

    @cmb:

    Remove hw.igb.num_queues, that's not something that should be set in any recent version (and isn't in any docs that I'm seeing, no reference to it on doc.pfsense.org, if you found that somewhere, please link where so we can fix that).

    https://calomel.org/freebsd_network_tuning.html found this here


  • why does that link add in tweaks the comment them out ?

  • Netgate

    @bensons:

    Still debugging, but

    @cmb:

    Remove hw.igb.num_queues, that's not something that should be set in any recent version (and isn't in any docs that I'm seeing, no reference to it on doc.pfsense.org, if you found that somewhere, please link where so we can fix that).

    https://calomel.org/freebsd_network_tuning.html found this here

    it's not a host.


  • Any resolution to this?  I have the same hardware and running vlan via LAGG but it's only slow around 35-40MB/sec

    Still not acceptable for this level of hardware.

  • Netgate

    @bensons:

    Hi,

    we recently replaced one of our 2 pfsense firewalls with a Supermicro box using a 5018A Mainboard and its C2758 Atom / Quad Intel igb card on-board, pfsense is at version 2.2.6-RELEASE.
    Unfortunately the inter VLAN throughput is very bad, only around 7-10MB/s even though everything is Gbit cabled and set to autoneg.
    When bypassing pfsense, I get more or less Gbit througput so I can rule out switchport/bad cables etc….

    I had a look at the network card troubleshooting and raised mbufs, limited num_queues to 4 + disabled HT, but with no success.
    I enabled Hardware TCP Segmentation Offload (TSO) and Hardware Large Receive Offload (LRO) - no success.

    I belive the hardware should be identical to what is sold via pfsense shop, so its pretty hard for me to belive this can't be fixed/adjusted.

    Is anyone using a similar hardware and has overcome the limitations or is my hardware probably broken?
    See attached the dmesg.boot and my loader.conf

    Thanks a lot in advance.

    How did you disable HT on a C2000 Atom?


  • @BlueKobold:

    And what is MB/s meaning? If you got a throughput from 10 MB/s (MegaByte) that means in real
    you got there: 10 MB/s * 8 = 80 MBit/s, because 8 Bit are one 1 Byte!

    Well yeah, but either way, if its 10mbit/s or 10mbyte/s (80mbit/s) those are both a far cry away from 1000mbit/s (1% and 8% respectively) so I can see where he might be concerned :p

    What are the CPU loads like on the pfSense box when you do these tests?  It seems unlikely that the CPU is holding you back at these speeds, but checking it to make sure and eliminating it as a contributing cause might not be a bad idea.


  • Was this issue solved? I am experiencing a similar issue on my board and would appreciate information regarding how it was solved.


  • Unsure but it seems a single NIC cannot handle bi-directional throughput.  So if somehow the algorithm is using just 1 NIC for both sending and receiving traffic from 1 VLAN to another, then the maximum would be 50% or 500mbit/sec.  A bit less than that actually.

    To fix this, you have to separate VLANs with different NIC groups.  On my network topology, I have a Server VLAN and many other "User" VLAN which they access the servers.  So putting servers vlan on say 2 NIC then rest of vlan (which I hope you don't access to each other) on 2 other NIC, then the problem is solved somewhat.

    I don't think it's issue of the CPU but somehow the NIC can't handle 2000mbit/sec or not offloading something.

    My ultimate solution was to get 10GbE, with a new D-1537 server, so now I have 2x 10GbE uplink so basically 20GbE throughput available.  My transfers are working fine now, if it wasn't I would be upset at the whole thing.