10GbE NIC performance



  • Hi,

    I've installed Pfsense 2.4.4-RELEASE on a Supermicro A2SDi-H-TF, featuring two onboard Intel 10GbE NICs. We installed an additional 1GbE card for the WAN interface, so we have a standard LAN, DMZ & WAN setup, with:
    LAN: 10GbE - ix
    DMZ: 10GbE - ix
    WAN: 1GbE - igb

    Everything working fine, but noticed slow performance between LAN and DMZ.

    The DMZ is an ESXi 6.5 host running various virtual machines. I figured I had misconfigured it somehow, but after installing the iperf add-on to Pfsense and doing some tests, it seems the speed of the embedded 10GbE NICs in Pfsense is not what we were expecting.

    Doing iperf tests between a server on our LAN and Pfsense, we notice that we're getting around 1.0-1.2 Gbps, single stream and standard window size. But, using 8 streams (there's an 8 core Intel C3758 CPU) and a bigger window size, we're getting results near 7 Gbps, maxing out our CPU at 99%.

    So the question is: is that difference between single stream and 8 streams normal? Any way to get single streams to go faster?

    We did some reading on the subject of NIC tuning and tried various things, but none of them had any effect to speak of.

    We did notice, however, that one of our NICs has rxpause, txpause autonegotiated (the side connected to the ESXi DMZ), and the other (connected to the LAN switch) does not. Unsure if this is relevant to the problem.

    ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    	options=e400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
    	ether ac:1f:6b:41:25:b0
    	hwaddr ac:1f:6b:41:25:b0
    	inet6 fe80::ae1f:6bff:fe41:25b0%ix0 prefixlen 64 scopeid 0x3 
    	inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255 
    	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
    	status: active
    ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    	options=e400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
    	ether ac:1f:6b:41:25:b1
    	hwaddr ac:1f:6b:41:25:b1
    	inet6 (deleted ip)%ix1 prefixlen 64 scopeid 0x4 
    	inet (deleted ip) netmask 0xfffffff8 broadcast (deleted ip) 
    	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    	media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
    	status: active
    

    Would be great if anyone could point me in the right direction!



  • Hi @robbevl - a better test would actually be running iperf/iperf3 test across / through the firewall vs. terminating at the firewall. Is it possible for you run a iperf test between two hosts on different subnets? If yes, what performance do you see in that case running a single stream and running multiple streams?

    As a point of reference I see about 3 - 4 Gbit/s across the firewall when running a single stream iperf3 test between two 10Gbit Linux desktops located on different subnets. The hardware I'm using is a 5018D-FN8T with 4 port Chelsio SFP+ add-on card and I have Snort enabled as well on each subnet's interface.


  • Netgate Administrator

    Yes, testing to or from the firewall itself is not a hood test there. pfSense is not optimised to terminal TCP connections.

    Using multiple streams allows the ix driver to use the available queues and hence CPU cores far better.
    Try running vmstat -i to see how the interrupt rate is being shared across the queues.

    7Gbps is pretty good for that CPU IMO. Though I've never tested that exact model myself.

    Steve



  • Hi all, thanks for the replies.

    I can do iperf tests between two hosts, with pfsense in the middle. In fact that's what I was doing initially, and faced with some subpar results, I tried the iperf add-on in pfsense to pinpoint the issue.

    Since one of those hosts is a virtual one, and I'm pretty new at ESXi, I figured I must have misconfigured it somehow. So I wanted to see if host1->pfsense and host2->pfsense were any better than host1->host2.

    Both hosts to pfsense were pretty bad, causing me to believe there must be a driver issue - but if I understand correctly, those tests results can not be relied upon? I read somewhere that the iperf add-on can not be used to test firewall performance as the connection is terminated before the firewall, but I figured it could be used to test NICs (if not, what other usage case is there for iperf directly in pfsense?)

    In any case, will do some more testing, but results are looking good, as the actual copy speed over NFS and AFP is better than expected.

    Will get back if I have more info.


  • Netgate Administrator

    Yup testing to the firewall using iperf3 at those speeds will almost always be bad. pfSense is not tuned at all to be a TCP endpoint and the iperf3 version in pfSense/FreeBSD seems to most give worse results anyway.
    It's is however still a very useful test at 1Gb or below. If you're seeing 20Mbps downloads at clients behind the firewall you can test from the firewall to the client and from the firewall to some public iperf server and quickly prove where the problem is.

    At 10G it's useful for proving the connection is good only. You will never see 10Gbps to/from the firewall directly. At least not currently.

    Steve


Log in to reply