Network card speed limited to 286 MBit



  • I've switched my internet provider and now I'm connected with a fiber cable and should be able to download and upload with 1 Gbit. (www.fiber7.ch)

    I did a few speed tests and noticed that I never get speeds over 250 MBit when routing the traffic with my pfSense firewall. But if I connect a Laptop directly to the internet I get a much higher speeds. So I've tested the internal network speed using iperf and there I also got a maximum speed of 286 MBit. (From my local computer to the internal ip of my firewall)

    The CPU load during the tests is about 50%. (4 cores)
    MBUF usage is at 2% and memory usage is at 11% (2 GB Ram) so this shouldn't be the problem

    My pfSense firewall runs as a virtual machine on a Dell PowerEdge 2950 server using ESXi 5.5. The server has two Broadcom NetXtreme II BCM5708 network cards. (On the virtual machine there are configured as E1000 cards) I've also tried the same iperf speed test with another virtual Ubuntu machine on the same server and there I get much higher speeds. (about 910 Mbit) So my guess is that this is some sort of software problem with my pfSense installation.

    I've already read the https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards page. I've tried checking the "Disable hardware checksum offload" and added the following lines to /boot/loader.conf.local:

    kern.ipc.nmbclusters="131072"
    hw.bce.tso_enable=0
    hw.pci.enable_msix=0
    hw.em.fc_setting=0

    Sadly one of those settings changed the result of the internal speed test. What else could I try to get higher speeds?



  • i've never had issues with the e1000 nics to get towards gigabit speeds, so there is probably some sort of issue (autonegotiate ?)

    you could try the vmxnet3 nics … they should be able to push you towards 2-3gbit on fairly recent server hardware.

    third option would be to test the 2.3-RC snapshots



  • Comparing a Linux distro without doing NAT and performing firewall rules against pfSense would be not really
    fair and wise! What kind of CPU is there inside of this Dell machine? How many rules, vlans and other things
    are running in pfSense or plain how many packets are installed?



  • @heper:

    i've never had issues with the e1000 nics to get towards gigabit speeds, so there is probably some sort of issue (autonegotiate ?)

    you could try the vmxnet3 nics … they should be able to push you towards 2-3gbit on fairly recent server hardware.

    third option would be to test the 2.3-RC snapshots

    The server has two Broadcom NetXtreme II BCM5708 network cards. (On the virtual machine there are configured as E1000 cards)

    They're Broadcom NICs exposed as e1000s via VMWare. Sounds like software NIC emulation.



  • @Harvy66:

    @heper:

    i've never had issues with the e1000 nics to get towards gigabit speeds, so there is probably some sort of issue (autonegotiate ?)

    you could try the vmxnet3 nics … they should be able to push you towards 2-3gbit on fairly recent server hardware.

    third option would be to test the 2.3-RC snapshots

    The server has two Broadcom NetXtreme II BCM5708 network cards. (On the virtual machine there are configured as E1000 cards)

    They're Broadcom NICs exposed as e1000s via VMWare. Sounds like software NIC emulation.

    yes i know, and i've never had issues reaching gbit speeds with the legacy_e1000_software_emulation on esxi, on server-class hardware thats less then 5 yrs old.



  • @heper:

    i've never had issues with the e1000 nics to get towards gigabit speeds, so there is probably some sort of issue (autonegotiate ?)

    you could try the vmxnet3 nics … they should be able to push you towards 2-3gbit on fairly recent server hardware.

    third option would be to test the 2.3-RC snapshots

    The link is up as "1000baseT <full-duplex>" so the autonegotiate seams to be right.

    I've tried to switch the network adapters to vmxnet3. During the bootup I've configured the new interfaces as lan and wan, but after that I was unable to reach the webinterface and the ssh port. I was able to ping the firewall, but everything else was blocked. Also a ping from the firewall to the local network was possible. Is this a known problem after changing to this network card type and how can I solve it?

    I've also tested the 2.3-RC snapshots. With that I got to 325 Mbit, not much more than in 2.2. Any other ideas?</full-duplex>



  • @BlueKobold:

    Comparing a Linux distro without doing NAT and performing firewall rules against pfSense would be not really
    fair and wise! What kind of CPU is there inside of this Dell machine? How many rules, vlans and other things
    are running in pfSense or plain how many packets are installed?

    Here are the details about the CPU. (from CPU-Z)
    Name: Intel Xeon DP
    Codename: Dempsey
    Package: Socket 771 LGA
    Specification: Intel(R) Xeon(TM) CPU 3.00GHz (ES)
    Hyper Threading is deactivated

    Here are the details about my pfSense installation:

    Rules:
    8 wan rules
    5 lan rules
    1 OpenVPN rule

    Vlans:
    no vlans

    Running services:
    apinger
    dhcpd
    dnsmasq
    HAProxy
    miniupnpd
    ntpd
    openvpn
    squid
    sshd
    vmware-guestd

    Installed Packages:
    Cron
    haproxy-1_5
    iperf
    Lightsquid
    mailreport
    Open-VM-Tools
    OpenVPN Client Export Utility
    Sarg
    squid3
    System Patches

    Any idea what could slow down the transfer?



  • any errors in logs? any interface errors? tried creating a new VM ? might there be an issue with the vswitches ?



  • @heper:

    any errors in logs? any interface errors? tried creating a new VM ? might there be an issue with the vswitches ?

    There are no "In/out errors" and no "Collisions" on the network interfaces.

    I've checked the system log since the last restart and the only error message I found was for the external interface:
    kernel: arpresolve: can't allocate llinfo for XXX.XXX.XXX.XXX on em1

    I don't think that there is an issue with the vm or the vswitches because I also tested the speed with iperf on an Ubuntu vm on the same server with the same network card type that is connected to the same vswitch and there I get speeds beyond 900 MBit.



  • have you tried without (open)vmware-tools ?



  • Any idea what could slow down the transfer?

    Squid perhaps does or could be doing it.

    have you tried without (open)vmware-tools ?

    For sure they should be installed.

    I don't think that there is an issue with the vm or the vswitches because I also tested the speed with iperf on an Ubuntu vm on the same server with the same network card type that is connected to the same vswitch and there I get speeds beyond 900 MBit.

    Ubuntu, is not doing NAT, pfSense rules and Squid proxy so this could be really different from the pfSense
    test as I see it right.



  • For sure they should be installed.

    no they shouldn't, they are optional



  • @BlueKobold:

    Any idea what could slow down the transfer?

    Squid perhaps does or could be doing it.

    have you tried without (open)vmware-tools ?

    For sure they should be installed.

    I don't think that there is an issue with the vm or the vswitches because I also tested the speed with iperf on an Ubuntu vm on the same server with the same network card type that is connected to the same vswitch and there I get speeds beyond 900 MBit.

    Ubuntu, is not doing NAT, pfSense rules and Squid proxy so this could be really different from the pfSense
    test as I see it right.

    I've disabled squid and removed the open vm tools package, but the speed is still under 300 MBit. I've also shut down every other machine on the ESXi server so pfSense can take all the resources, but event that did not help. I've noticed that the CPU speed doesn't go much over 50%. (55 was max as far as I've seen) Maybe the problem is that the firewall cannot use all four cpu cores?



  • What does system activity look like? If you're seeing something like 100% usage, then you're effectively CPU bound.



  • @Harvy66:

    What does system activity look like? If you're seeing something like 100% usage, then you're effectively CPU bound.

    Here is the output of top when running the iperf test:

    
    last pid: 32033;  load averages:  1.60,  0.74,  0.38    up 1+08:50:47  10:27:11
    49 processes:  1 running, 48 sleeping
    CPU:  3.2% user,  0.0% nice, 49.5% system,  0.1% interrupt, 47.2% idle
    Mem: 17M Active, 130M Inact, 171M Wired, 175M Buf, 1642M Free
    Swap: 4096M Total, 4096M Free
    
      PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAN
    77954 root             7  30    0 51952K  4528K sbwait  0   1:12  60.94% iperf
    26538 www              1  20    0 33080K 10860K kqread  3   4:05   0.98% haprox
    69605 root             1  52   20 17136K  2692K wait    2   0:44   0.98% sh
    30253 root             1  52   20  8304K  1956K nanslp  2   0:00   0.98% sleep
    21880 root             1  20    0    99M  9288K select  0   4:09   0.00% vmtool
    27828 root             1  20    0 12456K  2128K select  1   0:52   0.00% apinge
    62032 root             1  20    0 21160K  4740K select  0   0:40   0.00% miniup
    44483 nobody           1  20    0 30264K  4324K select  1   0:23   0.00% dnsmas
    54025 proxy            1  20    0   120M 29184K kqread  2   0:21   0.00% squid
    40422 root             1  20    0 48692K  7892K kqread  1   0:17   0.00% lightt
    19650 root             1  20    0 16804K  2292K bpf     1   0:17   0.00% filter
      245 root             1  20    0   229M 21844K kqread  3   0:14   0.00% php-fp
    69305 root             1  20    0 21732K  6136K select  1   0:13   0.00% openvp
    65449 dhcpd            1  20    0 24848K 13832K select  1   0:09   0.00% dhcpd
     3672 root             1  20    0 28344K 18104K select  3   0:09   0.00% ntpd
    46979 root             1  20    0 14648K  2408K select  0   0:07   0.00% syslog
    27862 root             1  20    0 28328K  2952K piperd  2   0:03   0.00% rrdtoo
    
    

    I never saw that more than 64% of the cpu where used when pfsense was running the iperf server.

    I've now tried switching iperf server and client so that pfsense is just the client and with that I am getting more speed. (about 433 MBit) I also get more CPU load. (about 90% at max) I think this means the limiting factor is the cpu.

    I' am thinking about ordering a separate pfsense hardware firewall. (e.g. the "SG-4860 1U pfSense® Security Gateway Appliance") This firewall should get me near the 1 GBit throughoutput or is there better hardware for pfSense?



  • It typically comes down to hardware or configuration. My home PFSense box is getting about 3.9Gb/s at 15% load. That's tested with one client on the WAN connecting to a client in the LAN, and running iperf through the firewall, which also means NAT is going on. That was nearly 2Gb on the WAN port and another 2Gb on the LAN port.

    I am using a Haswell i5 with an Intel i350-T2 NIC and running on the metal, no guest VM here.



  • I've noticed that the CPU speed doesn't go much over 50%. (55 was max as far as I've seen) Maybe the problem is that the firewall cannot use all four cpu cores?

    At the WAN Port and using PPPoE it would be using only one single CPU core at the moment and not more.

    I've now tried switching iperf server and client so that pfsense is just the client and with that I am getting more speed. (about 433 MBit) I also get more CPU load. (about 90% at max) I think this means the limiting factor is the cpu.

    If I had to guess, you're being limited by your RAM speed more than anything else.

    That's not quite how it works. The packet filter, the IP forwarding parts, and even NAT
    (part of pf, but run at a different phase) all hit the memory system.

    It's likely not that your CPU can't keep up, it's that your memory system is saturated.

    I' am thinking about ordering a separate pfsense hardware firewall. (e.g. the "SG-4860 1U pfSense® Security Gateway Appliance") This firewall should get me near the 1 GBit throughoutput or is there better
    hardware for pfSense?

    It is likes it is for now, also there will only one CPU core be used for the entire WAN part, if PPPoE is in usage.

    I am using a Haswell i5 with an Intel i350-T2 NIC and running on the metal, no guest VM here.

    Will be a more strong and more powerful appliance then the older Intel Xeon CPUs and also with faster RAM
    I would imagine and on top of this an Intel i5 core will be not the same as an lower end Intel Atom CPU or SoC
    core that shoud be compared against. The Intel Core i5 CPU core is much more powerful the the other ones.
    But if now, someone wnat to save eceltric power it could be a hint to go with a modern Intel Xeon E3-12xxv3
    CPU with 4 CPU core running @3,xGHz or more to get the same results and with a new v5 one it could also
    be used RAM with more Clock speed or frequency.



  • @BlueKobold:

    I've noticed that the CPU speed doesn't go much over 50%. (55 was max as far as I've seen) Maybe the problem is that the firewall cannot use all four cpu cores?

    At the WAN Port and using PPPoE it would be using only one single CPU core at the moment and not more.

    I've now tried switching iperf server and client so that pfsense is just the client and with that I am getting more speed. (about 433 MBit) I also get more CPU load. (about 90% at max) I think this means the limiting factor is the cpu.

    If I had to guess, you're being limited by your RAM speed more than anything else.

    That's not quite how it works. The packet filter, the IP forwarding parts, and even NAT
    (part of pf, but run at a different phase) all hit the memory system.

    It's likely not that your CPU can't keep up, it's that your memory system is saturated.

    I' am thinking about ordering a separate pfsense hardware firewall. (e.g. the "SG-4860 1U pfSense® Security Gateway Appliance") This firewall should get me near the 1 GBit throughoutput or is there better
    hardware for pfSense?

    It is likes it is for now, also there will only one CPU core be used for the entire WAN part, if PPPoE is in usage.

    I am using a Haswell i5 with an Intel i350-T2 NIC and running on the metal, no guest VM here.

    Will be a more strong and more powerful appliance then the older Intel Xeon CPUs and also with faster RAM
    I would imagine and on top of this an Intel i5 core will be not the same as an lower end Intel Atom CPU or SoC
    core that shoud be compared against. The Intel Core i5 CPU core is much more powerful the the other ones.
    But if now, someone wnat to save eceltric power it could be a hint to go with a modern Intel Xeon E3-12xxv3
    CPU with 4 CPU core running @3,xGHz or more to get the same results and with a new v5 one it could also
    be used RAM with more Clock speed or frequency.

    I'am not using PPPoE. I have a direct internet connection over ethernet. (Using a fiber to ethernet media converter)
    I've done some ram speed tests with my Ubuntu VM:

    
    root@ubuntux64:~# mbw 32 | grep AVG
    AVG	Method: MEMCPY	Elapsed: 0.04285	MiB: 32.00000	Copy: 746.781 MiB/s
    AVG	Method: DUMB	Elapsed: 0.04170	MiB: 32.00000	Copy: 767.351 MiB/s
    AVG	Method: MCBLOCK	Elapsed: 0.02452	MiB: 32.00000	Copy: 1305.249 MiB/s
    root@ubuntux64:~# mbw -b 4096 32 | grep AVG
    AVG	Method: MEMCPY	Elapsed: 0.04103	MiB: 32.00000	Copy: 779.965 MiB/s
    AVG	Method: DUMB	Elapsed: 0.04168	MiB: 32.00000	Copy: 767.845 MiB/s
    AVG	Method: MCBLOCK	Elapsed: 0.02514	MiB: 32.00000	Copy: 1273.080 MiB/s
    
    

    Is my ram to slow to get the gigabit through output?



  • Update:

    I'am now running a pfSense Firewall on a Dell PowerEdge R220 using this fiber card: https://www.startech.com/ch/Netzwerk-IO/Adapter-Karten/PCIe-Gigabit-Ethernet-LWL-Karte-Offen-SFP~PEX1000SFP2

    I now got almost Gigabit througoutput. (about 940 MBits) The hardware works very good with pfSense.


Log in to reply