Poor performance on igb driver



  • Hello,

    I'm running pfSense 2.4.3-p1, on a Qotom board with the following specs:

    Intel(R) Celeron(R) CPU J1900 @ 1.99GHz
    4 CPUs: 1 package(s) x 4 core(s)
    8 GB RAM
    120 GB SATA SSD
    4 x Intel I211AT network ports.

    The network ports are served by the igb driver.
    I have a 1Gbps (best effort) fibre optics from my provider, but the router WAN is performing quite poorly.
    When connected directly to the mediaconverter, I'm getting 750 Mbps, which is fair considering the terms provided by the ISP (850 Mbps average, 940 Mbps maximum).

    When connected to the LAN port of pfSense, I'm getting only 500 Mbps, give or take 50 Mbps, with CPU going to 30% during tests.
    Both the WAN and LAN are using SFTP, Cat 6a cables.
    Now, are there any settings that I can use to improve the speed?

    FreeBSD always boasted on high speed with Intel cards, so.. what gives?

    Thanks.


  • Netgate Administrator

    How are you testing the throughput?

    Check the CPU loading across the cores by running at the CLI top -aSH during the test.

    Check the Status > Interfaces page for any errors on the either interface.

    Steve



  • The quad core CPU might have one core at 100% and that is what's being used on your test, if you run parallel connections your total throughput might be higher.



  • I'm testing using a Speedtest client running on Windows 10 Pro, that connects to a server in the ISP network.

    There are no In/Out Errors, nor Collisions, recorded in the Interfaces page.

    I tried running the client from 2 different computers, at the same time.
    Both clients got about half of the total speed, 340 and 317 Mbps.

    Re: CPU loading across the cores, I saw that not all cores were used. Maybe only 2 were used, with 20% and 80% load, while the others were on zero.



  • @bdaniel7 said in Poor performance on igb driver:

    Intel(R) Celeron(R) CPU J1900 @ 1.99GHz

    If the test was done via HTTPS it seems that this is the most your non AES-NI capable CPU can do, try using HTTP to confirm this please, otherwise I would have no other ideas.

    Cheers.



  • @marcop
    I ran the test using an exe, not from browser, so I don't know if it was on HTTPS.



  • Are you running any traffic shaping or anything else?




  • Netgate Administrator

    I would expect a J1900 to pass that fairly easily in normal test conditions.

    Can we see the output from top when the test is running?

    Steve



  • @Birke

    I have set most of the settings from that tunning page.
    The only thing I didn't set was hw.igb.num_queues=1, I have it set to 0.

    I tried with Hardware Checksum Offloading, Hardware TCP Segmentation Offloading, Hardware Large Receive Offloading disabled and enabled, I didn't notice any difference.

    @Animosity022
    No traffic shaping.

    I have the following services enabled:

    dhcpd
    dpinger
    ntpd
    openvpn OpenVPN server: Home LAN
    openvpn_2 OpenVPN client:
    sshd
    syslogd
    unbound

    @stephenw10

    How do I record/export the live output from top?
    Should I record a video?


  • Netgate Administrator

    You shouldn't need to set the igb queues to 1 any longer. That was a bug in much older versions.

    Just hit q in top when it's showing something useful and it will quit out and leave whatever was there available to copy and paste out.

    Are you routing traffic over OpenVPN?

    Steve



  • Hi @bdaniel7

    I also agree that the CPU should be able to handle 1Gbit speeds fairly easily, especially if you are not trying run any IDS/IPS on top regular kernel packet processing.

    FreeBSD's network defaults aren't tuned too well for very high speed connections by default (although this is getting better in newer versions). Here is a link to a thread with some more parameters you can tune on your Intel NIC's:

    https://forum.netgate.com/topic/117072/dsl-reports-speed-test-causing-crash-on-upload

    Of those parameters, I"d probably adjust the RX/TX descriptors and processing limits first and see if that yields any improvements.

    Hope this helps.



  • I'm only using OpenVPN to access the internal network from outside.
    Which is happening when I'm at the office.

    0_1535194760006_top.jpg


  • Netgate Administrator

    How are you testing when that is shown? What is connected to igb0 and igb1?

    Is the CPU actually running at 1.9GHz? Do you have powerd enabled?

    Try running sysctl dev.cpu.0.freq when the test is running.

    Steve



  • igb0 is WAN, igb1 is LAN.

    I'm starting top -aSH as you suggested, then during the peak transfer, I exit from top with q.

    I had powerD enabled, with all (AC power, Battery power, Unknown power) set to Maximum.
    I disabled powerD but there is no difference.

    And I get this sysctl: unknown oid 'dev.cpu.0.freq'



  • @tman222 said in Poor performance on igb driver:

    Hi @bdaniel7

    I also agree that the CPU should be able to handle 1Gbit speeds fairly easily, especially if you are not trying run any IDS/IPS on top regular kernel packet processing.

    FreeBSD's network defaults aren't tuned too well for very high speed connections by default (although this is getting better in newer versions). Here is a link to a thread with some more parameters you can tune on your Intel NIC's:

    https://forum.netgate.com/topic/117072/dsl-reports-speed-test-causing-crash-on-upload

    Of those parameters, I"d probably adjust the RX/TX descriptors and processing limits first and see if that yields any improvements.

    Hope this helps.

    Hi @bdaniel7 - have you also tried tuning some of the additional parameters that I suggested? If yes, what were the results?


  • Netgate Administrator

    Sorry I meant where are you testing between? Speedtest client on igb1 connecting to a server via igb0?

    Steve



  • @stephenw10
    Yes, the mediaconverter is connected to igb0, my Windows 10 client is connected to the igb1 port.


  • Netgate Administrator

    I don't see it having been asked so, are you connecting using PPPoE?

    Steve



  • @stephenw10
    Yes, I'm using PPPoE.


  • Netgate Administrator

    Ah, then that is the cause of the problem. You can see that all the loading is on one queue and hence one CPU core while the others are mostly idle. It's unfortunately a known issue with PPPoE in FreeBSD/pfSense right now.
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203856

    However there is something you can do to mitigate it to some extent, set:
    sysctl net.isr.dispatch=deferred

    You can add that as a system tunable in System > Advanced if it makes a significant difference.

    Be aware that doing so may negatively impact some other things, ALTQ traffic shaping in particular.

    Steve



  • Thank you for the clarification.
    I should've stated from the beginning that I'm on PPPoE.
    I added the net.isr.dispatch setting, but I don't have any improvements in speed.

    I am now evaluating which option is cheaper and faster, buying a different board, with other (Intel) cards and keeping pfSense, or moving to Linux.



  • These are my settings, by the way:

    hw.igb.fc_setting=0
    hw.igb.rxd="4096"
    hw.igb.txd="4096"
    net.link.ifqmaxlen="8192"
    hw.igb.max_interrupt_rate="64000"
    hw.igb.rx_process_limit="-1"
    hw.igb.tx_process_limit="-1"
    hw.igb.0.fc=0
    hw.igb.1.fc=0
    net.isr.defaultqlimit=4096
    net.isr.dispatch=deferred
    net.pf.states_hashsize="2097152"
    net.pf.source_nodes_hashsize="65536"
    hw.igb.enable_msix: 1
    hw.igb.enable_aim: 1


  • Netgate Administrator

    Hmm, you should see some improvement in speed with that setting. You may need to restart the ppp session or at least clear the firewall state. Or reboot if it's being applied by system tunables.

    Steve



  • @bdaniel7 said in Poor performance on igb driver:

    These are my settings, by the way:

    hw.igb.fc_setting=0
    hw.igb.rxd="4096"
    hw.igb.txd="4096"
    net.link.ifqmaxlen="8192"
    hw.igb.max_interrupt_rate="64000"
    hw.igb.rx_process_limit="-1"
    hw.igb.tx_process_limit="-1"
    hw.igb.0.fc=0
    hw.igb.1.fc=0
    net.isr.defaultqlimit=4096
    net.isr.dispatch=deferred
    net.pf.states_hashsize="2097152"
    net.pf.source_nodes_hashsize="65536"
    hw.igb.enable_msix: 1
    hw.igb.enable_aim: 1

    I recently went through the process if identifying the performance culprit on the Intel NICs using a Lanner FW-7525A. It turns out, that for the igb driver, you want hw.igb.enable_msix=0 or hw.pci.enable_msix=0 to nudge the driver towards using msi interrupts over the less-performant MSIX interrupts (suggested here). This made a 4x difference on my system. It is also recommended to disable tso and lso on the igb drivers so include net.inet.tcp.tso=0 as well. Hope this helps.


  • Netgate Administrator

    Hmm, interesting. I wouldn't have expected msi to any better than msix.
    What sort of figures did you see?

    Steve



  • @stephenw10 said in Poor performance on igb driver:

    Hmm, interesting. I wouldn't have expected msi to any better than msix.
    What sort of figures did you see?

    Steve

    Hmmm, I'm back to msix interrupts so that was a red herring. I'm able to fully saturate my 400/20 link (achieve 470/24) with both inbound and outbound firewall rules enabled. Here is my current config that seems to achieve this:

    [2.4.4-RELEASE][root@firewall.home]/root: cat /boot/loader.conf 
    kern.cam.boot_delay=10000
    # Tune the igb driver
    hw.igb.rx_process_limit=800  #100
    hw.igb.rxd=4096  #default 1024
    hw.igb.txd=4096  #default 1024
    # Disable msix interrupts on igb driver either via hw.pci or the narrower hw.igb
    #hw.pci.enable_msix=0   #default 1 (enabled, disable to nudge to msi interrupts)
    #hw.igb.enable_msix=0
    #net.inet.tcp.tso=0  #confirmed redundant with disable in GUI
    #hw.igb.fc_setting=0
    legal.intel_ipw.license_ack=1
    legal.intel_iwi.license_ack=1
    boot_multicons="YES"
    boot_serial="YES"
    console="comconsole,vidconsole"
    comconsole_speed="115200"
    autoboot_delay="3"
    hw.usb.no_pf="1"
    

    Basically, I'm using the defaults other than increasing the igb driver rx_process_limit, rxd and txd. I have disabled tso, lro and checksum offloading via the gui under System->Advanced->Networking (checked means disabled) and set kern.ipc.nmbclusters to 262144 under System->Advanced->Tunables.

    Hardware:

    CPU: Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz (1750.04-MHz K8-class CPU)
      Origin="GenuineIntel"  Id=0x406d8  Family=0x6  Model=0x4d  Stepping=8
      Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
      Features2=0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND>
      AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
      AMD Features2=0x101<LAHF,Prefetch>
      Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
      VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
      TSC: P-state invariant, performance statistics
    

    You might want to go back to the pfSense defaults and enable all networking offloading options are disabled (checked in the GUI), then tweak the igb driver elements as I did above test and then adjust key tunables such as kern.ipc.nmbclusters but more isn't necessarily better.



  • I've just noticed you are on PPPoE, would increasing MSS clamping on the interface or setting MTU at 1492 help.


  • Netgate Administrator

    You should put custom settings in /boot/loader.conf.local to avoid them being overwritten at upgrade. Create that file if it's not there.

    Steve



  • Hi @bdaniel7, any luck on achieving gigabit speeds after your tweaks? I’ve been running into the same issues as you with the same Qotom box.

    Posted about it [here] (https://forum.netgate.com/topic/137196/slow-gigabit-download-on-a-quadcore-intel-celeron-j1900-2-41ghz), and then used the tweaks in this thread.

    Still getting only about 730mbps on wired. 😐



  • @nonconformist
    Hi, nope, I couldn't get any speed higher than 550 Mbps when I tried the tweaks.
    Then I abandoned the subject, due to lack of time.

    I will try the tweaks from the article you posted.



  • any dropped packets?

    netstat -ihw 1



  • Since you have cores waiting, you could try to avoid locks when switching between them with:
    net.isr.bindthreads="1"



  • @marcop Couldn't check through the week so doing this over the weekend. Long story short, no dropped packets.

    net.isr.bindthreads="1"
    

    actually brought the download/upload speeds down to 680/800 from 740/934.

    Reading more about this, it's beginning to look like achieving 1G download isn't possible with the igb0 driver with a PPPoE WAN connection.


  • Netgate Administrator

    Are you guys actually running at 2.4GHz? I asked about cpu freq values earlier and there were none and looking back it appears to be running at 2GHz.

    If that's true then you need to enable speedstep in the BIOS and make sure it's loading.
    You should see some values reported by:
    sysctl dev.cpu.0.freq_levels

    Steve


  • Netgate Administrator

    The dashboard should show:

    Intel(R) Celeron(R) CPU J1900 @ 1.99GHz
    Current: 1992 MHz, Max: 1993 MHz
    4 CPUs: 1 package(s) x 4 core(s)
    AES-NI CPU Crypto: No 
    

    Sysctl something like:

    dev.cpu.0.freq_levels: 1993/2000 1992/2000 1909/1825 1826/1650 1743/1475 1660/1300 1577/1125 1494/950 1411/775 1328/600
    

    Otherwise you're seeing a 20% performance reduction.

    Steve



  • @stephenw10 Didn't find a setting on the BIOS to enable Speedstep.

    Nevertheless, it seems like the Celeron J1900 QOTOM box cannot achieve gigabit down/up from what I've gathered from a fair bit of research online. Read numerous threads about some limits with the PCIe lanes on this hardware which mean achieving gigabit is a long shot, and definitely not practical with PfblockerNG and Snort/Suricata and VPN running. I'm giving up on trying to get this box to do it. This leaves me with a tricky decision as a home user with about 35 devices behind PFSense:

    • Should I upgrade to better hardware, considering the lack of gigabit speeds and the fact that this processor doesn't support AES-NI anyway?

    • Does it make sense to upgrade at all, given that the practical applications of gigabit speeds are few, if any at this point in time?

    Just out of curiosity - if I were to upgrade and get one of the official Netgate boxes, should it be looking at the SG-1000, SG-3100 or SG-5100?


  • Netgate Administrator

    You would need either the SG-3100 or SG-5100. The 5100 has significantly more processing power though, you could run VPN packages etc and still push Gigabit.

    I'm curious though, do you not see any CPU frequency levels reported even after enabling powerd?

    Steve