Intel IX only using single queue



  • Hi All,

    Before I begin, I'm in no way an expert with FreeBSD or hardware configuration within. Most of the below, I've pieced together via Google, so please bear with me as there may be mistakes, or incorrect assumptions made.

    I'm having an issue where the Intel IX driver only seems to be using a single queue. I believe this is causing high CPU load on a single core - Please correct me if I'm wrong.

    Running "top -aSH" shows the following line. I believe this corresponds to interrupts against the NIC, causing the CPU core in question to wait for response. This can reach ~100% under higher loads.

    12 root       -92    -     0K  1024K WAIT    2  23:50  34.02% [intr{irq258: ix0}]
    

    Running "sysctl dev.ix | grep queue", seems to suggest that there's only one queue in use for this NIC:

    dev.ix.0.queue0.lro_flushed: 0
    dev.ix.0.queue0.lro_queued: 0
    dev.ix.0.queue0.rx_discarded: 0
    dev.ix.0.queue0.rx_copies: 598437
    dev.ix.0.queue0.rx_bytes: 1618839559
    dev.ix.0.queue0.rx_packets: 1672247
    dev.ix.0.queue0.rxd_tail: 1078
    dev.ix.0.queue0.rxd_head: 1080
    dev.ix.0.queue0.br_drops: 0
    dev.ix.0.queue0.tx_packets: 276414
    dev.ix.0.queue0.no_desc_avail: 0
    dev.ix.0.queue0.no_tx_dma_setup: 0
    dev.ix.0.queue0.tso_tx: 0
    dev.ix.0.queue0.txd_tail: 284
    dev.ix.0.queue0.txd_head: 284
    dev.ix.0.queue0.irqs: 855954
    dev.ix.0.queue0.interrupt_rate: 978
    

    Running "vmstat -i | grep ix0" also seems to confirm this, with only the below line returned:

    irq258: ix0                      4766336        827
    

    Running "sysctl hw.ix" shows the driver does appear to be configured to use 8 queues.:

    hw.ix.enable_rss: 1
    hw.ix.enable_legacy_tx: 0
    hw.ix.enable_fdir: 0
    hw.ix.unsupported_sfp: 0
    hw.ix.rxd: 2048
    hw.ix.txd: 2048
    hw.ix.num_queues: 8
    hw.ix.enable_msix: 1
    hw.ix.advertise_speed: 0
    hw.ix.flow_control: 3
    hw.ix.tx_process_limit: 256
    hw.ix.rx_process_limit: 256
    hw.ix.max_interrupt_rate: 31250
    hw.ix.enable_aim: 1
    

    I've also tried forcing this using both System Tunables and /boot/loader.conf.local, although, considering sysctl already shows the driver is configured for 8 queues, I somewhat doubt this is likely to make any difference.

    Not sure if this is useful, but specific device info for this NIC is as follows:

    dev.ix.0.%pnpinfo: vendor=0x8086 device=0x10fb subvendor=0x103c subdevice=0x17d3 class=0x020000
    dev.ix.0.%location: slot=0 function=0 dbsf=pci0:3:0:0 handle=\_SB_.PCI0.PE40.S1F0
    dev.ix.0.%driver: ix
    dev.ix.0.%desc: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.2.12-k
    

    Any feedback on the above would be hugely appreciated.

    Thanks.


  • Netgate Administrator

    Hmm, odd unless you only have a single core CPU there, which seems unlikely.

    Appears to be this: http://pci-ids.ucw.cz/read/PC/8086/10fb/103c17d3
    Not especially uncommon.

    Even if you are using PPPoE over that I would expect all the traffic on one queue but the other queues to still exist.

    Do you see errors at boot trying to create the queues?

    Steve



  • Thanks Steve,

    I think you might've linked to the wrong page. I'm assuming you meant to link to a discussion related to the following: https://redmine.pfsense.org/issues/4821

    We don't use PPPoE at all on our firewalls. We do, however, make extensive use of QinQs, which I know are a bit "hacky" in FreeBSD, using ngctl. Would this have any impact, do you think? Having said that, despite my limited knowledge of the subject, I would've thought the queues would still be created and just not used, as you've mentioned above.

    The firewall in question has 8 CPU cores, so this is definitely not the issue.

    I've taken a look at dmesg and can only see the following related to the IX driver:

    ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.2.12-k> port 0x4000-0x401f mem 0xfd300000-0xfd3fffff,0xfd4fc000-0xfd4fffff irq 19 at device 0.0 on pci3
    ix0: Using an MSI interrupt
    ix0: Ethernet address: 48:df:37:01:c6:04
    ix0: PCI Express Bus: Speed 5.0GT/s Unknown
    ix0: netmap queues/slots: TX 1/2048, RX 1/2048
    

    This again seems to confirm that only one queue is being created. This does at least appear to suggest the issue occurs before the pfSense config comes into play.

    Thanks.



  • Hi @ChrisPSD - Looking at the dmesg output it appears the card is using PCIe 2.0 and limiting itself to MSI instead of MSI-X.
    Is the card you are using indeed just a 2.0 card? Or is it a 3.0 card that is running in a 2.0 PCIe slot? Also, what does the output look like of "netstat -Q"? Hope this helps.


  • Netgate Administrator

    I linked to that just to identify the card. The PPPoE issue is known to limit queues to 1 in FreeBSD but as mentioned it doesn't prevent the other queues being created. Nor would using netgraph as far as I know.

    Steve



  • Thanks @tman222. This is a PCI-E 2.0 card, according to the datasheet, although it's also listed as supporting MSI-X, so I'm not really sure of the implications here.

    Output of "netstat -Q":

    Configuration:
    Setting                        Current        Limit
    Thread count                         8            8
    Default queue limit                256        10240
    Dispatch policy                 direct          n/a
    Threads bound to CPUs         disabled          n/a
    
    Protocols:
    Name   Proto QLimit Policy Dispatch Flags
    ip         1   1000   flow  default   ---
    igmp       2    256 source  default   ---
    rtsock     3   1024 source  default   ---
    arp        4    256 source  default   ---
    ether      5    256 source   direct   ---
    ip6        6    256   flow  default   ---
    
    Workstreams:
    WSID CPU   Name     Len WMark   Disp'd  HDisp'd   QDrops   Queued  Handled
       0   0   ip         0     0   963989        0        0        0   963989
       0   0   igmp       0     0      124        0        0        0      124
       0   0   rtsock     0     3        0        0        0    32302    32302
       0   0   arp        0     0     3349        0        0        0     3349
       0   0   ether      0     0  2284722        0        0        0  2284722
       0   0   ip6        0     0        0        0        0        0        0
       1   1   ip         0     0  1042502        0        0        0  1042502
       1   1   igmp       0     0       14        0        0        0       14
       1   1   rtsock     0     0        0        0        0        0        0
       1   1   arp        0     0     3546        0        0        0     3546
       1   1   ether      0     0  2471311        0        0        0  2471311
       1   1   ip6        0     0        0        0        0        0        0
       2   2   ip         0     0 255534577        0        0        0 255534577
       2   2   igmp       0     0        2        0        0        0        2
       2   2   rtsock     0     0        0        0        0        0        0
       2   2   arp        0     0  1231844        0        0        0  1231844
       2   2   ether      0     0 621215905        0        0        0 621215905
       2   2   ip6        0     0        0        0        0        0        0
       3   3   ip         0     0  5898970        0        0        0  5898970
       3   3   igmp       0     0       19        0        0        0       19
       3   3   rtsock     0     0        0        0        0        0        0
       3   3   arp        0     0     4157        0        0        0     4157
       3   3   ether      0     0  7271238        0        0        0  7271238
       3   3   ip6        0     0        0        0        0        0        0
       4   4   ip         0     0  1047368        0        0        0  1047368
       4   4   igmp       0     0       11        0        0        0       11
       4   4   rtsock     0     0        0        0        0        0        0
       4   4   arp        0     0     3221        0        0        0     3221
       4   4   ether      0     0  2492517        0        0        0  2492517
       4   4   ip6        0     0        0        0        0        0        0
       5   5   ip         0     0  1110272        0        0        0  1110272
       5   5   igmp       0     0       15        0        0        0       15
       5   5   rtsock     0     0        0        0        0        0        0
       5   5   arp        0     0     3423        0        0        0     3423
       5   5   ether      0     0  2640091        0        0        0  2640091
       5   5   ip6        0     0        0        0        0        0        0
       6   6   ip         0     0  1085384        0        0        0  1085384
       6   6   igmp       0     0        7        0        0        0        7
       6   6   rtsock     0     0        0        0        0        0        0
       6   6   arp        0     0     3443        0        0        0     3443
       6   6   ether      0     0  2575020        0        0        0  2575020
       6   6   ip6        0     0        0        0        0        0        0
       7   7   ip         0     0  1083571        0        0        0  1083571
       7   7   igmp       0     0        2        0        0        0        2
       7   7   rtsock     0     0        0        0        0        0        0
       7   7   arp        0     0     3324        0        0        0     3324
       7   7   ether      0     0  2573719        0        0        0  2573719
       7   7   ip6        0     0        0        0        0        0        0
    

    The only other peculiarity with this setup is that the firewall is a VMWare VM with PCIe passthrough for the NIC. My understanding of this, however, is that the VM has full hardware access to the underlying device that's passed to it, so I can't see this being a problem.



  • @stephenw10 said in Intel IX only using single queue:

    I linked to that just to identify the card.

    Sorry Steve, pretty obvious now I've reread your post .



  • Another quick followup. Running "pciconf -lvbc" shows the following:

    ix0@pci0:3:0:0: class=0x020000 card=0x17d3103c chip=0x10fb8086 rev=0x01 hdr=0x00
        vendor     = 'Intel Corporation'
        device     = '82599ES 10-Gigabit SFI/SFP+ Network Connection'
        class      = network
        subclass   = ethernet
        bar   [10] = type Memory, range 32, base 0xfd300000, size 1048576, enabled
        bar   [18] = type I/O Port, range 32, base 0x4000, size 32, enabled
        bar   [1c] = type Memory, range 32, base 0xfd4fc000, size 16384, enabled
        cap 01[40] = powerspec 3  supports D0 D3  current D0
        cap 05[50] = MSI supports 1 message, 64 bit, vector masks enabled with 1 message
        cap 11[70] = MSI-X supports 64 messages
                     Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
        cap 10[a0] = PCI-Express 2 endpoint max data 128(128)
                     link x32(x32) speed 5.0(5.0) ASPM disabled(L0s)
        cap 03[e0] = VPD
        ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
        ecap 0003[140] = Serial 1 48df37ffff01b8e4
        ecap 000e[150] = ARI 1
        ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI enabled
                         0 VFs configured out of 0 supported
                         First VF RID Offset 0x0080, VF RID Stride 0x0002
                         VF Device ID 0x10ed
                         Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
    
    

    This seems to suggest that the OS sees the device as supporting MSI-X. Also seems to confirm it's running as a PCIe 2 device.

    Could you clarify what MSI vs MSI-X actually means from an operational standpoint? Will queues simply not work without MSI-X?

    Thanks.



  • Fixed it!

    Looks like VMWare was entirely the issue here. As per this FreeBSD bug, any device that sits on a VMWare PCI root port/bridge is blacklisted from enabling MSI-X. Even though the device was passed through, it was still being presented on a VMWare PCIe device.

    This can be worked around by adding the following sysctl entry:

    hw.pci.honor_msi_blacklist=0

    Adding this to /boot/loader.conf.local and rebooting has resolved the problem.

    Hopefully, this'll help with the underlying CPU load. I'll be monitoring.

    Many thanks for the help, both. @tman222 I'm not sure I would've found this if you hadn't put me on the MSI-X path.



  • Hi @ChrisPSD - That's great news! I was literally a couple minutes away from posting the exact same thing and asking you to try to modify the honor_msi_blacklist tunable:

    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203874

    I ran across the same bug report as well in my searching -- glad the suggested workaround did the trick and this is now resolved! Out of curiosity, do you now see MSI-X instead of MSI in dmesg for the adapter?



  • Yes, it's now showing correctly as MSI-X. dmesg output below for completeness:

    ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.2.12-k> port 0x4000-0x401f mem 0xfd300000-0xfd3fffff,0xfd4fc000-0xfd4fffff irq 19 at device 0.0 on pci3
    ix0: Using MSI-X interrupts with 9 vectors
    ix0: Ethernet address: 48:df:37:01:b8:e4
    ix0: PCI Express Bus: Speed 5.0GT/s Unknown
    ix0: netmap queues/slots: TX 8/2048, RX 8/2048
    

    Thanks again for your help.


Log in to reply