Ix driver doesn't allow num_queues = 1, generates high interrupts, crashes



  • Despite reporting the same driver version number (2.5.15) as the ix driver in pfSense 2.1, the driver included with 2.1.1 doesn't behave the same way.

    First, it seems impossible to disable multiple queues.  No matter what you set in /boot/loader.conf.local it forces a value of 4.

    Second, when using a 2.1 box as an iperf client and a 2.1.1 box as a server, the ix driver generates about 400 interrupts per second on the 2.1 system but around 8000 per second on the 2.1.1 system.  If I switch the directions I'd have expected those numbers to switch but that isn't the case.  With the 2.1 box as the server and the 2.1.1 box as the client now the 2.1.1 box only generates the 400 interrupts per second but the 2.1 box, rather than jumping to 8000, only hits 600.  Oh, and in this configuration, the 2.1.1 box crashes after about a minute or so of constant traffic on the interface.

    Changing to a different 10Gbe port on the boxes does not change anything, nor does swapping out the SFP+ Direct Attach cable I was using for a pair of Intel MM optics & a MM OM3 fiber patch.

    Here's a crash dump:

    http://pastebin.com/AMY3MpNY

    EDIT:  I'm also going to throw in that I seem to be pegged at 2Gbit/s.  Not sure if this is due to 2.1.1 or not.  I didn't install these cards until after I had upgraded the backup box.



  • I think you're going to have to wait for 2.2 for that one.  In fact, does it occur on the 2.2 alpha?



  • @gonzopancho:

    I think you're going to have to wait for 2.2 for that one.  In fact, does it occur on the 2.2 alpha?

    I haven't tried it. Didn't know there was a working ISO.  What's the stability/functionality level as compared to 2.1.1?  Can a 2.2 box be the backup for a 2.1 box or has the config format changed?

    I'm not opposed to running prerelease code on a backup firewall but it's got to stay running at least most of the time.


  • Rebel Alliance Developer Netgate

    @Jason:

    First, it seems impossible to disable multiple queues.  No matter what you set in /boot/loader.conf.local it forces a value of 4.

    Second, when using a 2.1 box as an iperf client and a 2.1.1 box as a server, the ix driver generates about 400 interrupts per second on the 2.1 system but around 8000 per second on the 2.1.1 system.  If I switch the directions I'd have expected those numbers to switch but that isn't the case.  With the 2.1 box as the server and the 2.1.1 box as the client now the 2.1.1 box only generates the 400 interrupts per second but the 2.1 box, rather than jumping to 8000, only hits 600.  Oh, and in this configuration, the 2.1.1 box crashes after about a minute or so of constant traffic on the interface.

    Changing to a different 10Gbe port on the boxes does not change anything, nor does swapping out the SFP+ Direct Attach cable I was using for a pair of Intel MM optics & a MM OM3 fiber patch.

    You might find this some interesting reading (especially for the interrupts)

    https://github.com/freebsd/freebsd/blob/master/sys/dev/ixgbe/README



  • @jimp:

    You might find this some interesting reading (especially for the interrupts)

    https://github.com/freebsd/freebsd/blob/master/sys/dev/ixgbe/README

    The only thing I really saw in there about interrupts was the storm threshold which I've already got set to 10000.

    Did I miss something?


  • Rebel Alliance Developer Netgate

    Mostly, it confirms the interrupt behavior is expected/normal and not a problem.

    Doesn't explain the crash, but it means that bit is unrelated. There are lots of other tuning suggestions in the files as well that are worth trying out.



  • @jimp:

    Mostly, it confirms the interrupt behavior is expected/normal and not a problem.

    Doesn't explain the crash, but it means that bit is unrelated. There are lots of other tuning suggestions in the files as well that are worth trying out.

    Any idea why the interrupts are fairly low on the box running 2.1?

    I've already done pretty much all the tuning stuff out there, adding one at a time to see what changes.  They've all done basically nothing.



  • EDIT: Added CPU stats
    EDIT2:  Make sure it is clear up front that I am using the igb driver and not the ixgb driver.

    I noticed that 2.1.1-PRERELEASE seems to ignore the num_queues too (I defined 2 and it setup 4 on each interface).  Performance with TCP tests on iperf seem about the same (660mbit/sec) to me as the other system with only 2 queues but the limiting factor might be the linux desktops that I am testing with which are using cheap desktop realtek 1gb interfaces.    I am probably not at the point of stressing the driver on freebsd.  I have 2 desktops on two different interfaces (WAN, LAN) of the firewall going through a loadbalancer config (default load balancer) on the firewall.  I also have a full production rule base running on it (70 rules total between all the interfaces)..  The system is only doing an iperf test so nothing else is happening on the systems during the test.

    I don't know if it matters or not… maybe the new driver handles the num_queues better now even though we are using the workaround that is supposed to allow altq again.  I haven't tested altq with the new driver.  There definitely is a queue assignment change with the new driver though.

    I show about double the amount of interrupts but it doesn't seem to impact performance for the simple test I am doing anyway.

    I have 2 Dell R320 servers that I am testing.  Each is using two Pro/1000 ET2 Quad port interfaces.  The primary is still at 2.0.3 and the backup was just upgraded to the 2.1.1 snapshot from the 2nd of this month.  I disabled config sync on the primary but left state sync btw to make sure there are no config sync issues syncing from 2.1.1 to 2.0.3.

    The only other odd thing is that vmstat -i only shows 2 queues for ethernet 3 and 4 but yet dmesg says there were 4 queues for all interfaces.  I have 8 total interfaces but only the first 4 are enabled which one of them is used for sync (igb4).  This just might be some automatic balancing going on.

    irq256: igb0:que 0              14991721       3131
    irq257: igb0:que 1                 18873          3
    irq258: igb0:que 2                 21796          4
    irq259: igb0:que 3               3118071        651
    irq260: igb0:link                      2          0
    irq261: igb1:que 0               4798753       1002
    irq262: igb1:que 1               6032416       1260
    irq263: igb1:que 2               6661729       1391
    irq264: igb1:que 3               6865118       1434
    irq265: igb1:link                      2          0
    irq266: igb2:que 0                 11997          2
    irq267: igb2:que 1                  2938          0
    irq268: igb2:que 2                     1          0
    irq269: igb2:que 3                     2          0
    irq270: igb2:link                     10          0
    irq271: igb3:que 0                 11436          2
    irq273: igb3:que 2                     2          0
    irq275: igb3:link                     10          0
    irq276: igb4:que 0                496736        103
    irq279: igb4:que 3                380983         79
    irq280: igb4:link                     10          0
    
    
    igb0: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xfcc0-0xfcdf mem 0xda3a0000-0xda3bffff,0xd9800000-0xd9bfffff,0xda3f8000-0xda3fbfff irq 53 at device 0.0 on pci10
    igb0: Using MSIX interrupts with 5 vectors
    igb0: [ITHREAD]
    igb0: Bound queue 0 to cpu 0
    igb0: [ITHREAD]
    igb0: Bound queue 1 to cpu 1
    igb0: [ITHREAD]
    igb0: Bound queue 2 to cpu 2
    igb0: [ITHREAD]
    igb0: Bound queue 3 to cpu 3
    igb0: [ITHREAD]
    igb1: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xfce0-0xfcff mem 0xda3c0000-0xda3dffff,0xd9c00000-0xd9ffffff,0xda3fc000-0xda3fffff irq 54 at device 0.1 on pci10
    igb1: Using MSIX interrupts with 5 vectors
    igb1: [ITHREAD]
    igb1: Bound queue 0 to cpu 0
    igb1: [ITHREAD]
    igb1: Bound queue 1 to cpu 1
    igb1: [ITHREAD]
    igb1: Bound queue 2 to cpu 2
    igb1: [ITHREAD]
    igb1: Bound queue 3 to cpu 3
    igb1: [ITHREAD]
    pcib5: <pci-pci bridge="">at device 4.0 on pci9
    pci11: <pci bus="">on pcib5
    igb2: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xecc0-0xecdf mem 0xdafa0000-0xdafbffff,0xda400000-0xda7fffff,0xdaff8000-0xdaffbfff irq 48 at device 0.0 on pci11
    igb2: Using MSIX interrupts with 5 vectors
    igb2: [ITHREAD]
    igb2: Bound queue 0 to cpu 0
    igb2: [ITHREAD]
    igb2: Bound queue 1 to cpu 1
    igb2: [ITHREAD]
    igb2: Bound queue 2 to cpu 2
    igb2: [ITHREAD]
    igb2: Bound queue 3 to cpu 3
    igb2: [ITHREAD]
    igb3: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xece0-0xecff mem 0xdafc0000-0xdafdffff,0xda800000-0xdabfffff,0xdaffc000-0xdaffffff irq 52 at device 0.1 on pci11
    igb3: Using MSIX interrupts with 5 vectors
    igb3: [ITHREAD]
    igb3: Bound queue 0 to cpu 0
    igb3: [ITHREAD]
    igb3: Bound queue 1 to cpu 1
    igb3: [ITHREAD]
    igb3: Bound queue 2 to cpu 2
    igb3: [ITHREAD]
    igb3: Bound queue 3 to cpu 3
    igb3: [ITHREAD]
    pci0: <base peripheral=""> at device 5.0 (no driver attached)
    pci0: <base peripheral=""> at device 5.2 (no driver attached)
    pcib6: <pci-pci bridge="">irq 16 at device 17.0 on pci0
    pci12: <pci bus="">on pcib6
    pci0: <simple comms="">at device 22.0 (no driver attached)
    pci0: <simple comms="">at device 22.1 (no driver attached)
    ehci0: <ehci (generic)="" usb="" 2.0="" controller="">mem 0xdc8fd000-0xdc8fd3ff irq 23 at device 26.0 on pci0
    ehci0: [ITHREAD]
    usbus0: EHCI version 1.0
    usbus0: <ehci (generic)="" usb="" 2.0="" controller="">on ehci0
    pcib7: <acpi pci-pci="" bridge="">at device 28.0 on pci0
    pci13: <acpi pci="" bus="">on pcib7
    pcib8: <pci-pci bridge="">at device 0.0 on pci13
    pci14: <pci bus="">on pcib8
    pcib9: <pci-pci bridge="">at device 2.0 on pci14
    pci15: <pci bus="">on pcib9
    igb4: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xdcc0-0xdcdf mem 0xdbba0000-0xdbbbffff,0xdb000000-0xdb3fffff,0xdbbf8000-0xdbbfbfff irq 18 at device 0.0 on pci15
    igb4: Using MSIX interrupts with 5 vectors
    igb4: [ITHREAD]
    igb4: Bound queue 0 to cpu 0
    igb4: [ITHREAD]
    igb4: Bound queue 1 to cpu 1
    igb4: [ITHREAD]
    igb4: Bound queue 2 to cpu 2
    igb4: [ITHREAD]
    igb4: Bound queue 3 to cpu 3
    igb4: [ITHREAD]
    igb5: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xdce0-0xdcff mem 0xdbbc0000-0xdbbdffff,0xdb400000-0xdb7fffff,0xdbbfc000-0xdbbfffff irq 19 at device 0.1 on pci15
    igb5: Using MSIX interrupts with 5 vectors
    igb5: [ITHREAD]
    igb5: Bound queue 0 to cpu 0
    igb5: [ITHREAD]
    igb5: Bound queue 1 to cpu 1
    igb5: [ITHREAD]
    igb5: Bound queue 2 to cpu 2
    igb5: [ITHREAD]
    igb5: Bound queue 3 to cpu 3
    igb5: [ITHREAD]
    pcib10: <pci-pci bridge="">at device 4.0 on pci14
    pci16: <pci bus="">on pcib10
    igb6: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xccc0-0xccdf mem 0xdc7a0000-0xdc7bffff,0xdbc00000-0xdbffffff,0xdc7f8000-0xdc7fbfff irq 16 at device 0.0 on pci16
    igb6: Using MSIX interrupts with 5 vectors
    igb6: [ITHREAD]
    igb6: Bound queue 0 to cpu 0
    igb6: [ITHREAD]
    igb6: Bound queue 1 to cpu 1
    igb6: [ITHREAD]
    igb6: Bound queue 2 to cpu 2
    igb6: [ITHREAD]
    igb6: Bound queue 3 to cpu 3
    igb6: [ITHREAD]
    igb7: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xcce0-0xccff mem 0xdc7c0000-0xdc7dffff,0xdc000000-0xdc3fffff,0xdc7fc000-0xdc7fffff irq 17 at device 0.1 on pci16
    igb7: Using MSIX interrupts with 5 vectors
    igb7: [ITHREAD]
    igb7: Bound queue 0 to cpu 0
    igb7: [ITHREAD]
    igb7: Bound queue 1 to cpu 1
    igb7: [ITHREAD]
    igb7: Bound queue 2 to cpu 2
    igb7: [ITHREAD]
    igb7: Bound queue 3 to cpu 3
    igb7: [ITHREAD]</intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></pci></pci-pci></acpi></acpi></ehci></ehci></simple></simple></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)> 
    
    
    2.0.3 RELEASE
    
    last pid: 13783;  load averages:  0.23,  0.06,  0.02                                                                                                             up 0+01:28:46  18:53:33
    43 processes:  1 running, 42 sleeping
    CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
    CPU 1:  0.0% user,  0.0% nice,  0.0% system, 48.1% interrupt, 51.9% idle
    CPU 2:  0.0% user,  0.0% nice,  1.1% system,  0.0% interrupt, 98.9% idle
    CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
    Mem: 42M Active, 18M Inact, 222M Wired, 68K Cache, 28M Buf, 3606M Free
    Swap: 8192M Total, 8192M Free
    
    constant 12.9 - 13.5% total CPU without -P parameter
    
    
    
    2.1.1 PRE-RELEASE 2nd of Feb
    
    last pid: 61308;  load averages:  0.26,  0.13,  0.05                                                                                                             up 0+01:39:32  19:11:46
    43 processes:  1 running, 42 sleeping
    CPU 0:  0.0% user,  0.0% nice,  0.0% system,  3.8% interrupt, 96.2% idle
    CPU 1:  0.0% user,  0.0% nice,  0.0% system, 32.3% interrupt, 67.7% idle
    CPU 2:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
    CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
    Mem: 81M Active, 29M Inact, 236M Wired, 2004K Cache, 61M Buf, 3537M Free
    Swap: 8192M Total, 8192M Free
    
    A more variable 8.8% to 15% total CPU (without -P parameter) averaging close to 2.0.3 performance.
    
    

  • Rebel Alliance Developer Netgate

    @Jason:

    Any idea why the interrupts are fairly low on the box running 2.1?

    I'm not sure but I'd have to say it's a difference in the way the driver is hitting the hardware. Either it is using a different tactic (e.g. interrupts rather than MSI/MSIX) or it's being reported differently by the driver, or it may just be driving the card harder using it more fully.

    Speaking of MSI/MSIX, have you tried running with that forced off?

    loader.conf.local settings for that are:
    hw.pci.enable_msi=0
    hw.pci.enable_msix=0



  • @jimp:

    @Jason:

    Any idea why the interrupts are fairly low on the box running 2.1?

    I'm not sure but I'd have to say it's a difference in the way the driver is hitting the hardware. Either it is using a different tactic (e.g. interrupts rather than MSI/MSIX) or it's being reported differently by the driver, or it may just be driving the card harder using it more fully.

    Speaking of MSI/MSIX, have you tried running with that forced off?

    loader.conf.local settings for that are:
    hw.pci.enable_msi=0
    hw.pci.enable_msix=0

    Not since upgrading to 2.1.1, no.  Getting rid of that setting was one of the reasons I upgraded a box before 2.1.1 was fully baked.

    I'll give it a try.

    EDIT:  With MSIX disabled the box gets stuck in a reboot loop.  It happens at the line "Configuring VLAN interfaces".  No output past that, it just reboots.