• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Ix driver doesn't allow num_queues = 1, generates high interrupts, crashes

2.1.1 Snapshot Feedback and Problems - RETIRED
4
10
3.4k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    jasonlitka
    last edited by Jan 30, 2014, 11:56 PM Jan 30, 2014, 11:37 PM

    Despite reporting the same driver version number (2.5.15) as the ix driver in pfSense 2.1, the driver included with 2.1.1 doesn't behave the same way.

    First, it seems impossible to disable multiple queues.  No matter what you set in /boot/loader.conf.local it forces a value of 4.

    Second, when using a 2.1 box as an iperf client and a 2.1.1 box as a server, the ix driver generates about 400 interrupts per second on the 2.1 system but around 8000 per second on the 2.1.1 system.  If I switch the directions I'd have expected those numbers to switch but that isn't the case.  With the 2.1 box as the server and the 2.1.1 box as the client now the 2.1.1 box only generates the 400 interrupts per second but the 2.1 box, rather than jumping to 8000, only hits 600.  Oh, and in this configuration, the 2.1.1 box crashes after about a minute or so of constant traffic on the interface.

    Changing to a different 10Gbe port on the boxes does not change anything, nor does swapping out the SFP+ Direct Attach cable I was using for a pair of Intel MM optics & a MM OM3 fiber patch.

    Here's a crash dump:

    http://pastebin.com/AMY3MpNY

    EDIT:  I'm also going to throw in that I seem to be pegged at 2Gbit/s.  Not sure if this is due to 2.1.1 or not.  I didn't install these cards until after I had upgraded the backup box.

    I can break anything.

    1 Reply Last reply Reply Quote 0
    • ?
      Guest
      last edited by Jan 31, 2014, 3:49 AM

      I think you're going to have to wait for 2.2 for that one.  In fact, does it occur on the 2.2 alpha?

      1 Reply Last reply Reply Quote 0
      • J
        jasonlitka
        last edited by Jan 31, 2014, 1:14 PM

        @gonzopancho:

        I think you're going to have to wait for 2.2 for that one.  In fact, does it occur on the 2.2 alpha?

        I haven't tried it. Didn't know there was a working ISO.  What's the stability/functionality level as compared to 2.1.1?  Can a 2.2 box be the backup for a 2.1 box or has the config format changed?

        I'm not opposed to running prerelease code on a backup firewall but it's got to stay running at least most of the time.

        I can break anything.

        1 Reply Last reply Reply Quote 0
        • J
          jimp Rebel Alliance Developer Netgate
          last edited by Jan 31, 2014, 4:51 PM

          @Jason:

          First, it seems impossible to disable multiple queues.  No matter what you set in /boot/loader.conf.local it forces a value of 4.

          Second, when using a 2.1 box as an iperf client and a 2.1.1 box as a server, the ix driver generates about 400 interrupts per second on the 2.1 system but around 8000 per second on the 2.1.1 system.  If I switch the directions I'd have expected those numbers to switch but that isn't the case.  With the 2.1 box as the server and the 2.1.1 box as the client now the 2.1.1 box only generates the 400 interrupts per second but the 2.1 box, rather than jumping to 8000, only hits 600.  Oh, and in this configuration, the 2.1.1 box crashes after about a minute or so of constant traffic on the interface.

          Changing to a different 10Gbe port on the boxes does not change anything, nor does swapping out the SFP+ Direct Attach cable I was using for a pair of Intel MM optics & a MM OM3 fiber patch.

          You might find this some interesting reading (especially for the interrupts)

          https://github.com/freebsd/freebsd/blob/master/sys/dev/ixgbe/README

          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • J
            jasonlitka
            last edited by Jan 31, 2014, 6:18 PM

            @jimp:

            You might find this some interesting reading (especially for the interrupts)

            https://github.com/freebsd/freebsd/blob/master/sys/dev/ixgbe/README

            The only thing I really saw in there about interrupts was the storm threshold which I've already got set to 10000.

            Did I miss something?

            I can break anything.

            1 Reply Last reply Reply Quote 0
            • J
              jimp Rebel Alliance Developer Netgate
              last edited by Jan 31, 2014, 6:30 PM

              Mostly, it confirms the interrupt behavior is expected/normal and not a problem.

              Doesn't explain the crash, but it means that bit is unrelated. There are lots of other tuning suggestions in the files as well that are worth trying out.

              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • J
                jasonlitka
                last edited by Jan 31, 2014, 7:35 PM

                @jimp:

                Mostly, it confirms the interrupt behavior is expected/normal and not a problem.

                Doesn't explain the crash, but it means that bit is unrelated. There are lots of other tuning suggestions in the files as well that are worth trying out.

                Any idea why the interrupts are fairly low on the box running 2.1?

                I've already done pretty much all the tuning stuff out there, adding one at a time to see what changes.  They've all done basically nothing.

                I can break anything.

                1 Reply Last reply Reply Quote 0
                • A
                  adam65535
                  last edited by Feb 4, 2014, 2:01 PM Feb 4, 2014, 12:01 AM

                  EDIT: Added CPU stats
                  EDIT2:  Make sure it is clear up front that I am using the igb driver and not the ixgb driver.

                  I noticed that 2.1.1-PRERELEASE seems to ignore the num_queues too (I defined 2 and it setup 4 on each interface).  Performance with TCP tests on iperf seem about the same (660mbit/sec) to me as the other system with only 2 queues but the limiting factor might be the linux desktops that I am testing with which are using cheap desktop realtek 1gb interfaces.    I am probably not at the point of stressing the driver on freebsd.  I have 2 desktops on two different interfaces (WAN, LAN) of the firewall going through a loadbalancer config (default load balancer) on the firewall.  I also have a full production rule base running on it (70 rules total between all the interfaces)..  The system is only doing an iperf test so nothing else is happening on the systems during the test.

                  I don't know if it matters or not… maybe the new driver handles the num_queues better now even though we are using the workaround that is supposed to allow altq again.  I haven't tested altq with the new driver.  There definitely is a queue assignment change with the new driver though.

                  I show about double the amount of interrupts but it doesn't seem to impact performance for the simple test I am doing anyway.

                  I have 2 Dell R320 servers that I am testing.  Each is using two Pro/1000 ET2 Quad port interfaces.  The primary is still at 2.0.3 and the backup was just upgraded to the 2.1.1 snapshot from the 2nd of this month.  I disabled config sync on the primary but left state sync btw to make sure there are no config sync issues syncing from 2.1.1 to 2.0.3.

                  The only other odd thing is that vmstat -i only shows 2 queues for ethernet 3 and 4 but yet dmesg says there were 4 queues for all interfaces.  I have 8 total interfaces but only the first 4 are enabled which one of them is used for sync (igb4).  This just might be some automatic balancing going on.

                  irq256: igb0:que 0              14991721       3131
                  irq257: igb0:que 1                 18873          3
                  irq258: igb0:que 2                 21796          4
                  irq259: igb0:que 3               3118071        651
                  irq260: igb0:link                      2          0
                  irq261: igb1:que 0               4798753       1002
                  irq262: igb1:que 1               6032416       1260
                  irq263: igb1:que 2               6661729       1391
                  irq264: igb1:que 3               6865118       1434
                  irq265: igb1:link                      2          0
                  irq266: igb2:que 0                 11997          2
                  irq267: igb2:que 1                  2938          0
                  irq268: igb2:que 2                     1          0
                  irq269: igb2:que 3                     2          0
                  irq270: igb2:link                     10          0
                  irq271: igb3:que 0                 11436          2
                  irq273: igb3:que 2                     2          0
                  irq275: igb3:link                     10          0
                  irq276: igb4:que 0                496736        103
                  irq279: igb4:que 3                380983         79
                  irq280: igb4:link                     10          0
                  
                  
                  igb0: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xfcc0-0xfcdf mem 0xda3a0000-0xda3bffff,0xd9800000-0xd9bfffff,0xda3f8000-0xda3fbfff irq 53 at device 0.0 on pci10
                  igb0: Using MSIX interrupts with 5 vectors
                  igb0: [ITHREAD]
                  igb0: Bound queue 0 to cpu 0
                  igb0: [ITHREAD]
                  igb0: Bound queue 1 to cpu 1
                  igb0: [ITHREAD]
                  igb0: Bound queue 2 to cpu 2
                  igb0: [ITHREAD]
                  igb0: Bound queue 3 to cpu 3
                  igb0: [ITHREAD]
                  igb1: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xfce0-0xfcff mem 0xda3c0000-0xda3dffff,0xd9c00000-0xd9ffffff,0xda3fc000-0xda3fffff irq 54 at device 0.1 on pci10
                  igb1: Using MSIX interrupts with 5 vectors
                  igb1: [ITHREAD]
                  igb1: Bound queue 0 to cpu 0
                  igb1: [ITHREAD]
                  igb1: Bound queue 1 to cpu 1
                  igb1: [ITHREAD]
                  igb1: Bound queue 2 to cpu 2
                  igb1: [ITHREAD]
                  igb1: Bound queue 3 to cpu 3
                  igb1: [ITHREAD]
                  pcib5: <pci-pci bridge="">at device 4.0 on pci9
                  pci11: <pci bus="">on pcib5
                  igb2: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xecc0-0xecdf mem 0xdafa0000-0xdafbffff,0xda400000-0xda7fffff,0xdaff8000-0xdaffbfff irq 48 at device 0.0 on pci11
                  igb2: Using MSIX interrupts with 5 vectors
                  igb2: [ITHREAD]
                  igb2: Bound queue 0 to cpu 0
                  igb2: [ITHREAD]
                  igb2: Bound queue 1 to cpu 1
                  igb2: [ITHREAD]
                  igb2: Bound queue 2 to cpu 2
                  igb2: [ITHREAD]
                  igb2: Bound queue 3 to cpu 3
                  igb2: [ITHREAD]
                  igb3: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xece0-0xecff mem 0xdafc0000-0xdafdffff,0xda800000-0xdabfffff,0xdaffc000-0xdaffffff irq 52 at device 0.1 on pci11
                  igb3: Using MSIX interrupts with 5 vectors
                  igb3: [ITHREAD]
                  igb3: Bound queue 0 to cpu 0
                  igb3: [ITHREAD]
                  igb3: Bound queue 1 to cpu 1
                  igb3: [ITHREAD]
                  igb3: Bound queue 2 to cpu 2
                  igb3: [ITHREAD]
                  igb3: Bound queue 3 to cpu 3
                  igb3: [ITHREAD]
                  pci0: <base peripheral=""> at device 5.0 (no driver attached)
                  pci0: <base peripheral=""> at device 5.2 (no driver attached)
                  pcib6: <pci-pci bridge="">irq 16 at device 17.0 on pci0
                  pci12: <pci bus="">on pcib6
                  pci0: <simple comms="">at device 22.0 (no driver attached)
                  pci0: <simple comms="">at device 22.1 (no driver attached)
                  ehci0: <ehci (generic)="" usb="" 2.0="" controller="">mem 0xdc8fd000-0xdc8fd3ff irq 23 at device 26.0 on pci0
                  ehci0: [ITHREAD]
                  usbus0: EHCI version 1.0
                  usbus0: <ehci (generic)="" usb="" 2.0="" controller="">on ehci0
                  pcib7: <acpi pci-pci="" bridge="">at device 28.0 on pci0
                  pci13: <acpi pci="" bus="">on pcib7
                  pcib8: <pci-pci bridge="">at device 0.0 on pci13
                  pci14: <pci bus="">on pcib8
                  pcib9: <pci-pci bridge="">at device 2.0 on pci14
                  pci15: <pci bus="">on pcib9
                  igb4: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xdcc0-0xdcdf mem 0xdbba0000-0xdbbbffff,0xdb000000-0xdb3fffff,0xdbbf8000-0xdbbfbfff irq 18 at device 0.0 on pci15
                  igb4: Using MSIX interrupts with 5 vectors
                  igb4: [ITHREAD]
                  igb4: Bound queue 0 to cpu 0
                  igb4: [ITHREAD]
                  igb4: Bound queue 1 to cpu 1
                  igb4: [ITHREAD]
                  igb4: Bound queue 2 to cpu 2
                  igb4: [ITHREAD]
                  igb4: Bound queue 3 to cpu 3
                  igb4: [ITHREAD]
                  igb5: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xdce0-0xdcff mem 0xdbbc0000-0xdbbdffff,0xdb400000-0xdb7fffff,0xdbbfc000-0xdbbfffff irq 19 at device 0.1 on pci15
                  igb5: Using MSIX interrupts with 5 vectors
                  igb5: [ITHREAD]
                  igb5: Bound queue 0 to cpu 0
                  igb5: [ITHREAD]
                  igb5: Bound queue 1 to cpu 1
                  igb5: [ITHREAD]
                  igb5: Bound queue 2 to cpu 2
                  igb5: [ITHREAD]
                  igb5: Bound queue 3 to cpu 3
                  igb5: [ITHREAD]
                  pcib10: <pci-pci bridge="">at device 4.0 on pci14
                  pci16: <pci bus="">on pcib10
                  igb6: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xccc0-0xccdf mem 0xdc7a0000-0xdc7bffff,0xdbc00000-0xdbffffff,0xdc7f8000-0xdc7fbfff irq 16 at device 0.0 on pci16
                  igb6: Using MSIX interrupts with 5 vectors
                  igb6: [ITHREAD]
                  igb6: Bound queue 0 to cpu 0
                  igb6: [ITHREAD]
                  igb6: Bound queue 1 to cpu 1
                  igb6: [ITHREAD]
                  igb6: Bound queue 2 to cpu 2
                  igb6: [ITHREAD]
                  igb6: Bound queue 3 to cpu 3
                  igb6: [ITHREAD]
                  igb7: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xcce0-0xccff mem 0xdc7c0000-0xdc7dffff,0xdc000000-0xdc3fffff,0xdc7fc000-0xdc7fffff irq 17 at device 0.1 on pci16
                  igb7: Using MSIX interrupts with 5 vectors
                  igb7: [ITHREAD]
                  igb7: Bound queue 0 to cpu 0
                  igb7: [ITHREAD]
                  igb7: Bound queue 1 to cpu 1
                  igb7: [ITHREAD]
                  igb7: Bound queue 2 to cpu 2
                  igb7: [ITHREAD]
                  igb7: Bound queue 3 to cpu 3
                  igb7: [ITHREAD]</intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></pci></pci-pci></acpi></acpi></ehci></ehci></simple></simple></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)> 
                  
                  
                  2.0.3 RELEASE
                  
                  last pid: 13783;  load averages:  0.23,  0.06,  0.02                                                                                                             up 0+01:28:46  18:53:33
                  43 processes:  1 running, 42 sleeping
                  CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                  CPU 1:  0.0% user,  0.0% nice,  0.0% system, 48.1% interrupt, 51.9% idle
                  CPU 2:  0.0% user,  0.0% nice,  1.1% system,  0.0% interrupt, 98.9% idle
                  CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                  Mem: 42M Active, 18M Inact, 222M Wired, 68K Cache, 28M Buf, 3606M Free
                  Swap: 8192M Total, 8192M Free
                  
                  constant 12.9 - 13.5% total CPU without -P parameter
                  
                  
                  
                  2.1.1 PRE-RELEASE 2nd of Feb
                  
                  last pid: 61308;  load averages:  0.26,  0.13,  0.05                                                                                                             up 0+01:39:32  19:11:46
                  43 processes:  1 running, 42 sleeping
                  CPU 0:  0.0% user,  0.0% nice,  0.0% system,  3.8% interrupt, 96.2% idle
                  CPU 1:  0.0% user,  0.0% nice,  0.0% system, 32.3% interrupt, 67.7% idle
                  CPU 2:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                  CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                  Mem: 81M Active, 29M Inact, 236M Wired, 2004K Cache, 61M Buf, 3537M Free
                  Swap: 8192M Total, 8192M Free
                  
                  A more variable 8.8% to 15% total CPU (without -P parameter) averaging close to 2.0.3 performance.
                  
                  
                  1 Reply Last reply Reply Quote 0
                  • J
                    jimp Rebel Alliance Developer Netgate
                    last edited by Feb 6, 2014, 3:28 PM

                    @Jason:

                    Any idea why the interrupts are fairly low on the box running 2.1?

                    I'm not sure but I'd have to say it's a difference in the way the driver is hitting the hardware. Either it is using a different tactic (e.g. interrupts rather than MSI/MSIX) or it's being reported differently by the driver, or it may just be driving the card harder using it more fully.

                    Speaking of MSI/MSIX, have you tried running with that forced off?

                    loader.conf.local settings for that are:
                    hw.pci.enable_msi=0
                    hw.pci.enable_msix=0

                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 0
                    • J
                      jasonlitka
                      last edited by Feb 6, 2014, 9:07 PM Feb 6, 2014, 8:58 PM

                      @jimp:

                      @Jason:

                      Any idea why the interrupts are fairly low on the box running 2.1?

                      I'm not sure but I'd have to say it's a difference in the way the driver is hitting the hardware. Either it is using a different tactic (e.g. interrupts rather than MSI/MSIX) or it's being reported differently by the driver, or it may just be driving the card harder using it more fully.

                      Speaking of MSI/MSIX, have you tried running with that forced off?

                      loader.conf.local settings for that are:
                      hw.pci.enable_msi=0
                      hw.pci.enable_msix=0

                      Not since upgrading to 2.1.1, no.  Getting rid of that setting was one of the reasons I upgraded a box before 2.1.1 was fully baked.

                      I'll give it a try.

                      EDIT:  With MSIX disabled the box gets stuck in a reboot loop.  It happens at the line "Configuring VLAN interfaces".  No output past that, it just reboots.

                      I can break anything.

                      1 Reply Last reply Reply Quote 0
                      1 out of 10
                      • First post
                        1/10
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.