Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Ix driver doesn't allow num_queues = 1, generates high interrupts, crashes

    Scheduled Pinned Locked Moved 2.1.1 Snapshot Feedback and Problems - RETIRED
    10 Posts 4 Posters 3.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jasonlitka
      last edited by

      Despite reporting the same driver version number (2.5.15) as the ix driver in pfSense 2.1, the driver included with 2.1.1 doesn't behave the same way.

      First, it seems impossible to disable multiple queues.  No matter what you set in /boot/loader.conf.local it forces a value of 4.

      Second, when using a 2.1 box as an iperf client and a 2.1.1 box as a server, the ix driver generates about 400 interrupts per second on the 2.1 system but around 8000 per second on the 2.1.1 system.  If I switch the directions I'd have expected those numbers to switch but that isn't the case.  With the 2.1 box as the server and the 2.1.1 box as the client now the 2.1.1 box only generates the 400 interrupts per second but the 2.1 box, rather than jumping to 8000, only hits 600.  Oh, and in this configuration, the 2.1.1 box crashes after about a minute or so of constant traffic on the interface.

      Changing to a different 10Gbe port on the boxes does not change anything, nor does swapping out the SFP+ Direct Attach cable I was using for a pair of Intel MM optics & a MM OM3 fiber patch.

      Here's a crash dump:

      http://pastebin.com/AMY3MpNY

      EDIT:  I'm also going to throw in that I seem to be pegged at 2Gbit/s.  Not sure if this is due to 2.1.1 or not.  I didn't install these cards until after I had upgraded the backup box.

      I can break anything.

      1 Reply Last reply Reply Quote 0
      • ?
        Guest
        last edited by

        I think you're going to have to wait for 2.2 for that one.  In fact, does it occur on the 2.2 alpha?

        1 Reply Last reply Reply Quote 0
        • J
          jasonlitka
          last edited by

          @gonzopancho:

          I think you're going to have to wait for 2.2 for that one.  In fact, does it occur on the 2.2 alpha?

          I haven't tried it. Didn't know there was a working ISO.  What's the stability/functionality level as compared to 2.1.1?  Can a 2.2 box be the backup for a 2.1 box or has the config format changed?

          I'm not opposed to running prerelease code on a backup firewall but it's got to stay running at least most of the time.

          I can break anything.

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            @Jason:

            First, it seems impossible to disable multiple queues.  No matter what you set in /boot/loader.conf.local it forces a value of 4.

            Second, when using a 2.1 box as an iperf client and a 2.1.1 box as a server, the ix driver generates about 400 interrupts per second on the 2.1 system but around 8000 per second on the 2.1.1 system.  If I switch the directions I'd have expected those numbers to switch but that isn't the case.  With the 2.1 box as the server and the 2.1.1 box as the client now the 2.1.1 box only generates the 400 interrupts per second but the 2.1 box, rather than jumping to 8000, only hits 600.  Oh, and in this configuration, the 2.1.1 box crashes after about a minute or so of constant traffic on the interface.

            Changing to a different 10Gbe port on the boxes does not change anything, nor does swapping out the SFP+ Direct Attach cable I was using for a pair of Intel MM optics & a MM OM3 fiber patch.

            You might find this some interesting reading (especially for the interrupts)

            https://github.com/freebsd/freebsd/blob/master/sys/dev/ixgbe/README

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • J
              jasonlitka
              last edited by

              @jimp:

              You might find this some interesting reading (especially for the interrupts)

              https://github.com/freebsd/freebsd/blob/master/sys/dev/ixgbe/README

              The only thing I really saw in there about interrupts was the storm threshold which I've already got set to 10000.

              Did I miss something?

              I can break anything.

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                Mostly, it confirms the interrupt behavior is expected/normal and not a problem.

                Doesn't explain the crash, but it means that bit is unrelated. There are lots of other tuning suggestions in the files as well that are worth trying out.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • J
                  jasonlitka
                  last edited by

                  @jimp:

                  Mostly, it confirms the interrupt behavior is expected/normal and not a problem.

                  Doesn't explain the crash, but it means that bit is unrelated. There are lots of other tuning suggestions in the files as well that are worth trying out.

                  Any idea why the interrupts are fairly low on the box running 2.1?

                  I've already done pretty much all the tuning stuff out there, adding one at a time to see what changes.  They've all done basically nothing.

                  I can break anything.

                  1 Reply Last reply Reply Quote 0
                  • A
                    adam65535
                    last edited by

                    EDIT: Added CPU stats
                    EDIT2:  Make sure it is clear up front that I am using the igb driver and not the ixgb driver.

                    I noticed that 2.1.1-PRERELEASE seems to ignore the num_queues too (I defined 2 and it setup 4 on each interface).  Performance with TCP tests on iperf seem about the same (660mbit/sec) to me as the other system with only 2 queues but the limiting factor might be the linux desktops that I am testing with which are using cheap desktop realtek 1gb interfaces.    I am probably not at the point of stressing the driver on freebsd.  I have 2 desktops on two different interfaces (WAN, LAN) of the firewall going through a loadbalancer config (default load balancer) on the firewall.  I also have a full production rule base running on it (70 rules total between all the interfaces)..  The system is only doing an iperf test so nothing else is happening on the systems during the test.

                    I don't know if it matters or not… maybe the new driver handles the num_queues better now even though we are using the workaround that is supposed to allow altq again.  I haven't tested altq with the new driver.  There definitely is a queue assignment change with the new driver though.

                    I show about double the amount of interrupts but it doesn't seem to impact performance for the simple test I am doing anyway.

                    I have 2 Dell R320 servers that I am testing.  Each is using two Pro/1000 ET2 Quad port interfaces.  The primary is still at 2.0.3 and the backup was just upgraded to the 2.1.1 snapshot from the 2nd of this month.  I disabled config sync on the primary but left state sync btw to make sure there are no config sync issues syncing from 2.1.1 to 2.0.3.

                    The only other odd thing is that vmstat -i only shows 2 queues for ethernet 3 and 4 but yet dmesg says there were 4 queues for all interfaces.  I have 8 total interfaces but only the first 4 are enabled which one of them is used for sync (igb4).  This just might be some automatic balancing going on.

                    irq256: igb0:que 0              14991721       3131
                    irq257: igb0:que 1                 18873          3
                    irq258: igb0:que 2                 21796          4
                    irq259: igb0:que 3               3118071        651
                    irq260: igb0:link                      2          0
                    irq261: igb1:que 0               4798753       1002
                    irq262: igb1:que 1               6032416       1260
                    irq263: igb1:que 2               6661729       1391
                    irq264: igb1:que 3               6865118       1434
                    irq265: igb1:link                      2          0
                    irq266: igb2:que 0                 11997          2
                    irq267: igb2:que 1                  2938          0
                    irq268: igb2:que 2                     1          0
                    irq269: igb2:que 3                     2          0
                    irq270: igb2:link                     10          0
                    irq271: igb3:que 0                 11436          2
                    irq273: igb3:que 2                     2          0
                    irq275: igb3:link                     10          0
                    irq276: igb4:que 0                496736        103
                    irq279: igb4:que 3                380983         79
                    irq280: igb4:link                     10          0
                    
                    
                    igb0: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xfcc0-0xfcdf mem 0xda3a0000-0xda3bffff,0xd9800000-0xd9bfffff,0xda3f8000-0xda3fbfff irq 53 at device 0.0 on pci10
                    igb0: Using MSIX interrupts with 5 vectors
                    igb0: [ITHREAD]
                    igb0: Bound queue 0 to cpu 0
                    igb0: [ITHREAD]
                    igb0: Bound queue 1 to cpu 1
                    igb0: [ITHREAD]
                    igb0: Bound queue 2 to cpu 2
                    igb0: [ITHREAD]
                    igb0: Bound queue 3 to cpu 3
                    igb0: [ITHREAD]
                    igb1: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xfce0-0xfcff mem 0xda3c0000-0xda3dffff,0xd9c00000-0xd9ffffff,0xda3fc000-0xda3fffff irq 54 at device 0.1 on pci10
                    igb1: Using MSIX interrupts with 5 vectors
                    igb1: [ITHREAD]
                    igb1: Bound queue 0 to cpu 0
                    igb1: [ITHREAD]
                    igb1: Bound queue 1 to cpu 1
                    igb1: [ITHREAD]
                    igb1: Bound queue 2 to cpu 2
                    igb1: [ITHREAD]
                    igb1: Bound queue 3 to cpu 3
                    igb1: [ITHREAD]
                    pcib5: <pci-pci bridge="">at device 4.0 on pci9
                    pci11: <pci bus="">on pcib5
                    igb2: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xecc0-0xecdf mem 0xdafa0000-0xdafbffff,0xda400000-0xda7fffff,0xdaff8000-0xdaffbfff irq 48 at device 0.0 on pci11
                    igb2: Using MSIX interrupts with 5 vectors
                    igb2: [ITHREAD]
                    igb2: Bound queue 0 to cpu 0
                    igb2: [ITHREAD]
                    igb2: Bound queue 1 to cpu 1
                    igb2: [ITHREAD]
                    igb2: Bound queue 2 to cpu 2
                    igb2: [ITHREAD]
                    igb2: Bound queue 3 to cpu 3
                    igb2: [ITHREAD]
                    igb3: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xece0-0xecff mem 0xdafc0000-0xdafdffff,0xda800000-0xdabfffff,0xdaffc000-0xdaffffff irq 52 at device 0.1 on pci11
                    igb3: Using MSIX interrupts with 5 vectors
                    igb3: [ITHREAD]
                    igb3: Bound queue 0 to cpu 0
                    igb3: [ITHREAD]
                    igb3: Bound queue 1 to cpu 1
                    igb3: [ITHREAD]
                    igb3: Bound queue 2 to cpu 2
                    igb3: [ITHREAD]
                    igb3: Bound queue 3 to cpu 3
                    igb3: [ITHREAD]
                    pci0: <base peripheral=""> at device 5.0 (no driver attached)
                    pci0: <base peripheral=""> at device 5.2 (no driver attached)
                    pcib6: <pci-pci bridge="">irq 16 at device 17.0 on pci0
                    pci12: <pci bus="">on pcib6
                    pci0: <simple comms="">at device 22.0 (no driver attached)
                    pci0: <simple comms="">at device 22.1 (no driver attached)
                    ehci0: <ehci (generic)="" usb="" 2.0="" controller="">mem 0xdc8fd000-0xdc8fd3ff irq 23 at device 26.0 on pci0
                    ehci0: [ITHREAD]
                    usbus0: EHCI version 1.0
                    usbus0: <ehci (generic)="" usb="" 2.0="" controller="">on ehci0
                    pcib7: <acpi pci-pci="" bridge="">at device 28.0 on pci0
                    pci13: <acpi pci="" bus="">on pcib7
                    pcib8: <pci-pci bridge="">at device 0.0 on pci13
                    pci14: <pci bus="">on pcib8
                    pcib9: <pci-pci bridge="">at device 2.0 on pci14
                    pci15: <pci bus="">on pcib9
                    igb4: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xdcc0-0xdcdf mem 0xdbba0000-0xdbbbffff,0xdb000000-0xdb3fffff,0xdbbf8000-0xdbbfbfff irq 18 at device 0.0 on pci15
                    igb4: Using MSIX interrupts with 5 vectors
                    igb4: [ITHREAD]
                    igb4: Bound queue 0 to cpu 0
                    igb4: [ITHREAD]
                    igb4: Bound queue 1 to cpu 1
                    igb4: [ITHREAD]
                    igb4: Bound queue 2 to cpu 2
                    igb4: [ITHREAD]
                    igb4: Bound queue 3 to cpu 3
                    igb4: [ITHREAD]
                    igb5: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xdce0-0xdcff mem 0xdbbc0000-0xdbbdffff,0xdb400000-0xdb7fffff,0xdbbfc000-0xdbbfffff irq 19 at device 0.1 on pci15
                    igb5: Using MSIX interrupts with 5 vectors
                    igb5: [ITHREAD]
                    igb5: Bound queue 0 to cpu 0
                    igb5: [ITHREAD]
                    igb5: Bound queue 1 to cpu 1
                    igb5: [ITHREAD]
                    igb5: Bound queue 2 to cpu 2
                    igb5: [ITHREAD]
                    igb5: Bound queue 3 to cpu 3
                    igb5: [ITHREAD]
                    pcib10: <pci-pci bridge="">at device 4.0 on pci14
                    pci16: <pci bus="">on pcib10
                    igb6: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xccc0-0xccdf mem 0xdc7a0000-0xdc7bffff,0xdbc00000-0xdbffffff,0xdc7f8000-0xdc7fbfff irq 16 at device 0.0 on pci16
                    igb6: Using MSIX interrupts with 5 vectors
                    igb6: [ITHREAD]
                    igb6: Bound queue 0 to cpu 0
                    igb6: [ITHREAD]
                    igb6: Bound queue 1 to cpu 1
                    igb6: [ITHREAD]
                    igb6: Bound queue 2 to cpu 2
                    igb6: [ITHREAD]
                    igb6: Bound queue 3 to cpu 3
                    igb6: [ITHREAD]
                    igb7: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.4.0="">port 0xcce0-0xccff mem 0xdc7c0000-0xdc7dffff,0xdc000000-0xdc3fffff,0xdc7fc000-0xdc7fffff irq 17 at device 0.1 on pci16
                    igb7: Using MSIX interrupts with 5 vectors
                    igb7: [ITHREAD]
                    igb7: Bound queue 0 to cpu 0
                    igb7: [ITHREAD]
                    igb7: Bound queue 1 to cpu 1
                    igb7: [ITHREAD]
                    igb7: Bound queue 2 to cpu 2
                    igb7: [ITHREAD]
                    igb7: Bound queue 3 to cpu 3
                    igb7: [ITHREAD]</intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></pci></pci-pci></acpi></acpi></ehci></ehci></simple></simple></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)> 
                    
                    
                    2.0.3 RELEASE
                    
                    last pid: 13783;  load averages:  0.23,  0.06,  0.02                                                                                                             up 0+01:28:46  18:53:33
                    43 processes:  1 running, 42 sleeping
                    CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                    CPU 1:  0.0% user,  0.0% nice,  0.0% system, 48.1% interrupt, 51.9% idle
                    CPU 2:  0.0% user,  0.0% nice,  1.1% system,  0.0% interrupt, 98.9% idle
                    CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                    Mem: 42M Active, 18M Inact, 222M Wired, 68K Cache, 28M Buf, 3606M Free
                    Swap: 8192M Total, 8192M Free
                    
                    constant 12.9 - 13.5% total CPU without -P parameter
                    
                    
                    
                    2.1.1 PRE-RELEASE 2nd of Feb
                    
                    last pid: 61308;  load averages:  0.26,  0.13,  0.05                                                                                                             up 0+01:39:32  19:11:46
                    43 processes:  1 running, 42 sleeping
                    CPU 0:  0.0% user,  0.0% nice,  0.0% system,  3.8% interrupt, 96.2% idle
                    CPU 1:  0.0% user,  0.0% nice,  0.0% system, 32.3% interrupt, 67.7% idle
                    CPU 2:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                    CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                    Mem: 81M Active, 29M Inact, 236M Wired, 2004K Cache, 61M Buf, 3537M Free
                    Swap: 8192M Total, 8192M Free
                    
                    A more variable 8.8% to 15% total CPU (without -P parameter) averaging close to 2.0.3 performance.
                    
                    
                    1 Reply Last reply Reply Quote 0
                    • jimpJ
                      jimp Rebel Alliance Developer Netgate
                      last edited by

                      @Jason:

                      Any idea why the interrupts are fairly low on the box running 2.1?

                      I'm not sure but I'd have to say it's a difference in the way the driver is hitting the hardware. Either it is using a different tactic (e.g. interrupts rather than MSI/MSIX) or it's being reported differently by the driver, or it may just be driving the card harder using it more fully.

                      Speaking of MSI/MSIX, have you tried running with that forced off?

                      loader.conf.local settings for that are:
                      hw.pci.enable_msi=0
                      hw.pci.enable_msix=0

                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                      Need help fast? Netgate Global Support!

                      Do not Chat/PM for help!

                      1 Reply Last reply Reply Quote 0
                      • J
                        jasonlitka
                        last edited by

                        @jimp:

                        @Jason:

                        Any idea why the interrupts are fairly low on the box running 2.1?

                        I'm not sure but I'd have to say it's a difference in the way the driver is hitting the hardware. Either it is using a different tactic (e.g. interrupts rather than MSI/MSIX) or it's being reported differently by the driver, or it may just be driving the card harder using it more fully.

                        Speaking of MSI/MSIX, have you tried running with that forced off?

                        loader.conf.local settings for that are:
                        hw.pci.enable_msi=0
                        hw.pci.enable_msix=0

                        Not since upgrading to 2.1.1, no.  Getting rid of that setting was one of the reasons I upgraded a box before 2.1.1 was fully baked.

                        I'll give it a try.

                        EDIT:  With MSIX disabled the box gets stuck in a reboot loop.  It happens at the line "Configuring VLAN interfaces".  No output past that, it just reboots.

                        I can break anything.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.