NIC's irq affects performance



  • I've read topics about performance, pooling, irq. Didn't found the clear answer.

    Here is the situation:
    On board nics have the same irq as installed cards.

    Could it affect performance? Because the system average throughput is about 10 Mb/s BUT the average processor load is 50%

    vmstat -i

    interrupt                  total                  rate
    irq0: clk                    1029813916      1000
    irq1: atkbd0                            6              0
    irq7: em2 ohci0          24220127            23
    irq8: rtc                    131794209          128
    irq10: ciss0                  2236992              2
    irq11: bge0 em1        583590581          566
    irq14: ata0                            80              0
    irq15: bge1 em0+    965954416          938

    Total                        2737610327        2658

    ===========================

    NICS:

    bge0: <broadcom bcm5703="" a2,="" asic="" rev.="" 0x1002="">mem 0xf7cf0000-0xf7cfffff irq 11 at device 1.0 on pci2
    miibus0: <mii bus="">on bge0

    bge1: <broadcom bcm5703="" a2,="" asic="" rev.="" 0x1002="">mem 0xf7ce0000-0xf7ceffff irq 15 at device 2.0 on pci2
    miibus1: <mii bus="">on bge1

    em0: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 6.2.9="">port 0x4000-0x403f mem 0xf7de0000-0xf7dfffff irq 15 at device 1.0 on pci3

    em1: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 6.2.9="">port 0x5000-0x503f mem 0xf7fe0000-0xf7ffffff,0xf7f80000-0xf7fbffff irq 11 at device 1.0 on pci6

    em2: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 6.2.9="">port 0x5040-0x507f mem 0xf7f60000-0xf7f7ffff irq 7 at device 1.1 on pci6

    ===========================

    vmstat 5

    procs      memory      page                    disks    faults      cpu
    r b w    avm    fre          flt  re  pi  po  fr    sr da0 pa0  in  sy  cs us sy  id
    0 4 0  84196 1940056  469  0  0  1  419  0  0    0 4058  1312 4100  1 18 81
    0 4 0  84196 1940056    78  0  0  15  69    0  13  0 4839  169  5601  0 15 85
    0 4 0  84196 1940056    78  0  0    0  70    0  0    0 5877  166  6978  0 16 84

    Any ideas?</intel(r)></intel(r)></intel(r)></mii></broadcom></mii></broadcom>



  • When devices share an interrupt line, whenever there is an interrupt on that line the system has to call each of the interrupt handlers to see if the corresponding device needs service. So, in your case, whenever there is an interrupt on irq11 the system has to call the handler for bge0 and the handler for em1.  This is additional overhead compared with having a single device on the interrupt line. Suppose em1 is not enabled, then it shouldn't request interrupts but the em1 interrupt handler gets called every time bge0 requests an interrupt.

    Whether the interrupt sharing makes a big difference depends on interrupt rates and processor speed and how quickly each of the handlers can determine if the corresponding device has requested an interrupt.

    Sometimes the interrupt lines can be tweaked to some extent in the BIOS - if you play around with the settings you MIGHT be able to reduce the interrupt sharing.

    Whether 50% CPU load is reasonable for 10Mb/s throughput depends on CPU type and speed and whatever else you have running in the system. The top command ('top -S' to see system processes) can be useful for seeing what processes are big CPU consumers.

    Lots of complex firewall rules could also be costly in CPU time especially if typical packets end up having to be checked against many rules.

    My pfSense runs on a 800MHz VIA C3 CPU and has over 95% idle time (reported by top) when downloading a 13MB file at about 60kB/s.

    I'm curious why you say average processor load is 50% when the vmstat output you provided shows an idle time of over 80%.



  • @wallabybob:

    When devices share an interrupt line, whenever there is an interrupt on that line the system has to call each of the interrupt handlers to see if the corresponding device needs service. So, in your case, whenever there is an interrupt on irq11 the system has to call the handler for bge0 and the handler for em1.  This is additional overhead compared with having a single device on the interrupt line. Suppose em1 is not enabled, then it shouldn't request interrupts but the em1 interrupt handler gets called every time bge0 requests an interrupt.

    Whether the interrupt sharing makes a big difference depends on interrupt rates and processor speed and how quickly each of the handlers can determine if the corresponding device has requested an interrupt.

    Sometimes the interrupt lines can be tweaked to some extent in the BIOS - if you play around with the settings you MIGHT be able to reduce the interrupt sharing.

    Whether 50% CPU load is reasonable for 10Mb/s throughput depends on CPU type and speed and whatever else you have running in the system. The top command ('top -S' to see system processes) can be useful for seeing what processes are big CPU consumers.

    Lots of complex firewall rules could also be costly in CPU time especially if typical packets end up having to be checked against many rules.

    My pfSense runs on a 800MHz VIA C3 CPU and has over 95% idle time (reported by top) when downloading a 13MB file at about 60kB/s.

    I'm curious why you say average processor load is 50% when the vmstat output you provided shows an idle time of over 80%.

    Thanks for explanation.

    Just to make sure, how unusual load on our server and if it is related to irqs on NICs. Let me to list some statistics.

    dmesg

    Copyright © 1992-2007 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
    FreeBSD is a registered trademark of The FreeBSD Foundation.
    FreeBSD 6.2-RELEASE-p11 #0: Sun Feb 24 16:49:14 EST 2008
        sullrich@builder6.pfsense.com:/usr/obj.pfSense/usr/src/sys/pfSense_SMP.6
    Timecounter "i8254" frequency 1193182 Hz quality 0
    CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2789.37-MHz 686-class CPU)
      Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
      Features=0xbfebf9ff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Features2=0x4400<cntx-id,<b14>>
      Logical CPUs per core: 2
    real memory  = 2147459072 (2047 MB)
    avail memory = 2092097536 (1995 MB)

    netstat -A -w 10 -h

    input        (Total)          output
      packets  errs      bytes    packets  errs      bytes colls
          42K    0      9.9M        43K    0        18M    0
          47K    0      7.6M        45K    0        14M    0
          50K    0      8.0M        47K    0        14M    0
          55K    0      9.6M        52K    0        17M    0

    Current throughput is 10.4 Mb/s from pfsense stat page.

    #top -S

    last pid: 63725;  load averages:  0.81,  0.48, 0.38
    up 12+20:50:21  15:17:39
    87 processes:  3 running, 68 sleeping, 1 zombie, 15 waiting
    CPU states:  0.0% user,  0.0% nice,  3.7% system, 67.9% interrupt, 28.4% idle
    Mem: 32M Active, 13M Inact, 50M Wired, 26M Buf, 1905M Free
    Swap: 4096M Total, 4096M Free

    PID USERNAME  THR PRI NICE  SIZE    RES STATE    TIME  WCPU COMMAND
      23 root        1 -68 -187    0K    8K RUN    37.4H    43.07% irq15: bge1 em0+
      10 root        1 171  52    0K    8K RUN      249.2H  38.53% idle:  cpu0
      11 root        1 -44 -163    0K    8K WAIT    576:03  6.69%  swi1: net
      37 root        1 171  52    0K    8K pgzero  9:06    0.34%  pagezero

    43.07% irq15: bge1 em0+ is that the case?</cntx-id,<b14></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>



  • What else uses irq 15? The secondary disk controller? usb? (the "+" in the irq 15 line in the vmstat output indicates there are more devices than those listed).

    Does the system do much disk i/o, e.g. web cache?



  • @wallabybob:

    What else uses irq 15? The secondary disk controller? usb? (the "+" in the irq 15 line in the vmstat output indicates there are more devices than those listed).

    Does the system do much disk i/o, e.g. web cache?

    I have no idea what else might be on irq 15
    If there a way to find it out?

    cat /etc/defaults/pccard.conf | grep 15

    on i386 IRQs can be any of 3 4 5 7 9 10 11 12 14 15

    irq    3 5 10 11 15



  • I have no idea what else might be on irq 15
    If there a way to find it out?

    From the shell:

    # dmesg | grep "irq 15"
    
    ```is one way. Or you could just post your full dmesg output. (The startup output doesn't always show the interrupts assigned to all the devices.)


  • @wallabybob:

    I have no idea what else might be on irq 15
    If there a way to find it out?

    From the shell:

    # dmesg | grep "irq 15"
    
    ```is one way. Or you could just post your full dmesg output. (The startup output doesn't always show the interrupts assigned to all the devices.)
    

    I checked BIOS, there was same irq on different network interfaces. After I assigned uniq irq to NICs (some of irqs are still shared among NICs and other stuff like USB) - I noticed that box stated to work more efficiently. Reassigning NICs irq has definitely help.

    Now we have:

    #vmstat -i
    interrupt                          total       rate
    irq0: clk                      171322322       1000
    irq1: atkbd0                             6             0
    irq5: em1                     51624879          301
    irq7: bge1                     99709608         582
    irq8: rtc                        21925606          128
    irq9: em2 acpi0               1551210              9
    irq10: bge0                    44770010          261
    irq11: em0 ohci0             81789134          477
    irq14: ata0                               80             0
    irq15: ata1 ciss0                 361285             2
    Total                            473054140        2762

    The rates are the same if sum up irq rates on previously combined interfaces, but now because of each irq is unique then rate is lower. The total load on the processor if 3 times lower according to pfsense graphs… That's very good, but...

    #top -S

    last pid:  4611;  load averages:  0.35,  0.18,  0.12              up 1+23:40:47  17:26:22
    88 processes:  5 running, 68 sleeping, 1 zombie, 14 waiting
    CPU states:  0.0% user,  0.0% nice,  0.0% system, 20.5% interrupt, 79.5% idle
    Mem: 41M Active, 10M Inact, 39M Wired, 19M Buf, 1908M Free
    Swap: 4096M Total, 4096M Free

    PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
       10 root        1 171   52     0K     8K RUN     44.1H  78.71% idle: cpu0
      11 root        1 -44 -163     0K     8K WAIT    81:38 10.69% swi1: net
       24 root        1 -68 -187     0K     8K RUN     16:08  2.10%   irq11: em0 ohci0
       29 root        1 -68 -187     0K     8K RUN     15:17  1.37%   irq7: bge1
       14 root        1 -16    0      0K      8K -          2:55   0.05%   yarrow
       12 root        1 -32 -151     0K     8K WAIT    18:08  0.00%   swi4: clock sio
       28 root        1 -68 -187     0K     8K WAIT     6:54   0.00%   irq10: bge0
       30 root        1 -68 -187     0K     8K RUN       6:16   0.00%   irq5: em1

    Hehe, who is the swi1: net??? Quickly checked FreeBsd section on Google and didn't find answer. Any ideas?



  • The irq <n>threads are effectively the interrupt handlers doing time critical processing such as allocating another receive buffer to replace now filled buffer holding a received frame. The less time critical processing (probably includes thing such as matching received packets against firewall rules) is handed off to swi1: net.

    If you don't use USB you might be able to disable all the USB controllers in the BIOS and reduce the interrupt sharing even more by giving em0 sole use of irq 11.</n>



  • @wallabybob:

    The irq <n>threads are effectively the interrupt handlers doing time critical processing such as allocating another receive buffer to replace now filled buffer holding a received frame. The less time critical processing (probably includes thing such as matching received packets against firewall rules) is handed off to swi1: net.

    If you don't use USB you might be able to disable all the USB controllers in the BIOS and reduce the interrupt sharing even more by giving em0 sole use of irq 11.</n>

    Thank you for the explanation.

    I think that we  squeezed all from irq assigning, now I can't see that other devices shares irq with bge1 interface which utilizes most irq. Overall performance is still not satisfy.

    The following steps are to use Intel 2 port Gigabit card in place of built-in Broadcom one.
    And I suppose we have to look at our network activity and how pf handles packets, especially small ones.


Log in to reply