NIC's irq affects performance

rokka

I've read topics about performance, pooling, irq. Didn't found the clear answer.

Here is the situation:
On board nics have the same irq as installed cards.

Could it affect performance? Because the system average throughput is about 10 Mb/s BUT the average processor load is 50%

vmstat -i

interrupt total rate
irq0: clk 1029813916 1000
irq1: atkbd0 6 0
irq7: em2 ohci0 24220127 23
irq8: rtc 131794209 128
irq10: ciss0 2236992 2
irq11: bge0 em1 583590581 566
irq14: ata0 80 0
irq15: bge1 em0+ 965954416 938

Total 2737610327 2658

===========================

NICS:

bge0: <broadcom bcm5703="" a2,="" asic="" rev.="" 0x1002="">mem 0xf7cf0000-0xf7cfffff irq 11 at device 1.0 on pci2
miibus0: <mii bus="">on bge0

bge1: <broadcom bcm5703="" a2,="" asic="" rev.="" 0x1002="">mem 0xf7ce0000-0xf7ceffff irq 15 at device 2.0 on pci2
miibus1: <mii bus="">on bge1

em0: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 6.2.9="">port 0x4000-0x403f mem 0xf7de0000-0xf7dfffff irq 15 at device 1.0 on pci3

em1: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 6.2.9="">port 0x5000-0x503f mem 0xf7fe0000-0xf7ffffff,0xf7f80000-0xf7fbffff irq 11 at device 1.0 on pci6

em2: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 6.2.9="">port 0x5040-0x507f mem 0xf7f60000-0xf7f7ffff irq 7 at device 1.1 on pci6

===========================

vmstat 5

procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy id
0 4 0 84196 1940056 469 0 0 1 419 0 0 0 4058 1312 4100 1 18 81
0 4 0 84196 1940056 78 0 0 15 69 0 13 0 4839 169 5601 0 15 85
0 4 0 84196 1940056 78 0 0 0 70 0 0 0 5877 166 6978 0 16 84

Any ideas?</intel(r)></intel(r)></intel(r)></mii></broadcom></mii></broadcom>

wallabybob

When devices share an interrupt line, whenever there is an interrupt on that line the system has to call each of the interrupt handlers to see if the corresponding device needs service. So, in your case, whenever there is an interrupt on irq11 the system has to call the handler for bge0 and the handler for em1. This is additional overhead compared with having a single device on the interrupt line. Suppose em1 is not enabled, then it shouldn't request interrupts but the em1 interrupt handler gets called every time bge0 requests an interrupt.

Whether the interrupt sharing makes a big difference depends on interrupt rates and processor speed and how quickly each of the handlers can determine if the corresponding device has requested an interrupt.

Sometimes the interrupt lines can be tweaked to some extent in the BIOS - if you play around with the settings you MIGHT be able to reduce the interrupt sharing.

Whether 50% CPU load is reasonable for 10Mb/s throughput depends on CPU type and speed and whatever else you have running in the system. The top command ('top -S' to see system processes) can be useful for seeing what processes are big CPU consumers.

Lots of complex firewall rules could also be costly in CPU time especially if typical packets end up having to be checked against many rules.

My pfSense runs on a 800MHz VIA C3 CPU and has over 95% idle time (reported by top) when downloading a 13MB file at about 60kB/s.

I'm curious why you say average processor load is 50% when the vmstat output you provided shows an idle time of over 80%.

rokka

@wallabybob:

When devices share an interrupt line, whenever there is an interrupt on that line the system has to call each of the interrupt handlers to see if the corresponding device needs service. So, in your case, whenever there is an interrupt on irq11 the system has to call the handler for bge0 and the handler for em1. This is additional overhead compared with having a single device on the interrupt line. Suppose em1 is not enabled, then it shouldn't request interrupts but the em1 interrupt handler gets called every time bge0 requests an interrupt.

Whether the interrupt sharing makes a big difference depends on interrupt rates and processor speed and how quickly each of the handlers can determine if the corresponding device has requested an interrupt.

Sometimes the interrupt lines can be tweaked to some extent in the BIOS - if you play around with the settings you MIGHT be able to reduce the interrupt sharing.

Whether 50% CPU load is reasonable for 10Mb/s throughput depends on CPU type and speed and whatever else you have running in the system. The top command ('top -S' to see system processes) can be useful for seeing what processes are big CPU consumers.

Lots of complex firewall rules could also be costly in CPU time especially if typical packets end up having to be checked against many rules.

My pfSense runs on a 800MHz VIA C3 CPU and has over 95% idle time (reported by top) when downloading a 13MB file at about 60kB/s.

I'm curious why you say average processor load is 50% when the vmstat output you provided shows an idle time of over 80%.

Thanks for explanation.

Just to make sure, how unusual load on our server and if it is related to irqs on NICs. Let me to list some statistics.

dmesg

Copyright 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RELEASE-p11 #0: Sun Feb 24 16:49:14 EST 2008
sullrich@builder6.pfsense.com:/usr/obj.pfSense/usr/src/sys/pfSense_SMP.6
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2789.37-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0xf29 Stepping = 9
Features=0xbfebf9ff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Features2=0x4400<cntx-id,<b14>>
Logical CPUs per core: 2
real memory = 2147459072 (2047 MB)
avail memory = 2092097536 (1995 MB)

netstat -A -w 10 -h

input (Total) output
packets errs bytes packets errs bytes colls
42K 0 9.9M 43K 0 18M 0
47K 0 7.6M 45K 0 14M 0
50K 0 8.0M 47K 0 14M 0
55K 0 9.6M 52K 0 17M 0

Current throughput is 10.4 Mb/s from pfsense stat page.

#top -S

last pid: 63725; load averages: 0.81, 0.48, 0.38
up 12+20:50:21 15:17:39
87 processes: 3 running, 68 sleeping, 1 zombie, 15 waiting
CPU states: 0.0% user, 0.0% nice, 3.7% system, 67.9% interrupt, 28.4% idle
Mem: 32M Active, 13M Inact, 50M Wired, 26M Buf, 1905M Free
Swap: 4096M Total, 4096M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
23 root 1 -68 -187 0K 8K RUN 37.4H 43.07% irq15: bge1 em0+
10 root 1 171 52 0K 8K RUN 249.2H 38.53% idle: cpu0
11 root 1 -44 -163 0K 8K WAIT 576:03 6.69% swi1: net
37 root 1 171 52 0K 8K pgzero 9:06 0.34% pagezero

43.07% irq15: bge1 em0+ is that the case?</cntx-id,<b14></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>

wallabybob

What else uses irq 15? The secondary disk controller? usb? (the "+" in the irq 15 line in the vmstat output indicates there are more devices than those listed).

Does the system do much disk i/o, e.g. web cache?

rokka

@wallabybob:

What else uses irq 15? The secondary disk controller? usb? (the "+" in the irq 15 line in the vmstat output indicates there are more devices than those listed).

Does the system do much disk i/o, e.g. web cache?

I have no idea what else might be on irq 15
If there a way to find it out?

cat /etc/defaults/pccard.conf | grep 15

on i386 IRQs can be any of 3 4 5 7 9 10 11 12 14 15

irq 3 5 10 11 15

wallabybob

I have no idea what else might be on irq 15
If there a way to find it out?

From the shell:

# dmesg | grep "irq 15"

```is one way. Or you could just post your full dmesg output. (The startup output doesn't always show the interrupts assigned to all the devices.)

rokka

@wallabybob:

I have no idea what else might be on irq 15
If there a way to find it out?

From the shell:
# dmesg | grep "irq 15"

```is one way. Or you could just post your full dmesg output. (The startup output doesn't always show the interrupts assigned to all the devices.)

I checked BIOS, there was same irq on different network interfaces. After I assigned uniq irq to NICs (some of irqs are still shared among NICs and other stuff like USB) - I noticed that box stated to work more efficiently. Reassigning NICs irq has definitely help.

Now we have:

#vmstat -i
interrupt total rate
irq0: clk 171322322 1000
irq1: atkbd0 6 0
irq5: em1 51624879 301
irq7: bge1 99709608 582
irq8: rtc 21925606 128
irq9: em2 acpi0 1551210 9
irq10: bge0 44770010 261
irq11: em0 ohci0 81789134 477
irq14: ata0 80 0
irq15: ata1 ciss0 361285 2
Total 473054140 2762

The rates are the same if sum up irq rates on previously combined interfaces, but now because of each irq is unique then rate is lower. The total load on the processor if 3 times lower according to pfsense graphs… That's very good, but...

#top -S

last pid: 4611; load averages: 0.35, 0.18, 0.12 up 1+23:40:47 17:26:22
88 processes: 5 running, 68 sleeping, 1 zombie, 14 waiting
CPU states: 0.0% user, 0.0% nice, 0.0% system, 20.5% interrupt, 79.5% idle
Mem: 41M Active, 10M Inact, 39M Wired, 19M Buf, 1908M Free
Swap: 4096M Total, 4096M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
10 root 1 171 52 0K 8K RUN 44.1H 78.71% idle: cpu0
11 root 1 -44 -163 0K 8K WAIT 81:38 10.69% swi1: net
24 root 1 -68 -187 0K 8K RUN 16:08 2.10% irq11: em0 ohci0
29 root 1 -68 -187 0K 8K RUN 15:17 1.37% irq7: bge1
14 root 1 -16 0 0K 8K - 2:55 0.05% yarrow
12 root 1 -32 -151 0K 8K WAIT 18:08 0.00% swi4: clock sio
28 root 1 -68 -187 0K 8K WAIT 6:54 0.00% irq10: bge0
30 root 1 -68 -187 0K 8K RUN 6:16 0.00% irq5: em1

Hehe, who is the swi1: net??? Quickly checked FreeBsd section on Google and didn't find answer. Any ideas?

wallabybob

The irq <n>threads are effectively the interrupt handlers doing time critical processing such as allocating another receive buffer to replace now filled buffer holding a received frame. The less time critical processing (probably includes thing such as matching received packets against firewall rules) is handed off to swi1: net.

If you don't use USB you might be able to disable all the USB controllers in the BIOS and reduce the interrupt sharing even more by giving em0 sole use of irq 11.</n>

rokka

@wallabybob:

The irq <n>threads are effectively the interrupt handlers doing time critical processing such as allocating another receive buffer to replace now filled buffer holding a received frame. The less time critical processing (probably includes thing such as matching received packets against firewall rules) is handed off to swi1: net.

If you don't use USB you might be able to disable all the USB controllers in the BIOS and reduce the interrupt sharing even more by giving em0 sole use of irq 11.</n>

Thank you for the explanation.

I think that we squeezed all from irq assigning, now I can't see that other devices shares irq with bge1 interface which utilizes most irq. Overall performance is still not satisfy.

The following steps are to use Intel 2 port Gigabit card in place of built-in Broadcom one.
And I suppose we have to look at our network activity and how pf handles packets, especially small ones.

NIC's irq affects performance

vmstat -i

vmstat 5

procs memory page disks faults cpu r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy id 0 4 0 84196 1940056 469 0 0 1 419 0 0 0 4058 1312 4100 1 18 81 0 4 0 84196 1940056 78 0 0 15 69 0 13 0 4839 169 5601 0 15 85 0 4 0 84196 1940056 78 0 0 0 70 0 0 0 5877 166 6978 0 16 84

dmesg

netstat -A -w 10 -h

on i386 IRQs can be any of 3 4 5 7 9 10 11 12 14 15

procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy id
0 4 0 84196 1940056 469 0 0 1 419 0 0 0 4058 1312 4100 1 18 81
0 4 0 84196 1940056 78 0 0 15 69 0 13 0 4839 169 5601 0 15 85
0 4 0 84196 1940056 78 0 0 0 70 0 0 0 5877 166 6978 0 16 84