High CPU-usage with a Intel Core i3?
-
Hi!
I have a fresh 1.2.3-RELELASE installation and some troubles. The CPU-usage is around 85-95% with a brand new hardware setup. I've had alot older systems that had alot lower CPU-usage. Totally in traffic i only have around 600 GB for the last month with an average traffic of 1.73 Mb/s. I installed PhpSysInfo and the Load Averages was only 0.77 0.35 0.16. I guess the traffic isnt the problem, but why is the CPU working so hard? I only have PhpSysInfo when it comes to packages and i just have a couple of firewall rules and no Captive Portal activated. In total there are 4 wired interfaces (Adaptec AIC-6915 and NatSemi DP8381) and 1 wireless (Ralink RT2561S-based).
The hardware is following: Intel Core i3 540 3.06 GHz, Intel DH55HC motherboard (H55 chipset), 2 GB DDR2-memory, Seagate 80 GB 2.5-inch harddrive (ST980811AS).
Tip for checking what is eating all the CPU-power?
A chart:
-
UNCHECK |–-> System ---> Advanced ---> Networking ---> Enable device polling
-
Looks like you have polling on, as ericab was hinting at. Polling uses all spare CPU cycles to check for input from the network instead of waiting for interrupts.
On an i3 I doubt you need polling, it probably wouldn't help you much at all.
If it's not polling, we'd need to see a screen or so of output from "top -SH" from the shell to see what processes are using the CPU.
-
Device polling wasnt enabled. Strange. I checked it and unchecked it just to be safe, but the CPU-usage is still very high.
This is the output from the top -sh:
_24434 root 1 8 20 6564K 4844K wait 9:01 0.00% sh
442 root 1 -58 0 21076K 18936K bpf 2:31 0.00% tcpdump
420 root 1 44 0 3268K 1152K select 2:01 0.00% syslogd
443 root 1 -8 0 3156K 784K piperd 1:05 0.00% logger
988 root 1 8 20 3156K 796K nanslp 1:04 0.00% check_reload_status
613 root 1 4 0 6168K 3128K kqread 0:30 0.00% lighttpd
24256 root 1 44 0 3312K 1540K select 0:10 0.00% hostapd
55879 nobody 1 64 20 3156K 1336K select 0:05 0.00% dnsmasq
965 root 1 8 0 3240K 1272K nanslp 0:04 0.00% cron
875 _ntp 1 44 0 3156K 1228K select 0:04 0.00% ntpd
55906 dhcpd 1 64 20 3156K 2036K select 0:02 0.00% dhcpd
890 root 1 44 0 3156K 1244K select 0:01 0.00% ntpd
997 root 1 8 0 3156K 900K nanslp 0:01 0.00% minicron
45213 root 1 -8 0 40712K 17136K piperd 0:01 0.00% php
635 root 1 4 0 44808K 20296K accept 0:01 0.00% php
22507 dhcp 1 44 0 3156K 1320K select 0:01 0.00% dhclient
625 root 1 4 0 45832K 20104K accept 0:01 0.00% php
618 root 1 4 0 45832K 18992K accept 0:00 0.00% php -
top -SH
(capital s, capital h)
Also be sure to include the header lines that list the summary of cpu/ram/etc usage.
-
Ops, sorry. Here's the full output:
_$ top -SH
last pid: 59280; load averages: 0.22, 0.11, 0.04 up 42+20:33:39 19:09:33
84 processes: 2 running, 63 sleeping, 19 waitingMem: 70M Active, 26M Inact, 38M Wired, 92K Cache, 31M Buf, 1729M Free
Swap: 4096M Total, 4096M FreePID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND
34 root -68 - 0K 8K WAIT 823.4H 96.00% irq19: sf1 atapci1
11 root 171 ki31 0K 8K RUN 189.3H 10.79% idle: cpu0
25 root -68 - 0K 8K WAIT 380:59 0.00% irq18: sf0 atapci0
26 root -68 - 0K 8K WAIT 202:23 0.00% irq16: sf3 ehci0
15 root 44 - 0K 8K - 125:28 0.00% yarrow
12 root -32 - 0K 8K WAIT 52:33 0.00% swi4: clock sio
33 root -68 - 0K 8K WAIT 17:49 0.00% irq22: ral0
24434 root 8 20 6564K 4844K wait 9:03 0.00% sh
442 root -58 0 21076K 18960K bpf 2:31 0.00% tcpdump
420 root 44 0 3268K 1148K select 2:01 0.00% syslogd
4 root -8 - 0K 8K - 1:57 0.00% g_up
20 root 8 - 0K 8K - 1:39 0.00% thread taskq
5 root -8 - 0K 8K - 1:16 0.00% g_down
443 root -8 0 3156K 784K piperd 1:05 0.00% logger
988 root 8 20 3156K 796K nanslp 1:04 0.00% check_reload_status
48 root 20 - 0K 8K syncer 0:46 0.00% syncer
613 root 4 0 6168K 3128K kqread 0:30 0.00% lighttpd
3 root -8 - 0K 8K - 0:25 0.00% g_event_Seems to be problems with an IRQ, maybe from the Ralink or SiS-nics?
-
grep atapci1 /var/log/dmesg.boot
It's probably an HDD controller, so something is making the HDD chew up lots of CPU.
You might try adjusting any IDE/SATA options you have in the BIOS.
-
Very strange. SF1 is one of the server NIC:s with four ports.
Sorry to take your time, but i'm terrible with FreeBSD :-/ The grep command gave me this:
atapci1: <intel ata="" controller="">port 0xf090-0xf097,0xf080-0xf083,0xf070-0xf077,0xf060-0xf063,0xf050-0xf05f,0xf040-0xf04f irq 19 at device 31.2 on pci0
atapci1: [ITHREAD]
ata4: <ata 0="" channel="">on atapci1
ata5: <ata 1="" channel="">on atapci1</ata></ata></intel>Could it be the HDD that has bad sectors etc?
-
Is the drive actually on that controller? Is anything?
atacontrol list
-
ATA channel 0:
Master: no device present
Slave: no device present
ATA channel 1:
Master: no device present
Slave: no device present
ATA channel 2:
Master: no device present
Slave: no device present
ATA channel 3:
Master: no device present
Slave: no device present
ATA channel 4:
Master: ad8 <st980811as 3.alb="">SATA revision 1.x
Slave: no device present
ATA channel 5:
Master: acd0 <optiarc dvd="" rw="" ad-5240s="" 1.03="">SATA revision 1.x
Slave: no device present</optiarc></st980811as> -
So it could be either the HDD or the DVD
You could try moving the DVD drive to another controller, see if the problem stays the same or follows the DVD drive.
-
Thanks! I'll try to change SATA-port and unplug the DVD. Very strange problem since the controller is the standard ICH11 from Intel.
-
This line
34 root -68 - 0K 8K WAIT 823.4H 96.00% irq19: sf1 atapci1
shows its the irq19 handler thats chewing up most of the CPU. It shows that AT LEAST devices sf1 and atapci1 use irq19. There may be others.
One possible explanation is that the BIOS has either incorrectly calculated the IRQs or has provided incorrect information to FreeBSD such that FreeBSD thinks that some device which really uses IRQ19 uses (say IRQ17). If this is happening then the handler for such device would be on IRQ17 and when the device interrupts on IRQ19 there is no handler there to clear the interrupt condition and so on exit from the interrupt there is an immediate interrupt again because the interrupt condition hasn't been cleared.
Another possible explanation is that there is a bug in the handler for some device on IRQ19 and in some circumstances the interrupt condition is not cleared.
Perhaps this problem will go away if you try a pfSense 2.0 snapshot build; it has newer device drivers than pfSense 1.2.3 and your motherboard has a pretty new chipset.
If you can't go to pfSense 2.0 or it doesn't help you could try removing devices to see what makes a difference. Can you put one or more of your disks on atapci1? Can you replace (at least for test) the 4 port NIC?
-
vmstat -i
Will show what devices are on an IRQ (to a point..)
I didn't catch the sf1 in there earlier, but that is an Adaptec ethernet controller… I don't think I've ever seen one of those before, not sure what kind of quality it is (or the driver for that matter) so as wallabybob mentioned it could be that as well.
Moving the devices around a bit may still prove helpful, at least in narrowing down exactly which is to blame.
It might be worth relocating both of those drives to the primary sata controller instead of ports 4/5, they are obviously connected through a different internal path. sf1 may behave itself if it doesn't have to fight for the bus.
-
I seems to have the exact same problem on my new 1.2.3 install on i3 540 and thru the Web GUI it showed a constant 20+% CPU.
When view thru "top -SH" command I get the following which showed "irq19: atapci1+" taking up 89% on CPU0top -SH
last pid: 29754; load averages: 0.00, 0.01, 0.00 up 0+05:13:43 06:09:48
87 processes: 6 running, 66 sleeping, 15 waiting
CPU: 0.0% user, 0.0% nice, 0.1% system, 23.0% interrupt, 76.8% idle
Mem: 23M Active, 9280K Inact, 37M Wired, 280K Cache, 16M Buf, 2864M Free
Swap: 8192M Total, 8192M FreePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 171 ki31 0K 8K CPU3 3 311:54 100.00% idle: cpu3
13 root 171 ki31 0K 8K CPU1 1 311:45 100.00% idle: cpu1
12 root 171 ki31 0K 8K RUN 2 308:51 100.00% idle: cpu2
38 root -64 - 0K 8K CPU0 0 264:54 89.79% irq19: atapci1+
14 root 171 ki31 0K 8K RUN 0 47:24 11.87% idle: cpu0The "atacontrol list" command output I got:
atacontrol list
ATA channel 0:
Master: no device present
Slave: no device present
ATA channel 1:
Master: no device present
Slave: no device present
ATA channel 2:
Master: no device present
Slave: no device present
ATA channel 3:
Master: ad6 <st3500514ns sn11="">SATA revision 2.x
Slave: no device present
ATA channel 4:
Master: acd0 <hl-dt-st dvdram="" gh24ns50="" xp02="">SATA revision 1.x
Slave: no device present
ATA channel 5:
Master: no device present
Slave: no device present
ATA channel 6:
Master: no device present
Slave: no device presentand "vmstat -i" output I got:
vmstat -i
interrupt total rate
irq16: ehci0 159 0
irq19: atapci1+ 1281400417 67541
irq23: ehci1 55 0
cpu0: timer 37942915 1999
irq256: re0 364280 19
irq257: re1 353607 18
cpu2: timer 37933878 1999
cpu3: timer 37933877 1999
cpu1: timer 37933878 1999
Total 1433863066 75577So I'm wondering should I also try to relocate my SATA Hard disk to ATA channel 1 and DVD rom to ATA channel 2?
Any suggestion ?
Thanks.</hl-dt-st></st3500514ns>
-
Please post the dmesg output so we can see more about your configuration, particularly what devices are using irq19.
If you have the option to do so, i would go into the BIOS, record the mode of all the disk controllers and change them all to legacy rather than raid or AHCI and see if that makes a difference. Or try a snapshot build of pfSense 2.0.
-
This is very interesting after I poke around i can actually see that the CPU hog for IRQ19 is used by "Intel ATA Controller" as i can see it during the system bootup and it scroll by the screen.
then I check my system BIOS and my on-board SATA controller is already using IDE mode (non AHCI mode). So i went and disable just about everything I can find under the BIOS related to AHCI or disk mode and access, but nothing will make the CPU hog of IRQ19 drop. So finally I decided to change my SATA controller from IDE mode to AHCI mode, then after a reboot, my CPU usage is down to about 0% now. So this is interesting that it should be set to AHCI mode on this hardware combination.This has been a good learning experience.