High CPU/Interrupt usage with little traffic
-
This seems a bit odd. I'm idling at about 30% CPU usage, which is mostly supposedly interrupts (see attached).
last pid: 43211; load averages: 0.79, 0.75, 0.75 up 3+18:00:18 19:08:38 69 processes: 2 running, 66 sleeping, 1 waiting CPU: 0.0% user, 0.0% nice, 0.0% system, 37.4% interrupt, 62.6% idle Mem: 30M Active, 103M Inact, 154M Wired, 776M Buf, 3545M Free Swap: 8192M Total, 8192M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 2 155 ki31 0K 32K RUN 1 141.0H 131.59% idle 12 root 21 -72 - 0K 336K WAIT 1 38.2H 76.17% intr 0 root 11 -92 0 0K 176K - 1 16:26 0.00% kernel 15 root 1 -16 - 0K 16K - 1 4:36 0.00% rand_harvestq 29675 root 1 20 0 12456K 2176K select 1 1:30 0.00% apinger 5 root 1 -16 - 0K 16K pftm 1 0:59 0.00% pf purge 22898 root 1 20 0 21732K 6032K select 1 0:21 0.00% openvpn 55010 root 1 52 20 17136K 2656K wait 1 0:18 0.00% sh 20 root 1 16 - 0K 16K syncer 1 0:11 0.00% syncer 4 root 2 -16 - 0K 32K - 0 0:09 0.00% cam 80950 root 1 20 0 21160K 4656K select 1 0:09 0.00% miniupnpd
I have three ethernet NICs - one onboard bge (LAN), one PCI-e generic realtek (re0 - Fios), and one PCI generic realtek (re1 - low speed DSL).
![Screen Shot 2016-02-21 at 7.03.55 PM.png](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.03.55 PM.png)
![Screen Shot 2016-02-21 at 7.03.55 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.03.55 PM.png_thumb)
![Screen Shot 2016-02-21 at 7.11.41 PM.png](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.11.41 PM.png)
![Screen Shot 2016-02-21 at 7.11.41 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.11.41 PM.png_thumb) -
Might be worth checking what is enabled in Advanced, Networking, Network Interfaces.
If enabled, try disabling some of the hardware offloading to see if it changes anything.
-
No luck, the only thing that wasn't disabled was checksum offloading, so I disabled that as well and then manually plumbed the interfaces. No change.
I peeked at the dmesg buffer and saw this though:
arpresolve: can't allocate llinfo for 173.70.x.x on re0
Tons of it. The IP is the FiOS gateway IP. Might be a red herring though, I looked at the historical system graphs and the cpu has been running like this ever since I powered the box up on Thursday. FioS install was Friday.
-
More info on the NICs:
re1: rgephy1: <rtl8169s 8211="" 8110s="" 1000base-t="" media="" interface="">PHY 1 on miibus2 rgephy1: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re1: <realtek 8169="" 8169s="" 8169sb(l)="" 8110s="" 8110sb(l)="" gigabit="" ethernet="">port 0xcc00-0xccff mem 0xfe2ff000-0xfe2ff0ff irq 16 at device 0.0 on pci3 re1: Chip rev. 0x10000000 re1: MAC rev. 0x00000000 miibus2: <mii bus="">on re1 re0: rgephy0: <rtl8169s 8211="" 8110s="" 1000base-t="" media="" interface="">PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: <realtek 8111="" 8168="" b="" c="" cp="" d="" dp="" e="" f="" g="" pcie="" gigabit="" ethernet="">port 0xdc00-0xdcff mem 0xfe5ff000-0xfe5fffff,0xd0000000-0xd000ffff irq 16 at device 0.0 on pci1 re0: Using 1 MSI-X message re0: Chip rev. 0x3c000000 re0: MAC rev. 0x00400000 miibus0: <mii bus="">on re0 bge0: brgphy0: <bcm57780 1000base-t="" media="" interface="">PHY 1 on miibus1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: <broadcom bcm57780="" a1,="" asic="" rev.="" 0x57780001="">mem 0xfe4f0000-0xfe4fffff irq 16 at device 0.0 on pci2 bge0: CHIP ID 0x57780001; ASIC REV 0x57780; CHIP REV 0x577800; PCI-E miibus1: <mii bus="">on bge0</mii></broadcom></bcm57780></mii></realtek></rtl8169s></mii></realtek></rtl8169s>
-
Strange I'm running version 2.2.6 and every sunday night at round midnight the cpu usage goes high on low traffic and stays that way.
If I reboot the server the problem goes away and cpu usage goes back to 3% from 45%.Do I have a sunday only bug ?
-
I think I might troll around the freebsd-net list to see if anyone can spot something obvious. Not terribly strange hardware here, old enough to be well supported. There's probably some tweaking to some weird boot loader variable for the bge driver that will do something. I suspect the pfsense devs are probably prioritizing paid subs and don't much frequent the forums these days.
-
Not terribly strange hardware here, old enough to be well supported.
Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.
-
Not terribly strange hardware here, old enough to be well supported.
Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.
Regardless, it's pretty much THE brand you're going to end up with when you buy an ethernet card. I'm sure I'm not the only one running pfsense with Realtek cards. :)
-
Not terribly strange hardware here, old enough to be well supported.
Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.
Regardless, it's pretty much THE brand you're going to end up with when you buy an ethernet card. I'm sure I'm not the only one running pfsense with Realtek cards. :)
What kind of services are running, what kind of packets are installed and in usage?
That Realtek NICs are not so well performing, based on the driver support is not really sad or making me angry,
but if I am going to use this Realtek NICs then and the CPU will be not so well offloaded likes the Intel NICs are
doing it, I should also not running wild and searching for other things then the RealTek NICs that are not so well
performing like the Intel ones.Go with Intel or live with the odd circumstances based on the more bad RealTek driver support and cheaper
hardware parts of them. For sure not all will be so pointed in that direction, but many of them. -
There's an old thread that's similar to this. Can't tell if the interrupt source is the same. Are you plugging/unplugging the VGA cable?
https://forum.pfsense.org/index.php?topic=71589.0
-
There's an old thread that's similar to this. Can't tell if the interrupt source is the same. Are you plugging/unplugging the VGA cable?
https://forum.pfsense.org/index.php?topic=71589.0
That's bizarre.
Everything seems to be on IRQ 16:
[2.2.6-RELEASE][admin@gw.com]/root: grep "irq 16" /var/log/dmesg.boot pcib1: <acpi pci-pci="" bridge=""> irq 16 at device 1.0 on pci0 re0: <realtek 8111="" 8168="" b="" c="" cp="" d="" dp="" e="" f="" g="" pcie="" gigabit="" ethernet=""> port 0xdc00-0xdcff mem 0xfe5ff000-0xfe5fffff,0xd0000000-0xd000ffff irq 16 at device 0.0 on pci1 vgapci0: <vga-compatible display=""> port 0xecd8-0xecdf mem 0xfe800000-0xfebfffff,0xc0000000-0xcfffffff irq 16 at device 2.0 on pci0 pcib2: <acpi pci-pci="" bridge=""> irq 16 at device 28.0 on pci0 bge0: <broadcom bcm57780="" a1,="" asic="" rev.="" 0x57780001=""> mem 0xfe4f0000-0xfe4fffff irq 16 at device 0.0 on pci2 re1: <realtek 8169="" 8169s="" 8169sb(l)="" 8110s="" 8110sb(l)="" gigabit="" ethernet=""> port 0xcc00-0xccff mem 0xfe2ff000-0xfe2ff0ff irq 16 at device 0.0 on pci3 atapci0: <intel ich7="" udma100="" controller=""> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf irq 16 at device 31.1 on pci0 [2.2.6-RELEASE][admin@gw.sporklab.com]/root:</intel></realtek></broadcom></acpi></vga-compatible></realtek></acpi>
-
Not terribly strange hardware here, old enough to be well supported.
Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.
Regardless, it's pretty much THE brand you're going to end up with when you buy an ethernet card. I'm sure I'm not the only one running pfsense with Realtek cards. :)
Correct, you're also not the only one complaining about their Realtek NIC having issues. I only use Intel on all of my computers, even if that means I have to purchase a $70 NIC because my motherboard doesn't have one.
-
bge0: <broadcom bcm57780="" a1,="" asic<="" pre=""> Could it perhaps be, that the ASIC on the NIC is causing this higher interrupt usage? So it would be perhaps also a chance to get rid of the Broadcom and Realtek NICs and you might be testing it once more again out only with an Intel Quad LAN Port NIC alone. Perhaps you will see then better results like the actual one.</broadcom>
-
Well, updated BIOS from A05 to A07 and after the reboot for that, CPU usage is back to normal and has remained so for a few days. So either that BIOS update corrected something or the reboot temporarily masked the problem. I suspect the BIOS was the fix since my RRD graphs show that there was no dip in CPU usage after previous reboots.
As for Realtek, I still think it's best to work with them if possible. Plenty of home users of pfSense that are not going to spend $60/each on NICs. The Realtek's may suck if you really need full line rate 24/7, but as long as I can get 100Mb/s in each direction, I'm happy (as I would think would be the case with most home users).
One of my FreeBSD buddies does state that they do officially support Realtek, and rather than telling people to go run Linux (where the Realteks are not as flaky) or switch to something else, users should open bug reports if there seems to be a real driver issue.