High CPU/Interrupt usage with little traffic



  • This seems a bit odd.  I'm idling at about 30% CPU usage, which is mostly supposedly interrupts (see attached).

    
    last pid: 43211;  load averages:  0.79,  0.75,  0.75                                                        up 3+18:00:18  19:08:38
    69 processes:  2 running, 66 sleeping, 1 waiting
    CPU:  0.0% user,  0.0% nice,  0.0% system, 37.4% interrupt, 62.6% idle
    Mem: 30M Active, 103M Inact, 154M Wired, 776M Buf, 3545M Free
    Swap: 8192M Total, 8192M Free
    
      PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
       11 root        2 155 ki31     0K    32K RUN     1 141.0H 131.59% idle
       12 root       21 -72    -     0K   336K WAIT    1  38.2H  76.17% intr
        0 root       11 -92    0     0K   176K -       1  16:26   0.00% kernel
       15 root        1 -16    -     0K    16K -       1   4:36   0.00% rand_harvestq
    29675 root        1  20    0 12456K  2176K select  1   1:30   0.00% apinger
        5 root        1 -16    -     0K    16K pftm    1   0:59   0.00% pf purge
    22898 root        1  20    0 21732K  6032K select  1   0:21   0.00% openvpn
    55010 root        1  52   20 17136K  2656K wait    1   0:18   0.00% sh
       20 root        1  16    -     0K    16K syncer  1   0:11   0.00% syncer
        4 root        2 -16    -     0K    32K -       0   0:09   0.00% cam
    80950 root        1  20    0 21160K  4656K select  1   0:09   0.00% miniupnpd
    

    I have three ethernet NICs - one onboard bge (LAN), one PCI-e generic realtek (re0 - Fios), and one PCI generic realtek (re1 - low speed DSL).
    ![Screen Shot 2016-02-21 at 7.03.55 PM.png](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.03.55 PM.png)
    ![Screen Shot 2016-02-21 at 7.03.55 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.03.55 PM.png_thumb)
    ![Screen Shot 2016-02-21 at 7.11.41 PM.png](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.11.41 PM.png)
    ![Screen Shot 2016-02-21 at 7.11.41 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2016-02-21 at 7.11.41 PM.png_thumb)



  • Might be worth checking what is enabled in Advanced, Networking, Network Interfaces.

    If enabled, try disabling some of the hardware offloading to see if it changes anything.



  • No luck, the only thing that wasn't disabled was checksum offloading, so I disabled that as well and then manually plumbed the interfaces.  No change.

    I peeked at the dmesg buffer and saw this though:

    arpresolve: can't allocate llinfo for 173.70.x.x on re0

    Tons of it.  The IP is the FiOS gateway IP.  Might be a red herring though, I looked at the historical system graphs and the cpu has been running like this ever since I powered the box up on Thursday.  FioS install was Friday.



  • More info on the NICs:

    
    re1:
    
    rgephy1: <rtl8169s 8211="" 8110s="" 1000base-t="" media="" interface="">PHY 1 on miibus2
    rgephy1:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
    re1: <realtek 8169="" 8169s="" 8169sb(l)="" 8110s="" 8110sb(l)="" gigabit="" ethernet="">port 0xcc00-0xccff mem 0xfe2ff000-0xfe2ff0ff irq 16 at device 0.0 on pci3
    re1: Chip rev. 0x10000000
    re1: MAC rev. 0x00000000
    miibus2: <mii bus="">on re1
    
    re0: 
    
    rgephy0: <rtl8169s 8211="" 8110s="" 1000base-t="" media="" interface="">PHY 1 on miibus0
    rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
    re0: <realtek 8111="" 8168="" b="" c="" cp="" d="" dp="" e="" f="" g="" pcie="" gigabit="" ethernet="">port 0xdc00-0xdcff mem 0xfe5ff000-0xfe5fffff,0xd0000000-0xd000ffff irq 16 at device 0.0 on pci1
    re0: Using 1 MSI-X message
    re0: Chip rev. 0x3c000000
    re0: MAC rev. 0x00400000
    miibus0: <mii bus="">on re0
    
    bge0:
    
    brgphy0: <bcm57780 1000base-t="" media="" interface="">PHY 1 on miibus1
    brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    bge0: <broadcom bcm57780="" a1,="" asic="" rev.="" 0x57780001="">mem 0xfe4f0000-0xfe4fffff irq 16 at device 0.0 on pci2
    bge0: CHIP ID 0x57780001; ASIC REV 0x57780; CHIP REV 0x577800; PCI-E
    miibus1: <mii bus="">on bge0</mii></broadcom></bcm57780></mii></realtek></rtl8169s></mii></realtek></rtl8169s> 
    


  • Strange I'm running version 2.2.6 and every sunday night at round midnight the cpu usage goes high on low traffic and stays that way.
    If I reboot the server the problem goes away and cpu usage goes back to 3% from 45%.

    Do I have a sunday only bug ?



  • I think I might troll around the freebsd-net list to see if anyone can spot something obvious.  Not terribly strange hardware here, old enough to be well supported.  There's probably some tweaking to some weird boot loader variable for the bge driver that will do something.  I suspect the pfsense devs are probably prioritizing paid subs and don't much frequent the forums these days.



  • @sporkme:

    Not terribly strange hardware here, old enough to be well supported.

    Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.



  • @Harvy66:

    @sporkme:

    Not terribly strange hardware here, old enough to be well supported.

    Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.

    Regardless, it's pretty much THE brand you're going to end up with when you buy an ethernet card.  I'm sure I'm not the only one running pfsense with Realtek cards. :)



  • @sporkme:

    @Harvy66:

    @sporkme:

    Not terribly strange hardware here, old enough to be well supported.

    Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.

    Regardless, it's pretty much THE brand you're going to end up with when you buy an ethernet card.  I'm sure I'm not the only one running pfsense with Realtek cards. :)

    What kind of services are running, what kind of packets are installed and in usage?
    That Realtek NICs are not so well performing, based on the driver support is not really sad or making me angry,
    but if I am going to use this Realtek NICs then and the CPU will be not so well offloaded likes the Intel NICs are
    doing it, I should also not running wild and searching for other things then the RealTek NICs that are not so well
    performing like the Intel ones.

    Go with Intel or live with the odd circumstances based on the more bad RealTek driver support and cheaper
    hardware parts of them. For sure not all will be so pointed in that direction, but many of them.



  • There's an old thread that's similar to this. Can't tell if the interrupt source is the same. Are you plugging/unplugging the VGA cable?

    https://forum.pfsense.org/index.php?topic=71589.0



  • @darkcrucible:

    There's an old thread that's similar to this. Can't tell if the interrupt source is the same. Are you plugging/unplugging the VGA cable?

    https://forum.pfsense.org/index.php?topic=71589.0

    That's bizarre.

    Everything seems to be on IRQ 16:

    [2.2.6-RELEASE][admin@gw.com]/root: grep "irq 16" /var/log/dmesg.boot
    pcib1: <acpi pci-pci="" bridge=""> irq 16 at device 1.0 on pci0
    re0: <realtek 8111="" 8168="" b="" c="" cp="" d="" dp="" e="" f="" g="" pcie="" gigabit="" ethernet=""> port 0xdc00-0xdcff mem 0xfe5ff000-0xfe5fffff,0xd0000000-0xd000ffff irq 16 at device 0.0 on pci1
    vgapci0: <vga-compatible display=""> port 0xecd8-0xecdf mem 0xfe800000-0xfebfffff,0xc0000000-0xcfffffff irq 16 at device 2.0 on pci0
    pcib2: <acpi pci-pci="" bridge=""> irq 16 at device 28.0 on pci0
    bge0: <broadcom bcm57780="" a1,="" asic="" rev.="" 0x57780001=""> mem 0xfe4f0000-0xfe4fffff irq 16 at device 0.0 on pci2
    re1: <realtek 8169="" 8169s="" 8169sb(l)="" 8110s="" 8110sb(l)="" gigabit="" ethernet=""> port 0xcc00-0xccff mem 0xfe2ff000-0xfe2ff0ff irq 16 at device 0.0 on pci3
    atapci0: <intel ich7="" udma100="" controller=""> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf irq 16 at device 31.1 on pci0
    [2.2.6-RELEASE][admin@gw.sporklab.com]/root:</intel></realtek></broadcom></acpi></vga-compatible></realtek></acpi>
    


  • @sporkme:

    @Harvy66:

    @sporkme:

    Not terribly strange hardware here, old enough to be well supported.

    Realtek is rarely ever well supported. It's a crappy brand with crappy drivers.

    Regardless, it's pretty much THE brand you're going to end up with when you buy an ethernet card.  I'm sure I'm not the only one running pfsense with Realtek cards. :)

    Correct, you're also not the only one complaining about their Realtek NIC having issues. I only use Intel on all of my computers, even if that means I have to purchase a $70 NIC because my motherboard doesn't have one.



  • bge0: <broadcom bcm57780="" a1,="" asic<="" pre="">
    Could it perhaps be, that the ASIC on the NIC is causing this higher interrupt usage? 
    So it would be perhaps also a chance to get rid of the Broadcom and Realtek NICs and
    you might be testing it once more again out only with an Intel Quad LAN Port NIC alone.
    
    Perhaps you will see then better results like the actual one.</broadcom>
    


  • Well, updated BIOS from A05 to A07 and after the reboot for that, CPU usage is back to normal and has remained so for a few days.  So either that BIOS update corrected something or the reboot temporarily masked the problem.  I suspect the BIOS was the fix since my RRD graphs show that there was no dip in CPU usage after previous reboots.

    As for Realtek, I still think it's best to work with them if possible.  Plenty of home users of pfSense that are not going to spend $60/each on NICs.  The Realtek's may suck if you really need full line rate 24/7, but as long as I can get 100Mb/s in each direction, I'm happy (as I would think would be the case with most home users).

    One of my FreeBSD buddies does state that they do officially support Realtek, and rather than telling people to go run Linux (where the Realteks are not as flaky) or switch to something else, users should open bug reports if there seems to be a real driver issue.