Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Intel ET2 Quad Very High CPU usage - IGB driver

    Scheduled Pinned Locked Moved Hardware
    31 Posts 7 Posters 2.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ?
      Guest
      last edited by

      If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

      Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

      1 Reply Last reply Reply Quote 0
      • R
        ralms
        last edited by

        @johnkeates:

        @VAMike:

        @johnkeates:

        Not everyone needs all of the extras, but if someone does, we should have a thing like: need PPPoE? Add 1Ghz on the clock speed you need for non-PPPoE. Need VPN? Add 2Ghz to the base clock speed. Need IDS? Add RAM and more cores.

        Talking about GHz is a silly way to do it. Again, the architecture matters, and a 2GHz airmont isn't the same as a 2GHz Skylake.

        The Ghz thing wasn't relevant, just an example to point out that one-size-fits-all doesn't exist, and if we want to have some sort of ordering of 'what you need to do X' we'll need to start with a base setup that does normal internet security gateway stuff like WAN static/DHCP, NAT, Firewall, LAN DHCP and DNS. Anything on top of that will require something 'more'. This way, we can prevent the "setup xyz was supposed to do gigabit but does not work for me" type of situations.

        My network configuration is as follows:

        PfSense Box:
        Asus Mini-ITX N3150I-C
        Installed into a 120G SSD
        8GB Ram (barely used)
        NIC: Intel Pro/1000 ET2 Quad Port

        Switch: Netgear GS724Tv4 Smart Switch

        Windows Test Client: MSI GP62MVR 6RF (can max out a gigabit connection easy when talking to my NAS over Samba, copying large files)

        The diagram is as follows:

        ISP ONT –(VLan 12, DHCP)--> PfSense --(2 ports in LACP)--> Netgear Switch --(normal Gbit port)--> Test Machine

        1 Reply Last reply Reply Quote 0
        • R
          ralms
          last edited by

          @johnkeates:

          If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

          Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

          Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

          1 Reply Last reply Reply Quote 0
          • ?
            Guest
            last edited by

            @ralms:

            @johnkeates:

            If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

            Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

            Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

            Yeah, it should fit on PCIe 2.0, even if there is a tiny amount of overhead. If the CPU utilisation gets super high, it can be because of interrupts. Another thing with Intel cards has to do with the queues that apparently sometimes get misconfigured by FreeBSD, but that's on the Wiki as well.

            1 Reply Last reply Reply Quote 0
            • R
              ralms
              last edited by

              @johnkeates:

              @ralms:

              @johnkeates:

              If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

              Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

              Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

              Yeah, it should fit on PCIe 2.0, even if there is a tiny amount of overhead. If the CPU utilisation gets super high, it can be because of interrupts. Another thing with Intel cards has to do with the queues that apparently sometimes get misconfigured by FreeBSD, but that's on the Wiki as well.

              Do you know what is a normal value of interrupts when pulling a Gigabit connection like that? I have Max 38% in Monitoring

              1 Reply Last reply Reply Quote 0
              • ?
                Guest
                last edited by

                @ralms:

                @johnkeates:

                @ralms:

                @johnkeates:

                If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

                Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

                Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

                Yeah, it should fit on PCIe 2.0, even if there is a tiny amount of overhead. If the CPU utilisation gets super high, it can be because of interrupts. Another thing with Intel cards has to do with the queues that apparently sometimes get misconfigured by FreeBSD, but that's on the Wiki as well.

                Do you know what is a normal value of interrupts when pulling a Gigabit connection like that? I have Max 38% in Monitoring

                I have an i-series quad port in a machine somewhere, let me check.

                Alright, on a i340-t4 with a 4Gbit trunk and about 1Gbps load I get 0.4% interrupt and a 3% CPU load. It on a SuperMirco ITX board with a x16 PCIe 3.0 slot and a Xeon E3.
                It varies a bit, ran a copy from one subnet to another over the Intel card, top shows:

                CPU:  0.5% user,  0.1% nice,  0.2% system,  3.7% interrupt, 95.5% idle

                1 Reply Last reply Reply Quote 0
                • R
                  ralms
                  last edited by

                  @johnkeates:

                  I have an i-series quad port in a machine somewhere, let me check.

                  Alright, on a i340-t4 with a 4Gbit trunk and about 1Gbps load I get 0.4% interrupt and a 3% CPU load. It on a SuperMirco ITX board with a x16 PCIe 3.0 slot and a Xeon E3.
                  It varies a bit, ran a copy from one subnet to another over the Intel card, top shows:

                  CPU:  0.5% user,  0.1% nice,  0.2% system,  3.7% interrupt, 95.5% idle

                  Ok, so I did the test again while looking at Top instead of the GUI and I got the following:
                  CPU:  1.9% user,  0.0% nice, 90.0% system,  7.5% interrupt,  0.6% idle

                  Something I remembered and I don't know if that can be the reason, I'm bridging 2 interfaces together, the Lan with Wifi. Can that cause this ?

                  1 Reply Last reply Reply Quote 0
                  • ?
                    Guest
                    last edited by

                    @ralms:

                    @johnkeates:

                    I have an i-series quad port in a machine somewhere, let me check.

                    Alright, on a i340-t4 with a 4Gbit trunk and about 1Gbps load I get 0.4% interrupt and a 3% CPU load. It on a SuperMirco ITX board with a x16 PCIe 3.0 slot and a Xeon E3.
                    It varies a bit, ran a copy from one subnet to another over the Intel card, top shows:

                    CPU:  0.5% user,  0.1% nice,  0.2% system,  3.7% interrupt, 95.5% idle

                    Ok, so I did the test again while looking at Top instead of the GUI and I got the following:
                    CPU:  1.9% user,  0.0% nice, 90.0% system,  7.5% interrupt,  0.6% idle

                    Something I remembered and I don't know if that can be the reason, I'm bridging 2 interfaces together, the Lan with Wifi. Can that cause this ?

                    A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

                    1 Reply Last reply Reply Quote 0
                    • R
                      ralms
                      last edited by

                      @johnkeates:

                      A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

                      Yeah, I will test later today or maybe tomorrow morning (its 10pm here atm xD)

                      1 Reply Last reply Reply Quote 0
                      • R
                        ralms
                        last edited by

                        @johnkeates:

                        A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

                        Just tested without the Bridge and its the same result:
                        CPU:  1.7% user,  0.0% nice, 82.1% system,  8.9% interrupt,  7.4% idle

                        [SUM]  0.0-60.2 sec  1.85 GBytes  264 Mbits/sec

                        Are you sure this CPU can do 1000Mbits over TCP ?

                        1 Reply Last reply Reply Quote 0
                        • ?
                          Guest
                          last edited by

                          @ralms:

                          @johnkeates:

                          A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

                          Just tested without the Bridge and its the same result:
                          CPU:  1.7% user,  0.0% nice, 82.1% system,  8.9% interrupt,  7.4% idle

                          [SUM]  0.0-60.2 sec  1.85 GBytes  264 Mbits/sec

                          Are you sure this CPU can do 1000Mbits over TCP ?

                          What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

                          1 Reply Last reply Reply Quote 0
                          • R
                            ralms
                            last edited by

                            @johnkeates:

                            What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

                            iPerf

                            I just tested again and this time it managed to do 320Mbits/sec and the interrupt got really high (40%), but from what I check the Intel ET2 should be compatible.

                            1 Reply Last reply Reply Quote 0
                            • R
                              ralms
                              last edited by

                              @johnkeates:

                              What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

                              So after searching a bit, it seems that LAGG in LACP can cause this.
                              My LACP is using ports igb2 and igb3.

                              Print in attachment.

                              From this view it seems that the NIC is causing most of the CPU load, but I wonder why.

                              EDIT:
                              I will try to install the driver from Intel tomorow.

                              pfsenseLoad.png
                              pfsenseLoad.png_thumb

                              1 Reply Last reply Reply Quote 0
                              • ?
                                Guest
                                last edited by

                                Looks like you hit the Intel NIC queue problem, this happens to some quad port users. Check the forum to fix this :)

                                1 Reply Last reply Reply Quote 0
                                • R
                                  ralms
                                  last edited by

                                  @johnkeates:

                                  Looks like you hit the Intel NIC queue problem, this happens to some quad port users. Check the forum to fix this :)

                                  I've been searching everywhere and I can't find what you've mentioned.

                                  From some pages I've found I done the following:
                                  Set kern.ipc.nmbclusters to 1000000

                                  Disabled TSO, LRO and Hardware Checksum Offload

                                  So far no effect (so I will re-enable the Hardware options).

                                  Do you have a link for a post with a possible fix please?

                                  I will try to update the driver now since most people seem to complain of PfSense using very old drivers.

                                  1 Reply Last reply Reply Quote 0
                                  • ?
                                    Guest
                                    last edited by

                                    Most of the problems have to do with the NIC dumping all traffic on a single queue and then causing an interrupt storm.
                                    This page seems to have both the tunables and sysctls to fix this https://wiki.freebsd.org/NetworkPerformanceTuning

                                    Also read: https://forum.pfsense.org/index.php?topic=86732.0 & https://forum.pfsense.org/index.php?topic=123462.15

                                    1 Reply Last reply Reply Quote 0
                                    • D
                                      dreamslacker
                                      last edited by

                                      @ralms:

                                      So after searching a bit, it seems that LAGG in LACP can cause this.
                                      My LACP is using ports igb2 and igb3.

                                      Print in attachment.

                                      From this view it seems that the NIC is causing most of the CPU load, but I wonder why.

                                      EDIT:
                                      I will try to install the driver from Intel tomorow.

                                      Because the LAGG is a software driver. You are not offloading anything to the NIC as when you use it on its own. I've seen the same thing happen before on a Rangeley 8-core. Just had to break the LAGG and use the physical interfaces directly to resolve.

                                      1 Reply Last reply Reply Quote 0
                                      • H
                                        Harvy66
                                        last edited by

                                        @ralms:

                                        @johnkeates:

                                        What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

                                        So after searching a bit, it seems that LAGG in LACP can cause this.
                                        My LACP is using ports igb2 and igb3.

                                        Print in attachment.

                                        From this view it seems that the NIC is causing most of the CPU load, but I wonder why.

                                        EDIT:
                                        I will try to install the driver from Intel tomorow.

                                        Yeah, don't run iperf from pfSense. It's vastly different than running iperf through pfSense.

                                        1 Reply Last reply Reply Quote 0
                                        • R
                                          ralms
                                          last edited by

                                          @dreamslacker:

                                          Because the LAGG is a software driver. You are not offloading anything to the NIC as when you use it on its own. I've seen the same thing happen before on a Rangeley 8-core. Just had to break the LAGG and use the physical interfaces directly to resolve.

                                          I've tried outside the LAG, it has a bit more performance but not anything significant, the main issue is still there.

                                          @Harvy66:

                                          Yeah, don't run iperf from pfSense. It's vastly different than running iperf through pfSense.

                                          Since a SpeedTest is giving the same results, I dont think its a main problem right now to troubleshoot it.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Yeah I would still expect you to see Gigabit easily but it's a much better test to use other devices for the iperf client and server.

                                            Just as an example I can see line rate Gigabit (~940Mbps) with pfSense as one end of the iperf test, as you're doing, on an old E4500. That's using em NICs.

                                            Steve

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.