Intel ET2 Quad Very High CPU usage - IGB driver



  • Hi Guys,

    Ater seeing 100 threads about people talking about gigabit connections and no concrete data, would be interesting to see specs about people with Gigabit home connections and if you are getting the full gigabit.
    Please post below on a similar format as I did:

    I've recently updated my home private internet connection to a 1000Mbs Dw /200Mbs Up FTTH from MEO.
    To then discover that the 1-year-old router I've built won't cut it:

    MB: Asus Mini-ITX N3150I-C
    CPU: Embed Intel Celeron SoC N3150 Quad-Core 1.6 GHz (supposed to burst to 2.08Ghz but not sure if PfSense takes advantage of that).
    Memory: 8GB DDR3
    Disk: Kingston V300 120GB SSD
    Network: Intel Pro/1000 ET2 Quad Port

    I'm still on the old ONT that can be a bottleneck (waiting on hardware change from ISP in a week), but the service is set to the new speed already in the central.
    This will be a before look: I'm maxing out at around 270Mbps dw, 200Mbps up on speedtest.net to a server in the same city hosted by a different ISP.

    Will post a new result if it improves with the new ONT in a week.

    Thanks.

    EDIT:
    Since this topic as turned into a troubleshooting topic I've updated the Subject to reflect that.



  • Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).



  • @johnkeates:

    Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).

    I have a celeron and doesnt even come close, so do you have proof of what you are saying?



  • @johnkeates:

    Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).

    "Celeron" doesn't really mean anything. The N3150 is an airmont (same basic architecture as the C2xx atoms). There are also celerons based on Intel's high performance architectures, like the kaby lake G3950, which have dramatically higher performance (and power consumption). You need to look at the specific chip and ignore the marketing labels. The N3150 was a decent budget processor a couple of years ago (I'd get a J3355 today instead) while the G3950 is a decent candidate for "value priced high speed VPN" solution.

    All that said, I'd expect an N3150 to do gigabit just fine. At the data rates you're talking about, I'd guess you're dealing with PPPoE. Best option is to see if there's a way to reconfigure your provider to deliver straight IP with DHCP instead. If you're stuck with PPPoE you need a high clock low core chip. The G series celerons are a good starting place. :-)


  • Banned

    @ralms:

    MB: Asus Mini-ITX N3150I-C
    CPU: Embed Intel Celeron SoC N3150 Quad-Core 1.6 GHz (supposed to burst to 2.08Ghz but not sure if PfSense takes advantage of that).
    Network: Intel Pro/1000 ET2 Quad Port

    I have the exact same MB with a similiar setup. ASUS did disable the burst mode in the BIOS due to the passive cooling, but it can still easily route 1GB between the ports of the Intel card (tested with 2 PCs and iperf3).
    The onboard Realtek port is total crap though, disable it in the BIOS and use the Intel ports only.



  • @ralms:

    @johnkeates:

    Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).

    I have a celeron and doesnt even come close, so do you have proof of what you are saying?

    Didn't know this was a trial, but there is plenty of proof out there. I used a N2840 which did gigabit just fine, but if you want to check for yourself, here is the path:

    1. Find recent celerons or better https://ark.intel.com/Search/FeatureFilter?productType=processors&FamilyText=Intel® Celeron® Processor&BornOnDate=Q3'14

    2. Plug celeron (or better, i.e. pentium, core) that is recent (i.e. after Q2 2014) and pfsense into google

    3. ????

    4. Profit!

    Regarding the point I was making: unless you buy a really bad CPU, any recent 'normal' CPU will work just fine, and since the selection of compatible CPUs is limited to AES-NI, you can't really get a super slow CPU that doesn't do gigabit anymore.

    Celeron: https://ark.intel.com/Search/FeatureFilter?productType=processors&FamilyText=Intel® Celeron® Processor&AESTech=true
    Pentium: https://ark.intel.com/Search/FeatureFilter?productType=processors&FamilyText=Intel® Pentium® Processor&AESTech=true
    Core: https://ark.intel.com/Search/FeatureFilter?productType=processors&FamilyText=Intel® Core™ Processors&AESTech=true

    Unless there is a 2014+ CPU in there running below 1Ghz you probably won't have an issue running gigabit speeds, well, unless you plug in USB ethernet, or PCI realtek stuff.



  • @johnkeates:

    @ralms:

    @johnkeates:

    Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).

    I have a celeron and doesnt even come close, so do you have proof of what you are saying?

    Didn't know this was a trial, but there is plenty of proof out there. I used a N2840 which did gigabit just fine,

    PPPoE?



  • @VAMike:

    @johnkeates:

    @ralms:

    @johnkeates:

    Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).

    I have a celeron and doesnt even come close, so do you have proof of what you are saying?

    Didn't know this was a trial, but there is plenty of proof out there. I used a N2840 which did gigabit just fine,

    PPPoE?

    Hell no, who wants PPPoE. Wasn't running OpenVPN either. Keep in mind we are talking about the base functions, routing, NAT, DNS and DHCP. As soon as you want more (see my first post in this thread) you're gonna need something else. But that applies to most requirements.

    If we're going to put up a thread on what hardware you need to run gigabit, we should have a few setups, like:

    • Just internet
    • Multiple LANs
    • VPN
    • PPPoE
    • IDS/IPS

    Not everyone needs all of the extras, but if someone does, we should have a thing like: need PPPoE? Add 1Ghz on the clock speed you need for non-PPPoE. Need VPN? Add 2Ghz to the base clock speed. Need IDS? Add RAM and more cores.



  • @johnkeates:

    Not everyone needs all of the extras, but if someone does, we should have a thing like: need PPPoE? Add 1Ghz on the clock speed you need for non-PPPoE. Need VPN? Add 2Ghz to the base clock speed. Need IDS? Add RAM and more cores.

    Talking about GHz is a silly way to do it. Again, the architecture matters, and a 2GHz airmont isn't the same as a 2GHz Skylake.



  • @VAMike:

    @johnkeates:

    Not everyone needs all of the extras, but if someone does, we should have a thing like: need PPPoE? Add 1Ghz on the clock speed you need for non-PPPoE. Need VPN? Add 2Ghz to the base clock speed. Need IDS? Add RAM and more cores.

    Talking about GHz is a silly way to do it. Again, the architecture matters, and a 2GHz airmont isn't the same as a 2GHz Skylake.

    The Ghz thing wasn't relevant, just an example to point out that one-size-fits-all doesn't exist, and if we want to have some sort of ordering of 'what you need to do X' we'll need to start with a base setup that does normal internet security gateway stuff like WAN static/DHCP, NAT, Firewall, LAN DHCP and DNS. Anything on top of that will require something 'more'. This way, we can prevent the "setup xyz was supposed to do gigabit but does not work for me" type of situations.



  • @VAMike:

    @johnkeates:

    Gigabit can be done on any recent celeron and above. Most issues arise from non-scaling things (like PPPoE and OpenVPN) and from network inspection (snort, suricata).

    "Celeron" doesn't really mean anything. The N3150 is an airmont (same basic architecture as the C2xx atoms). There are also celerons based on Intel's high performance architectures, like the kaby lake G3950, which have dramatically higher performance (and power consumption). You need to look at the specific chip and ignore the marketing labels. The N3150 was a decent budget processor a couple of years ago (I'd get a J3355 today instead) while the G3950 is a decent candidate for "value priced high speed VPN" solution.

    All that said, I'd expect an N3150 to do gigabit just fine. At the data rates you're talking about, I'd guess you're dealing with PPPoE. Best option is to see if there's a way to reconfigure your provider to deliver straight IP with DHCP instead. If you're stuck with PPPoE you need a high clock low core chip. The G series celerons are a good starting place. :-)

    It's not PPPoE, its DHCP from the ONT with Vlans and the connection to my switch is 2 ports in LACP.
    Unless the ONT is the bottleneck (still waiting for replacement), the speeds are really low.

    I'm not doing IDS/IPS at the moment. Only have pfBlockerNG, openvpn (0 clients atm), ntopng (off atm).
    Considering the NIC I have is pretty good, in System -> Advanced -> Networking, I have all network Interfaces options Off.

    So still hard to explain the speedtest maxing out at 270Mbs.
    Since most of you are saying that should be enough, I went ahead and did some iperf tests.
    PfSense box as server running default (ssh and did: iperf -s) and my windows machine as client.

    Tests:

    | Command:  | Result SUM: |
    | -t 20 -i 2  | 110 Mbits/sec |
    | -t 20 -i 2 -P 2  | 112Mbits/sec |
    | -t 20 -i 2 -P 3  | 202Mbits/sec |
    | -t 20 -i 2 -P 4  | 259 Mbits/sec |
    | -t 20 -i 2 -P 10  | 286 Mbits/sec |
    | -t 20 -i 2 -u  | 1.05 Mbits/sec |
    | -t 20 -i 2 -u -b 1000M  | 938 Mbits/sec |

    The TCP connection is maxing at the same speeds as my internet, so it might actually be something in my local network.

    At the end, I did a test (with -t 60 -i 10 -P 10), my PFSense box reached 11.87 Load Average with CPU at 100%.

    This is confirming my suspicion that its CPU related.
    Now I don't know if my configuration that causes a higher CPU load, any ideas to troubleshoot?



  • If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

    Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.



  • @johnkeates:

    @VAMike:

    @johnkeates:

    Not everyone needs all of the extras, but if someone does, we should have a thing like: need PPPoE? Add 1Ghz on the clock speed you need for non-PPPoE. Need VPN? Add 2Ghz to the base clock speed. Need IDS? Add RAM and more cores.

    Talking about GHz is a silly way to do it. Again, the architecture matters, and a 2GHz airmont isn't the same as a 2GHz Skylake.

    The Ghz thing wasn't relevant, just an example to point out that one-size-fits-all doesn't exist, and if we want to have some sort of ordering of 'what you need to do X' we'll need to start with a base setup that does normal internet security gateway stuff like WAN static/DHCP, NAT, Firewall, LAN DHCP and DNS. Anything on top of that will require something 'more'. This way, we can prevent the "setup xyz was supposed to do gigabit but does not work for me" type of situations.

    My network configuration is as follows:

    PfSense Box:
    Asus Mini-ITX N3150I-C
    Installed into a 120G SSD
    8GB Ram (barely used)
    NIC: Intel Pro/1000 ET2 Quad Port

    Switch: Netgear GS724Tv4 Smart Switch

    Windows Test Client: MSI GP62MVR 6RF (can max out a gigabit connection easy when talking to my NAS over Samba, copying large files)

    The diagram is as follows:

    ISP ONT –(VLan 12, DHCP)--> PfSense --(2 ports in LACP)--> Netgear Switch --(normal Gbit port)--> Test Machine



  • @johnkeates:

    If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

    Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

    Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.



  • @ralms:

    @johnkeates:

    If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

    Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

    Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

    Yeah, it should fit on PCIe 2.0, even if there is a tiny amount of overhead. If the CPU utilisation gets super high, it can be because of interrupts. Another thing with Intel cards has to do with the queues that apparently sometimes get misconfigured by FreeBSD, but that's on the Wiki as well.



  • @johnkeates:

    @ralms:

    @johnkeates:

    If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

    Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

    Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

    Yeah, it should fit on PCIe 2.0, even if there is a tiny amount of overhead. If the CPU utilisation gets super high, it can be because of interrupts. Another thing with Intel cards has to do with the queues that apparently sometimes get misconfigured by FreeBSD, but that's on the Wiki as well.

    Do you know what is a normal value of interrupts when pulling a Gigabit connection like that? I have Max 38% in Monitoring



  • @ralms:

    @johnkeates:

    @ralms:

    @johnkeates:

    If basic lan-side networking causes a high CPU load, it's either bad settings or bad hardware. Settings include firmware. We've seen this often on the forums, mostly it was settings for the specific network card (check the wiki) or an unsupported/unknown card (i.e. weird off-brand NIC or USB stuff).

    Checking the PCIe first; it seems it's a x1 PCIe slot, depending on how the quad card works, you might simply be limited by the x1 speed. Normally you should be able to pull 2.5Gbit/s over x1, but if for some reason the Intel chip is optimised to balance over multiple links, that would be an issue.

    Ok, I will look into that, I thought about that could be the PCI but theoretically, a PCIe 2.0 1x can deliver 500MB/s bidirectional that should be more than plenty, since a 1000Mbit connection will do 120MB/s max.

    Yeah, it should fit on PCIe 2.0, even if there is a tiny amount of overhead. If the CPU utilisation gets super high, it can be because of interrupts. Another thing with Intel cards has to do with the queues that apparently sometimes get misconfigured by FreeBSD, but that's on the Wiki as well.

    Do you know what is a normal value of interrupts when pulling a Gigabit connection like that? I have Max 38% in Monitoring

    I have an i-series quad port in a machine somewhere, let me check.

    Alright, on a i340-t4 with a 4Gbit trunk and about 1Gbps load I get 0.4% interrupt and a 3% CPU load. It on a SuperMirco ITX board with a x16 PCIe 3.0 slot and a Xeon E3.
    It varies a bit, ran a copy from one subnet to another over the Intel card, top shows:

    CPU:  0.5% user,  0.1% nice,  0.2% system,  3.7% interrupt, 95.5% idle



  • @johnkeates:

    I have an i-series quad port in a machine somewhere, let me check.

    Alright, on a i340-t4 with a 4Gbit trunk and about 1Gbps load I get 0.4% interrupt and a 3% CPU load. It on a SuperMirco ITX board with a x16 PCIe 3.0 slot and a Xeon E3.
    It varies a bit, ran a copy from one subnet to another over the Intel card, top shows:

    CPU:  0.5% user,  0.1% nice,  0.2% system,  3.7% interrupt, 95.5% idle

    Ok, so I did the test again while looking at Top instead of the GUI and I got the following:
    CPU:  1.9% user,  0.0% nice, 90.0% system,  7.5% interrupt,  0.6% idle

    Something I remembered and I don't know if that can be the reason, I'm bridging 2 interfaces together, the Lan with Wifi. Can that cause this ?



  • @ralms:

    @johnkeates:

    I have an i-series quad port in a machine somewhere, let me check.

    Alright, on a i340-t4 with a 4Gbit trunk and about 1Gbps load I get 0.4% interrupt and a 3% CPU load. It on a SuperMirco ITX board with a x16 PCIe 3.0 slot and a Xeon E3.
    It varies a bit, ran a copy from one subnet to another over the Intel card, top shows:

    CPU:  0.5% user,  0.1% nice,  0.2% system,  3.7% interrupt, 95.5% idle

    Ok, so I did the test again while looking at Top instead of the GUI and I got the following:
    CPU:  1.9% user,  0.0% nice, 90.0% system,  7.5% interrupt,  0.6% idle

    Something I remembered and I don't know if that can be the reason, I'm bridging 2 interfaces together, the Lan with Wifi. Can that cause this ?

    A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?



  • @johnkeates:

    A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

    Yeah, I will test later today or maybe tomorrow morning (its 10pm here atm xD)



  • @johnkeates:

    A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

    Just tested without the Bridge and its the same result:
    CPU:  1.7% user,  0.0% nice, 82.1% system,  8.9% interrupt,  7.4% idle

    [SUM]  0.0-60.2 sec  1.85 GBytes  264 Mbits/sec

    Are you sure this CPU can do 1000Mbits over TCP ?



  • @ralms:

    @johnkeates:

    A software bridge will eat up resources pretty fast. I suspect it shouldn't matter if you are not using it, but then again maybe it even copies packets to the bridge regardless. Can you test without the bridge?

    Just tested without the Bridge and its the same result:
    CPU:  1.7% user,  0.0% nice, 82.1% system,  8.9% interrupt,  7.4% idle

    [SUM]  0.0-60.2 sec  1.85 GBytes  264 Mbits/sec

    Are you sure this CPU can do 1000Mbits over TCP ?

    What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.



  • @johnkeates:

    What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

    iPerf

    I just tested again and this time it managed to do 320Mbits/sec and the interrupt got really high (40%), but from what I check the Intel ET2 should be compatible.



  • @johnkeates:

    What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

    So after searching a bit, it seems that LAGG in LACP can cause this.
    My LACP is using ports igb2 and igb3.

    Print in attachment.

    From this view it seems that the NIC is causing most of the CPU load, but I wonder why.

    EDIT:
    I will try to install the driver from Intel tomorow.




  • Looks like you hit the Intel NIC queue problem, this happens to some quad port users. Check the forum to fix this :)



  • @johnkeates:

    Looks like you hit the Intel NIC queue problem, this happens to some quad port users. Check the forum to fix this :)

    I've been searching everywhere and I can't find what you've mentioned.

    From some pages I've found I done the following:
    Set kern.ipc.nmbclusters to 1000000

    Disabled TSO, LRO and Hardware Checksum Offload

    So far no effect (so I will re-enable the Hardware options).

    Do you have a link for a post with a possible fix please?

    I will try to update the driver now since most people seem to complain of PfSense using very old drivers.



  • Most of the problems have to do with the NIC dumping all traffic on a single queue and then causing an interrupt storm.
    This page seems to have both the tunables and sysctls to fix this https://wiki.freebsd.org/NetworkPerformanceTuning

    Also read: https://forum.pfsense.org/index.php?topic=86732.0 & https://forum.pfsense.org/index.php?topic=123462.15



  • @ralms:

    So after searching a bit, it seems that LAGG in LACP can cause this.
    My LACP is using ports igb2 and igb3.

    Print in attachment.

    From this view it seems that the NIC is causing most of the CPU load, but I wonder why.

    EDIT:
    I will try to install the driver from Intel tomorow.

    Because the LAGG is a software driver. You are not offloading anything to the NIC as when you use it on its own. I've seen the same thing happen before on a Rangeley 8-core. Just had to break the LAGG and use the physical interfaces directly to resolve.



  • @ralms:

    @johnkeates:

    What process is using all that CPU? It shouldn't be doing anything special for basic port-to-port traffic.

    So after searching a bit, it seems that LAGG in LACP can cause this.
    My LACP is using ports igb2 and igb3.

    Print in attachment.

    From this view it seems that the NIC is causing most of the CPU load, but I wonder why.

    EDIT:
    I will try to install the driver from Intel tomorow.

    Yeah, don't run iperf from pfSense. It's vastly different than running iperf through pfSense.



  • @dreamslacker:

    Because the LAGG is a software driver. You are not offloading anything to the NIC as when you use it on its own. I've seen the same thing happen before on a Rangeley 8-core. Just had to break the LAGG and use the physical interfaces directly to resolve.

    I've tried outside the LAG, it has a bit more performance but not anything significant, the main issue is still there.

    @Harvy66:

    Yeah, don't run iperf from pfSense. It's vastly different than running iperf through pfSense.

    Since a SpeedTest is giving the same results, I dont think its a main problem right now to troubleshoot it.


  • Netgate Administrator

    Yeah I would still expect you to see Gigabit easily but it's a much better test to use other devices for the iperf client and server.

    Just as an example I can see line rate Gigabit (~940Mbps) with pfSense as one end of the iperf test, as you're doing, on an old E4500. That's using em NICs.

    Steve


Log in to reply