discrete ethernet adapter efficiency vs onboard nic
-
I know this is splitting hairs, but I figured I'd ask you all because I'd get the correct answer.
if one were to decide to build a firewall from the ground, up.. take a semi-modern/modern motherboard, onboard intel nic, onboard graphics (so no need to add a graphics card). if you had a 1Gb fiber line that you wanted to make sure you squeezed all your bandwidth from.
you have a dual port intel nic that you're going to be installing into the 1st PCIE3.0 slot on your motherboard.
Would it be more efficient to use the onboard nic (which is built into the pch) and the server card in the first slot together, so say your WAN went into the server card, then your lan went into the onboard nic (could be the other way around, but you get my point), or to utilize both ports on the server card and disable the onboard, so that on the server card, one port is wan, while the other is lan?
My thought is; since the server card is in the first slot and has a direct connection with the cpu, assuming data can flow across the pcie bus full duplex simultaneously, it would be the more efficient means of running your firewall, as the pch would be taken out of the loop for processing and you'd avoid hardware overhead..
thoughts?
-
@jc1976 said in discrete ethernet adapter efficiency vs onboard nic:
I know this is splitting hairs
Yes, me again
I personally use these:https://www.supermicro.com/en/products/motherboard/M11SDV-8C+-LN4F
I350-AM4 - LOM (4X)
add-on:
https://www.supermicro.com/wdl/Networking_Drivers/CDR-NIC_1.61_for_Add-on_NIC_Cards/MANUALS/AOC-STG-i4S.pdf
There is no relevant difference in performance between LOM and add-on NIC, of course I mean on 1 Gig operation.
-
Long version:
Unless you use a really fine ruler I do not think you will measure a difference between an on-board vs add-on NIC of the same type.
Same ruler, you are unlikely to measure a difference between the slots.
However I think you may find a difference between NIC types. An onboard NIC is likely to be a cheaper NIC then one you can buy for the purpose you need.Short Version:
A good add-on NIC is best. -
Onboard NICs are often in the SoC, if you are using a low powered device, or in the PCH as you said. That can bring efficiency advantages.
The only significant difference I would look for is that some NIC chipsets support fewer queues and therefore can't use CPU cores as efficiently.
If you are approaching the limits of the hardware anywhere it would probably be better to whichever NICs have most queues. If there's a difference.Steve
-
what i meant was; would it be more efficient and more performant if i cut the onboard nic out of the loop and just designated the two ports on the discrete nic, one for WAN and one for LAN duties?
that way all data flows between the nic and the cpu for processing, and the pch is basically kept out of the loop.
-
Yes, I understand and my answer is only if the external NIC supports more queues and you have enough CPU cores to service those queues.
Steve
-
well, it's an i7-2600k with hyperthreading disabled, so it's running with 4 cores.
And the discrete nic is an intel pro1000 PT server adapter, with 2 ports.sorry for all the questions, this is all very new to me and i'm trying to learn as i go.
Thanks for your patience!
-
@jc1976 said in discrete ethernet adapter efficiency vs onboard nic:
an i7-2600k with hyperthreading disabled, so it's running with 4 cores.
, if you accept I will give you one or two guidelines to fine tune your network (HW level)
these will help you find out what's in the box:
dmesg | grep -i msi sysctl -a | grep msi
and this could be a FreeBSD NIC bible
https://calomel.org/freebsd_network_tuning.html
https://calomel.org/network_performance.html -
For example this C2K device:
[2.5.2-RELEASE][admin@test7.stevew.lan]/root: dmesg | grep queues igb0: Using 2 RX queues 2 TX queues igb0: netmap queues/slots: TX 2/1024, RX 2/1024 igb1: Using 2 RX queues 2 TX queues igb1: netmap queues/slots: TX 2/1024, RX 2/1024 igb2: Using 4 RX queues 4 TX queues igb2: netmap queues/slots: TX 4/1024, RX 4/1024 igb3: Using 4 RX queues 4 TX queues igb3: netmap queues/slots: TX 4/1024, RX 4/1024 igb4: Using 4 RX queues 4 TX queues igb4: netmap queues/slots: TX 4/1024, RX 4/1024 igb5: Using 4 RX queues 4 TX queues igb5: netmap queues/slots: TX 4/1024, RX 4/1024
The 4 NICs in the SoC support 4 queues but the two discrete NICs only 2. Since that's a 4 core CPU the on-board NICs could theoretically be faster individually.
In reality it doesn't make much difference since even on the discrete NICs that's 4 queues total and 8 queues for both so all CPU cores are loaded for normal traffic.Steve
-
LOL this is incredible stuff!
way beyond my skills and knowledge but gives me something to geek out on
-
@jc1976 said in discrete ethernet adapter efficiency vs onboard nic:
LOL this is incredible stuff!
Aha,
Just read it carefully and you'll see that it's logical and a really good description.
Experiment with the settings (not in production) the loader.conf.local file will be your good friend+++edit:
this is just a sample schema, a lot of things in it are no longer relevant (bc. : FB12.2-STABLE).......
hw.pci.realloc_bars=1
net.inet6.ip6.auto_linklocal=0
net.isr.maxthreads=-1
net.isr.bindthreads=1
kern.ipc.nmbclusters=1000000
net.inet.tcp.tso=0
net.inet.tcp.lro=0
dev.igb.0.fc=0
dev.igb.1.fc=0
dev.igb.2.fc=0
dev.igb.3.fc=0
dev.igb.4.fc=0
dev.igb.5.fc=0
dev.igb.6.fc=0
dev.igb.7.fc=0
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
legal.intel_igb.license_ack=1
hw.igb.rx_process_limit=-1
hw.igb.tx_process_limit=-1
hw.igb.rxd=2048
hw.igb.txd=2048
hw.igb.max_interrupt_rate=128000
net.pf.states_hashsize=1048576
net.pf.source_nodes_hashsize=524288
net.inet.tcp.syncache.hashsize=2048
net.inet.tcp.syncache.bucketlimit=100
net.inet.tcp.syncache.cachelimit=65536