I211 vs i350

moscato359

I bought a sg-8860-1u pfsense firewall

It has a wan port, lan port, and 4 opt ports

wan and lan seem to use the onboard intel i211, while the 4 opt ports have intel i350

i350 is a substantially better chipset

Should I use the opt ports instead of wan and lan for my primary wan and lan connections?

Does that add complexity to the setup?

Does it really matter at all?

whosmatt

@moscato359:

i350 is a substantially better chipset

I don't think that's necessarily true as far as pfSense is concerned. A quick search turns up that the i350 has some extra features that would come in handy if you were running a hypervisor, but otherwise you shouldn't see any difference.

stephenw10

The only substantial difference is that the i211 only supports 2 queues and hence only uses 2 CPU cores whereas the i350 can potentially use all 4. However I've yet to see that make a significant difference in real life.

It's easy enough to reassign the ports if you wish to test it but I doubt you would see a difference.

Steve

moscato359

My box actually has 8 cores.

It appears there are a few more differences.
I looked at the data sheets.

Admittingly, I don't know much what these mean.

211 has outstanding requests for 6 tx buffers per port, while i350 appears to have "24 per port and for all ports"
211 has outstanding requests for 1 tx descriptor per port, while i350 appears to have "4 per port and for all ports"
211 has outstanding requests for 1 rx descriptor per port, while i350 appears to have "4 per port and for all ports"

VPD size is N/A for 211 while 256B for i350 (not sure what that means)

i211 has 2 RSS queues per port, while i350 has 8

i211 has 1 arp offload, and 2 ns offloads, while 350 has the same number, but per "pf"

350 also has dma coalescing, while 211 does not

stephenw10

Ah yes that's my mistake. I've only looked at it on the 4860 which has 4 cores. In either case I'd be surprised if you saw much difference in throughput.

Steve

moscato359

Very well.

Thanks for the explanation about the cores and queues.

I decided to use opt1 for my general lan, and wan, and "lan" for a much smaller low traffic segment

dennypage

TLDR: For pfSense (firewall) the choice between i211 and i354 doesn't matter.

The i354 is not generally considered a "better" chipset than the i210/i211. There are advantages to each. The i354 supports more channels (msi interrupts), which can be advantageous in large multiprocessor or virtualization environments. The i211 has lower latency, almost half that of the i354 at 1Gb, and less jitter.

The only situation I think it would be a real consideration is if you had two LAN segments that you were were routing between, and you were concerned about latency. If that were the case, you would probably prefer the LAN segments be on the i211 chips. Otherwise, the default setup of WAN & LAN on the i211 chips generally makes the most sense.

Your mileage may vary.

q54e3w

@dennypage:

The i211 has lower latency, almost half that of the i354 at 1Gb, and less jitter.

We should look into selling audiophile grade NICs for a huge markup that improve the quality of digital replay, if it works for cables….... :o

seriously though, good thread, thanks for the info.

moscato359

How do you know what the latency and jitter differences are?

Any data sheets on that?

dennypage

Just had to ask didn't you? :)

By measuring hardware timestamp offsets for hundreds of millions of packets for both chipsets.

And since you asked…

If you turn off all but one channel, which gives the lowest variance, my observed mac to mac latency for the i211 at 1Gb is ~465ns. the mac to mac latency for the i354 is ~875ns. If you use the default channels (2 for i211, 4 for i354), the numbers will jump for both, around 10-15ns for the i211, and 50ns for the i354.

At 100Mb with one channel, my observed mac to mac for the i211 is ~3110ns, and ~3720ns for the i354. If I use the default number of channels on the i211, my observed latency is ~3175ns. Intel's published number is 3177ns. This is actually the only number they still publish. They have withdrawn numbers for 10Mb/1Gb on the i211, and to my knowledge have never published anything on the i354.

Note that all of this is done with empty transmit/receive queues, so scheduling latency is minimal. Packet scheduling adds substantial variance, and increases per channel.

moscato359

@dennypage:

Just had to ask didn't you? :)

By measuring hardware timestamp offsets for hundreds of millions of packets for both chipsets.

And since you asked…

If you turn off all but one channel, which gives the lowest variance, my observed mac to mac latency for the i211 at 1Gb is ~465ns. the mac to mac latency for the i354 is ~875ns. If you use the default channels (2 for i211, 4 for i354), the numbers will jump for both, around 10-15ns for the i211, and 50ns for the i354.

At 100Mb with one channel, my observed mac to mac for the i211 is ~3110ns, and ~3720ns for the i354. If I use the default number of channels on the i211, my observed latency is ~3175ns. Intel's published number is 3177ns. This is actually the only number they still publish. They have withdrawn numbers for 10Mb/1Gb on the i211, and to my knowledge have never published anything on the i354.

Note that all of this is done with empty transmit/receive queues, so scheduling latency is minimal. Packet scheduling adds substantial variance, and increases per channel.

Thank you.

So you'd suggest my primary lan and primary wan over i211 instead of i350 then?

It'd definitely simplify labeling on the physical box!

pfBasic

So you'd suggest my primary lan and primary wan over i211 instead of i350 then?

The measurements he posted are in nanoseconds, the biggest difference I saw was ~600ns, which is 0.0006ms or 0.0000006 seconds or six ten-millionths of a second, so entirely indiscernible.
I didn't even know anyone measured networking things in nanoseconds!

dennypage

I would suggest that you use the ports as they are labeled.

By the time you factor in all the variables involved, tx/rx channels and buffers, interrupt coalescing, energy efficient ethernet, etc., it's impossible for you to detect which chipset you are on in general use.

@moscato359:

So you'd suggest my primary lan and primary wan over i211 instead of i350 then?

dennypage

Measuring low microseconds / high nanoseconds is common in local time synchronization.

@pfBasic:

I didn't even know anyone measured networking things in nanoseconds!

moscato359

I have 2 LAN networks (and soon to be 4), and 2 WAN networks, which is why I got the 6 port unit in the first place.

I got the 8 core over the 4 core primarily due to rack mounting, and general overcompensation factor.

moscato359

@dennypage:

Measuring low microseconds / high nanoseconds is common in local time synchronization.

@pfBasic:

I didn't even know anyone measured networking things in nanoseconds!

Even inside the pfsense box, port to port, with no traffic going through it, I can't get totally consistent latencies over many many 10 ping tests.

The first ping is always about 50-60 microseconds slower than the rest, and then the the rest fluctuate between 47 to 67 microseconds

I'm finding if I remove the first ping from each sample, the i211 pinging the other i211 is about 10 microseconds faster, which is about 20% on average, with the first ping also being about 20% lower on the first ping, when compared to i350 to i350

whosmatt

@q54e3w:

We should look into selling audiophile grade NICs for a huge markup that improve the quality of digital replay, if it works for cables….... :o

Probably should use audiophile patch cables while we're at it: http://www.audioquest.com/ethernet/diamond

dennypage

When you ping an address that is in the same host, there are no packets on the wire. It's all inside the kernel. In other words, "the i211 pinging the other i211", isn't actually doing anything with the network interfaces.

FWIW, you should not expect ping to be overly consistent at low levels of latency, particularly with a stateful firewall. Even pinging a highly deterministic (micro controller based) host on a 100Mb cross-connect, you can expect to see 90-95us ping times with 10-15us standard deviation. The physical transmission time is 11.52us. The other ~80us comes from the source and target. Add a switch in-between, and you'll add another ~20us. And this is with interrupt coalescing disabled, which isn't something that you don't want to do on your firewall.

Best advice I can give you is that if you do not have a use case that requires ultra low latency, then don't worry about it. Be happy you have Intel chips.

dennypage

@whosmatt:

Probably should use audiophile patch cables while we're at it: http://www.audioquest.com/ethernet/diamond

:)

moscato359

I'm fine with my cat6 :P

I have
2 wans
2 lans

lan 1 is over 2TB/month
wan 1 is over 2TB/month
lan 2 is under 1MB/month: Isolated for security reasons.
wan 2 is under 2GB/month: This forwards to a vendor owned sonicwall which is only used to VPN to said vendor. Silly, but necessary.

I put wan1 on wan, lan1 on lan, lan2 on opt1, and wan2 on opt4

Should be all good.

Thanks for the help!