Why am I only getting 1 gigabit speed instead of 10 gigabit on my new Netgate XG-7100?
-
Why am I only getting 1 gigabit speed instead of 10 gigabit on my new Netgate XG-7100?
I forgot to benchmark things before I took it off my desk & put it in the rack downstairs. I was only getting 1 gigabit speeds & started checking every link, which showed up as 10 gigabit. I finally unracked it & brought it back to my office. Even connected via a DAC SFP+ cable I'm still only getting gigabit speeds. I completely expect the usual overhead & what not to cause limits below 10 gigabit speeds (VERY limited WAN link at the moment as well, but got a 10 gigabit league firewall as that will be changing soon). But I'd expect iperf to at least get in the ballpark.
Basic test with all other variables removed:
- I have a ASUS XG-C100F 10G SFP+ Network Adapter in my workstation, all other NICs
disabled - I'm connecting using a 10' 10Gtek SFP+ DAC Twinax cable
- Only the power cable & DAC cable are are plugged into the Netgate XG-7100 & the DAC cable is on ix1
- Both Windows & pfSense claim they connected at 10 gigabit
I temporarily set it to force VLAN 11 for the test as it was no longer plugged into an an access port on the switch on the correct VLAN. If I run iperf client & server on my worksation I get roughly 10 Gbits/sec, as I'd expect. But if I run iperf server on the pfSense box & the client on my workstation I only get 1 Gbits/sec speed. As expected everything is broken except for the pfsense IP & workstation IP as there is no WAN link. But I can get to pfSense & the iperf server on pfSense.
Run iperf on my workstation pointing to pfSense & only get 1 gigabit speed:
iperf3.exe -c 10.10.11.1 -t 100
Connecting to host 10.10.11.1, port 5201
[ 4] local 10.10.11.102 port 59250 connected to 10.10.11.1 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 128 MBytes 1.07 Gbits/sec
[ 4] 1.00-2.00 sec 133 MBytes 1.11 Gbits/sec
[ 4] 2.00-3.00 sec 132 MBytes 1.11 Gbits/sec
[ 4] 3.00-4.00 sec 122 MBytes 1.03 Gbits/secpfSense interface status:
GUEST Interface (opt7, ix1.11)
Status
up
MAC Address
00:08:a2:12:af:8b - ADI Engineering
IPv4 Address
10.10.11.1
Subnet mask IPv4
255.255.255.0
IPv6 Link Local
fe80::208:a2ff:fe12:af8b%ix1.11
MTU
1500
Media
10Gbase-Twinax <full-duplex,rxpause,txpause>
In/out packets
24477688/24477648 (34.13 GiB/934.21 MiB)
In/out packets (pass)
24477688/24477648 (34.13 GiB/934.21 MiB)
In/out packets (block)
0/0 (0 B/0 B)
In/out errors
0/12
Collisions
0Running client & server on the workstation (10.10.11.102) & pointing it to itself generates more or less expected 10 gigabit results:
iperf3.exe -c 10.10.11.102 -t 100
Connecting to host 10.10.11.102, port 5201
[ 4] local 10.10.11.102 port 62150 connected to 10.10.11.102 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 1.08 GBytes 9.30 Gbits/sec
[ 4] 1.00-2.00 sec 1.24 GBytes 10.6 Gbits/sec
[ 4] 2.00-3.00 sec 1.21 GBytes 10.4 Gbits/sec - I have a ASUS XG-C100F 10G SFP+ Network Adapter in my workstation, all other NICs
-
So you are connected directly with both ends using VLAN 11 tagged traffic? But you would otherwise usually be using a 10G switch?
You are actually seeing more that 1G so it is linked at 10G.
How are you running the test? More than 1 parallel stream?
Testing to/from pfSense directly will always give a worse result than expected. pfSense is optimised as a router and not as a server, it performs badly as a TCP endpoint. The iperf3 process itself uses significant CPU cycles that would otherwise be forwarding traffic.
You should test between two clients through pfSense if you can.Steve
-
Running iperf server and client on the same host, exactly what does that do? I can't believe that it actually pushes traffic out of the physical interface; on a Linux/*nix system I would expect the data to never leave the box, probably on the stack with zero copy stuff going on. Basically I'm not sure what that test would exactly "prove".
-
@mer
Same goes for bsd.
This is on a kvm virtualized pf, on i5. All physical interfaces are at 1Gbit.Connecting to host 192.168.31.1, port 5201
[ 5] local 192.168.31.1 port 33173 connected to 192.168.31.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 4.54 GBytes 39.0 Gbits/sec 0 864 KBytes
[ 5] 1.00-2.00 sec 4.63 GBytes 39.8 Gbits/sec 0 1.62 MBytes
[ 5] 2.00-3.00 sec 4.45 GBytes 38.3 Gbits/sec 0 2.01 MBytes
[ 5] 3.00-4.00 sec 4.35 GBytes 37.4 Gbits/sec 0 2.01 MBytes
[ 5] 4.00-5.00 sec 4.36 GBytes 37.4 Gbits/sec 0 2.01 MBytes
[ 5] 5.00-6.00 sec 4.43 GBytes 38.1 Gbits/sec 0 2.01 MBytes
[ 5] 6.00-7.00 sec 3.13 GBytes 26.9 Gbits/sec 0 2.01 MBytes
[ 5] 7.00-8.00 sec 4.35 GBytes 37.4 Gbits/sec 0 2.01 MBytes
[ 5] 8.00-9.00 sec 4.46 GBytes 38.3 Gbits/sec 0 2.01 MBytes
[ 5] 9.00-10.00 sec 4.40 GBytes 37.8 Gbits/sec 0 2.01 MBytes
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 43.1 GBytes 37.0 Gbits/sec 0 sender
[ 5] 0.00-10.05 sec 43.1 GBytes 36.8 Gbits/sec receiveriperf Done.
-
Indeed, running client and server on the same box is only testing it's ability to run iperf3.
-
@stephenw10, @netblues Thanks for the confirmation. I read the OP a few times earlier today and having no input for the stated problem, the last test just kept bothering me.
-
@stephenw10 said in Why am I only getting 1 gigabit speed instead of 10 gigabit on my new Netgate XG-7100?:
So you are connected directly with both ends using VLAN 11 tagged traffic? But you would otherwise usually be using a 10G switch?
Correct. I was connected via my 10gb switches & VLAN 11 or 12 when I went to validate performance & found 1G rather than the expected 10G, I disassembled things down to the minimum possible config to eliminate the switches as the potential bottleneck. I left things with VLAN 11to avoid reconfiguring the interfaces any further & just tagged VLAN 11 on the workstation itself.
You are actually seeing more that 1G so it is linked at 10G.
How are you running the test? More than 1 parallel stream?
Just 1 stream. iperf3 as a server on the pfSense box & client on the workstation. I also ran client AND server on the workstation as a test & got expected 10G speeds.
Testing to/from pfSense directly will always give a worse result than expected. pfSense is optimised as a router and not as a server, it performs badly as a TCP endpoint. The iperf3 process itself uses significant CPU cycles that would otherwise be forwarding traffic.
You should test between two clients through pfSense if you can.Seems a little odd to me. https://github.com/mendel5/iperf3-results does indicate you shouldn't be seeing above 940 MBit/s on a gigabit connection & I'm a hair above that. I was assuming it was rounding or buffering inconsistencies in the output or something.
only 1 10G endpoint at the moment. Unless I pull the currently unused 2 port NIC out of my XG-7100 & put it in the wife's PC for testing. That's living dangerously though so I'll probably have to sort out another endpoint or a 10G thunderbolt adapter for my work laptop, if that will play nicely. Or find a spare PC T 10G NIC.
-
Yeah you are seeing more than the 941Mbps maximum Gigabit will give you which indicates it _is_linked at 10G.
With 1 iperf process and 1 stream you are using using 1 NIC queue and potentially 1CPU core.
Try running it with 4 or 8 streams. You should at least see some numbers that are significantly above 1G. But it will still be quite a lot lower than testing through the firewall.Steve
-
@stephenw10 said in Why am I only getting 1 gigabit speed instead of 10 gigabit on my new Netgate XG-7100?:
Yeah you are seeing more than the 941Mbps maximum Gigabit will give you which indicates it _is_linked at 10G.
With 1 iperf process and 1 stream you are using using 1 NIC queue and potentially 1CPU core.
Try running it with 4 or 8 streams. You should at least see some numbers that are significantly above 1G. But it will still be quite a lot lower than testing through the firewall.Steve
Firing up 5 instances of iperf on the pfsense box
iperf3 -s -p 5101&; iperf3 -s -p 5102&; iperf3 -s -p 5103&; iperf3 -s -p 5104&; iperf3 -s -p 5105 &
Then launching 5 concurrent clients on my windows box
start iperf3.exe -c 10.10.11.1 -t 100 -p 5101 start iperf3.exe -c 10.10.11.1 -t 100 -p 5102 start iperf3.exe -c 10.10.11.1 -t 100 -p 5103 start iperf3.exe -c 10.10.11.1 -t 100 -p 5104 start iperf3.exe -c 10.10.11.1 -t 100 -p 5105
Gets me 5 concurrent stream that each wrap up in the 500Mbits-1.1Gbit/sec range, so a couple gigabit. Which i guess answers by fundamental fundamental question of "Did I get this thing hooked up through several hops at 10 gigabit?", enough to stop messing with it for now. Well, at least when I unwind my changes & put it back in the rack.
-
Oh, looks like you can do parallel threads natively too.
8 threads
iperf3.exe -c 10.10.11.1 -t 10 -P 8
gets just over 4.5 gigabit
- - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 548 MBytes 460 Mbits/sec sender [ 4] 0.00-10.00 sec 548 MBytes 460 Mbits/sec receiver [ 6] 0.00-10.00 sec 763 MBytes 640 Mbits/sec sender [ 6] 0.00-10.00 sec 763 MBytes 640 Mbits/sec receiver [ 8] 0.00-10.00 sec 783 MBytes 657 Mbits/sec sender [ 8] 0.00-10.00 sec 783 MBytes 657 Mbits/sec receiver [ 10] 0.00-10.00 sec 769 MBytes 645 Mbits/sec sender [ 10] 0.00-10.00 sec 769 MBytes 645 Mbits/sec receiver [ 12] 0.00-10.00 sec 538 MBytes 451 Mbits/sec sender [ 12] 0.00-10.00 sec 537 MBytes 451 Mbits/sec receiver [ 14] 0.00-10.00 sec 765 MBytes 642 Mbits/sec sender [ 14] 0.00-10.00 sec 765 MBytes 642 Mbits/sec receiver [ 16] 0.00-10.00 sec 762 MBytes 639 Mbits/sec sender [ 16] 0.00-10.00 sec 762 MBytes 639 Mbits/sec receiver [ 18] 0.00-10.00 sec 540 MBytes 453 Mbits/sec sender [ 18] 0.00-10.00 sec 540 MBytes 453 Mbits/sec receiver [SUM] 0.00-10.00 sec 5.34 GBytes 4.59 Gbits/sec sender [SUM] 0.00-10.00 sec 5.34 GBytes 4.58 Gbits/sec receiver iperf Done.
-
Ah, there we go that's about what I'd expect to see.
One of the interesting things about iperf is that it's deliberately designed to be single threaded. Running multiple parallel streams using the '-P' switch does not change that, you are still running one iperf process. But, as you already tried, that means you can run it multiple times to test combinations of CPU cores and streams. You still see a better result using -P because the firewall and the NICs can use multiple queues and therefore CPU cores to move that traffic.
Steve