10gbps performance issue
-
You might expect that but when running as a router/firewall connections are not normally terminated on the firewall. The exception might be if you're running Squid for example.
Since your WAN is 1Gbps the actual firewall throughput for a connection to/from the internet cannot exceed that. So if you're seeing 2Gbps to the firewall it's not throttling that.
With LRO disabled you are seeing 1Gbps from a Linux client to your ISP. That's the maximum you can get. SO where are you actually seeing less bandwidth than you expect other than testing to the firewall itself which never normally happens?
Steve
-
We're in the process of settings things up for a new environment and would like to make sure that they work properly. I agree there're a limited number of tasks when such a high throughput required against pfSense itself, but they exist and we wouldn't like to get into a situation when we'll have to troubleshoot things on the live production system.
Please, help me to find a reason for 2Gbps rate from a server to pfSense?
-
What is the CPU usage when you are running that test?
Try running
top -aSH
in another console window. Are any CPU threads running at or near 100%?Steve
-
Here is the top output during the iperf test (linux is a client, pfsense is a server). I do not see an overload here.
last pid: 54739; load averages: 0.27, 0.15, 0.10 up 3+02:29:33 05:12:44 264 processes: 12 running, 198 sleeping, 54 waiting CPU: 1.4% user, 0.0% nice, 9.3% system, 12.6% interrupt, 76.8% idle Mem: 67M Active, 680M Inact, 574M Wired, 84M Buf, 10G Free Swap: 3881M Total, 3881M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12 root -92 - 0K 880K CPU1 1 2:44 99.93% [intr{irq273: bxe2:fp01}] 11 root 155 ki31 0K 128K RUN 7 74.4H 98.14% [idle{idle: cpu7}] 11 root 155 ki31 0K 128K CPU0 0 74.4H 97.33% [idle{idle: cpu0}] 11 root 155 ki31 0K 128K CPU6 6 74.4H 95.98% [idle{idle: cpu6}] 11 root 155 ki31 0K 128K RUN 5 74.4H 86.53% [idle{idle: cpu5}] 54395 root 84 0 28552K 4508K CPU7 7 0:06 84.71% iperf -s{iperf} 11 root 155 ki31 0K 128K CPU3 3 74.4H 80.39% [idle{idle: cpu3}] 11 root 155 ki31 0K 128K RUN 2 74.4H 79.07% [idle{idle: cpu2}] 11 root 155 ki31 0K 128K CPU4 4 74.4H 72.06% [idle{idle: cpu4}] 54395 root 20 0 28552K 4508K nanslp 4 0:00 0.74% iperf -s{iperf} 12120 root 40 20 683M 524M CPU2 2 0:25 0.18% /usr/local/bin/snort -R 41368 -D -q --suppress-config-log -l /var/ 12 root -60 - 0K 880K WAIT 0 3:30 0.08% [intr{swi4: clock (0)}] 11 root 155 ki31 0K 128K RUN 1 74.4H 0.07% [idle{idle: cpu1}] 54739 root 20 0 22116K 4816K CPU5 5 0:00 0.07% top -aSH 12587 root 40 20 51952K 17220K nanslp 5 0:08 0.03% /usr/local/bin/barnyard2 -r 41368 -f snort_41368_lagg0.u2 --pid-pa 12 root -92 - 0K 880K WAIT 0 0:09 0.02% [intr{irq267: bxe1:fp00}]
iperf result:
[2.4.3-RELEASE][admin@pfSense]/root: iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 4] local 10.10.10.254 port 5001 connected with 10.10.10.20 port 53986 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 2.52 GBytes 2.16 Gbits/sec
The top output for the opposite direction (linux is a server, pfsense is a client):
last pid: 21988; load averages: 0.13, 0.16, 0.10 up 3+02:32:16 05:15:27 263 processes: 9 running, 199 sleeping, 55 waiting CPU: 0.1% user, 0.0% nice, 8.4% system, 8.4% interrupt, 83.0% idle Mem: 66M Active, 681M Inact, 575M Wired, 84M Buf, 10G Free Swap: 3881M Total, 3881M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0K 128K CPU7 7 74.4H 100.00% [idle{idle: cpu7}] 11 root 155 ki31 0K 128K RUN 1 74.4H 99.88% [idle{idle: cpu1}] 11 root 155 ki31 0K 128K CPU2 2 74.4H 98.56% [idle{idle: cpu2}] 11 root 155 ki31 0K 128K CPU5 5 74.4H 88.79% [idle{idle: cpu5}] 11 root 155 ki31 0K 128K CPU3 3 74.4H 85.63% [idle{idle: cpu3}] 11 root 155 ki31 0K 128K CPU4 4 74.4H 81.41% [idle{idle: cpu4}] 11 root 155 ki31 0K 128K CPU6 6 74.4H 72.80% [idle{idle: cpu6}] 21988 root 52 0 26376K 3852K sbwait 2 0:03 68.84% iperf -c 10.10.10.20{iperf} 12 root -92 - 0K 880K WAIT 0 2:57 62.64% [intr{irq272: bxe2:fp00}] 11 root 155 ki31 0K 128K RUN 0 74.4H 37.34% [idle{idle: cpu0}] 0 root -92 - 0K 832K - 5 0:00 1.00% [kernel{bxe2_fp0_tq}] 12120 root 40 20 683M 524M bpf 6 0:25 0.13% /usr/local/bin/snort -R 41368 -D -q --suppress-config-log -l /var/ 54739 root 20 0 22116K 4816K CPU1 1 0:00 0.10% top -aSH
iperf result:
[2.4.3-RELEASE][admin@pfSense]/root: iperf -c 10.10.10.20 ------------------------------------------------------------ Client connecting to 10.10.10.20, TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 3] local 10.10.10.254 port 15711 connected with 10.10.10.20 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 11.0 GBytes 9.41 Gbits/sec
Just in case, here is the local iperf test:
[2.4.3-RELEASE][admin@pfSense]/root: iperf -c localhost ------------------------------------------------------------ Client connecting to localhost, TCP port 5001 TCP window size: 144 KByte (default) ------------------------------------------------------------ [ 3] local 127.0.0.1 port 13072 connected with 127.0.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 18.0 GBytes 15.5 Gbits/sec
-
You have one CPU core running at 100% (~0% idle):
11 root 155 ki31 0K 128K RUN 1 74.4H 0.07% [idle{idle: cpu1}]
You probably have (at least) 4 queues per NIC so it would be worth running that test with
-P 4
at the client to spread the load better.Steve
-
[2.4.3-RELEASE][admin@pfSense]/root: iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 4] local 10.10.10.254 port 5001 connected with 10.10.10.20 port 53996 [ 5] local 10.10.10.254 port 5001 connected with 10.10.10.20 port 53998 [ 6] local 10.10.10.254 port 5001 connected with 10.10.10.20 port 54000 [ 7] local 10.10.10.254 port 5001 connected with 10.10.10.20 port 54002 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 1.36 GBytes 1.16 Gbits/sec [ 5] 0.0-10.0 sec 1.34 GBytes 1.15 Gbits/sec [ 6] 0.0-10.0 sec 1.32 GBytes 1.13 Gbits/sec [ 7] 0.0-10.0 sec 1.32 GBytes 1.13 Gbits/sec [SUM] 0.0-10.0 sec 5.34 GBytes 4.58 Gbits/sec
last pid: 34460; load averages: 1.15, 0.32, 0.16 up 3+03:52:37 06:35:48 267 processes: 17 running, 199 sleeping, 51 waiting CPU: 5.9% user, 0.0% nice, 30.7% system, 50.0% interrupt, 13.4% idle Mem: 67M Active, 683M Inact, 576M Wired, 84M Buf, 10G Free Swap: 3881M Total, 3881M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12 root -92 - 0K 880K CPU0 0 3:08 99.93% [intr{irq272: bxe2:fp00}] 12 root -92 - 0K 880K CPU3 3 3:02 99.91% [intr{irq275: bxe2:fp03}] 12 root -92 - 0K 880K CPU2 2 3:02 99.88% [intr{irq274: bxe2:fp02}] 12 root -92 - 0K 880K CPU1 1 2:55 99.86% [intr{irq273: bxe2:fp01}] 34352 root 83 0 35080K 6772K CPU6 6 0:06 77.43% iperf -s{iperf} 34352 root 52 0 35080K 6772K CPU4 4 0:06 76.79% iperf -s{iperf} 34352 root 52 0 35080K 6772K CPU7 7 0:06 76.59% iperf -s{iperf} 34352 root 52 0 35080K 6772K CPU6 6 0:06 76.52% iperf -s{iperf} 11 root 155 ki31 0K 128K RUN 7 75.8H 22.62% [idle{idle: cpu7}] 11 root 155 ki31 0K 128K RUN 6 75.8H 22.55% [idle{idle: cpu6}] 11 root 155 ki31 0K 128K RUN 5 75.8H 22.49% [idle{idle: cpu5}] 11 root 155 ki31 0K 128K RUN 4 75.8H 22.46% [idle{idle: cpu4}] 34352 root 20 0 35080K 6772K nanslp 6 0:00 2.14% iperf -s{iperf} 11 root 155 ki31 0K 128K RUN 2 75.8H 0.12% [idle{idle: cpu2}] 11 root 155 ki31 0K 128K RUN 3 75.8H 0.12% [idle{idle: cpu3}] 11 root 155 ki31 0K 128K RUN 0 75.7H 0.12% [idle{idle: cpu0}] 11 root 155 ki31 0K 128K RUN 1 75.8H 0.11% [idle{idle: cpu1}] 34460 root 20 0 22116K 4820K CPU5 5 0:00 0.10% top -aSH 12 root -60 - 0K 880K WAIT 5 3:34 0.09% [intr{swi4: clock (0)}]
Wow.. This does seem as a CPU limit.. Wondering why linux box's CPU (Intel(R) Xeon(R) CPU X5670 @ 2.93GHz) "eats" 10Gbps w/o issues:
top - 06:45:18 up 4 days, 21:26, 3 users, load average: 0.09, 0.03, 0.01 Threads: 426 total, 2 running, 424 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 0.0 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 1.3 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 1.3 si, 0.0 st %Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu8 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu9 : 0.3 us, 55.9 sy, 0.0 ni, 43.4 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu11 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu13 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu14 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu17 : 0.3 us, 0.0 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu18 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu20 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu21 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu22 : 0.0 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu23 : 0.0 us, 2.2 sy, 0.0 ni, 97.1 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st KiB Mem : 65965828 total, 63945296 free, 394004 used, 1626528 buff/cache KiB Swap: 67096572 total, 67096572 free, 0 used. 65039256 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20962 myuser 20 0 236696 2208 1924 R 60.6 0.0 0:01.83 iperf -s 125 root 20 0 0 0 0 S 1.0 0.0 0:00.87 [ksoftirqd/23] 34 root 20 0 0 0 0 S 0.7 0.0 0:00.26 [ksoftirqd/5] 14 root 20 0 0 0 0 S 0.3 0.0 0:00.29 [ksoftirqd/1] 55 root 20 0 0 0 0 S 0.3 0.0 0:00.36 [ksoftirqd/9] 16776 root 20 0 0 0 0 S 0.3 0.0 0:03.47 [kworker/9:1]
[2.4.3-RELEASE][admin@pfSense]/root: iperf -c 10.10.10.20 ------------------------------------------------------------ Client connecting to 10.10.10.20, TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 3] local 10.10.10.254 port 52919 connected with 10.10.10.20 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 10.8 GBytes 9.26 Gbits/sec
Any ideas?
-
What are the NICs in the Linux box? If the NICs are the same, most likely the drivers are different.
Also try:
- disabling Hyperthreading on the pfsense box.
- disabling flow-controll everywhere including the switch
- increasing the interrupt max rate of interrupts
To be honest it's a bit pointless to test througput. Better test PPS through the pfsense box. This will expose your PPS limit based on the CPU/NIC/Settings/Firewall Configuration combination.
You can check with netstat -ihw 1 where the drop happens.
-
The load due to pf shows in those values in pfSense. Is the Linux box running any sort of firewall?
Try disabling pf temporarily as a test.
But this is still not a test of the firewall throughput. I'm still unsure what you're trying to achieve here. Your WAN is 1Gbps and you are able to see that fully from a client behind pfSense. If you want to test more than that use a 10Gbps WAN to see what it can pass.
Steve
-
@stephenw10 with pf disabled it shows slightly better perfromance:
pf disabled [2.4.3-RELEASE][admin@pfSense]/root: [2.4.3-RELEASE][admin@pfSense]/root: iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 4] local 10.10.10.254 port 5001 connected with 10.10.10.20 port 54942 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 3.91 GBytes 3.35 Gbits/sec
4 flows test gives:
[SUM] 0.0-10.0 sec 8.03 GBytes 6.88 Gbits/sec
I don't think the ISP can provide us 10Gbps link at the moment. But later it's possible. And it wouldn't be great to face such an issue when all systems are in production.
I'm trying to figure out why the speed is not the expected one. The next steps in the list is to disable hyperthreading and upgrade the CPU. I'll post the results here.
Anyway, if you have any other ideas why the CPU is so slow comparing to the linux box, I'd be more than happy to check them. -
I believe the devs should remove iperf from base installs....
These iperf threads keep popping up every month & conclusion is always the same:
Don't run iperf on pfsenseThe only way to measure throughput is like this:
(Iperf-server)----(pfsense)----(iperf-client)
All other measurements are pointless and inaccurate. -
@heper hope devs would not follow your suggestion. it's like "we've got a headache. let's cut the head out". very wise.
-
I don't think iperf will be removed any time soon.
But I agree with heper, what you're testing is not anything that can ever happen in normal use.
It can be useful to run iperf on the firewall to test a single interface at a time if you are seeing very bad throughput testing through the firewall.
You have two 10GbE interfaces there. Just setup another device connected to another interfaces and run an iperf server on that. Then test to it from the client on another interface.
Steve
-
@heper said in 10gbps performance issue:
I believe the devs should remove iperf from base installs…
Its not part of base install? If it is what is the point of the iperf package? Are you suggesting that the package to install iperf be removed as an option?
-
I see no point in having it available on pfsense.
Time and time again, it's used to reach the wrong conclusions anyways. -
@heper said in 10gbps performance issue:
Time and time again, it’s used to reach the wrong conclusions anyways.
Will not disagree with you there.. But there are use cases when you understand that you might not see full speed on your interface using the tool. So for those people that don't or won't draw those conclusions when they understand the point of router is to route not as an end point device for such a tool.
So not sure agree with removal... Removal will just have the users asking how to install it from the freebsd ports/packages even if not part of the pfsense repository.
-
I personally would not want to see either the package removed or iperf3 removed from our repo. I regularly use those for testing. There are many legitimate use cases.
Often I use another pfSense box as a client/server since most of my test network is pfSense boxes for example.Steve
-
Removing access to a tool that can be misused by some while being massively-useful to others sort of reeks of the "thinking" behind control.
pkg add iperf3
please.(wth we still have a real gun emoji. someone's slacking.)
-
@stephenw10 we've finally replaced the CPU to Xeon X5560, but the issue is still in place. Here are the latest measurements:
Single flow:[2.4.3-RELEASE][admin@pfSense]/root: iperf3 -s ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- Accepted connection from 10.10.10.20, port 40256 [ 5] local 10.10.10.254 port 5201 connected to 10.10.10.20 port 40258 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 150 MBytes 1.26 Gbits/sec [ 5] 1.00-2.00 sec 219 MBytes 1.83 Gbits/sec [ 5] 2.00-3.00 sec 227 MBytes 1.90 Gbits/sec [ 5] 3.00-4.00 sec 258 MBytes 2.16 Gbits/sec [ 5] 4.00-5.00 sec 298 MBytes 2.50 Gbits/sec [ 5] 5.00-6.00 sec 298 MBytes 2.50 Gbits/sec [ 5] 6.00-7.00 sec 298 MBytes 2.50 Gbits/sec [ 5] 7.00-8.00 sec 298 MBytes 2.50 Gbits/sec [ 5] 8.00-9.00 sec 298 MBytes 2.50 Gbits/sec [ 5] 9.00-10.00 sec 299 MBytes 2.51 Gbits/sec [ 5] 10.00-10.01 sec 1.99 MBytes 2.48 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.01 sec 2.58 GBytes 2.22 Gbits/sec receiver
4 flows (-P 4):
[2.4.3-RELEASE][admin@pfSense]/root: iperf3 -s ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- Accepted connection from 10.10.10.20, port 40426 [ 5] local 10.10.10.254 port 5201 connected to 10.10.10.20 port 40428 [ 8] local 10.10.10.254 port 5201 connected to 10.10.10.20 port 40430 [ 10] local 10.10.10.254 port 5201 connected to 10.10.10.20 port 40432 [ 12] local 10.10.10.254 port 5201 connected to 10.10.10.20 port 40434 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 45.7 MBytes 383 Mbits/sec [ 8] 0.00-1.00 sec 48.9 MBytes 410 Mbits/sec [ 10] 0.00-1.00 sec 40.2 MBytes 337 Mbits/sec [ 12] 0.00-1.00 sec 47.4 MBytes 397 Mbits/sec [SUM] 0.00-1.00 sec 182 MBytes 1.53 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 1.00-2.00 sec 46.3 MBytes 389 Mbits/sec [ 8] 1.00-2.00 sec 108 MBytes 909 Mbits/sec [ 10] 1.00-2.00 sec 49.1 MBytes 412 Mbits/sec [ 12] 1.00-2.00 sec 38.7 MBytes 325 Mbits/sec [SUM] 1.00-2.00 sec 243 MBytes 2.03 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 2.00-3.00 sec 46.9 MBytes 394 Mbits/sec [ 8] 2.00-3.00 sec 108 MBytes 907 Mbits/sec [ 10] 2.00-3.00 sec 36.6 MBytes 307 Mbits/sec [ 12] 2.00-3.00 sec 25.9 MBytes 217 Mbits/sec [SUM] 2.00-3.00 sec 218 MBytes 1.83 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 3.00-4.00 sec 58.5 MBytes 491 Mbits/sec [ 8] 3.00-4.00 sec 94.0 MBytes 788 Mbits/sec [ 10] 3.00-4.00 sec 44.5 MBytes 374 Mbits/sec [ 12] 3.00-4.00 sec 37.4 MBytes 314 Mbits/sec [SUM] 3.00-4.00 sec 234 MBytes 1.97 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 4.00-5.00 sec 56.7 MBytes 475 Mbits/sec [ 8] 4.00-5.00 sec 79.0 MBytes 663 Mbits/sec [ 10] 4.00-5.00 sec 44.4 MBytes 372 Mbits/sec [ 12] 4.00-5.00 sec 38.5 MBytes 323 Mbits/sec [SUM] 4.00-5.00 sec 219 MBytes 1.83 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 5.00-6.00 sec 61.9 MBytes 520 Mbits/sec [ 8] 5.00-6.00 sec 70.0 MBytes 587 Mbits/sec [ 10] 5.00-6.00 sec 48.5 MBytes 407 Mbits/sec [ 12] 5.00-6.00 sec 42.3 MBytes 354 Mbits/sec [SUM] 5.00-6.00 sec 223 MBytes 1.87 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 6.00-7.00 sec 68.5 MBytes 575 Mbits/sec [ 8] 6.00-7.00 sec 54.1 MBytes 454 Mbits/sec [ 10] 6.00-7.00 sec 54.6 MBytes 458 Mbits/sec [ 12] 6.00-7.00 sec 47.7 MBytes 400 Mbits/sec [SUM] 6.00-7.00 sec 225 MBytes 1.89 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 7.00-8.00 sec 65.1 MBytes 546 Mbits/sec [ 8] 7.00-8.00 sec 55.4 MBytes 464 Mbits/sec [ 10] 7.00-8.00 sec 49.2 MBytes 413 Mbits/sec [ 12] 7.00-8.00 sec 49.9 MBytes 419 Mbits/sec [SUM] 7.00-8.00 sec 220 MBytes 1.84 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 8.00-9.00 sec 67.0 MBytes 562 Mbits/sec [ 8] 8.00-9.00 sec 51.9 MBytes 435 Mbits/sec [ 10] 8.00-9.00 sec 48.3 MBytes 405 Mbits/sec [ 12] 8.00-9.00 sec 56.3 MBytes 472 Mbits/sec [SUM] 8.00-9.00 sec 224 MBytes 1.88 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 9.00-10.00 sec 65.1 MBytes 546 Mbits/sec [ 8] 9.00-10.00 sec 52.0 MBytes 436 Mbits/sec [ 10] 9.00-10.00 sec 54.7 MBytes 459 Mbits/sec [ 12] 9.00-10.00 sec 65.1 MBytes 546 Mbits/sec [SUM] 9.00-10.00 sec 237 MBytes 1.99 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 10.00-10.01 sec 636 KBytes 432 Mbits/sec [ 8] 10.00-10.01 sec 636 KBytes 432 Mbits/sec [ 10] 10.00-10.01 sec 663 KBytes 450 Mbits/sec [ 12] 10.00-10.01 sec 764 KBytes 519 Mbits/sec [SUM] 10.00-10.01 sec 2.64 MBytes 1.83 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.01 sec 582 MBytes 488 Mbits/sec receiver [ 8] 0.00-10.01 sec 722 MBytes 605 Mbits/sec receiver [ 10] 0.00-10.01 sec 471 MBytes 395 Mbits/sec receiver [ 12] 0.00-10.01 sec 450 MBytes 377 Mbits/sec receiver [SUM] 0.00-10.01 sec 2.17 GBytes 1.86 Gbits/sec receiver -----------------------------------------------------------
top output during the tests:
[2.4.3-RELEASE][admin@pfSense]/root: top -aSH last pid: 97946; load averages: 0.72, 0.28, 0.12 up 2+12:43:41 09:10:01 329 processes: 17 running, 241 sleeping, 71 waiting CPU: 0.1% user, 0.5% nice, 2.5% system, 5.4% interrupt, 91.6% idle Mem: 214M Active, 565M Inact, 830M Wired, 232M Buf, 30G Free Swap: 3712M Total, 3712M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0K 256K CPU7 7 60.6H 100.00% [idle{idle: cpu7}] 11 root 155 ki31 0K 256K CPU1 1 60.6H 100.00% [idle{idle: cpu1}] 11 root 155 ki31 0K 256K CPU2 2 60.5H 100.00% [idle{idle: cpu2}] 11 root 155 ki31 0K 256K CPU9 9 60.5H 100.00% [idle{idle: cpu9}] 11 root 155 ki31 0K 256K CPU11 11 60.5H 100.00% [idle{idle: cpu11}] 11 root 155 ki31 0K 256K CPU13 13 60.5H 100.00% [idle{idle: cpu13}] 11 root 155 ki31 0K 256K CPU6 6 60.6H 99.99% [idle{idle: cpu6}] 11 root 155 ki31 0K 256K CPU4 4 60.6H 99.88% [idle{idle: cpu4}] 11 root 155 ki31 0K 256K CPU3 3 60.5H 98.83% [idle{idle: cpu3}] 11 root 155 ki31 0K 256K RUN 12 60.5H 98.45% [idle{idle: cpu12}] 11 root 155 ki31 0K 256K CPU10 10 60.5H 94.15% [idle{idle: cpu10}] 11 root 155 ki31 0K 256K CPU14 14 60.5H 89.01% [idle{idle: cpu14}] 12 root -92 - 0K 1136K WAIT 0 2:07 84.99% [intr{irq277: bxe3:fp00}] 11 root 155 ki31 0K 256K CPU5 5 60.6H 84.03% [idle{idle: cpu5}] 24259 root 52 0 19752K 5628K select 9 0:20 75.33% iperf3 -s 11 root 155 ki31 0K 256K CPU8 8 60.5H 70.66% [idle{idle: cpu8}] 11 root 155 ki31 0K 256K CPU15 15 60.5H 52.12% [idle{idle: cpu15}] 11 root 155 ki31 0K 256K CPU0 0 60.5H 14.83% [idle{idle: cpu0}] 254 root 23 0 266M 44468K accept 12 0:28 1.20% php-fpm: pool nginx (php-fpm){php-fpm} 97017 root 40 20 728M 570M bpf 8 4:14 0.57% /usr/local/bin/snort -R 41368 -D -q --suppress-config-log -l /var/ 12 root -100 - 0K 1136K WAIT 0 0:53 0.25% [intr{irq20: hpet0 uhci3}] 12 root -60 - 0K 1136K WAIT 9 3:06 0.12% [intr{swi4: clock (0)}] 82170 root 20 0 22116K 4796K CPU12 12 0:00 0.10% top -aSH 12 root -92 - 0K 1136K WAIT 1 1:36 0.09% [intr{irq273: bxe2:fp01}] 10462 root 20 0 20356K 6412K select 11 0:10 0.07% /usr/local/sbin/openvpn --config /var/etc/openvpn/server1.conf 12 root -92 - 0K 1136K WAIT 1 1:52 0.07% [intr{irq268: bxe1:fp01}] 12 root -92 - 0K 1136K WAIT 0 2:19 0.06% [intr{irq267: bxe1:fp00}] 12 root -92 - 0K 1136K WAIT 1 1:30 0.06% [intr{irq263: bxe0:fp01}] 12 root -92 - 0K 1136K WAIT 2 2:15 0.06% [intr{irq264: bxe0:fp02}] 12 root -92 - 0K 1136K WAIT 2 1:59 0.06% [intr{irq279: bxe3:fp02}] 60178 www 20 0 58924K 12688K kqread 9 0:01 0.05% /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -p /var/ru 12 root -92 - 0K 1136K WAIT 3 2:28 0.05% [intr{irq280: bxe3:fp03}] 60322 www 20 0 58924K 12632K kqread 11 0:01 0.04% /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -p /var/ru
Do you still see a CPU bottleneck here?
-
To me it looks like each adapter (bxX) is using one queue each. The queues seem to be there, but they are not in use.
12 root -92 - 0K 1136K WAIT 0 2:07 84.99% [intr{irq277: bxe3:fp00}] 12 root -92 - 0K 1136K WAIT 1 1:36 0.09% [intr{irq273: bxe2:fp01}] 12 root -92 - 0K 1136K WAIT 1 1:52 0.07% [intr{irq268: bxe1:fp01}] 12 root -92 - 0K 1136K WAIT 0 2:19 0.06% [intr{irq267: bxe1:fp00}] 12 root -92 - 0K 1136K WAIT 1 1:30 0.06% [intr{irq263: bxe0:fp01}] 12 root -92 - 0K 1136K WAIT 2 2:15 0.06% [intr{irq264: bxe0:fp02}] 12 root -92 - 0K 1136K WAIT 2 1:59 0.06% [intr{irq279: bxe3:fp02}]
I would expect to see the load distributed between all of them.
-
Have you tried a test through the firewall as opposed to terminating on it?
Steve