SG-3100 slow not getting gigabit.
-
You are not CPU limited, both CPU cores are ~60% idle. Which is about what I would expect at 400Mbps.
Yes, the client and server both running on pfSense is about the only way to explain that 3Gbps result.
The mvneta NICs in the SG-3100 are single queue so it is possible to hit a problem if both NIC queues are using the same CPU core. That doesn't appear to be the case here though.
What does the output oftop aSH
at the CLI actually show when you're testing?Steve
-
/root: top aSH last pid: 82903; load averages: 0.38, 0.29, 0.22 up 6+17:21:22 11:23:18 55 processes: 1 running, 54 sleeping CPU: 0.2% user, 0.0% nice, 0.0% system, 43.5% interrupt, 56.3% idle Mem: 26M Active, 692M Inact, 201M Wired, 77M Buf, 1076M Free
Speedtest.net 430 Mbps Dn / 340 Mbps Up
-
@stephenw10
I'm thinking you maybe meant "top -aSH". Results below.
mvneta2 being the internet connection via VLAN.
mvneta1 being the lan connection.last pid: 42738; load averages: 0.56, 0.51, 0.40 up 6+21:04:20 15:06:16 153 threads: 4 running, 128 sleeping, 21 waiting CPU: 0.4% user, 0.0% nice, 0.0% system, 48.1% interrupt, 51.6% idle Mem: 26M Active, 695M Inact, 201M Wired, 77M Buf, 1073M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 10 root 155 ki31 0B 16K RUN 0 162.0H 58.83% [idle{idle: cpu0}] 11 root -92 - 0B 176K WAIT 0 28:28 56.61% [intr{mpic0: mvneta2}] 10 root 155 ki31 0B 16K CPU1 1 161.9H 49.75% [idle{idle: cpu1}] 11 root -92 - 0B 176K CPU1 1 24:45 33.30% [intr{mpic0: mvneta1}] 65464 root 20 0 93M 44M accept 0 0:46 0.56% php-fpm: pool nginx (php-fpm){php-fpm} 85835 root 20 0 6936K 3472K CPU1 1 0:00 0.24% top -aSH 11 root -60 - 0B 176K WAIT 1 15:31 0.16% [intr{swi4: clock (0)}] 66868 root 20 0 5716K 2952K bpf 0 2:12 0.09% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid 58289 root 20 0 11M 7612K kqread 1 0:11 0.07% nginx: worker process (nginx) 28090 root 20 0 4868K 2456K select 1 4:09 0.07% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /etc/syslog.conf 13742 root 20 0 105M 101M nanslp 0 6:10 0.06% /usr/local/sbin/pcscd{pcscd} 8 root -16 - 0B 8192B pftm 1 7:19 0.05% [pf purge] 16260 unbound 20 0 41M 25M kqread 1 1:38 0.03% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound} 16260 unbound 20 0 41M 25M kqread 1 1:07 0.02% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound} 39429 dhcpd 20 0 13M 9332K select 1 0:57 0.02% /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid mvneta1 mvneta1.666 59340 root 20 0 11M 5936K select 1 1:04 0.02% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid{ntpd} 69715 root 20 0 12M 8852K select 0 0:05 0.02% sshd: root@pts/1 (sshd)
-
Yes, sorry, that's exactly what I meant.
So you can see the cpu loading is interrupt load from the NICs. But neither NIC queue is at 100% and both CPU cores have available cycles. Nothing is obviously a problem there.
Check Status > Interfaces for errors or collisions.
How are the interfaces connected when you're testing?Steve
-
Interfaces look ok. If CPU isn't overloaded on the SG-3100, that really points to the problem being somewhere else. I am going through my switches on the LAN/WAN network side to see if anything is wrong there, although the ISP box worked fine with the same switch configuration.
WAN Interface (wan, mvneta2.35) Status up DHCP up Relinquish Lease MAC Address 00:08:a2:0d:51:4a IPv4 Address xxx.xxx.232.220 Subnet mask IPv4 255.255.248.0 Gateway IPv4 xxx.xxx.232.1 IPv6 Link Local fe80::208:a2ff:fe0d:514a%mvneta2.35 MTU 1500 Media 1000baseT <full-duplex> In/out packets 197969313/149760972 (212.99 GiB/74.46 GiB) In/out packets (pass) 197969313/149760972 (212.99 GiB/74.46 GiB) In/out packets (block) 991258/9 (39.31 MiB/756 B) In/out errors 0/0 Collisions 0 LAN Interface (lan, mvneta1) Status up MAC Address 00:08:a2:0d:51:49 IPv4 Address 192.168.9.1 Subnet mask IPv4 255.255.254.0 IPv6 Link Local fe80::208:a2ff:fe0d:5149%mvneta1 MTU 1500 Media 2500Base-KX <full-duplex> In/out packets 155378805/191656958 (84.70 GiB/198.58 GiB) In/out packets (pass) 155378805/191656958 (84.70 GiB/198.58 GiB) In/out packets (block) 1831791/10 (562.40 MiB/637 B) In/out errors 0/0 Collisions 0
-
You are correct the sg-3100 is no longer gigabit with the new firmware.
-
@david_moo Do you have any packages installed? If you have packages that put the NICs in promiscous mode, throughput performance will drop significantly.
Examples could be: Darkstat, NTopNG, Snort, Suricata or any other IDS/IPS, statistics and traffic totals/aggregation packages. -
@keyser
I have used squid and some other packages in the past, but removed it ages ago.The only thing running in PROMISC mode is pflog, which is suppose to I think?
[21.05-RELEASE][root@pfSense]/root: ifconfig mvneta0: flags=8a02<BROADCAST,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: OPT1usedwithHomeHob3000whenweneeded options=800bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE> ether 00:08:a2:0d:51:48 inet6 fe80::208:a2ff:fe0d:5148%mvneta0 prefixlen 64 tentative scopeid 0x1 media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> mvneta1: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: LAN options=bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM> ether 00:08:a2:0d:51:49 inet6 fe80::208:a2ff:fe0d:5149%mvneta1 prefixlen 64 scopeid 0x2 inet 192.168.9.1 netmask 0xfffffe00 broadcast 192.168.9.255 media: Ethernet 2500Base-KX <full-duplex> status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> mvneta2: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=800bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE> ether 00:08:a2:0d:51:4a inet6 fe80::208:a2ff:fe0d:514a%mvneta2 prefixlen 64 scopeid 0x8 media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> enc0: flags=0<> metric 0 mtu 1536 groups: enc nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> pflog0: flags=100<PROMISC> metric 0 mtu 33184 groups: pflog pfsync0: flags=0<> metric 0 mtu 1500 groups: pfsync mvneta2.35: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: WAN options=80003<RXCSUM,TXCSUM,LINKSTATE> ether 00:08:a2:0d:51:4a inet6 fe80::208:a2ff:fe0d:514a%mvneta2.35 prefixlen 64 scopeid 0xd inet xxx.yyy.232.220 netmask 0xfffff800 broadcast xxx.yyy.239.255 groups: vlan vlan: 35 vlanpcp: 0 parent interface: mvneta2 media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> mvneta1.666: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: OpenVlan666 options=3<RXCSUM,TXCSUM> ether 00:08:a2:0d:51:49 inet6 fe80::208:a2ff:fe0d:5149%mvneta1.666 prefixlen 64 scopeid 0xe inet 172.16.0.1 netmask 0xffffff00 broadcast 172.16.0.255 groups: vlan vlan: 666 vlanpcp: 0 parent interface: mvneta1 media: Ethernet Other <full-duplex> status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ovpns1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500 options=80000<LINKSTATE> inet6 fe80::208:a2ff:fe0d:5148%ovpns1 prefixlen 64 scopeid 0xf inet 172.18.0.1 --> 172.18.0.2 netmask 0xffff0000 groups: tun openvpn nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> Opened by PID 58270 [21.05-RELEASE][root@pfSense]/root:
-
@david_moo Yep, that looks normal so that is not it :-(
Hope you can find the bottleneck because it should handle that without issues
-
I keep doing testing and am getting a weird result now. My internet comes over a 60Ghz wireless link. Wireless, of course, is not rock solid like wired so I have avoided using it in my tests for the SG-3100 problem as much as possible.
I'm back to running iperf3 on the SG-3100, this maxes out at about ~650Mbit with 100% CPU, fine.
My network is: Internet-Switch-AP(23)----------AP(22)-Switch--Switch-pfsense.I have a linux box(192.168.9.2) on the same switch as the pfsense box(192.168.9.9.1), so testing to either is the same path.
Testing from the close AP (not going over the wireless part of the link) gives the following to the pfsense and linux boxes.GP# iperf3 -c 192.168.9.1 -t 40 Connecting to host 192.168.9.1, port 5201 [ 4] local 192.168.8.22 port 53356 connected to 192.168.9.1 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 82.5 MBytes 689 Mbits/sec 17 346 KBytes [ 4] 1.00-2.00 sec 74.6 MBytes 627 Mbits/sec 3 376 KBytes [ 4] 2.00-3.00 sec 77.7 MBytes 653 Mbits/sec 4 294 KBytes [ 4] 3.00-4.00 sec 73.9 MBytes 620 Mbits/sec 5 318 KBytes [ 4] 4.00-5.01 sec 82.8 MBytes 691 Mbits/sec 3 372 KBytes [ 4] 5.01-6.00 sec 79.8 MBytes 672 Mbits/sec 12 293 KBytes [ 4] 6.00-7.01 sec 75.6 MBytes 631 Mbits/sec 1 320 KBytes ^C[ 4] 7.01-7.08 sec 5.53 MBytes 599 Mbits/sec 0 334 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-7.08 sec 552 MBytes 654 Mbits/sec 45 sender [ 4] 0.00-7.08 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated GP# iperf3 -c 192.168.9.2 -t 40 Connecting to host 192.168.9.2, port 5201 [ 4] local 192.168.8.22 port 57128 connected to 192.168.9.2 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 114 MBytes 950 Mbits/sec 22 355 KBytes [ 4] 1.00-2.00 sec 111 MBytes 936 Mbits/sec 11 385 KBytes [ 4] 2.00-3.00 sec 112 MBytes 943 Mbits/sec 33 342 KBytes [ 4] 3.00-4.00 sec 112 MBytes 938 Mbits/sec 0 537 KBytes [ 4] 4.00-5.00 sec 112 MBytes 939 Mbits/sec 44 382 KBytes [ 4] 5.00-6.00 sec 111 MBytes 936 Mbits/sec 10 402 KBytes ^C[ 4] 6.00-6.98 sec 110 MBytes 938 Mbits/sec 10 443 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-6.98 sec 782 MBytes 940 Mbits/sec 130 sender [ 4] 0.00-6.98 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated
As one can see, everything is normal for a 1Gbit network, with pfsense being maxxed out CPU wise for iperf3.
Now if I test again, but from the far side of the wireless link:
GP# iperf3 -c 192.168.9.1 -t 40 Connecting to host 192.168.9.1, port 5201 [ 4] local 192.168.8.23 port 51104 connected to 192.168.9.1 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 56.1 MBytes 469 Mbits/sec 0 776 KBytes [ 4] 1.00-2.01 sec 64.7 MBytes 542 Mbits/sec 0 776 KBytes [ 4] 2.01-3.02 sec 37.5 MBytes 310 Mbits/sec 0 776 KBytes [ 4] 3.02-4.02 sec 38.8 MBytes 325 Mbits/sec 0 776 KBytes [ 4] 4.02-5.03 sec 38.8 MBytes 321 Mbits/sec 0 776 KBytes [ 4] 5.03-6.01 sec 36.3 MBytes 310 Mbits/sec 0 776 KBytes [ 4] 6.01-7.02 sec 38.8 MBytes 322 Mbits/sec 0 776 KBytes [ 4] 7.02-8.02 sec 38.8 MBytes 325 Mbits/sec 0 776 KBytes [ 4] 8.02-9.02 sec 36.3 MBytes 305 Mbits/sec 0 776 KBytes [ 4] 9.02-10.01 sec 38.8 MBytes 327 Mbits/sec 0 776 KBytes [ 4] 10.01-11.00 sec 54.1 MBytes 458 Mbits/sec 0 776 KBytes [ 4] 11.00-12.00 sec 55.5 MBytes 465 Mbits/sec 43 580 KBytes [ 4] 12.00-13.00 sec 58.2 MBytes 488 Mbits/sec 0 650 KBytes [ 4] 13.00-14.02 sec 57.4 MBytes 474 Mbits/sec 0 694 KBytes [ 4] 14.02-15.05 sec 50.4 MBytes 411 Mbits/sec 0 740 KBytes [ 4] 15.05-16.01 sec 36.3 MBytes 316 Mbits/sec 0 740 KBytes ^C[ 4] 16.01-16.56 sec 21.3 MBytes 323 Mbits/sec 0 740 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-16.56 sec 758 MBytes 384 Mbits/sec 43 sender [ 4] 0.00-16.56 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated GP# iperf3 -c 192.168.9.2 -t 40 Connecting to host 192.168.9.2, port 5201 [ 4] local 192.168.8.23 port 38320 connected to 192.168.9.2 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.01 sec 83.4 MBytes 695 Mbits/sec 0 1.50 MBytes [ 4] 1.01-2.00 sec 92.8 MBytes 783 Mbits/sec 0 1.59 MBytes [ 4] 2.00-3.00 sec 74.9 MBytes 627 Mbits/sec 0 1.59 MBytes [ 4] 3.00-4.00 sec 94.0 MBytes 789 Mbits/sec 0 1.68 MBytes [ 4] 4.00-5.02 sec 73.5 MBytes 607 Mbits/sec 0 1.68 MBytes [ 4] 5.02-6.00 sec 52.5 MBytes 449 Mbits/sec 0 1.68 MBytes [ 4] 6.00-7.01 sec 48.8 MBytes 406 Mbits/sec 0 1.68 MBytes [ 4] 7.01-8.02 sec 48.8 MBytes 405 Mbits/sec 0 1.68 MBytes [ 4] 8.02-9.00 sec 45.0 MBytes 384 Mbits/sec 0 1.68 MBytes [ 4] 9.00-10.00 sec 48.8 MBytes 408 Mbits/sec 0 1.68 MBytes [ 4] 10.00-11.00 sec 63.1 MBytes 531 Mbits/sec 0 1.68 MBytes [ 4] 11.00-12.03 sec 86.0 MBytes 703 Mbits/sec 0 1.68 MBytes [ 4] 12.03-13.02 sec 47.5 MBytes 400 Mbits/sec 0 1.68 MBytes [ 4] 13.02-14.02 sec 48.8 MBytes 409 Mbits/sec 0 1.68 MBytes [ 4] 14.02-15.00 sec 52.1 MBytes 445 Mbits/sec 0 1.68 MBytes [ 4] 15.00-16.01 sec 46.3 MBytes 386 Mbits/sec 0 1.68 MBytes [ 4] 16.01-17.02 sec 48.8 MBytes 406 Mbits/sec 0 1.68 MBytes [ 4] 17.02-18.00 sec 54.8 MBytes 465 Mbits/sec 0 1.68 MBytes [ 4] 18.00-19.00 sec 85.3 MBytes 719 Mbits/sec 0 2.53 MBytes [ 4] 19.00-20.00 sec 93.0 MBytes 780 Mbits/sec 0 2.53 MBytes [ 4] 20.00-21.00 sec 93.0 MBytes 781 Mbits/sec 0 2.53 MBytes [ 4] 21.00-22.01 sec 88.3 MBytes 736 Mbits/sec 0 2.53 MBytes [ 4] 22.01-23.00 sec 92.5 MBytes 780 Mbits/sec 0 2.53 MBytes [ 4] 23.00-24.01 sec 94.4 MBytes 788 Mbits/sec 0 2.53 MBytes [ 4] 24.01-25.01 sec 89.0 MBytes 747 Mbits/sec 0 2.53 MBytes [ 4] 25.01-26.01 sec 89.7 MBytes 751 Mbits/sec 0 2.53 MBytes [ 4] 26.01-27.00 sec 92.0 MBytes 778 Mbits/sec 0 2.53 MBytes ^C[ 4] 27.00-27.17 sec 16.2 MBytes 804 Mbits/sec 0 2.53 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-27.17 sec 1.90 GBytes 600 Mbits/sec 0 sender [ 4] 0.00-27.17 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated GP#
I don't understand this result. If a let it run longer, it the results don't change really. I am looking at what they peak at. The linux box peaks at about 780Mbit which is around the max on the link. The pfsense box peaks at 540Mbit (best case) or so.
I would assume the pfsense and linux boxes should give identical answers until we hit the CPU limit of iperf3 on the pfsense box, but that's not the case. The TCP congestion control seems to be behaving differently if it talks to the pfsense box vs the linux box? Is that expected?
-
@david_moo said in SG-3100 slow not getting gigabit.:
The TCP congestion control seems to be behaving differently if it talks to the pfsense box vs the linux box? Is that expected?
Yes. pfSense is not optimised as a server, a TCP end point, so running iperf on it dircetly will almost always give a lower result then actually testing through it. Even allowing for that fact that iperf itself uses a lot of CPU cycles.
Steve
-
@stephenw10
Great thanks, explained! -
I cannot get more than 480Mbps routed out of the SG-3100 . Getting ~600 would be an improvement for me.
So, why does process "[intr{mpic0: nmvneta1}]" use 100% of CPU when running iperf?
-
It's interrupt load from the NIC.
Are you running iperf3 on the 3100 directly? Or that's just when running through it?
A number of things will appear as interrupr load like that, notably pf. So if you have a very large number of rules or traffic shaping or maybe something complex in the ruleset somehow that's where it will show when loaded.
Steve
-
I'm running iperf between 2 nodes on different vlans... i.e., using the SG-3100 as a router/firewall only. With that I'm still maxed out at 480Mbps.... If I turn Suricata back on, it drops down to ~450. :(
-
VLANs on the same interface? Try between different NICs if you can.
-
@stephenw10
That's brilliant... OK, with using separate interfaces for VLANs, I was able to get 760Mbps with iperf. Still significantly shy of advertised performance, but probably as good as the current network design can sustain (i.e., using a single trunk port).Also, it's the same thing (different PID/NIC) that maxes out the CPU on the SG-3100....
[intr{mpic0: nmvneta0}]
[intr{mpic0: nmvneta1}] -
@msf2000
I think we need more of an explanation.....
If I am understanding correctly we have:
vlan #1 -> port 1 -> SG-3100 -> port 2 -> vlan #2.If that is the case, the the SG-3100 is routing in a very standard way and should be pushing in/out 940Mbps (max for a 1Gbit port) . It's not doing that, why? Can the SG-3100 not handle it?
-
If both VLANs are using the switch ports they are sharing a single parent NIC.
The mvneta NIC/driver is single queue so only one CPU core can service it in any direction.
If you test between a VLAN on LAN and a VLAN on OPT, for example, you are using two NICs and hence two queues that both CPU cores can service.
I would not expect anything to have changed there between 2.4.5 and 21.0X.Steve
-
The 760Mbps figure was routing between OPT1 and a LAN port. CPU was maxed with the nmvneta0 & 1 taking all of a core each.
I.e., this was my test setup:
Linux node 1 --> vlan #1 --> port 1 --> sg-3100 --> opt1 --> vlan 2 --> Linux node 2