Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    6100 10g port and vlans maxing at 1g speed

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    26 Posts 4 Posters 3.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      SpaceBass
      last edited by SpaceBass

      hey folks,
      I've got a newish 6100. Mostly I quite like it. However I'm battling an issue with the 10g port.

      TL;DR - I can't get more than 1g speeds between subnets or to/from the 6100 itself.

      Details

      • I'm using ix0 for my LAN port - 10Gbase-Twinax <full-duplex,rxpause,txpause>
      • Switch shows 10g connection
      • Using a DAC cable rated for 10g, tried multiple DACs, including different brands
      • packages include: avahi, acme, iperf, ntopng, nut, pfBlockerNG-devel, pimd, Telegraf

      What I'm seeing

      • iperf3 between two hosts on same subnet - ~9 Gbits/sec
      • iperf3 between two hosts on different subnets - 941Mbits/sec
      • iperf3 from pfSense Subnet A interface to host on Subnet A ~ 940Mbits/sec
      • iperf3 from pfSense Subnet A to host on Subnet B ~ 940Mbits/sec
      • If I do multiple streams in iperf3 I get about 1.75Gbits/sec in all tests

      Observations

      • processor spikes to 55-65% on the 6100 for each test I run from the device itself

      The big questions
      Can the 6100 actually route (or even RX/TX) 10g traffic?
      What other troubleshooting tips would yall suggest?
      Is there a hidden setting somewhere I need to tweak to get 10g routing and traffic?

      Thanks everyone!

      Examples

      • LAN is 10.15.1.0/24
      • SERVERS is 10.15.100.0/24

      From (presumably) LAN to host on LAN

      iperf3 -c 10.15.1.5
      - - - - - - - - - - - - - - - - - - - - - - - - -
      [ ID] Interval           Transfer     Bitrate         Retr
      [  5]   0.00-10.00  sec  1.16 GBytes   998 Mbits/sec    0             sender
      [  5]   0.00-10.21  sec  1.16 GBytes   977 Mbits/sec                  receiver
      

      From (presumably) LAN to host on Servers

      iperf3 -c 10.15.100.5 -P 4
      [ ID] Interval           Transfer     Bitrate         Retr
      [  5]   0.00-10.00  sec   908 MBytes   762 Mbits/sec    0             sender
      [  5]   0.00-10.22  sec   908 MBytes   746 Mbits/sec                  receiver
      

      From (presumably) LAN to host on Servers with 4 streams

      iperf3 -c 10.15.100.5 -P 4
      [SUM]   0.00-10.01  sec  1.70 GBytes  1.46 Gbits/sec    0             sender
      [SUM]   0.00-10.22  sec  1.70 GBytes  1.43 Gbits/sec                  receiver
      

      From Servers interface to host on Servers with 4 streams

      iperf3 -B 10.15.100.1 -c 10.15.100.5 -P 4
      [SUM]   0.00-10.02  sec  1.73 GBytes  1.48 Gbits/sec    0             sender
      [SUM]   0.00-10.23  sec  1.73 GBytes  1.45 Gbits/sec                  receiver
      
      1 Reply Last reply Reply Quote 0
      • D
        dnavas
        last edited by

        @spacebass I get 7G one way and 5G the other between two subnets (I have no idea why there's an asymmetry there, but the boxes are different), and I get well over 8 from gateway to subnet.
        I assume you've set MTU high on the 6100 LAN interface itself?

        I've noticed if you've bridged the 10G with the ethernet that the bridged network struggles mightily. I had that setup while I was bringing up the 6100 and didn't have all the switches connected yet. Don't leave the bridge on if you want decent throughput.

        There are no other interfaces between that might be a problem? I ask as in testing this I had reset my server's default vlan on the switch and when I went to set it back I lost connectivity. Reboot server, reboot switch, wound up having to create a dummy vlan untagged on that port and then remove the vlan before the port was happy again (the port next to it was happy as a clam). weird Also, when I first plugged in my gateway to the sfp28 switch it connected at 1G, which was ... not swell. Eventually that problem went away, but I didn't really do anything, it just resolved itself (was fiber 10G, now DAC). Double-check switch to gateway connection?

        1 Reply Last reply Reply Quote 0
        • S
          SteveITS Galactic Empire
          last edited by

          Netgate always says to not run the test on pfSense since that will use CPU and slow things down.

          They list the 6100 as
          "IPERF3 Traffic: 9.93 Gbps
          IMIX Traffic: 2.73 Gbps"

          Per the docs are you testing through WAN 3 and WAN 4, the 10 Gb ports? You mention "between subnets" but are those other ports? 940 is pretty much the max on a gigabit port. LAN 1-4 are 2.5 Gbps, so if using those, at what speed are they connecting to the switch? (visible on the dashboard)

          Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
          When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
          Upvote 👍 helpful posts!

          D 1 Reply Last reply Reply Quote 0
          • D
            dnavas @SteveITS
            last edited by

            Good point. In my case I was just hairpinning through a single 10Gb port across vlans, but OP should probably indicate what exists between A and B.

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Mmm, 941Mbps is too close to 1G 'line rate' to be a coincidence IMO.

              Running iperf to/from the firewall itself will always be a worse result that testing through the firewall. But it can be a useful test as long as you realise the limitations.
              So here we can see the Servers interface must be linked at more than 1G but we can't actually see if the LAN is from the results shown.

              Steve

              S 1 Reply Last reply Reply Quote 0
              • S
                SpaceBass @stephenw10
                last edited by SpaceBass

                Here's some additional details:

                All of these subnets are vLANs off the ix0 10g interface.
                I know that means I'm running "on a stick" (so to speak) but shouldn't I at least get close to half of the port's speed?

                Here's a test from LAN (10.15.1.0/24) to vLAN 1012 (10.15.100.0/24). There are ANY/ANY rules for IPV4* on both interfaces.

                ╰─○ iperf3 -c 10.15.100.18
                
                [ ID] Interval           Transfer     Bitrate
                [  5]   0.00-10.00  sec   521 MBytes   437 Mbits/sec                  sender
                [  5]   0.00-10.00  sec   518 MBytes   434 Mbits/sec                  receiver
                

                Here's a test from LAN to another host on LAN

                ╰─○ iperf3 -c 10.15.1.5
                [ ID] Interval           Transfer     Bitrate
                [  5]   0.00-10.00  sec  10.9 GBytes  9.36 Gbits/sec                  sender
                [  5]   0.00-10.00  sec  10.9 GBytes  9.35 Gbits/sec                  receiver
                

                These remote hosts are on the same 10g switch (Unifi Aggregation Switch) as the pfSense appliance.

                I'm using default 1500 MTU.

                1 Reply Last reply Reply Quote 0
                • D
                  dnavas
                  last edited by

                  It's fine, I've got the same setup. I also often use "-R" to run the traffic back, depending on the speed of the hosts involved.

                  I would definitely up your MTU to 9k. The CPU is capable of a little over 600k packets per second per core, which at 1500b packets is about what you are seeing. It would be exciting if pfsense every got the tnsr routing speeds for handling LAN routing, but until then, MTU is your friend for managing peak bandwidth.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, I expect to see more than that. What does the CPU usage look like when you're testing?

                    What does top -HaSP at the command line show?

                    Steve

                    S 1 Reply Last reply Reply Quote 0
                    • S
                      SpaceBass @stephenw10
                      last edited by

                      @stephenw10 said in 6100 10g port and vlans maxing at 1g speed:

                      top -HaSP

                      Looks like at least two core get pegged. Ran a test on two hosts between 10.15.1.0/24 and 10.15.100.0/24

                      /root: top -HaSP
                      last pid: 53556;  load averages:  1.27,  0.75,  0.55    up 1+21:18:12  22:35:13
                      656 threads:   8 running, 631 sleeping, 17 waiting
                      CPU 0:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
                      CPU 1:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
                      CPU 2:  1.2% user,  0.0% nice, 51.0% system,  0.0% interrupt, 47.8% idle
                      CPU 3:  1.2% user,  0.0% nice,  2.7% system,  0.0% interrupt, 96.1% idle
                      Mem: 297M Active, 570M Inact, 715M Wired, 6225M Free
                      ARC: 357M Total, 88M MFU, 265M MRU, 32K Anon, 1149K Header, 3206K Other
                           132M Compressed, 291M Uncompressed, 2.20:1 Ratio
                      
                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Hmm, you see that load with only ~500Mbps passing?

                        Does it show what process is using that in the full top output? ntop-ng perhaps?

                        1 Reply Last reply Reply Quote 1
                        • S
                          SpaceBass
                          last edited by

                          @stephenw10 said in 6100 10g port and vlans maxing at 1g speed:

                          Hmm, you see that load with only ~500Mbps passing?

                          Does it show what process is using that in the full top output? ntop-ng perhaps?

                          It looks like it is if_io_tqg ... not sure what that is?
                          I ensured all SNMP is disabled, I uninstalled ntop-ng, still getting less than 1Gbit/sec

                          PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
                            0 root        -76    -     0B   736K CPU2     2 191:10  99.79% [kernel{if_io_tqg_2}]
                            0 root        -76    -     0B   736K CPU3     3 190:40  99.79% [kernel{if_io_tqg_3}]
                            0 root        -76    -     0B   736K -        0 223:36  95.08% [kernel{if_io_tqg_0}]
                            0 root        -76    -     0B   736K -        1 182:50  72.32% [kernel{if_io_tqg_1}]
                           11 root        155 ki31     0B    64K RUN      1  57.7H  19.83% [idle{idle: cpu1}]
                          
                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Those are the NIC driver queues which is where the load for routing and filtering should appear.
                            But you should easily be able to pass 1Gbps there.

                            Do you see the same restriction between other interfaces?

                            S 1 Reply Last reply Reply Quote 0
                            • S
                              SpaceBass @stephenw10
                              last edited by

                              @stephenw10 I'm only using one 10g interface. I'll try the 2nd one and report back.

                              1 Reply Last reply Reply Quote 1
                              • D
                                dnavas
                                last edited by dnavas

                                Learning a lot watching this thread.
                                That said, I'm not expecting more than 620ish kpps. Would be happy to be proven wrong!
                                https://ipng.ch/s/articles/2021/11/26/netgate-6100.html

                                S 1 Reply Last reply Reply Quote 0
                                • S
                                  SpaceBass @dnavas
                                  last edited by

                                  @dnavas
                                  interesting! A lot of that post is beyond my level of expertise. But is if fair to infer from your findings that you never got full 10gb speed when routing across networks?

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    The 6100 will not pass 10Gbps. There are many variables but I expect to see in the 3-4Gbps between the two 10G NICs.
                                    But you are seeing a restriction at a far lower level. Even given the single TCP stream and tha5t it's between VLANs on the same NIC I expect to see more.
                                    I'm setting up my own test now...

                                    S 1 Reply Last reply Reply Quote 0
                                    • S
                                      SpaceBass @stephenw10
                                      last edited by SpaceBass

                                      @stephenw10 said in 6100 10g port and vlans maxing at 1g speed:

                                      The 6100 will not pass 10Gbps.

                                      Interesting
                                      Does that mean, in the context of this marketing language: IPERF3 Traffic: 18.50 Gbps, 18.50 Gpbs only refers to LAN <-> WAN? I mean, that'd be bottlenecked by the 10g ports, right? So is that some sort of WAN LAGG setup?

                                      What about this: IPERF3 Traffic: 9.93 Gbps does that just mean LAN <-> WAN with firewalls rules? Not routing across subnets?

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        The 18.5Gbps figure is a total throughput (all interfaces) value for large packets (iperf, 1500B) of forwarding traffic. Without filtering.

                                        For a single TCP stream when hairpinned on the same interface using VLANs you hit the additional complication of loading the queues/cores. You will probably find you see different results if you repeat the test with a single iperf stream. The throughput is better when the send and receive queues fall to different CPU cores. Testing with multiple streams avoid that, I usually use -P 4 since it's a 4 core CPU.

                                        This is the loading I see when testing between VLANs on the ix0 port:

                                        last pid: 81510;  load averages:  1.36,  0.81,  0.56                                                                                      up 16+18:48:27  18:33:43
                                        670 threads:   10 running, 623 sleeping, 4 zombie, 33 waiting
                                        CPU 0:  0.4% user,  0.0% nice, 20.4% system, 15.7% interrupt, 63.5% idle
                                        CPU 1:  2.7% user,  0.0% nice,  1.2% system, 67.8% interrupt, 28.2% idle
                                        CPU 2:  7.5% user,  0.0% nice,  5.1% system, 58.4% interrupt, 29.0% idle
                                        CPU 3:  0.4% user,  0.0% nice, 10.2% system, 19.2% interrupt, 70.2% idle
                                        Mem: 1233M Active, 211M Inact, 1230M Laundry, 761M Wired, 4272M Free
                                        ARC: 357M Total, 242M MFU, 106M MRU, 296K Anon, 1610K Header, 6869K Other
                                             117M Compressed, 291M Uncompressed, 2.48:1 Ratio
                                        Swap: 1024M Total, 364M Used, 660M Free, 35% Inuse
                                        
                                          PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
                                           12 root        -72    -     0B   560K CPU3     3   3:41  81.58% [intr{swi1: netisr 0}]
                                           12 root        -72    -     0B   560K CPU0     0   2:47  72.43% [intr{swi1: netisr 3}]
                                           11 root        155 ki31     0B    64K RUN      3 385.9H  70.77% [idle{idle: cpu3}]
                                           11 root        155 ki31     0B    64K RUN      0 387.1H  66.14% [idle{idle: cpu0}]
                                           11 root        155 ki31     0B    64K RUN      1 385.7H  32.37% [idle{idle: cpu1}]
                                           11 root        155 ki31     0B    64K RUN      2 386.0H  30.00% [idle{idle: cpu2}]
                                            0 root        -76    -     0B   960K CPU0     0   1:40  17.55% [kernel{if_io_tqg_0}]
                                            0 root        -76    -     0B   960K -        3   1:06  11.07% [kernel{if_io_tqg_3}]
                                        

                                        That's between two 1G clients but with the port linked to a switch at 10G.
                                        It passes 1G as expected:

                                        [ ID] Interval           Transfer     Bitrate
                                        [  5]   0.00-60.00  sec  1.60 GBytes   230 Mbits/sec                  receiver
                                        [  8]   0.00-60.00  sec  1.65 GBytes   237 Mbits/sec                  receiver
                                        [ 10]   0.00-60.00  sec  1.62 GBytes   232 Mbits/sec                  receiver
                                        [ 12]   0.00-60.00  sec  1.62 GBytes   232 Mbits/sec                  receiver
                                        [SUM]   0.00-60.00  sec  6.50 GBytes   930 Mbits/sec                  receiver
                                        

                                        ntop-ng is also running on that box but not on either interface in the test.

                                        Steve

                                        S 1 Reply Last reply Reply Quote 0
                                        • S
                                          SpaceBass @stephenw10
                                          last edited by

                                          @stephenw10
                                          I do see a difference with -P 4 vs -P 1

                                          4 streams I get about 1.5Gbits/sec and with 1 stream I get 650Mbits/sec.

                                          Does that lead us to conclude that, when using a single 10g interface for VLANs, I will never get more than about 1.5 Gbits/sec because of processor constraints?

                                          Would I see better performance if I put some of the vLANS on ix1 (the other 10g interface)?

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Yes, I would expect to see better performance routing between different NICs. Especially for single TCP connections.

                                            One interesting thing though is that your top output does not appear to show the interrupt load like mine does. It could just be missing from your screenshot but that would also imply is using less CPU time than the NIC queues unlike in my test. I wonder if you have something else running that appears as load there. Traffic shaping maybe?

                                            Steve

                                            S 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.