Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Slow Vlan-Vlan Performance on c2758 (Supermicro 5018A-FTN4)

    Scheduled Pinned Locked Moved Hardware
    27 Posts 5 Posters 6.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      Downloadski
      last edited by

      Well try 8 streams than.

      It could be that the packet flow is not optimized to come from the firewall itself, but more to push data through from lan to wan.

      I assume if pfsense is the server and you send data to it, all rules will have to be inpected, and you hit on the last one (logically)
      Perhaps move it to top for test and see if it improves, than you know if going through the rules has such a impact.

      With pfsense as client it is pushing data out so no filters at all i assume.

      And that 219 is the total of the 4 sessions that run in parrallel.

      If i think about it, pfsense its lagg seems to work if the total is above 125 MB/sec it must go over more than one link.

      If that does not work for a pc connected to your switch, the switch does not distribute the session over multiple links. That is default for 1 connected pc.

      You could test in parrallel with 2 clients against the pfsense server.
      If the total of the 2 clients goes above 125 Mb/sec the traffic goes over more than 1 link.

      (You will never see 125 MB/sec in real world over 1GE links)

      1 Reply Last reply Reply Quote 0
      • O
        oddworld19
        last edited by

        Okay. Here are the results. What do they mean?

        8 streams w/ pfsense as server  [SUM]  0.0-10.0 sec  1133 MBytes  113 MBytes/sec
        12 streams w/ pfsense as server  [SUM]  0.0-10.1 sec  1138 MBytes  113 MBytes/sec
        32 streams w/ pfsense as server [SUM]  0.0-10.2 sec  1151 MBytes  113 MBytes/sec
        64 streams w/ pfsense as server  [SUM]  0.0-10.3 sec  1163 MBytes  113 MBytes/sec

        8 streams w/ pfsense as client  [SUM]  0.0-10.0 sec  2160 MBytes  216 MBytes/sec
        12 streams w/ pfsense as client  [SUM]  0.0-10.0 sec  2848 MBytes  284 MBytes/sec
        32 streams w/ pfsense as client  [SUM]  0.0-10.3 sec  3249 MBytes  314 MBytes/sec
        64 streams w/ pfsense as client  [SUM]  0.0-10.5 sec  3269 MBytes  312 MBytes/sec

        Supermicro SYS-5018A-FTN4 (Atom c2758)
        pfSense 2.3.2

        1 Reply Last reply Reply Quote 0
        • O
          oddworld19
          last edited by

          When I test pfsense acting as server, and two clients sending data at the same time using this:

          iperf -c 192.168.10.1 -P 64 -i 1 -p 5001 -f M -t 10
          

          Then the result is that both clients each have the following result:

          Client #1
          [SUM]  0.0-10.2 sec  1153 MBytes  113 MBytes/sec

          Client #2
          [SUM]  0.0-10.3 sec  1162 MBytes  113 MBytes/sec

          When I run the following, they each only get 56.9 MB/s

          iperf -c 192.168.10.1 -P 1 -i 1 -p 5001 -f M -t 10
          

          That tells me that LACP is working because I could saturate the line with -P 64

          Why does a single -P stream not saturate the line?

          Supermicro SYS-5018A-FTN4 (Atom c2758)
          pfSense 2.3.2

          1 Reply Last reply Reply Quote 0
          • D
            Downloadski
            last edited by

            It seems to me the lacp is working on the pfsense side.

            I do not know juniper, but with a ciso you can only do lacp if the physical interface members have the same configuration. So port speed, duplex, mdix etc.
            Perhaps you can check that ?

            Read this one: http://www.juniper.net/techpubs/en_US/junos15.1/topics/concept/interfaces-hashing-lag-ecmp-understanding.html

            Standard hashing is on payload it seems,mall the iperf packets might have the same payload, so they end up on 1 of the members of the link..
            Change to level 2 info, your clients have different mac adresses.

            1 Reply Last reply Reply Quote 0
            • O
              oddworld19
              last edited by

              I see more info on hashing here:  https://forums.juniper.net/t5/Ethernet-Switching/EX2200-LACP-hashing-algorithm/td-p/107844

              I don't see any issue with LACP. I'm expecting it to only do one gigabit links for four separate clients. I think it's strange that I need more than one stream of iperf to saturate the line. In my tests on the management interface, I plugged a linux machine directly into the management port. No switch involved.

              Any idea why iperf needs -P 32 or 32 Streams to saturate the line?

              Included information about my LACP link below:

              root> show interfaces ae0
              Physical interface: ae0, Enabled, Physical link is Up
                Interface index: 128, SNMP ifIndex: 599
                Description: pfsense
                Link-level type: Ethernet, MTU: 1514, Speed: 4Gbps, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled,
                Source filtering: Disabled, Flow control: Disabled, Minimum links needed: 1, Minimum bandwidth needed: 0
                Device flags   : Present Running
                Interface flags: SNMP-Traps Internal: XXXXXXX
                Current address: XXXXX, Hardware address: XXXXX
                Last flapped   : 2016-04-14 17:32:53 CDT (20:17:55 ago)
                Input rate     : 113307208 bps (13345 pps)
                Output rate    : 113834880 bps (13366 pps)
              
                Logical interface ae0.0 (Index 65) (SNMP ifIndex 603)
                  Flags: SNMP-Traps 0xc0004000 Encapsulation: ENET2
                  Statistics        Packets        pps         Bytes          bps
                  Bundle:
                      Input :         14611          0        916446            0
                      Output:       2293348          0     245371565            0
                  Adaptive Statistics:
                      Adaptive Adjusts:          0
                      Adaptive Scans  :          0
                      Adaptive Updates:          0
                  Protocol eth-switch
                    Flags: Is-Primary, Trunk-Mode
              
              

              Supermicro SYS-5018A-FTN4 (Atom c2758)
              pfSense 2.3.2

              1 Reply Last reply Reply Quote 0
              • D
                Downloadski
                last edited by

                What OS are the client pc's running ?
                It might be the iperf clients for the OS version issues.

                My test from pc with windows 7 to freebsd 10.1 server was 350 MB/sec with one session in iperf.
                It also depends on buffer size, packet size tested etc, i think.

                This was with 10GE link over intel cards.
                Wintel combination did not want to go faster than that it seems.

                Freebsd to freebsd was simply close to line rate with one session.

                1 Reply Last reply Reply Quote 0
                • ?
                  Guest
                  last edited by

                  I don't see any issue with LACP. I'm expecting it to only do one gigabit links for four separate clients.

                  In normal 4 single line will be aggregated to one fat pipe that is then in numbers the 4x (400%) of that single
                  line as an example here showing then up as 4 GBit/s aggregated.

                  I think it's strange that I need more than one stream of iperf to saturate the line.

                  How much you will need to saturate one single line?

                  In my tests on the management interface, I plugged a linux machine directly into the management port. No switch involved.

                  And no LAG, VLAN and QoS over all or?

                  Any idea why iperf needs -P 32 or 32 Streams to saturate the line?

                  Each line has its speed limit but this is mostly also owed to other circumstances besides.

                  Link-level type: Ethernet, MTU: 1514, Speed: 4Gbps, BPDU Error: None, MAC-REWRITE Error:
                  

                  1.- What is the MTU size on all devices in that test?
                  2.- What does you configure the LAG?
                  – (2 Lines sending and 2 lines receiving or 4 lines sending and receiving)
                  -- (active / active all lines are in usage or active passive one line is in usage and the rest is as spare for failover)

                  In normal you will have no need for that experiences to go with your set up.
                  You can do the following things in my eyes.
                  1.- Setting up a static (manual) LAG and use round robin method and on top 2 line for sending and 2 lines
                  for receiving by using active / active
                  2.- You could use your Layer3 switch to route between the VLANs only inside of that switch that will be
                  more nearly wire speed and the freed capacities from the pfSense box you will be perhaps able to use for
                  other things, or as a silent reserve.

                  1 Reply Last reply Reply Quote 0
                  • O
                    oddworld19
                    last edited by

                    How much you will need to saturate one single line?

                    It looks like "-P 2" will saturate the line, but "-P 1" will not.

                    In my tests on the management interface, I plugged a linux machine directly into the management port. No switch involved.

                    And no LAG, VLAN and QoS over all or?

                    Correct. The management port does not have any LAGG, VLAN or any other tags. Just one computer plugged directly into the pfsense machine.

                    1.- What is the MTU size on all devices in that test?
                    2.- What does you configure the LAG?
                    – (2 Lines sending and 2 lines receiving or 4 lines sending and receiving)
                    -- (active / active all lines are in usage or active passive one line is in usage and the rest is as spare for failover)

                    To answer #1
                    MTU on Juniper switch is 1514.
                    MTU on linux clients are 1500.
                    MTU on pfsense LAGG is 1500.
                    MTU on pfsense igb0 / igb1 / igb2 / igb3 are each 1500
                    Detailed ifconfig is below

                    To answer #2
                    LAGG is configured as LACP over 4 lines. Each of the 4 lines both send and receive. If one line goes down, Juniper will ignore it and then use the remaining acceptable lines. Only one line is necessary to maintain satisfactory connection.

                    ifconfig on pfsense:
                    
                    THIS IS THE LAGG
                    lagg0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                            options=400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso>ether XX:XX:XX:XX:XX:XX
                            inet6 XXXXXXXXXXXXX%lagg0 prefixlen 64 scopeid 0xb
                            inet 192.168.10.1 netmask 0xffffff00 broadcast 192.168.10.255
                            inet 10.10.10.1 netmask 0xffffffff broadcast 10.10.10.1
                            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect
                            status: active
                            laggproto lacp lagghash l2,l3,l4
                            laggport: igb0 flags=1c <active,collecting,distributing>laggport: igb1 flags=1c <active,collecting,distributing>laggport: igb2 flags=1c <active,collecting,distributing>laggport: igb3 flags=1c <active,collecting,distributing>THIS IS MANAGEMENT PORT 
                    em1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                            options=4009b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso>ether XXXXXXXXXXXX
                            inet6 XXXXXXXXXXXX%em1 prefixlen 64 scopeid 0x2
                            inet 192.168.5.1 netmask 0xffffff00 broadcast 192.168.5.255
                            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect
                            status: no carrier
                    
                    THIS IS ONE OF THE PORTS INCLUDED IN THE LAGG
                    igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                            options=400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso>ether XXXXXXXXXXXX
                            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
                            status: active</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast></active,collecting,distributing></active,collecting,distributing></active,collecting,distributing></active,collecting,distributing></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast> 
                    

                    Supermicro SYS-5018A-FTN4 (Atom c2758)
                    pfSense 2.3.2

                    1 Reply Last reply Reply Quote 0
                    • M
                      mikeisfly
                      last edited by

                      PC1 core i5-3470 3.2 GHz /w 4GB or RAM SAMSUNG SSD 830 EVO, OS Windows 10 pro (10.0.10586)
                      PC2 core i5-2400 3.1 GHz /w 8GB of RAM SAMSUNG SSD 840 EVO , OS Windows 10 Pro (10.0.10586)
                      A network share was configured on PC1 , the test file:

                      Spartacus Season 1 Episode 1 Past Transgressions.mkv:
                      file size is 4,583,539 KB (About 4.3 GB)

                      Both PCs are connected to an HP Procurve 2810-24G and I have 4 port LAGG (LACP) going back to a Brocade FastIron 648P. From the Brocade I have a single Gigabit port going to my PfSense Firewall which is using the built in Intel NIC on the motherboard as the LAN port. The LAN port is sub-interfaced with 5 virtual ports.
                                    GbE                                                        x4 GbE(LAG)
                      [PfSense]–-----------[Brocade FastIron 648P]–--------------------[ProCurve]–---------------[PC1]
                                                                                                                                              |–---------[PC2]
                      PfSense is a core i5-3470 running at 3.2GHz with 4GB of RAM. My current version of PfSense is 2.3 Release 64bit. I have 10 Open VPN tunnels with not much traffic going across them at the moment, and my CPU usually is at 1% from what I can observe. At the time the test is being done the only traffic is YouTube from a Chromecast.

                      Test 1:
                      PC1 to PC2 on same subnet
                      Trial 1 took 41.01 Sec to transfer the test file which is indicated above which was calculated to be 873.17 Mbps.

                      Test 2:
                      PC1 to PC2 on Different subnets
                      Trial 1 took 45.28 Sec to transfer the test file which is indicated above which was calculated to be 790.83 Mbps.

                      These are the fastest times for each test. I ran 3 trails for each test to try to get a more accurate idea about how your network might perform. I have more data that I hope to publish later today.

                      1 Reply Last reply Reply Quote 0
                      • O
                        oddworld19
                        last edited by

                        Thanks. That's interesting. You're not maxing out either.

                        Supermicro SYS-5018A-FTN4 (Atom c2758)
                        pfSense 2.3.2

                        1 Reply Last reply Reply Quote 0
                        • M
                          mikeisfly
                          last edited by

                          I would say that I'm pretty close and if you look on trial 1 I'm not routing at all and I'm still not getting line rate. I'm pretty sure that has to do with the VLAN tags and also the overhead with TCP.

                          1 Reply Last reply Reply Quote 0
                          • ?
                            Guest
                            last edited by

                            I would say that I'm pretty close and if you look on trial 1 I'm not routing at all and I'm still not getting line rate.

                            873 MBit/s + TCP overhead + VLAN TAG + QoS + all other running services that narrow down the
                            entire throughput of your pfSense appliance.

                            I'm pretty sure that has to do with the VLAN tags and also the overhead with TCP.

                            Each OpenVPN tunnel is taking one core from the CPU or SoC and all other packets are also "eating"
                            some CPU power as I know it. So what else packets and services you are running on that pfSense machine?

                            1 Reply Last reply Reply Quote 0
                            • M
                              mikeisfly
                              last edited by

                              Final Results:

                              Test 1 - No routing both machines on same subnet
                                              Time (Seconds) Speed (Mbps)
                              Pass 1 41.69                   858.9325603
                              Pass 2 80.43                   445.2181827
                              Pass 3 41.01                   873.1747973

                              Test 2 - PCs on different subnet PfSense doing the routing across vlans

                              Time (Seconds) Speed (Mbps)

                              Pass 1 45.28                   790.8325627
                              Pass 2 45.68                   783.907584
                              Pass 3 55.7                     642.8886614

                              Test 3 -  Cisco 2821 Router inserted and it is handling the routing between the two subnets

                              Time (Seconds) Speed (Mbps)
                              Pass 1 44.36                 807.2339594
                              Pass 2 44.12                 811.6250779
                              Pass 3 44.94                 796.8157196

                              Summary - What I did here is take out the high and low of all tests and then compared test 2 and test 3 against test 1 (which is switching performance)

                              Performance Hit
                              Test 2 8.73%
                              Test 3 6.02%

                              Summary :

                              Switching is faster than routing (duh!), but the Asics in the Cisco Router allow it to perform at nearly the same level as my PfSense Firewall with higher end hardware. From the results here we can see that the Cisco router has about 2% better routing performance which in my mind is well worth the trade-off of what PfSense gives me! I have done nothing in-terms of optimizations which could bring PfSense even closer to my Cisco Router, and like others have stated if I put a NIC with custom silicon the gap may get even closer. The purpose for this test was not to prove one platform is better than another, I always wanted to see something by way of charts with various hardware with some numbers for people to make some decisions for what is best for them.

                              Lastly , the CPU in my PfSense firewall went from 1-2% load to 10-13% when routing across vlans, which at first scared me because a couple of routing streams going across vlans could be a big hit, so I decided to add simultaneous transfers which did not bring the CPU above the 10% - 13% load (Nice!)

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.