Slow Vlan-Vlan Performance on c2758 (Supermicro 5018A-FTN4)



  • Brief Summary

    I am only receiving ~500 mbit/s on iperf from the pfsense LAN NIC to a client on the LAN side.  Any ideas what is wrong? Lots more detail below. Thanks so much for your help.

    Description of Network

    High-level overview of network:

    WAN -> pfsense -> juniper ex2200 switch -> clients

    OS: pfsense 2.3

    Hardware: Supermicro 5018A-FTN4 and 8 gigs of dual channel ECC ram

    One PCI-e Intel PRO/1000 PT dual port: one port is dedicated to WAN, the other is for dedicated management access

    4x onboard igb NICs are in a LAGG on LAN side

    4X LAN NICs are in LAGG to Juniper switch with Juniper switch being in active LACP mode.

    There are 7 VLANs. I understand this is a Layer3 switch, but I am not using the Layer3 services (still learning how to use the switch). At the moment, all VLAN-VLAN traffic is routed, not switched. With 4x NICs in the LAGG, I thought this was fine until I learn how to use the switch at Layer3.

    _(http://imgur.com/J7VLlBX)

    The only packages I'm running is openVPN export, iperf, and pfblockerNG

    Description of Problem

    Everything functions as intended, but I noticed that

    (1) SMB copies between VLANs are only 65 MB/s and

    (2) an iperf test to the LAGG shows that a pfsense tests at 500 mbits/s when pfsense LAN-side is the server (and a LAN-client is the client)

    (3) 700 mbit/s when pfsense is the client (and another linux computer is server), and

    (4)  3.7 Gigabits/s from LAN to WAN

    Question

    Why are VLAN-VLAN transfers slow? I understand that pfsense is routing instead of switching, but it should be capable of achieving the full gigabit, right? 500 mbits seems like a configuration issue to me.

    Thank you for your help! Please let me know if I can provide any more information.

    **iperf data below   **

    ```
    (1) LAN to WAN

    [2.3-RELEASE][admin@pfSense.localdomain]/root: iperf -s -B EXTERNAL_WAN_IP_REDACTED
        ------------------------------------------------------------
        Server listening on TCP port 5001
        Binding to local address EXTERNAL_WAN_IP_REDACTED
        TCP window size: 63.7 KByte (default)
        ------------------------------------------------------------
        [  4] local EXTERNAL_WAN_IP_REDACTED port 5001 connected with 192.168.10.1 port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  4]  0.0-10.0 sec  4.29 GBytes  3.68 Gbits/sec

    [2.3-RELEASE][admin@pfSense.localdomain]/root: iperf -B 192.168.10.1 -c EXTERNAL_WAN_IP_REDACTED
        ------------------------------------------------------------
        Client connecting to EXTERNAL_WAN_IP_REDACTED, TCP port 5001
        Binding to local address 192.168.10.1
        TCP window size: 63.8 KByte (default)
        ------------------------------------------------------------
        [  3] local 192.168.10.1 port 5001 connected with EXTERNAL_WAN_IP_REDACTED port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  3]  0.0-10.0 sec  4.29 GBytes  3.68 Gbits/sec

    (2) DebLab

    [2.3-RELEASE][admin@pfSense.localdomain]/root: iperf -s -B 192.168.10.1
        ------------------------------------------------------------
        Server listening on TCP port 5001
        Binding to local address 192.168.10.1
        TCP window size: 63.7 KByte (default)
        ------------------------------------------------------------
        [  4] local 192.168.10.1 port 5001 connected with 192.168.10.66 port 35507
        [ ID] Interval      Transfer    Bandwidth
        [  4]  0.0-10.1 sec  669 MBytes  558 Mbits/sec

    USERNAME_REDACTED@DebLab:~$ iperf -c 192.168.10.1
        ------------------------------------------------------------
        Client connecting to 192.168.10.1, TCP port 5001
        TCP window size: 85.0 KByte (default)
        ------------------------------------------------------------
        [  3] local 192.168.10.66 port 35507 connected with 192.168.10.1 port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  3]  0.0-10.0 sec  669 MBytes  560 Mbits/sec

    (3) Chewy

    [2.3-RELEASE][admin@pfSense.localdomain]/root: iperf -s -B 192.168.70.1
        ------------------------------------------------------------
        Server listening on TCP port 5001
        Binding to local address 192.168.70.1
        TCP window size: 63.7 KByte (default)
        ------------------------------------------------------------
        [  4] local 192.168.70.1 port 5001 connected with 192.168.70.10 port 46337
        [ ID] Interval      Transfer    Bandwidth
        [  4]  0.0-10.1 sec  585 MBytes  488 Mbits/sec

    USERNAME_REDACTED@chewy:~$ iperf -c 192.168.70.1
        ------------------------------------------------------------
        Client connecting to 192.168.70.1, TCP port 5001
        TCP window size: 85.0 KByte (default)
        ------------------------------------------------------------
        [  3] local 192.168.70.10 port 46337 connected with 192.168.70.1 port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  3]  0.0-10.0 sec  585 MBytes  490 Mbits/sec

    (4) Chewy UDP

    USERNAME_REDACTED@chewy:~$ iperf -c 192.168.70.1 -u -b 1000m
        ------------------------------------------------------------
        Client connecting to 192.168.70.1, UDP port 5001
        Sending 1470 byte datagrams
        UDP buffer size:  208 KByte (default)
        ------------------------------------------------------------
        [  3] local 192.168.70.10 port 60311 connected with 192.168.70.1 port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  3]  0.0-10.0 sec  970 MBytes  814 Mbits/sec
        [  3] Sent 692106 datagrams
        [  3] Server Report:
        [  3]  0.0-10.0 sec  950 MBytes  797 Mbits/sec  0.014 ms 14552/692105 (2.1%)
        [  3]  0.0-10.0 sec  1 datagrams received out-of-order

    (5) Inter-VLAN

    [2.3-RELEASE][admin@pfSense.localdomain]/root: iperf -s -B 192.168.70.1
        ------------------------------------------------------------
        Server listening on TCP port 5001
        Binding to local address 192.168.70.1
        TCP window size: 63.7 KByte (default)
        ------------------------------------------------------------
        [  4] local 192.168.70.1 port 5001 connected with 192.168.10.1 port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  4]  0.0-10.0 sec  4.29 GBytes  3.68 Gbits/sec

    [2.3-RELEASE][admin@pfSense.localdomain]/root: iperf -B 192.168.10.1 -c 192.168.70.1
        ------------------------------------------------------------
        Client connecting to 192.168.70.1, TCP port 5001
        Binding to local address 192.168.10.1
        TCP window size: 63.8 KByte (default)
        ------------------------------------------------------------
        [  3] local 192.168.10.1 port 5001 connected with 192.168.70.1 port 5001
        [ ID] Interval      Transfer    Bandwidth
        [  3]  0.0-10.0 sec  4.29 GBytes  3.69 Gbits/sec

       _


  • did you enable powerD ? ( System/Advanced/Miscellaneous)



  • Thanks for the reply. Yes - powerD is enabled and also MBUF has been raised to 1,000,000.



  • couple of things I see which could be your problem (Or be no problem)

    1. How fast hard the hard drives in the machines that you are testing? ~60 MB/s seems right.
    2. The VLAN tags on your packet will take away from the payload which will lower your speeds
    3. Try using SMBv3, there is a dramatic speed performance boost (Windows 8 + to Windows 8+ Server 2012R2 supports SMBv3



  • Thanks for the thoughts:

    1. HDD speed is not the limiter. Copying to and from NVME SSD @ 2050MB/s on computer #1 and 4x RAID10 sata SSDs that reads at 1000 MB/s. I've tested it on a separate 10G line and I get about 700 megabytes per second (not megabits).

    2. How much does VLAN tagging affect speed? Understood that a portion of the packet is occupied by the tag.

    3. Good point re: sabma. That's why I wanted to test with iperf.



  • Quick update on the matter.
    I ran a separate iperf test on the interface reserved for management. This interface is unique because (1) it is not switched through the Juniper ex2200 and (2) it is on the Intel PRO/1000 PT card, not the onboard NICs.

    Speeds were roughly the same at 570 mbits/s. Although this does not solve any problem, it eliminates a few variables, which is useful.

    Your advice is appreciated.

    [2.3-RELEASE][root@pfSense.localdomain]/root: iperf -s -B 192.168.5.1
    ------------------------------------------------------------
    Server listening on TCP port 5001
    Binding to local address 192.168.5.1
    TCP window size: 63.7 KByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.5.1 port 5001 connected with 192.168.5.12 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0-10.0 sec   688 MBytes   575 Mbits/sec
    [  5] local 192.168.5.1 port 5001 connected with 192.168.5.12 port 43223
    [  5]  0.0-10.1 sec   681 MBytes   568 Mbits/sec
    
    

    I am also showing data from "top -SH"

    last pid: 42589;  load averages:  0.97,  0.84,  0.59                                      up 0+00:23:49  17:55:45
    235 processes: 16 running, 160 sleeping, 59 waiting
    CPU:  0.6% user,  0.0% nice, 19.8% system,  3.4% interrupt, 76.1% idle
    Mem: 105M Active, 47M Inact, 297M Wired, 208M Buf, 7419M Free
    Swap: 16G Total, 16G Free
    
      PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
       11 root     155 ki31     0K   128K CPU2    2  22:25 100.00% idle{idle: cpu2}
       11 root     155 ki31     0K   128K CPU5    5  22:29  95.56% idle{idle: cpu5}
       11 root     155 ki31     0K   128K RUN     6  21:34  95.36% idle{idle: cpu6}
       11 root     155 ki31     0K   128K RUN     7  22:42  90.97% idle{idle: cpu7}
       11 root     155 ki31     0K   128K CPU0    0  22:00  89.70% idle{idle: cpu0}
       11 root     155 ki31     0K   128K RUN     4  22:02  76.86% idle{idle: cpu4}
       11 root     155 ki31     0K   128K CPU3    3  22:05  63.96% idle{idle: cpu3}
       11 root     155 ki31     0K   128K CPU1    1  22:02  55.76% idle{idle: cpu1}
    98914 root      52    0 36548K  3456K CPU2    2   0:11  51.37% iperf{iperf}
        0 root     -92    -     0K   736K CPU6    6   0:17  46.78% kernel{igb0 que}
       12 root     -92    -     0K  1024K WAIT    3   0:26  23.29% intr{irq262: igb0:que}
       12 root     -92    -     0K  1024K WAIT    3   0:03   4.88% intr{irq289: igb3:que}
       12 root     -92    -     0K  1024K RUN     6   0:40   2.49% intr{irq283: igb2:que}
       12 root     -92    -     0K  1024K CPU0    0   0:43   2.39% intr{irq277: igb2:que}
       12 root     -92    -     0K  1024K RUN     6   0:34   2.10% intr{irq274: igb1:que}
       12 root     -92    -     0K  1024K RUN     6   0:33   2.10% intr{irq265: igb0:que}
       12 root     -92    -     0K  1024K WAIT    5   0:32   2.10% intr{irq291: igb3:que}
       12 root     -92    -     0K  1024K WAIT    3   0:33   1.86% intr{irq280: igb2:que}
       12 root     -92    -     0K  1024K WAIT    0   0:19   0.88% intr{irq259: igb0:que}
       12 root     -92    -     0K  1024K WAIT    2   0:19   0.78% intr{irq270: igb1:que}
       12 root     -92    -     0K  1024K WAIT    2   0:17   0.59% intr{irq288: igb3:que}
       12 root     -92    -     0K  1024K WAIT    7   0:17   0.49% intr{irq284: igb2:que}
        0 root     -92    -     0K   736K -       1   0:51   0.00% kernel{igb0 que}
        0 root     -16    -     0K   736K swapin  4   0:48   0.00% kernel{swapper}
        0 root     -92    -     0K   736K -       0   0:39   0.00% kernel{em1 que}
       12 root     -92    -     0K  1024K WAIT    2   0:19   0.00% intr{irq261: igb0:que}
        0 root     -92    -     0K   736K -       1   0:15   0.00% kernel{igb2 que}
       12 root     -92    -     0K  1024K WAIT    4   0:11   0.00% intr{irq281: igb2:que}
       12 root     -60    -     0K  1024K WAIT    1   0:09   0.00% intr{swi4: clock}
        0 root     -92    -     0K   736K -       3   0:07   0.00% kernel{igb0 que}
       12 root     -92    -     0K  1024K WAIT    5   0:03   0.00% intr{irq264: igb0:que}
       12 root     -92    -     0K  1024K WAIT    5   0:03   0.00% intr{irq273: igb1:que}
       12 root     -92    -     0K  1024K WAIT    3   0:03   0.00% intr{irq271: igb1:que}
       12 root     -92    -     0K  1024K WAIT    0   0:03   0.00% intr{irq268: igb1:que}
       12 root     -92    -     0K  1024K WAIT    6   0:03   0.00% intr{irq292: igb3:que}
    53945 root      20    0 21616K  5752K select  0   0:02   0.00% openvpn
       12 root     -92    -     0K  1024K WAIT    7   0:02   0.00% intr{irq293: igb3:que}
       12 root     -92    -     0K  1024K WAIT    4   0:02   0.00% intr{irq263: igb0:que}
       12 root     -92    -     0K  1024K WAIT    7   0:02   0.00% intr{irq275: igb1:que}
    95329 root      20    0   224M 33636K nanslp  6   0:01   0.00% php
       12 root     -92    -     0K  1024K WAIT    1   0:01   0.00% intr{irq278: igb2:que}
       12 root     -52    -     0K  1024K WAIT    3   0:01   0.00% intr{swi6: task queue}
       12 root     -92    -     0K  1024K WAIT    5   0:01   0.00% intr{irq282: igb2:que}
        0 root     -92    -     0K   736K -       7   0:01   0.00% kernel{em0 que}
    
    


  • What kind of hardware are you running. It's interesting that when you ran the test from a port with no VLAN tagging you are seeing a 100 Mbps increase in performance. How ever I would believe that you would want to see something closer to wire speed. I will run the test on my hardware to see what I get. I am running a Core i5 with 4 GB of RAM. Also My LAN is built in and I'm using VLANs as well. I also have a intel dual port Gigabit PCI-e NIC.

    I also  have a Cisco Router 2821 with two Gigabit ports I will set up sub-interfaces on that router and then connect that to a gigabit switch and see what kind of speeds I get with that. I will run the iPerf test from Windows Server 2012 R2 and a Windows 10 machine with a core i5 processor as well. I should have the results in a day.

    It will be interesting to see if I can get routing at wire speed?



  • A cisco 2821 will not do line rate with routing.
    This is simply seen a cpu with linux and some asics added for certain functions.
    And the cpu is a 64 bit risc processor at 466 Mhz (http://pmcs.com/cgi-bin/download_p.pl?res_id=4607&filename=2020578_004405.pdf)

    For my work i have few hundred in the field, and there we use them up to 80 mbit/sec.
    But we use QoS, some ACL's, BGP, ip sla and shaping.

    To add something usefull. I have tested with my supermicro a1sri-2758f pfsense from lan to wan with no packages and saw about line speed. (With version 2.2.4) that was with 2 of the onboard nics. Simply inserted the pfsense before my download pc to test all the pc applications.
    Copy was from PC to NAS.

    A Lagg port will not load balance a single session per default. It will be one link only, unless you run a round robin algoritm. So test from station to station will per default use 1 link out of you group from 4.
    A lagg would be usefull only with multiple stations going over it. Or windows with smb 3..



  • I also  have a Cisco Router 2821 with two Gigabit ports

    This is doing the real work in silicon, Cisco and other big vendors are often using ASICs or FPGAs
    to do that job very fast and offloading many tasks from that CPU inside! That is not a real comparing
    because pfSense is a pure software firewall. If you want to compare something likes this against a
    pfSense machine go and buy a Chelsio adapter that is also owning a ASIC/FPGA on it.

    There are two different versions of the LAGs. One is using the LACP this is the dynamic one and the
    other will be set up manual by hand and is the static one.

    But then you will be also able to set them up as active actice and active / passive and this might be
    more important as you might be thinking about. If you are using active / passive it will be working
    likes the following. If the first from 2 till 8 wires are broken or not working, the next one will be in
    usage and so on. But if you now using the active / active method it is pending on the algorithm how
    it works, because in normal the first of 2 till 8 wires or that line must be fully rendered or saturated
    before the next one will be in usage!!!!!!!!!!!! So you have now 4 wire in usage and set them up as
    2 sending and 2 receiving together with the round robin algorithm that is spreading all packets
    consistently over all wires or lines in the LAG you will be seeing other numbers and reactions.



  • Pfsense is set to lagg protocol "LACP"

    And Juniper ex-2200 is set to "LACP active"

    From what I can gather from the packet stats on Juniper, packets seems to flow somewhat evenly across all four lines. Yes, I am aware that 4x In LACP means up to four separate one gigabit connections, not one four-gigabit connection.

    If anyone wants a few Juniper printouts / data/ commands, I can provide. I think the issue is my pfsense configuration, however.

    I've run iperf with 4-5 different hosts now. All yield about the same result.

    I was able to get a 100MB/s transfer over samba yesterday, but iperf is still showing 500mbit/s.

    Any suggestions for what's going on?



  • You test perhaps single session with iperf.
    I did some testing between freebds nas servers

    iperf -c 192.168.3.153 -P 1 -i 1 -p 5001 -f M -t 10
    iperf -s -P 0 -i 1 -p 5001 -f M

    That increases the buffers somewhat, and got about 1GB/sec.
    That was over 10GE nics on systems with E3-1220

    
    [root@zfsguru2 /]# iperf -c 192.168.3.153 -P 1 -i 1 -p 5001 -f M -t 10
    ------------------------------------------------------------
    Client connecting to 192.168.3.153, TCP port 5001
    TCP window size: 2.01 MByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.3.152 port 50742 connected with 192.168.3.153 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0- 1.0 sec   988 MBytes   988 MBytes/sec
    [  3]  1.0- 2.0 sec  1052 MBytes  1052 MBytes/sec
    [  3]  2.0- 3.0 sec  1135 MBytes  1135 MBytes/sec
    [  3]  3.0- 4.0 sec  1183 MBytes  1183 MBytes/sec
    [  3]  4.0- 5.0 sec  1075 MBytes  1075 MBytes/sec
    [  3]  5.0- 6.0 sec  1181 MBytes  1181 MBytes/sec
    [  3]  6.0- 7.0 sec  1180 MBytes  1180 MBytes/sec
    [  3]  7.0- 8.0 sec  1179 MBytes  1179 MBytes/sec
    [  3]  8.0- 9.0 sec  1180 MBytes  1180 MBytes/sec
    [  3]  9.0-10.0 sec  1180 MBytes  1180 MBytes/sec
    [  3]  0.0-10.0 sec  11346 MBytes  1134 MBytes/sec
    
    


  • Thank you for the reply. I have replicated the test as follows:

    [2.3-RELEASE][root@pfSense.localdomain]/root: iperf -s -P 0 -i 1 -p 5001 -f M -B 192.168.10.1
    ------------------------------------------------------------
    Server listening on TCP port 5001
    Binding to local address 192.168.10.1
    TCP window size: 0.06 MByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.10.1 port 5001 connected with 192.168.10.66 port 35989
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0- 1.0 sec  68.4 MBytes  68.4 MBytes/sec
    [  4]  1.0- 2.0 sec  68.6 MBytes  68.6 MBytes/sec
    [  4]  2.0- 3.0 sec  68.7 MBytes  68.7 MBytes/sec
    [  4]  3.0- 4.0 sec  68.4 MBytes  68.4 MBytes/sec
    [  4]  4.0- 5.0 sec  68.0 MBytes  68.0 MBytes/sec
    [  4]  5.0- 6.0 sec  68.1 MBytes  68.1 MBytes/sec
    [  4]  6.0- 7.0 sec  67.0 MBytes  67.0 MBytes/sec
    [  4]  7.0- 8.0 sec  68.1 MBytes  68.1 MBytes/sec
    [  4]  8.0- 9.0 sec  68.0 MBytes  68.0 MBytes/sec
    [  4]  9.0-10.0 sec  68.5 MBytes  68.5 MBytes/sec
    [  4]  0.0-10.1 sec   685 MBytes  68.2 MBytes/sec
    
    
    XXXXXX@YYYYYY:~$ iperf -c 192.168.10.1 -P 1 -i 1 -p 5001 -f M -t 10
    ------------------------------------------------------------
    Client connecting to 192.168.10.1, TCP port 5001
    TCP window size: 0.08 MByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.10.66 port 35989 connected with 192.168.10.1 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0- 1.0 sec  71.1 MBytes  71.1 MBytes/sec
    [  3]  1.0- 2.0 sec  69.1 MBytes  69.1 MBytes/sec
    [  3]  2.0- 3.0 sec  69.2 MBytes  69.2 MBytes/sec
    [  3]  3.0- 4.0 sec  67.4 MBytes  67.4 MBytes/sec
    [  3]  4.0- 5.0 sec  68.8 MBytes  68.8 MBytes/sec
    [  3]  5.0- 6.0 sec  68.1 MBytes  68.1 MBytes/sec
    [  3]  6.0- 7.0 sec  66.2 MBytes  66.2 MBytes/sec
    [  3]  7.0- 8.0 sec  68.4 MBytes  68.4 MBytes/sec
    [  3]  8.0- 9.0 sec  68.5 MBytes  68.5 MBytes/sec
    [  3]  9.0-10.0 sec  68.4 MBytes  68.4 MBytes/sec
    [  3]  0.0-10.0 sec   685 MBytes  68.4 MBytes/sec
    
    

    And going the other direction:

    [2.3-RELEASE][root@pfSense.localdomain]/root: iperf -c 192.168.10.66 -P 1 -i 1 -p 5001 -f M -t 10
    ------------------------------------------------------------
    Client connecting to 192.168.10.66, TCP port 5001
    TCP window size: 0.06 MByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.10.1 port 36323 connected with 192.168.10.66 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0- 1.0 sec   104 MBytes   104 MBytes/sec
    [  3]  1.0- 2.0 sec   107 MBytes   107 MBytes/sec
    [  3]  2.0- 3.0 sec   108 MBytes   108 MBytes/sec
    [  3]  3.0- 4.0 sec   107 MBytes   107 MBytes/sec
    [  3]  4.0- 5.0 sec   107 MBytes   107 MBytes/sec
    [  3]  5.0- 6.0 sec   108 MBytes   108 MBytes/sec
    [  3]  6.0- 7.0 sec   107 MBytes   107 MBytes/sec
    [  3]  7.0- 8.0 sec   108 MBytes   108 MBytes/sec
    [  3]  8.0- 9.0 sec   108 MBytes   108 MBytes/sec
    [  3]  9.0-10.0 sec   108 MBytes   108 MBytes/sec
    [  3]  0.0-10.0 sec  1071 MBytes   107 MBytes/sec
    [2.3-RELEASE][root@pfSense.localdomain]/root:
    
    
    XXXXX@YYYYYY:~$ iperf -s -P 0 -i 1 -p 5001 -f M
    ------------------------------------------------------------
    Server listening on TCP port 5001
    TCP window size: 0.08 MByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.10.66 port 5001 connected with 192.168.10.1 port 36323
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0- 1.0 sec   102 MBytes   102 MBytes/sec
    [  4]  1.0- 2.0 sec   107 MBytes   107 MBytes/sec
    [  4]  2.0- 3.0 sec   108 MBytes   108 MBytes/sec
    [  4]  3.0- 4.0 sec   107 MBytes   107 MBytes/sec
    [  4]  4.0- 5.0 sec   107 MBytes   107 MBytes/sec
    [  4]  5.0- 6.0 sec   107 MBytes   107 MBytes/sec
    [  4]  6.0- 7.0 sec   107 MBytes   107 MBytes/sec
    [  4]  7.0- 8.0 sec   108 MBytes   108 MBytes/sec
    [  4]  8.0- 9.0 sec   108 MBytes   108 MBytes/sec
    [  4]  9.0-10.0 sec   108 MBytes   108 MBytes/sec
    [  4]  0.0-10.0 sec  1071 MBytes   107 MBytes/sec
    
    

    Rules for Management Link (192.168.5.XXX)

    Rules for LAN (192.168.10.XXX)



  • Ok still single thread.

    Use -P 4 for 4 parrallel streams.

    See http://www.jamescoyle.net/cheat-sheets/581-iperf-cheat-sheet



  • I set the "-P 0" flag on the server and the "-P 4" flag on the client. Is that right?

    With pfsense as server, I get 112 MBytes/s and with pfsense as client, I get 219 MBytes/s. That's strange…... I'm not sure how I even reached 219. To be honest, I'm not sure how to interpret these results.

    With pfsense acting as server

    [2.3-RELEASE][root@pfSense.localdomain]/root: iperf -s -P 0 -i 1 -p 5001 -f M -B 192.168.10.1
    ------------------------------------------------------------
    Server listening on TCP port 5001
    Binding to local address 192.168.10.1
    TCP window size: 0.06 MByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.10.1 port 5001 connected with 192.168.10.66 port 35998
    [  5] local 192.168.10.1 port 5001 connected with 192.168.10.66 port 35999
    [  7] local 192.168.10.1 port 5001 connected with 192.168.10.66 port 36001
    [  6] local 192.168.10.1 port 5001 connected with 192.168.10.66 port 36000
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0- 1.0 sec  26.6 MBytes  26.6 MBytes/sec
    [  5]  0.0- 1.0 sec  30.6 MBytes  30.6 MBytes/sec
    [  7]  0.0- 1.0 sec  26.5 MBytes  26.5 MBytes/sec
    [  6]  0.0- 1.0 sec  28.4 MBytes  28.4 MBytes/sec
    [SUM]  0.0- 1.0 sec   112 MBytes   112 MBytes/sec
    [  4]  1.0- 2.0 sec  26.8 MBytes  26.8 MBytes/sec
    [  5]  1.0- 2.0 sec  30.6 MBytes  30.6 MBytes/sec
    [  6]  1.0- 2.0 sec  29.1 MBytes  29.1 MBytes/sec
    [  7]  1.0- 2.0 sec  25.6 MBytes  25.6 MBytes/sec
    [SUM]  1.0- 2.0 sec   112 MBytes   112 MBytes/sec
    [  4]  2.0- 3.0 sec  27.6 MBytes  27.6 MBytes/sec
    [  5]  2.0- 3.0 sec  29.8 MBytes  29.8 MBytes/sec
    [  7]  2.0- 3.0 sec  25.9 MBytes  25.9 MBytes/sec
    [  6]  2.0- 3.0 sec  28.7 MBytes  28.7 MBytes/sec
    [SUM]  2.0- 3.0 sec   112 MBytes   112 MBytes/sec
    [  4]  3.0- 4.0 sec  26.7 MBytes  26.7 MBytes/sec
    [  5]  3.0- 4.0 sec  29.1 MBytes  29.1 MBytes/sec
    [  7]  3.0- 4.0 sec  28.3 MBytes  28.3 MBytes/sec
    [  6]  3.0- 4.0 sec  28.1 MBytes  28.1 MBytes/sec
    [SUM]  3.0- 4.0 sec   112 MBytes   112 MBytes/sec
    [  4]  4.0- 5.0 sec  26.6 MBytes  26.6 MBytes/sec
    [  5]  4.0- 5.0 sec  29.4 MBytes  29.4 MBytes/sec
    [  7]  4.0- 5.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  6]  4.0- 5.0 sec  27.9 MBytes  27.9 MBytes/sec
    [SUM]  4.0- 5.0 sec   112 MBytes   112 MBytes/sec
    [  4]  5.0- 6.0 sec  27.0 MBytes  27.0 MBytes/sec
    [  5]  5.0- 6.0 sec  28.4 MBytes  28.4 MBytes/sec
    [  7]  5.0- 6.0 sec  29.0 MBytes  29.0 MBytes/sec
    [  6]  5.0- 6.0 sec  27.7 MBytes  27.7 MBytes/sec
    [SUM]  5.0- 6.0 sec   112 MBytes   112 MBytes/sec
    [  4]  6.0- 7.0 sec  28.4 MBytes  28.4 MBytes/sec
    [  5]  6.0- 7.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  7]  6.0- 7.0 sec  27.8 MBytes  27.8 MBytes/sec
    [  6]  6.0- 7.0 sec  27.7 MBytes  27.7 MBytes/sec
    [SUM]  6.0- 7.0 sec   112 MBytes   112 MBytes/sec
    [  4]  7.0- 8.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  5]  7.0- 8.0 sec  27.3 MBytes  27.3 MBytes/sec
    [  7]  7.0- 8.0 sec  28.3 MBytes  28.3 MBytes/sec
    [  6]  7.0- 8.0 sec  28.3 MBytes  28.3 MBytes/sec
    [SUM]  7.0- 8.0 sec   112 MBytes   112 MBytes/sec
    [  4]  8.0- 9.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  5]  8.0- 9.0 sec  28.3 MBytes  28.3 MBytes/sec
    [  7]  8.0- 9.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  6]  8.0- 9.0 sec  27.6 MBytes  27.6 MBytes/sec
    [SUM]  8.0- 9.0 sec   112 MBytes   112 MBytes/sec
    [  4]  9.0-10.0 sec  28.0 MBytes  28.0 MBytes/sec
    [  7]  9.0-10.0 sec  28.4 MBytes  28.4 MBytes/sec
    [  6]  9.0-10.0 sec  28.2 MBytes  28.2 MBytes/sec
    [  5]  9.0-10.0 sec  27.4 MBytes  27.4 MBytes/sec
    [SUM]  9.0-10.0 sec   112 MBytes   112 MBytes/sec
    [  4]  0.0-10.1 sec   276 MBytes  27.4 MBytes/sec
    [  5]  0.0-10.1 sec   291 MBytes  28.9 MBytes/sec
    [  7]  0.0-10.1 sec   278 MBytes  27.6 MBytes/sec
    [  6]  0.0-10.1 sec   284 MBytes  28.2 MBytes/sec
    [SUM]  0.0-10.1 sec  1129 MBytes   112 MBytes/sec
    
    XXXXXXXX@yyyyYYYyyYY:~$ iperf -c 192.168.10.1 -P 4 -i 1 -p 5001 -f M -t 10
    ------------------------------------------------------------
    Client connecting to 192.168.10.1, TCP port 5001
    TCP window size: 0.08 MByte (default)
    ------------------------------------------------------------
    [  5] local 192.168.10.66 port 36000 connected with 192.168.10.1 port 5001
    [  3] local 192.168.10.66 port 35998 connected with 192.168.10.1 port 5001
    [  4] local 192.168.10.66 port 35999 connected with 192.168.10.1 port 5001
    [  6] local 192.168.10.66 port 36001 connected with 192.168.10.1 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0- 1.0 sec  32.1 MBytes  32.1 MBytes/sec
    [  6]  0.0- 1.0 sec  27.6 MBytes  27.6 MBytes/sec
    [  5]  0.0- 1.0 sec  30.1 MBytes  30.1 MBytes/sec
    [  3]  0.0- 1.0 sec  28.0 MBytes  28.0 MBytes/sec
    [SUM]  0.0- 1.0 sec   118 MBytes   118 MBytes/sec
    [  5]  1.0- 2.0 sec  29.0 MBytes  29.0 MBytes/sec
    [  3]  1.0- 2.0 sec  27.1 MBytes  27.1 MBytes/sec
    [  4]  1.0- 2.0 sec  31.0 MBytes  31.0 MBytes/sec
    [  6]  1.0- 2.0 sec  26.1 MBytes  26.1 MBytes/sec
    [SUM]  1.0- 2.0 sec   113 MBytes   113 MBytes/sec
    [  3]  2.0- 3.0 sec  27.1 MBytes  27.1 MBytes/sec
    [  5]  2.0- 3.0 sec  28.8 MBytes  28.8 MBytes/sec
    [  4]  2.0- 3.0 sec  29.8 MBytes  29.8 MBytes/sec
    [  6]  2.0- 3.0 sec  26.0 MBytes  26.0 MBytes/sec
    [SUM]  2.0- 3.0 sec   112 MBytes   112 MBytes/sec
    [  5]  3.0- 4.0 sec  27.9 MBytes  27.9 MBytes/sec
    [  4]  3.0- 4.0 sec  28.8 MBytes  28.8 MBytes/sec
    [  3]  3.0- 4.0 sec  27.1 MBytes  27.1 MBytes/sec
    [  6]  3.0- 4.0 sec  28.6 MBytes  28.6 MBytes/sec
    [SUM]  3.0- 4.0 sec   112 MBytes   112 MBytes/sec
    [  6]  4.0- 5.0 sec  27.5 MBytes  27.5 MBytes/sec
    [  3]  4.0- 5.0 sec  26.5 MBytes  26.5 MBytes/sec
    [  4]  4.0- 5.0 sec  29.8 MBytes  29.8 MBytes/sec
    [  5]  4.0- 5.0 sec  28.2 MBytes  28.2 MBytes/sec
    [SUM]  4.0- 5.0 sec   112 MBytes   112 MBytes/sec
    [  3]  5.0- 6.0 sec  26.9 MBytes  26.9 MBytes/sec
    [  6]  5.0- 6.0 sec  29.0 MBytes  29.0 MBytes/sec
    [  5]  5.0- 6.0 sec  27.5 MBytes  27.5 MBytes/sec
    [  4]  5.0- 6.0 sec  28.9 MBytes  28.9 MBytes/sec
    [SUM]  5.0- 6.0 sec   112 MBytes   112 MBytes/sec
    [  5]  6.0- 7.0 sec  27.6 MBytes  27.6 MBytes/sec
    [  3]  6.0- 7.0 sec  28.6 MBytes  28.6 MBytes/sec
    [  4]  6.0- 7.0 sec  27.5 MBytes  27.5 MBytes/sec
    [  6]  6.0- 7.0 sec  27.8 MBytes  27.8 MBytes/sec
    [SUM]  6.0- 7.0 sec   112 MBytes   112 MBytes/sec
    [  5]  7.0- 8.0 sec  28.2 MBytes  28.2 MBytes/sec
    [  3]  7.0- 8.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  4]  7.0- 8.0 sec  27.4 MBytes  27.4 MBytes/sec
    [  6]  7.0- 8.0 sec  28.4 MBytes  28.4 MBytes/sec
    [SUM]  7.0- 8.0 sec   112 MBytes   112 MBytes/sec
    [  5]  8.0- 9.0 sec  27.6 MBytes  27.6 MBytes/sec
    [  3]  8.0- 9.0 sec  28.1 MBytes  28.1 MBytes/sec
    [  4]  8.0- 9.0 sec  28.2 MBytes  28.2 MBytes/sec
    [  6]  8.0- 9.0 sec  28.1 MBytes  28.1 MBytes/sec
    [SUM]  8.0- 9.0 sec   112 MBytes   112 MBytes/sec
    [  3]  9.0-10.0 sec  27.9 MBytes  27.9 MBytes/sec
    [  3]  0.0-10.0 sec   276 MBytes  27.6 MBytes/sec
    [  5]  9.0-10.0 sec  28.8 MBytes  28.8 MBytes/sec
    [  5]  0.0-10.0 sec   284 MBytes  28.3 MBytes/sec
    [  4]  9.0-10.0 sec  27.9 MBytes  27.9 MBytes/sec
    [  4]  0.0-10.0 sec   291 MBytes  29.1 MBytes/sec
    [  6]  9.0-10.0 sec  29.0 MBytes  29.0 MBytes/sec
    [SUM]  9.0-10.0 sec   114 MBytes   114 MBytes/sec
    [  6]  0.0-10.0 sec   278 MBytes  27.8 MBytes/sec
    [SUM]  0.0-10.0 sec  1129 MBytes   113 MBytes/sec
    
    

    With pfsense acting as client

    [2.3-RELEASE][root@pfSense.localdomain]/root: iperf -c 192.168.10.66 -P 4 -i 1 -                                    p 5001 -f M -t 10
    ------------------------------------------------------------
    Client connecting to 192.168.10.66, TCP port 5001
    TCP window size: 0.06 MByte (default)
    ------------------------------------------------------------
    [  6] local 192.168.10.1 port 15069 connected with 192.168.10.66 port 5001
    [  5] local 192.168.10.1 port 4301 connected with 192.168.10.66 port 5001
    [  4] local 192.168.10.1 port 10417 connected with 192.168.10.66 port 5001
    [  3] local 192.168.10.1 port 46991 connected with 192.168.10.66 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  6]  0.0- 1.0 sec  56.9 MBytes  56.9 MBytes/sec
    [  5]  0.0- 1.0 sec  56.9 MBytes  56.9 MBytes/sec
    [  4]  0.0- 1.0 sec  74.4 MBytes  74.4 MBytes/sec
    [  3]  0.0- 1.0 sec  31.2 MBytes  31.2 MBytes/sec
    [SUM]  0.0- 1.0 sec   219 MBytes   219 MBytes/sec
    [  6]  1.0- 2.0 sec  56.4 MBytes  56.4 MBytes/sec
    [  5]  1.0- 2.0 sec  56.4 MBytes  56.4 MBytes/sec
    [  4]  1.0- 2.0 sec  71.5 MBytes  71.5 MBytes/sec
    [  3]  1.0- 2.0 sec  35.2 MBytes  35.2 MBytes/sec
    [SUM]  1.0- 2.0 sec   220 MBytes   220 MBytes/sec
    [  6]  2.0- 3.0 sec  56.4 MBytes  56.4 MBytes/sec
    [  5]  2.0- 3.0 sec  56.4 MBytes  56.4 MBytes/sec
    [  4]  2.0- 3.0 sec  80.6 MBytes  80.6 MBytes/sec
    [  3]  2.0- 3.0 sec  26.4 MBytes  26.4 MBytes/sec
    [SUM]  2.0- 3.0 sec   220 MBytes   220 MBytes/sec
    [  6]  3.0- 4.0 sec  56.4 MBytes  56.4 MBytes/sec
    [  5]  3.0- 4.0 sec  56.4 MBytes  56.4 MBytes/sec
    [  4]  3.0- 4.0 sec  82.1 MBytes  82.1 MBytes/sec
    [  3]  3.0- 4.0 sec  24.6 MBytes  24.6 MBytes/sec
    [SUM]  3.0- 4.0 sec   220 MBytes   220 MBytes/sec
    [  6]  4.0- 5.0 sec  56.2 MBytes  56.2 MBytes/sec
    [  5]  4.0- 5.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  4]  4.0- 5.0 sec  75.9 MBytes  75.9 MBytes/sec
    [  3]  4.0- 5.0 sec  30.6 MBytes  30.6 MBytes/sec
    [SUM]  4.0- 5.0 sec   219 MBytes   219 MBytes/sec
    [  6]  5.0- 6.0 sec  56.2 MBytes  56.2 MBytes/sec
    [  5]  5.0- 6.0 sec  56.2 MBytes  56.2 MBytes/sec
    [  4]  5.0- 6.0 sec  78.8 MBytes  78.8 MBytes/sec
    [  3]  5.0- 6.0 sec  28.1 MBytes  28.1 MBytes/sec
    [SUM]  5.0- 6.0 sec   219 MBytes   219 MBytes/sec
    [  6]  6.0- 7.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  5]  6.0- 7.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  4]  6.0- 7.0 sec  72.8 MBytes  72.8 MBytes/sec
    [  3]  6.0- 7.0 sec  34.2 MBytes  34.2 MBytes/sec
    [SUM]  6.0- 7.0 sec   219 MBytes   219 MBytes/sec
    [  6]  7.0- 8.0 sec  56.0 MBytes  56.0 MBytes/sec
    [  5]  7.0- 8.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  4]  7.0- 8.0 sec  79.4 MBytes  79.4 MBytes/sec
    [  3]  7.0- 8.0 sec  27.2 MBytes  27.2 MBytes/sec
    [SUM]  7.0- 8.0 sec   219 MBytes   219 MBytes/sec
    [  6]  8.0- 9.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  5]  8.0- 9.0 sec  56.0 MBytes  56.0 MBytes/sec
    [  4]  8.0- 9.0 sec  70.8 MBytes  70.8 MBytes/sec
    [  3]  8.0- 9.0 sec  36.2 MBytes  36.2 MBytes/sec
    [SUM]  8.0- 9.0 sec   219 MBytes   219 MBytes/sec
    [  4]  9.0-10.0 sec  72.4 MBytes  72.4 MBytes/sec
    [  6]  9.0-10.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  6]  0.0-10.0 sec   563 MBytes  56.3 MBytes/sec
    [  5]  9.0-10.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  5]  0.0-10.0 sec   563 MBytes  56.3 MBytes/sec
    [  4]  0.0-10.0 sec   759 MBytes  75.9 MBytes/sec
    [  3]  9.0-10.0 sec  34.4 MBytes  34.4 MBytes/sec
    [SUM]  9.0-10.0 sec   219 MBytes   219 MBytes/sec
    [  3]  0.0-10.0 sec   308 MBytes  30.8 MBytes/sec
    [SUM]  0.0-10.0 sec  2193 MBytes   219 MBytes/sec
    [2.3-RELEASE][root@pfSense.localdomain]/root:
    
    
    XXXXX@YYYYY:~$ iperf -s -P 0 -i 1 -p 5001 -f M
    ------------------------------------------------------------
    Server listening on TCP port 5001
    TCP window size: 0.08 MByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.10.66 port 5001 connected with 192.168.10.1 port 46991
    [  5] local 192.168.10.66 port 5001 connected with 192.168.10.1 port 10417
    [  6] local 192.168.10.66 port 5001 connected with 192.168.10.1 port 15069
    [  7] local 192.168.10.66 port 5001 connected with 192.168.10.1 port 4301
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0- 1.0 sec  31.0 MBytes  31.0 MBytes/sec
    [  5]  0.0- 1.0 sec  74.1 MBytes  74.1 MBytes/sec
    [  6]  0.0- 1.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  0.0- 1.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  0.0- 1.0 sec   217 MBytes   217 MBytes/sec
    [  5]  1.0- 2.0 sec  71.4 MBytes  71.4 MBytes/sec
    [  6]  1.0- 2.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  1.0- 2.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  4]  1.0- 2.0 sec  35.2 MBytes  35.2 MBytes/sec
    [SUM]  1.0- 2.0 sec   219 MBytes   219 MBytes/sec
    [  4]  2.0- 3.0 sec  26.2 MBytes  26.2 MBytes/sec
    [  5]  2.0- 3.0 sec  80.6 MBytes  80.6 MBytes/sec
    [  6]  2.0- 3.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  2.0- 3.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  2.0- 3.0 sec   219 MBytes   219 MBytes/sec
    [  5]  3.0- 4.0 sec  82.0 MBytes  82.0 MBytes/sec
    [  6]  3.0- 4.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  3.0- 4.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  4]  3.0- 4.0 sec  24.8 MBytes  24.8 MBytes/sec
    [SUM]  3.0- 4.0 sec   219 MBytes   219 MBytes/sec
    [  4]  4.0- 5.0 sec  30.5 MBytes  30.5 MBytes/sec
    [  5]  4.0- 5.0 sec  75.9 MBytes  75.9 MBytes/sec
    [  6]  4.0- 5.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  4.0- 5.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  4.0- 5.0 sec   219 MBytes   219 MBytes/sec
    [  4]  5.0- 6.0 sec  28.2 MBytes  28.2 MBytes/sec
    [  5]  5.0- 6.0 sec  78.7 MBytes  78.7 MBytes/sec
    [  6]  5.0- 6.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  5.0- 6.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  5.0- 6.0 sec   219 MBytes   219 MBytes/sec
    [  4]  6.0- 7.0 sec  34.2 MBytes  34.2 MBytes/sec
    [  5]  6.0- 7.0 sec  72.8 MBytes  72.8 MBytes/sec
    [  6]  6.0- 7.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  6.0- 7.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  6.0- 7.0 sec   219 MBytes   219 MBytes/sec
    [  4]  7.0- 8.0 sec  27.2 MBytes  27.2 MBytes/sec
    [  5]  7.0- 8.0 sec  79.4 MBytes  79.4 MBytes/sec
    [  6]  7.0- 8.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  7.0- 8.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  7.0- 8.0 sec   219 MBytes   219 MBytes/sec
    [  4]  8.0- 9.0 sec  36.2 MBytes  36.2 MBytes/sec
    [  5]  8.0- 9.0 sec  70.7 MBytes  70.7 MBytes/sec
    [  6]  8.0- 9.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  8.0- 9.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  8.0- 9.0 sec   219 MBytes   219 MBytes/sec
    [  4]  9.0-10.0 sec  34.5 MBytes  34.5 MBytes/sec
    [  5]  9.0-10.0 sec  72.4 MBytes  72.4 MBytes/sec
    [  6]  9.0-10.0 sec  56.1 MBytes  56.1 MBytes/sec
    [  7]  9.0-10.0 sec  56.1 MBytes  56.1 MBytes/sec
    [SUM]  9.0-10.0 sec   219 MBytes   219 MBytes/sec
    [  4]  0.0-10.0 sec   308 MBytes  30.8 MBytes/sec
    [  5]  0.0-10.0 sec   759 MBytes  75.8 MBytes/sec
    [  7]  0.0-10.0 sec   563 MBytes  56.1 MBytes/sec
    [  6]  0.0-10.0 sec   563 MBytes  56.1 MBytes/sec
    [SUM]  0.0-10.0 sec  2193 MBytes   219 MBytes/sec
    
    


  • Well try 8 streams than.

    It could be that the packet flow is not optimized to come from the firewall itself, but more to push data through from lan to wan.

    I assume if pfsense is the server and you send data to it, all rules will have to be inpected, and you hit on the last one (logically)
    Perhaps move it to top for test and see if it improves, than you know if going through the rules has such a impact.

    With pfsense as client it is pushing data out so no filters at all i assume.

    And that 219 is the total of the 4 sessions that run in parrallel.

    If i think about it, pfsense its lagg seems to work if the total is above 125 MB/sec it must go over more than one link.

    If that does not work for a pc connected to your switch, the switch does not distribute the session over multiple links. That is default for 1 connected pc.

    You could test in parrallel with 2 clients against the pfsense server.
    If the total of the 2 clients goes above 125 Mb/sec the traffic goes over more than 1 link.

    (You will never see 125 MB/sec in real world over 1GE links)



  • Okay. Here are the results. What do they mean?

    8 streams w/ pfsense as server  [SUM]  0.0-10.0 sec  1133 MBytes  113 MBytes/sec
    12 streams w/ pfsense as server  [SUM]  0.0-10.1 sec  1138 MBytes  113 MBytes/sec
    32 streams w/ pfsense as server [SUM]  0.0-10.2 sec  1151 MBytes  113 MBytes/sec
    64 streams w/ pfsense as server  [SUM]  0.0-10.3 sec  1163 MBytes  113 MBytes/sec

    8 streams w/ pfsense as client  [SUM]  0.0-10.0 sec  2160 MBytes  216 MBytes/sec
    12 streams w/ pfsense as client  [SUM]  0.0-10.0 sec  2848 MBytes  284 MBytes/sec
    32 streams w/ pfsense as client  [SUM]  0.0-10.3 sec  3249 MBytes  314 MBytes/sec
    64 streams w/ pfsense as client  [SUM]  0.0-10.5 sec  3269 MBytes  312 MBytes/sec



  • When I test pfsense acting as server, and two clients sending data at the same time using this:

    iperf -c 192.168.10.1 -P 64 -i 1 -p 5001 -f M -t 10
    

    Then the result is that both clients each have the following result:

    Client #1
    [SUM]  0.0-10.2 sec  1153 MBytes  113 MBytes/sec

    Client #2
    [SUM]  0.0-10.3 sec  1162 MBytes  113 MBytes/sec

    When I run the following, they each only get 56.9 MB/s

    iperf -c 192.168.10.1 -P 1 -i 1 -p 5001 -f M -t 10
    

    That tells me that LACP is working because I could saturate the line with -P 64

    Why does a single -P stream not saturate the line?



  • It seems to me the lacp is working on the pfsense side.

    I do not know juniper, but with a ciso you can only do lacp if the physical interface members have the same configuration. So port speed, duplex, mdix etc.
    Perhaps you can check that ?

    Read this one: http://www.juniper.net/techpubs/en_US/junos15.1/topics/concept/interfaces-hashing-lag-ecmp-understanding.html

    Standard hashing is on payload it seems,mall the iperf packets might have the same payload, so they end up on 1 of the members of the link..
    Change to level 2 info, your clients have different mac adresses.



  • I see more info on hashing here:  https://forums.juniper.net/t5/Ethernet-Switching/EX2200-LACP-hashing-algorithm/td-p/107844

    I don't see any issue with LACP. I'm expecting it to only do one gigabit links for four separate clients. I think it's strange that I need more than one stream of iperf to saturate the line. In my tests on the management interface, I plugged a linux machine directly into the management port. No switch involved.

    Any idea why iperf needs -P 32 or 32 Streams to saturate the line?

    Included information about my LACP link below:

    root> show interfaces ae0
    Physical interface: ae0, Enabled, Physical link is Up
      Interface index: 128, SNMP ifIndex: 599
      Description: pfsense
      Link-level type: Ethernet, MTU: 1514, Speed: 4Gbps, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled,
      Source filtering: Disabled, Flow control: Disabled, Minimum links needed: 1, Minimum bandwidth needed: 0
      Device flags   : Present Running
      Interface flags: SNMP-Traps Internal: XXXXXXX
      Current address: XXXXX, Hardware address: XXXXX
      Last flapped   : 2016-04-14 17:32:53 CDT (20:17:55 ago)
      Input rate     : 113307208 bps (13345 pps)
      Output rate    : 113834880 bps (13366 pps)
    
      Logical interface ae0.0 (Index 65) (SNMP ifIndex 603)
        Flags: SNMP-Traps 0xc0004000 Encapsulation: ENET2
        Statistics        Packets        pps         Bytes          bps
        Bundle:
            Input :         14611          0        916446            0
            Output:       2293348          0     245371565            0
        Adaptive Statistics:
            Adaptive Adjusts:          0
            Adaptive Scans  :          0
            Adaptive Updates:          0
        Protocol eth-switch
          Flags: Is-Primary, Trunk-Mode
    
    


  • What OS are the client pc's running ?
    It might be the iperf clients for the OS version issues.

    My test from pc with windows 7 to freebsd 10.1 server was 350 MB/sec with one session in iperf.
    It also depends on buffer size, packet size tested etc, i think.

    This was with 10GE link over intel cards.
    Wintel combination did not want to go faster than that it seems.

    Freebsd to freebsd was simply close to line rate with one session.



  • I don't see any issue with LACP. I'm expecting it to only do one gigabit links for four separate clients.

    In normal 4 single line will be aggregated to one fat pipe that is then in numbers the 4x (400%) of that single
    line as an example here showing then up as 4 GBit/s aggregated.

    I think it's strange that I need more than one stream of iperf to saturate the line.

    How much you will need to saturate one single line?

    In my tests on the management interface, I plugged a linux machine directly into the management port. No switch involved.

    And no LAG, VLAN and QoS over all or?

    Any idea why iperf needs -P 32 or 32 Streams to saturate the line?

    Each line has its speed limit but this is mostly also owed to other circumstances besides.

    Link-level type: Ethernet, MTU: 1514, Speed: 4Gbps, BPDU Error: None, MAC-REWRITE Error:
    

    1.- What is the MTU size on all devices in that test?
    2.- What does you configure the LAG?
    – (2 Lines sending and 2 lines receiving or 4 lines sending and receiving)
    -- (active / active all lines are in usage or active passive one line is in usage and the rest is as spare for failover)

    In normal you will have no need for that experiences to go with your set up.
    You can do the following things in my eyes.
    1.- Setting up a static (manual) LAG and use round robin method and on top 2 line for sending and 2 lines
    for receiving by using active / active
    2.- You could use your Layer3 switch to route between the VLANs only inside of that switch that will be
    more nearly wire speed and the freed capacities from the pfSense box you will be perhaps able to use for
    other things, or as a silent reserve.



  • How much you will need to saturate one single line?

    It looks like "-P 2" will saturate the line, but "-P 1" will not.

    In my tests on the management interface, I plugged a linux machine directly into the management port. No switch involved.

    And no LAG, VLAN and QoS over all or?

    Correct. The management port does not have any LAGG, VLAN or any other tags. Just one computer plugged directly into the pfsense machine.

    1.- What is the MTU size on all devices in that test?
    2.- What does you configure the LAG?
    – (2 Lines sending and 2 lines receiving or 4 lines sending and receiving)
    -- (active / active all lines are in usage or active passive one line is in usage and the rest is as spare for failover)

    To answer #1
    MTU on Juniper switch is 1514.
    MTU on linux clients are 1500.
    MTU on pfsense LAGG is 1500.
    MTU on pfsense igb0 / igb1 / igb2 / igb3 are each 1500
    Detailed ifconfig is below

    To answer #2
    LAGG is configured as LACP over 4 lines. Each of the 4 lines both send and receive. If one line goes down, Juniper will ignore it and then use the remaining acceptable lines. Only one line is necessary to maintain satisfactory connection.

    ifconfig on pfsense:
    
    THIS IS THE LAGG
    lagg0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso>ether XX:XX:XX:XX:XX:XX
            inet6 XXXXXXXXXXXXX%lagg0 prefixlen 64 scopeid 0xb
            inet 192.168.10.1 netmask 0xffffff00 broadcast 192.168.10.255
            inet 10.10.10.1 netmask 0xffffffff broadcast 10.10.10.1
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect
            status: active
            laggproto lacp lagghash l2,l3,l4
            laggport: igb0 flags=1c <active,collecting,distributing>laggport: igb1 flags=1c <active,collecting,distributing>laggport: igb2 flags=1c <active,collecting,distributing>laggport: igb3 flags=1c <active,collecting,distributing>THIS IS MANAGEMENT PORT 
    em1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=4009b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso>ether XXXXXXXXXXXX
            inet6 XXXXXXXXXXXX%em1 prefixlen 64 scopeid 0x2
            inet 192.168.5.1 netmask 0xffffff00 broadcast 192.168.5.255
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect
            status: no carrier
    
    THIS IS ONE OF THE PORTS INCLUDED IN THE LAGG
    igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso>ether XXXXXXXXXXXX
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast></active,collecting,distributing></active,collecting,distributing></active,collecting,distributing></active,collecting,distributing></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso></up,broadcast,running,simplex,multicast> 
    


  • PC1 core i5-3470 3.2 GHz /w 4GB or RAM SAMSUNG SSD 830 EVO, OS Windows 10 pro (10.0.10586)
    PC2 core i5-2400 3.1 GHz /w 8GB of RAM SAMSUNG SSD 840 EVO , OS Windows 10 Pro (10.0.10586)
    A network share was configured on PC1 , the test file:

    Spartacus Season 1 Episode 1 Past Transgressions.mkv:
    file size is 4,583,539 KB (About 4.3 GB)

    Both PCs are connected to an HP Procurve 2810-24G and I have 4 port LAGG (LACP) going back to a Brocade FastIron 648P. From the Brocade I have a single Gigabit port going to my PfSense Firewall which is using the built in Intel NIC on the motherboard as the LAN port. The LAN port is sub-interfaced with 5 virtual ports.
                  GbE                                                        x4 GbE(LAG)
    [PfSense]–-----------[Brocade FastIron 648P]–--------------------[ProCurve]–---------------[PC1]
                                                                                                                            |–---------[PC2]
    PfSense is a core i5-3470 running at 3.2GHz with 4GB of RAM. My current version of PfSense is 2.3 Release 64bit. I have 10 Open VPN tunnels with not much traffic going across them at the moment, and my CPU usually is at 1% from what I can observe. At the time the test is being done the only traffic is YouTube from a Chromecast.

    Test 1:
    PC1 to PC2 on same subnet
    Trial 1 took 41.01 Sec to transfer the test file which is indicated above which was calculated to be 873.17 Mbps.

    Test 2:
    PC1 to PC2 on Different subnets
    Trial 1 took 45.28 Sec to transfer the test file which is indicated above which was calculated to be 790.83 Mbps.

    These are the fastest times for each test. I ran 3 trails for each test to try to get a more accurate idea about how your network might perform. I have more data that I hope to publish later today.



  • Thanks. That's interesting. You're not maxing out either.



  • I would say that I'm pretty close and if you look on trial 1 I'm not routing at all and I'm still not getting line rate. I'm pretty sure that has to do with the VLAN tags and also the overhead with TCP.



  • I would say that I'm pretty close and if you look on trial 1 I'm not routing at all and I'm still not getting line rate.

    873 MBit/s + TCP overhead + VLAN TAG + QoS + all other running services that narrow down the
    entire throughput of your pfSense appliance.

    I'm pretty sure that has to do with the VLAN tags and also the overhead with TCP.

    Each OpenVPN tunnel is taking one core from the CPU or SoC and all other packets are also "eating"
    some CPU power as I know it. So what else packets and services you are running on that pfSense machine?



  • Final Results:

    Test 1 - No routing both machines on same subnet
                    Time (Seconds) Speed (Mbps)
    Pass 1 41.69                   858.9325603
    Pass 2 80.43                   445.2181827
    Pass 3 41.01                   873.1747973

    Test 2 - PCs on different subnet PfSense doing the routing across vlans

    Time (Seconds) Speed (Mbps)

    Pass 1 45.28                   790.8325627
    Pass 2 45.68                   783.907584
    Pass 3 55.7                     642.8886614

    Test 3 -  Cisco 2821 Router inserted and it is handling the routing between the two subnets

    Time (Seconds) Speed (Mbps)
    Pass 1 44.36                 807.2339594
    Pass 2 44.12                 811.6250779
    Pass 3 44.94                 796.8157196

    Summary - What I did here is take out the high and low of all tests and then compared test 2 and test 3 against test 1 (which is switching performance)

    Performance Hit
    Test 2 8.73%
    Test 3 6.02%

    Summary :

    Switching is faster than routing (duh!), but the Asics in the Cisco Router allow it to perform at nearly the same level as my PfSense Firewall with higher end hardware. From the results here we can see that the Cisco router has about 2% better routing performance which in my mind is well worth the trade-off of what PfSense gives me! I have done nothing in-terms of optimizations which could bring PfSense even closer to my Cisco Router, and like others have stated if I put a NIC with custom silicon the gap may get even closer. The purpose for this test was not to prove one platform is better than another, I always wanted to see something by way of charts with various hardware with some numbers for people to make some decisions for what is best for them.

    Lastly , the CPU in my PfSense firewall went from 1-2% load to 10-13% when routing across vlans, which at first scared me because a couple of routing streams going across vlans could be a big hit, so I decided to add simultaneous transfers which did not bring the CPU above the 10% - 13% load (Nice!)


Log in to reply