Another thread about low bandwidth with VMware ESXi



  • Hi,

    First of all, let me apologize for making another thread on the same subject. I've already read all the info I could gather throu the forum, and tried several things and can't seem to figure out where the issue is.

    I have a home ESXi lab. I'll try to explain everything and what I want to achieve with pfSense. But let's start with the easiest explanation and sceneario:

    A VM on my network (VLAN20) connects to internet at full speed (40mbps up / 40mbps down). The same VM behind pfSense (VLAN40) has only 5Mbps upload.

    What I tried:

    Swaping vNCIs: E1000 and VMXNET3
    Swaping pNICs: I have Intel PRO1000 server adapters and Realtek 1Gbps adapters
    Modify Hardware Offloading: Checksum, segmentation, large receive
    Open VM Tools: Installed and uninstalled
    Installing a new pfSense VM


    My hardware:
    Ubiquity Edgerouter Lite: well… my edge router. v1.7.0. In charge of inter VLAN talking. Connected to WAN.
    TP Link TL-SG2424: my access switch, where the VLANs and trunks are defined.
    4 ESXi 6.5: 2 clusters of 2. Xeons E5 2670 with 64GB RAM and Xeons X3363 with 16GB RAM. Everything on 6.5. 4 pNICs each, 2 for iSCSI.
    Not hardware but every VM is behind a Distributed Virtual Switch.

    Logical Network:
    VLAN10: All of my Servers. They don't have access to Internet.
    VLAN20: My desktops. Full access.
    VLAN11/12: iSCSI VLANs.
    VLAN40: pfSense VLAN for LAN: Guests VMs (VDIs), Guest WiFi. Everything that I want to isolate from my main Network has to talk to pfSense first.
    VLAN50: pfSense WAN network -> connects to router (which is another firewall). So, LAN and WAN on pfSense are actually both LAN.

    What I wan't to achieve is, pfSense managing my Guest network AND as a Front End of my servers (reverse proxy) and a proxy for my Update Managers and servers that needs to contact the internet (giving inbound and outbound traffic to VLAN10 passing through pfSense).

    It's working. I have Squid reverse proxy, I can connect to my VMware Horizon Lab, to my NextCloud, webmail, etc. The edgerouter points to pfsense, pfsense works as a frontend, and talks to VLAN10 backends. Everything works! Problem? When I download anything I get ~150KB/s max.

    I tried several things, convinced that the problem was the lack of my routing skills... to no avail. Disabled firewalls in between. Nothing. Then I thought... let's make it simple. Connected a VM behind a newly installed pfSense which is doing nothing but giving internet access on VLAN40 and that VM has 5mbps upload (40/40mbps connection).

    Conclusion: there is an issue, and I can replicate it on a smaller configuration. So I'm open to suggestions...



  • It's not a limitation with ESXi. I've got a pfSense VM with stock em interfaces that does over 500 Mb both ways.
    Mine is a simple setup- ESXi is trunked to a switch, one vlan is the LAN, another is WAN- coming in off the switch to the provider. No proxies or other routers in the mix- I'd guess your problem lies there. For reference, VM has 1 CPU, 4 GB, x64. ESXi is 6.0 running on server grade hardware.



  • Inter VLAN speed is at Link Speed, 1gbps up and down. The Ubiquity router is quite capable of routing ~1gpbs FD.

    VLAN20 Server - VLAN10 Client TCP:

    [SUM]  0.00-10.00  sec  1.06 GBytes  909 Mbits/sec                  sender
    [SUM]  0.00-10.00  sec  1.06 GBytes  909 Mbits/sec                  receiver

    pfSense Server - VLAN20 Client:

    Server listening on TCP port 5001
    TCP window size: 63.7 KByte (default)
    –----------------------------------------------------------
    [  4] local 192.168.40.3 port 5001 connected with 192.168.20.2 port 56582
    [ ID] Interval      Transfer    Bandwidth
    [  4]  0.0-10.0 sec  920 MBytes  772 Mbits/sec

    VLAN20 Server - pfSense Client:

    –----------------------------------------------------------
    Server listening on TCP port 5001
    TCP window size:  208 KByte (default)

    [  4] local 192.168.20.2 port 5001 connected with 192.168.40.3 port 36320
    [ ID] Interval      Transfer    Bandwidth
    [  4]  0.0-10.9 sec  2.00 MBytes  1.54 Mbits/sec

    So, when pfSense sends data it crawls. This is newly installed.

    I have white boxes… not server grade. But I don't think it's a hardware issue. pfSense has 2C/4GB. Tried 1C. Zero usage.

    Trace:

    pathping pfsense01t

    Tracing route to pfsense01t.lab.tst [192.168.40.3]
    over a maximum of 30 hops:
      0  Haswell.lab.tst [192.168.20.2]
      1  edge01.lab.tst [192.168.20.1]
      2  pfsense01t.lab.tst [192.168.40.3]

    Traceback:

    traceroute 192.168.20.2
    traceroute to 192.168.20.2 (192.168.20.2), 64 hops max, 40 byte packets
    1  192.168.40.2 (192.168.40.2)  0.387 ms  0.339 ms  0.337 ms
    2  haswell (192.168.20.2)  0.549 ms *  0.612 ms

    There's a dropped packet!  :o


    EDIT


    On the same ESXi Server

    VM on VLAN40 Client - pfSense VLAN40 Server

    –----------------------------------------------------------
    Client connecting to 192.168.40.3, TCP port 5001
    TCP window size:  208 KByte (default)

    [  3] local 192.168.40.100 port 65097 connected with 192.168.40.3 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [  3]  0.0-10.0 sec  2.34 GBytes  2.01 Gbits/sec

    pfSense VLAN40 Client - VM on VLAN40 Server

    –----------------------------------------------------------
    Client connecting to 192.168.40.100, TCP port 5001
    TCP window size: 65.0 KByte (default)

    [  3] local 192.168.40.3 port 15470 connected with 192.168.40.100 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [  3]  0.0-10.0 sec  5.73 GBytes  4.92 Gbits/sec


    Now the same but with different ESXi hosts. Now communications have to go to the NIC and the TPLink Switch.

    pfSense Server - VM VLAN40 Client:

    [  4] local 192.168.40.3 port 5001 connected with 192.168.40.200 port 52995
    [  4]  0.0-10.0 sec  233 MBytes  195 Mbits/sec

    VM VLAN40 Server - pfSense Client:

    –----------------------------------------------------------
    Client connecting to 192.168.40.200, TCP port 5001
    TCP window size: 65.0 KByte (default)

    [  3] local 192.168.40.3 port 38347 connected with 192.168.40.200 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [  3]  0.0-10.0 sec  641 MBytes  537 Mbits/sec

    Conclusions (kinda):
    When pfSense has to actually use the physical NIC, bandwidth is affected. The Edgerouter makes matters even worse. But VLAN to VLAN on my setup is working a 900Mbps between VMs and desktops. pfSense VM does not like something on my network! And vNIC performance is sub-par other guests OS



  • I'm not a "networking guy", this is beyond my skill set, please bear with me. But I like troubleshooting things, and this is proving to be… entertaining.

    At least this posts serves me as a captains log  :P

    I fired up Wireshark, and started to capture packets listening on port 5001, to isolate the iperf traffic and to try and figure something out.

    First, I set the pfsense as server, and uploaded at 758mbps, to have an example of a "good" TCP conversation. I immediately noticed some duplicated ACKs here and there... But I can't tell if that's common or not.

    Ok, so then I proceeded to capture with the pfsense uploading data to mi Desktop set up as server... and it's a mess.

    I got TCP Dup ACKs all over, and TCP Fast Retransmission here and there. But something caught my eye: packet length. ALL packets are ~1514 bytes, but on the other capture I have 21954 bytes and even bigger!

    :o

    Now what?



  • I am in a similar situation running pfSense 2.3.4 release in VMWare. I have narrowed my connection speed limitation down to the pfSense vm. Iperf from other vm's to physical hosts and reverse all have 900+ Mbps through put both ways on the same VLAN ID. Iperf from physical machine to local VLAN IP of pfSense VM get 900+ Mbps through put. Once pfSense has to route to a different VLAN I drop to ~500Mbps through put. This is the same result if I use the E1000 or VMX3 nic for the pfSense VM. I'm running dell server hardware, and the pfSense VM is quad core with 4GB of RAM no snort or other modules installed. Even destroyed the VM and rebuilt thinking there was corruption somewhere. I will do a packet capture and see if I get the same issue of ACK packets.


  • Rebel Alliance Global Moderator

    "Once pfSense has to route to a different VLAN I drop to ~500Mbps through put"

    Are you hairpinning those intervlan tests?  Why are you testing TO pfsense when you should be testing through pfsense..

    "pfSense Server - VM VLAN40 Client:"

    Sounds like pfsense is the endpoint in this iperf test.. Your also using a small 65k window size..



  • For me I was testing from from a physical client to the closets end point working my way out to the furthest end point to see where my breakage was happening. I then did the same testing from a virtual machine on the same ESX host to see where the breakage was at. At 1st I was not sure if it was VMWare network configuration, switch configuration , or pfSense causing my problem. By doing this it narrowed it straight down to pfSense being the problem child. I did other testing as well from VM to VM on the ESX host.


  • Rebel Alliance Global Moderator

    And how are you connecting between vlans.. I take it when you say vlan you have your vlan tag sitting on physical interface..  So your in out the same physical for traffic between the vlans.  That is a hairpin - and yes you just cut the bandwidth in half…

    You need to be clear on how your testing...

    Are you on like the top connection with no hairpin, or more like the bottom with hairpin when devices talking to each other?  Without understanding how you have this all connected together my guess is yes your doing a hairpin, and yes when you use vlan tags and share the same equipment for traffic flowing between these vlans..




  • I don't think we are having the same issue. We both see a penalty when pfSense has to route, but you have 500mbps, and my conection drops from 750mbps to 1.5mbps.

    I know the TCP window was lower (65KBytes vs 208KBytes). My Ubiquiti router can route at 1gbps with 65KBytes, anyway. The problem is, with a Window Size of 64KBytes, as I understand, the packets should be up to 65KBytes of size?

    As I said before, I captured the packets:

    To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
    To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes.



  • Thanks for the follow ups btw. Below is more of  how my setup works sorry its abit rough I was throwing it together while drinking morning coffee. I can go from Physical -> VM1 without network performance drop. I can go from Physical -> pfSense local gateway without performance drop. When I go from Physical -> pfSense vlan10 or vlan 12 gateway address I experience performance drop. When I go from Physical -> VM2 I experience performance drop. I can swap all this testing by using VM1 as my source and get the same results. All my indicators are pointing to pfSense limiting my bandwidth somehow. I have trashed my original VM and started from scratch with a vanilla install with the same results. I have followed the network optimization thread but that has not changed my results either.

    ESX Host Specs:
    Dual Xeon Quad Core 2.4Ghz
    16GB RAM
    8 x 10K SAS drives in Raid 10
    Dual onboard Broadcom Gigabit NICs.

    ![Example Drawing.jpg_thumb](/public/imported_attachments/1/Example Drawing.jpg_thumb)
    ![Example Drawing.jpg](/public/imported_attachments/1/Example Drawing.jpg)



  • Why are you using portchannel? Could you try disabling it? Route based on virtual port ID should be enough to load balance 2 1gbps pNICs. LACP/PortChannel/ChannelBonding/whatever only gives headaches…


    Packet segmentation happens on layer 3, am I right? Because in my environment, when pfSense talks to a client on his VLAN (no routing) performance is Okayish. But when pfSense has to route to another VLAN, performance drops like a brick to 1.5mbps

    In my capture, it's evident that packet are being segmented. iperf running on pfSense sends 65KB packets, but on the interface I only see 1514 Bytes.

    Is it possible to completely disable packet segmentation? If it's needed, the Ubiquiti router should be in charge.

    Edit: several edits. I need my morning coffee.


  • Rebel Alliance Global Moderator

    "To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
    To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes."

    What part did you not understand about not testing to pfsense as the endpoint?  Pfsense is NOT a file server its a router - if you want to test the performance of it routing/firewall/natting/etc it will all be THROUGH pfsense not too it.

    As to your load sharing/port channel - kind of utterly pointless unless you have LOTS of clients talking to lots of servers - any single cllient talking to any single server is going to go through the same interface.  One thing I will agree with is yes that sort of setup normally makes it way more complex in troubleshooting bandwidth issues.



  • For me this is a home test lab that I just mess around and learn with. While I would normally agree about the port channel and I didn't mention this before was that I had taken ESX down to a single trunked port connection. This didn't change my through put results either sadly. I did test to the pfSense interfaces but mine was simply for testing each hop connection in the chain. In my testing it always suffered performance when pfSense began to route the traffic.



  • @johnpoz:

    "To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
    To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes."

    What part did you not understand about not testing to pfsense as the endpoint?  Pfsense is NOT a file server its a router - if you want to test the performance of it routing/firewall/natting/etc it will all be THROUGH pfsense not too it.

    Ok. Care to explain the logic behind that statement? I'm not testing against a VM behind pfSense just to narrow down the issue. I don't see how testing against pfSense can be an issue at all.

    Do you REALLY need me to make that test and show you the exact same results?

    When pfSense has to ROUTE traffic, that's it… from one of my VLANs to another, packet size is 1514bytes and bandwidth is 1.5Mbps.
    When pfSense is at Layer 2, that's pfSense on the same VLAN as the other VM, packet size is whatever the application want's it to be and bandwidth is 500+mbps

    EDIT:

    Just to prove a point:


    Server listening on TCP port 5001
    TCP window size:  208 KByte (default)

    [  4] local 192.168.20.2 port 5001 connected with 192.168.20.217 port 49772
    [ ID] Interval      Transfer    Bandwidth
    [  4]  0.0-10.0 sec  1.09 GBytes  934 Mbits/sec
    [  4] local 192.168.20.2 port 5001 connected with 192.168.40.202 port 49818
    [  4]  0.0-11.9 sec  5.50 MBytes  3.89 Mbits/sec

    192.168.40.202 is a VM behind pfSense. Exact same issue as pfSense. When it has to SEND data to the server on VLAN20 it crawls at 3.89mbps

    BUT

    [  3] local 192.168.20.2 port 52252 connected with 192.168.40.202 port 5001
    [ ID] Interval      Transfer    Bandwidth
    [  3]  0.0-10.0 sec  945 MBytes  792 Mbits/sec

    it RECEIVES data at 792mbps

    Packet captured by Wireshark:

    From VLAN20 (my Desktop) to VLAN40 (pfSense): 800Mbps ~30KBytes per data packet
    From VLAN40 (pfSense) to VLAN20 (Desktop): 4Mbps 1514bytes per packet

    Clear and concise question:

    How can I completely disable packet segmentation on pfSense?
    Why is pfSense segmenting packets when my router is not?

    MTU on everything is 1500.



  • @agbiront:

    I don't see how testing against pfSense can be an issue at all.

    It's simply a poor test because there is no real-world counterpart.

    That said, I can somewhat corroborate this issue.  I too have pfSense running in ESXi 6.x at home and can verify that inter-VLAN throughput with pfSense as the router is very poor.  Much worse, in fact, than LAN to WAN throughput.  As an example, I've tested with two Debian VMs running on the same host as pfSense in separate VLANs.

    A typical iperf result with default settings looks like this:

    [  3] local 10.22.44.201 port 52033 connected with 10.22.11.121 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.1 sec  11.2 MBytes  9.35 Mbits/sec
    

    iperf from the same client to a physical machine on the same VLAN (and even on a different physical ethernet switch) shows what I'd expect from 1Gbps NICs and switches:

    [  3] local 10.22.44.201 port 39339 connected with 10.22.44.200 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.0 sec  1.09 GBytes   936 Mbits/sec
    

    and just as a sanity check, between 2 VMs on the ESXi host but in the same VLAN:

    [  3] local 10.22.44.201 port 49256 connected with 10.22.44.207 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.0 sec  6.32 GBytes  5.43 Gbits/sec
    

    finally, and this is just for reference, pfSense as the iperf client to an external box:

    [  3] local 10.22.44.88 port 27152 connected with 10.22.44.200 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.1 sec  23.2 MBytes  19.4 Mbits/sec
    

    difference with pfSense as the client is the default TCP window size of 65.0 KByte rather than the 85 that is default on Debian.

    LAN to WAN throughput is fine; though I only have a 50Mbps connection at the moment.

    I thought traffic shaping might be playing a role, so I removed my shaper config and cleared the state table before running any tests.

    I also tested a SCP file transfer between Debian hosts on either side of pfSense, but I'm not sure how valid that is given that SCP incurs significant encryption overhead and none of these VMs is terribly fast in the CPU department.  I don't have any physical machines in my DMZ subnet, and my guest subnet is all wireless, so I'm unable to test a real-world workload between subnets with physical machines on either side, if that makes sense.

    EDIT:

    As a counterpoint, here's an iperf result between two CentOS VMs running in a completely different environment, with the same pfSense version (2.3.4-RELEASE) on ESXi 5.x.  The VMs are on different hosts, with the server VM on 10 year old hardware.  The iperf client VM and the pfSense VM are on the same host.

    [  3] local 192.168.66.15 port 49985 connected with 192.168.56.66 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.0 sec   561 MBytes   470 Mbits/sec/code]
    
    

    All pfSense VMs in my tests are running with VMXNET NICs



  • @agbiront:

    MTU on everything is 1500.

    You saw this value under ifconfig or set it under gui?

    I was having big performance issues with xen + bsd/pfSense. Vlan interfaces was really slow and i was sending a lot of icmp reject asking to defrag IIRC. At the end, it was a mtu on all vlan networking.

    I could workaround by setting mtu to 1504 on real interface ans ser all vlans to 1500



  • @marcelloc:

    I could workaround by setting mtu to 1504 on real interface ans ser all vlans to 1500

    How did you accomplish this on the parent interface?  I've tried @ the CLI with 'ifconfig vmx1 mtu 1504' and get 'operation not permitted'

    The choice isn't available in the GUI that I can see because vmx1 doesn't have an interface assignment; it's just the parent interface for three VLANs.



  • @whosmatt:

    How did you accomplish this on the parent interface?  I've tried @ the CLI with 'ifconfig vmx1 mtu 1504' and get 'operation not permitted'

    On xen it accepted. May be vmx1 driver is not fully implemented or need the guest binaries?

    This is the code I'm running on cron.  xn0 and xn1 are the interfaces I'm configuring vlans

    #!/bin/sh
    for a in xn0 xn1;do
       ifconfig $a | grep mtu.1500 &&
       ifconfig $a mtu 1504 up
    done
    
    ifconfig | sed "s/:/ /" |grep mtu.1496 |
       while read c d;do
           ifconfig $c mtu 1500 up
       done
    
    


  • @marcelloc:

    @whosmatt:

    How did you accomplish this on the parent interface?  I've tried @ the CLI with 'ifconfig vmx1 mtu 1504' and get 'operation not permitted'

    On xen it accepted. May be vmx1 driver is not fully implemented or need the guest binaries?

    This is the code I'm running on cron.  xn0 and xn1 are the interfaces I'm configuring vlans

    #!/bin/sh
    for a in xn0 xn1;do
       ifconfig $a | grep mtu.1500 &&
       ifconfig $a mtu 1504 up
    done
    
    ifconfig | sed "s/:/ /" |grep mtu.1496 |
       while read c d;do
           ifconfig $c mtu 1500 up
       done
    
    

    Thanks.  vmx is well supported and the open-vm-tools package is installed.  I'm using the same drivers and tools in my work pfSense VMs and not experiencing the same problem.

    Just so we're clear, your pfSense VM in xen is doing the tagging?  The ifconfig you're running above is in pfSense and not the host OS?  My familiarity with xen is minimal.



  • @whosmatt:

    Just so we're clear, your pfSense VM in xen is doing the tagging?  The ifconfig you're running above is in pfSense and not the host OS?

    Yes, I'm running this script on pfSense  and configuring vlan on it too.

    IIRC, xen limit guest vms to 5 or 6 interfaces.



  • @marcelloc:

    I'm running this script on pfSense  and configuring vlan on it too.

    Ok, i'm a dummy.  Long ago, I set a very strong pw for admin/root on pfSense and never use that account.  I created a separate admin account with a not-easily-guessed user name for day-to-day admin.  I used that account at the CLI to run ifconfig, which failed.  Long story short, admin access in the gui =! root @ cli. Running as the actual root account lets me set MTU.



  • Yes, it needs root.
    Check if any pfSense routine will back it to 1496. If so, add that script changing parent interfaces to your setup on cron.



  • I don't know.. I've experimented with MTU as far as up to 9000 on the virtual NIC and the VMware switch and have seen throughput up to about 150Mbps but it's wildly inconsistent.

    I really got into this thread because I found the original post interesting and was able to replicate a similar issue in my personal (home) setup.  It is not, in fact, a real problem for me as I do so little inter-vlan routing on my home network that it doesn't affect me one way or the other.

    One thing I can verify is that my shaper config limits inter-vlan traffic; disabling the root queue on the DMZ interface allows much faster throughput during a sustained NFS -> local storage transfer on a DMZ machine from a NFS mount in LAN.  But that's for a different forum.