Another thread about low bandwidth with VMware ESXi
-
And how are you connecting between vlans.. I take it when you say vlan you have your vlan tag sitting on physical interface.. So your in out the same physical for traffic between the vlans. That is a hairpin - and yes you just cut the bandwidth in half…
You need to be clear on how your testing...
Are you on like the top connection with no hairpin, or more like the bottom with hairpin when devices talking to each other? Without understanding how you have this all connected together my guess is yes your doing a hairpin, and yes when you use vlan tags and share the same equipment for traffic flowing between these vlans..
-
I don't think we are having the same issue. We both see a penalty when pfSense has to route, but you have 500mbps, and my conection drops from 750mbps to 1.5mbps.
I know the TCP window was lower (65KBytes vs 208KBytes). My Ubiquiti router can route at 1gbps with 65KBytes, anyway. The problem is, with a Window Size of 64KBytes, as I understand, the packets should be up to 65KBytes of size?
As I said before, I captured the packets:
To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes. -
Thanks for the follow ups btw. Below is more of how my setup works sorry its abit rough I was throwing it together while drinking morning coffee. I can go from Physical -> VM1 without network performance drop. I can go from Physical -> pfSense local gateway without performance drop. When I go from Physical -> pfSense vlan10 or vlan 12 gateway address I experience performance drop. When I go from Physical -> VM2 I experience performance drop. I can swap all this testing by using VM1 as my source and get the same results. All my indicators are pointing to pfSense limiting my bandwidth somehow. I have trashed my original VM and started from scratch with a vanilla install with the same results. I have followed the network optimization thread but that has not changed my results either.
ESX Host Specs:
Dual Xeon Quad Core 2.4Ghz
16GB RAM
8 x 10K SAS drives in Raid 10
Dual onboard Broadcom Gigabit NICs.![Example Drawing.jpg_thumb](/public/imported_attachments/1/Example Drawing.jpg_thumb)
![Example Drawing.jpg](/public/imported_attachments/1/Example Drawing.jpg) -
Why are you using portchannel? Could you try disabling it? Route based on virtual port ID should be enough to load balance 2 1gbps pNICs. LACP/PortChannel/ChannelBonding/whatever only gives headaches…
Packet segmentation happens on layer 3, am I right? Because in my environment, when pfSense talks to a client on his VLAN (no routing) performance is Okayish. But when pfSense has to route to another VLAN, performance drops like a brick to 1.5mbps
In my capture, it's evident that packet are being segmented. iperf running on pfSense sends 65KB packets, but on the interface I only see 1514 Bytes.
Is it possible to completely disable packet segmentation? If it's needed, the Ubiquiti router should be in charge.
Edit: several edits. I need my morning coffee.
-
"To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes."What part did you not understand about not testing to pfsense as the endpoint? Pfsense is NOT a file server its a router - if you want to test the performance of it routing/firewall/natting/etc it will all be THROUGH pfsense not too it.
As to your load sharing/port channel - kind of utterly pointless unless you have LOTS of clients talking to lots of servers - any single cllient talking to any single server is going to go through the same interface. One thing I will agree with is yes that sort of setup normally makes it way more complex in troubleshooting bandwidth issues.
-
For me this is a home test lab that I just mess around and learn with. While I would normally agree about the port channel and I didn't mention this before was that I had taken ESX down to a single trunked port connection. This didn't change my through put results either sadly. I did test to the pfSense interfaces but mine was simply for testing each hop connection in the chain. In my testing it always suffered performance when pfSense began to route the traffic.
-
"To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes."What part did you not understand about not testing to pfsense as the endpoint? Pfsense is NOT a file server its a router - if you want to test the performance of it routing/firewall/natting/etc it will all be THROUGH pfsense not too it.
Ok. Care to explain the logic behind that statement? I'm not testing against a VM behind pfSense just to narrow down the issue. I don't see how testing against pfSense can be an issue at all.
Do you REALLY need me to make that test and show you the exact same results?
When pfSense has to ROUTE traffic, that's it… from one of my VLANs to another, packet size is 1514bytes and bandwidth is 1.5Mbps.
When pfSense is at Layer 2, that's pfSense on the same VLAN as the other VM, packet size is whatever the application want's it to be and bandwidth is 500+mbpsEDIT:
Just to prove a point:
Server listening on TCP port 5001
TCP window size: 208 KByte (default)[ 4] local 192.168.20.2 port 5001 connected with 192.168.20.217 port 49772
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec
[ 4] local 192.168.20.2 port 5001 connected with 192.168.40.202 port 49818
[ 4] 0.0-11.9 sec 5.50 MBytes 3.89 Mbits/sec192.168.40.202 is a VM behind pfSense. Exact same issue as pfSense. When it has to SEND data to the server on VLAN20 it crawls at 3.89mbps
BUT
[ 3] local 192.168.20.2 port 52252 connected with 192.168.40.202 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 945 MBytes 792 Mbits/secit RECEIVES data at 792mbps
Packet captured by Wireshark:
From VLAN20 (my Desktop) to VLAN40 (pfSense): 800Mbps ~30KBytes per data packet
From VLAN40 (pfSense) to VLAN20 (Desktop): 4Mbps 1514bytes per packetClear and concise question:
How can I completely disable packet segmentation on pfSense?
Why is pfSense segmenting packets when my router is not?MTU on everything is 1500.
-
I don't see how testing against pfSense can be an issue at all.
It's simply a poor test because there is no real-world counterpart.
That said, I can somewhat corroborate this issue. I too have pfSense running in ESXi 6.x at home and can verify that inter-VLAN throughput with pfSense as the router is very poor. Much worse, in fact, than LAN to WAN throughput. As an example, I've tested with two Debian VMs running on the same host as pfSense in separate VLANs.
A typical iperf result with default settings looks like this:
[ 3] local 10.22.44.201 port 52033 connected with 10.22.11.121 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 11.2 MBytes 9.35 Mbits/sec
iperf from the same client to a physical machine on the same VLAN (and even on a different physical ethernet switch) shows what I'd expect from 1Gbps NICs and switches:
[ 3] local 10.22.44.201 port 39339 connected with 10.22.44.200 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.09 GBytes 936 Mbits/sec
and just as a sanity check, between 2 VMs on the ESXi host but in the same VLAN:
[ 3] local 10.22.44.201 port 49256 connected with 10.22.44.207 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 6.32 GBytes 5.43 Gbits/sec
finally, and this is just for reference, pfSense as the iperf client to an external box:
[ 3] local 10.22.44.88 port 27152 connected with 10.22.44.200 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 23.2 MBytes 19.4 Mbits/sec
difference with pfSense as the client is the default TCP window size of 65.0 KByte rather than the 85 that is default on Debian.
LAN to WAN throughput is fine; though I only have a 50Mbps connection at the moment.
I thought traffic shaping might be playing a role, so I removed my shaper config and cleared the state table before running any tests.
I also tested a SCP file transfer between Debian hosts on either side of pfSense, but I'm not sure how valid that is given that SCP incurs significant encryption overhead and none of these VMs is terribly fast in the CPU department. I don't have any physical machines in my DMZ subnet, and my guest subnet is all wireless, so I'm unable to test a real-world workload between subnets with physical machines on either side, if that makes sense.
EDIT:
As a counterpoint, here's an iperf result between two CentOS VMs running in a completely different environment, with the same pfSense version (2.3.4-RELEASE) on ESXi 5.x. The VMs are on different hosts, with the server VM on 10 year old hardware. The iperf client VM and the pfSense VM are on the same host.
[ 3] local 192.168.66.15 port 49985 connected with 192.168.56.66 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 561 MBytes 470 Mbits/sec/code]
All pfSense VMs in my tests are running with VMXNET NICs
-
MTU on everything is 1500.
You saw this value under ifconfig or set it under gui?
I was having big performance issues with xen + bsd/pfSense. Vlan interfaces was really slow and i was sending a lot of icmp reject asking to defrag IIRC. At the end, it was a mtu on all vlan networking.
I could workaround by setting mtu to 1504 on real interface ans ser all vlans to 1500
-
I could workaround by setting mtu to 1504 on real interface ans ser all vlans to 1500
How did you accomplish this on the parent interface? I've tried @ the CLI with 'ifconfig vmx1 mtu 1504' and get 'operation not permitted'
The choice isn't available in the GUI that I can see because vmx1 doesn't have an interface assignment; it's just the parent interface for three VLANs.
-
How did you accomplish this on the parent interface? I've tried @ the CLI with 'ifconfig vmx1 mtu 1504' and get 'operation not permitted'
On xen it accepted. May be vmx1 driver is not fully implemented or need the guest binaries?
This is the code I'm running on cron. xn0 and xn1 are the interfaces I'm configuring vlans
#!/bin/sh for a in xn0 xn1;do ifconfig $a | grep mtu.1500 && ifconfig $a mtu 1504 up done ifconfig | sed "s/:/ /" |grep mtu.1496 | while read c d;do ifconfig $c mtu 1500 up done
-
How did you accomplish this on the parent interface? I've tried @ the CLI with 'ifconfig vmx1 mtu 1504' and get 'operation not permitted'
On xen it accepted. May be vmx1 driver is not fully implemented or need the guest binaries?
This is the code I'm running on cron. xn0 and xn1 are the interfaces I'm configuring vlans
#!/bin/sh for a in xn0 xn1;do ifconfig $a | grep mtu.1500 && ifconfig $a mtu 1504 up done ifconfig | sed "s/:/ /" |grep mtu.1496 | while read c d;do ifconfig $c mtu 1500 up done
Thanks. vmx is well supported and the open-vm-tools package is installed. I'm using the same drivers and tools in my work pfSense VMs and not experiencing the same problem.
Just so we're clear, your pfSense VM in xen is doing the tagging? The ifconfig you're running above is in pfSense and not the host OS? My familiarity with xen is minimal.
-
Just so we're clear, your pfSense VM in xen is doing the tagging? The ifconfig you're running above is in pfSense and not the host OS?
Yes, I'm running this script on pfSense and configuring vlan on it too.
IIRC, xen limit guest vms to 5 or 6 interfaces.
-
I'm running this script on pfSense and configuring vlan on it too.
Ok, i'm a dummy. Long ago, I set a very strong pw for admin/root on pfSense and never use that account. I created a separate admin account with a not-easily-guessed user name for day-to-day admin. I used that account at the CLI to run ifconfig, which failed. Long story short, admin access in the gui =! root @ cli. Running as the actual root account lets me set MTU.
-
Yes, it needs root.
Check if any pfSense routine will back it to 1496. If so, add that script changing parent interfaces to your setup on cron. -
I don't know.. I've experimented with MTU as far as up to 9000 on the virtual NIC and the VMware switch and have seen throughput up to about 150Mbps but it's wildly inconsistent.
I really got into this thread because I found the original post interesting and was able to replicate a similar issue in my personal (home) setup. It is not, in fact, a real problem for me as I do so little inter-vlan routing on my home network that it doesn't affect me one way or the other.
One thing I can verify is that my shaper config limits inter-vlan traffic; disabling the root queue on the DMZ interface allows much faster throughput during a sustained NFS -> local storage transfer on a DMZ machine from a NFS mount in LAN. But that's for a different forum.