Incredible slow throughput

fLoo

Hello Guys n Girls,

i've got an Atom D2500CC (2x onboard Gbit + 1x PCI Gbit) setup with ESXi and 1 VM in it (2 vCPU, 2GB RAM) for pfSense 2.1 latest build. I cant get network performance exceed 20 MB/s (~ 140 Mbit) between any directions.

Network info:

24 Port Gbit Managed Switch
9 VLANs
Firewall rules for every VLAN (set to * for the test = allow all)
No Snort / anything else running

Tested the following:

VLAN X to VLAN Y
WAN to VLAN X / Y

The load says ~ 70 % on 140 Mbit network throughput but in ESXi i can see like 90 % usage. I just setup my new nas box with 18 TB and have to transfer a LOT of data and now encounting this problem.

Can i fix this by moving from ESXi to a complete hardware based pfSense install without the hypervisor? Or is this problem based on the different VLANs and routing between them? I havent had the problems while i ran pfSense 2.0 (NOT 2.1!) on my other ESXi (Quad Core i5 / 16 GB Ram / Quad Intel Gbit NIC).

Thanks in advance

Florian

Update: Here is the System Activity while backup up files through pfSense 2.1 to my NAS (max transfer speed ~ 140 Mbit):

last pid: 33441; load averages: 1.31, 1.04, 0.83 up 0+07:22:59 01:03:11
99 processes: 4 running, 78 sleeping, 17 waiting

Mem: 53M Active, 14M Inact, 52M Wired, 1144K Cache, 28M Buf, 1871M Free
Swap: 2048M Total, 2048M Free

PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
0 root -68 0 0K 80K CPU1 1 63:40 66.99% kernel{em0 taskq}
11 root 171 ki31 0K 16K RUN 0 405:08 51.95% idle{idle: cpu0}
11 root 171 ki31 0K 16K RUN 1 365:52 16.99% idle{idle: cpu1}
0 root -68 0 0K 80K - 0 4:56 9.96% kernel{em1 taskq}
83552 root 76 0 73584K 19876K piperd 0 0:02 4.98% php
12 root -32 - 0K 136K WAIT 0 5:10 0.00% intr{swi4: clock}
14 root -16 - 0K 8K - 1 0:51 0.00% yarrow
0 root -16 0 0K 80K sched 0 0:50 0.00% kernel{swapper}
260 root 76 20 3416K 1184K kqread 1 0:44 0.00% check_reload_status
75674 root 76 20 3708K 1436K wait 1 0:36 0.00% sh
45272 root 44 0 3328K 1296K select 1 0:34 0.00% apinger
0 root -68 0 0K 80K - 0 0:11 0.00% kernel{em2 taskq}
70909 root 44 0 8368K 6748K select 0 0:08 0.00% bsnmpd
51049 root 76 0 73584K 23332K accept 0 0:08 0.00% php
61152 dhcpd 44 0 8448K 5300K select 0 0:07 0.00% dhcpd
49255 root 44 0 10096K 7012K kqread 0 0:06 0.00% lighttpd
12 root -32 - 0K 136K WAIT 0 0:05 0.00% intr{swi4: clock}
61779 dhcpd 44 0 8448K 5632K select 0 0:05 0.00% dhcpd

iFloris

Have you installed vm tools, either the open package or manual vendor supplied tools?

athurdent

Does an ATOM CPU actually have enough power for this and oes the system have full VT-d capabilities enabled? If it has to emulate your NICs it will soon max out I guess.
If I had to build an ESX host, I'd go for at least a modern Core iX, a mainboard that does Intel VT-d and a fast HDD.

fLoo

VMWare Tools not installed
Yes, an dual core atom should be capable of running vlans + firewall without any extra stuff on atleast 400+ Mbit.

I'll install VMWare Tools now and see if this changes anything. If it doesnt, i go bare matel and install pfSense on the hardware directly

drewy

@fLoo:

VMWare Tools not installed

Yes, an dual core atom should be capable of running vlans + firewall without any extra stuff on atleast 400+ Mbit.

Mmmm yes, but I'd be surprised if it can when running via esx.

bman212121

Yes, I would agree with drewy and athurdent. When you install bare metal on atom you can use the hardware offload options on your NICs and significantly reduce cpu overhead. When installing pfsense in a VM it's now using virtual nics and all of the networking has to be processed by the cpu. Atom doesn't support VT-d or you might be able to pass the nics directly to your pfsense vm, and gain the benefits of hardware offload.

From the VMware ESXi guide:

In a native environment, CPU utilization plays a significant role in network throughput. To process higher levels of throughput, more CPU resources are needed. The effect of CPU resource availability on the network throughput of virtualized applications is even more significant. Because insufficient CPU resources will limit maximum throughput, it is important to monitor the CPU utilization of high-throughput workloads.
Use separate virtual switches, each connected to its own physical network adapter, to avoid contention between the VMkernel and virtual machines, especially virtual machines running heavy networking workloads.
To establish a network connection between two virtual machines that reside on the same ESXi system, connect both virtual machines to the same virtual switch. If the virtual machines are connected to different virtual switches, traffic will go through wire and incur unnecessary CPU and network overhead.

www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf

I think when you install bare metal and test the same workload you might see half the cpu utilization. Given the 90% number you saw on the ESXi box itself it sounds like all of the cpu is being used up. 70% is used by pfsense, and the other 20% being used by the hypervisor to process the virtual switch and actually send the data out on the network. (Just a guess)