HYPERVISOR performance testing

jsone

I recently purchased some c2758 from the pfsense guys, 2 for production and 2 for our test lab, while i had all 4 in my test lab i took an opportunity to setup some tests to see which hypervisors performed best, or if baremetal was really the only option.

the lab setup consisted of 3 c2758, 1 with centos 7 kvm, 1 with esxi 6, and the other a baremetal install. 2 quadcore centos 7 client end points. a desktop cisco switch for the wan and another for the lan. all interfaces are 1gig

all testing was done wan->router->lan

the version of pfsense i used was 2.2.4

guests and installs use 8cores, 8gigs of ram.

thruput results were as follows

#thruput tests
iperf3 -c 192.168.88.202 -b 0 -P 20 -t 30

baremetal - 960mbit
vmware esxi 6 vmxnet3 driver - 930mbit
vmware esxi 6 e1000 driver - 700mbit
centos 7.1 e1000 driver - 150mbit
centos 7.1 virtio (default single core) - 600mbit
centos 7.1 virtio (8 cores) - 930mbit

mutlicore virtio notes: (this requires you manual virsh edit and place <driver name="vhost" queues="8">into each interface. this only works for virtio, and we saw folks struggle to make this work in other builds such as arch linux and ubuntu.

another note: for now in centos guests disable all hardware offload, some of them cause hypervisor panic style dmesg spew when under load.

#flood tests
hping3 192.168.88.202 -c 10000000 -d 120 -S -w 64 -p 22 -s 22 –flood --rand-source

This flood scenario basically creates states as fast as the router can possibly go, so you are testing both states and thru put,

no test env we tried could survive it we set our state limit to 12 million on each case. most of the baremetal and centos tests survive upwards of 5-8million states.

baremetal took the biggest volume, sustained roughly 150mbit of flood before dying shortly after (less than 30seconds)
centos 7.1 8core virtio did the next best with 60mbit, then dies about 10seconds later
vmware esxi vmxnet3 takes about 35mbit then dies within about 1-2 seconds

#flood tests
hping3 192.168.88.202 -c 10000000 -d 120 -S -w 64 -p 22 -s 22 --flood

This test floods using the same states, so we arent limiting our self with cpu and state creation

baremetal could push about 500mbit
centos 7.1 virtiox8 could do 300mbit. (pfsense guest pps claimed 120k pps wan 120k pps lan)
vmware esxi vmxnet3 175mbit

both baremetal and centos fought thru this style flood with SERIOUS lag, while the vmware esxi would ping timeout.

it was interesting to note when flooding the baremetal and centos, we were starting to lag our poor little desktop cisco switchs, probably maxing out their pps limits, the lag was effecting my management access to the units, while this did not occur when flooding the vmware setup.

i hope someone finds this info useful, it took weeeeeeks of time testing it all out.</driver>

jsone

all of the above iperf tests were TCP

i also did UDP tests, but nomatter what test setup i used they were pushing 950mbit of udp so uninteresting i forgot to mention it :D

#update 8-23-15

tested hyper-V 2012 and pfsense 2.2.4 as gen1 guest

guest 8 cores 5982megs of ram
2 vswitches, 1 lan 1 wan

test 1. iperf got 644mbit

test 2. the random state generation hping test not only killed the guest instantly, it also continue to grind the CPUs at 100% for over 5minutes before i forced the vm off.

test 3. flood without random ports, died instantly at 37% cpu hypervisor reports disco from guest after 10-15seconds, had to force off as well.

windows firewall was turned off for setup and test run

it should be noted that microsoft makes it impossible to even do the testing on their free hyper-v 2012 without have some paid access to a 2012 server or paid 8 pro, feels a little unwelcoming, after seeing hyper-v crap the boat this bad, its not just their licensing disaster that is unwelcoming.

attempted to disable all hardware offloading and retest, did not make any performance or reliability difference, even the boot up looks a little slower than either kvm, vmware or xen, some strange messages when starting cores and other such. its safe to stay it works although just barely

/usr/bin/openssl engine -t -c
enabling ansi hardware encyption says it supporty RSA,DSA,DH,AES-128,192,256, iperf stopped flowing after about 10seconds. ping from in the pfsense guest to lan host reports "not enough buffer"

disabed ansi hardware and rebooted to get traffic flowing again

i enabled the windows firewall to see if that would against belief possibly increase performance, i can still reach the vm guest via the webui , iperf produced roughly the same performance 666mbit. hping flood only still mortally wounds the pfsense guest.

Keljian

Very interesting, so if seeking the best in virtualization for PFsense, your experience would suggest using KVM?

Halvsvenskeren

I push 900+ mbit/s easily in a production scenario running in a VmWare env.

Both ways testing on www.speedtest.net

So you have underperforming hypervisors running that is not setup to handle massive loads.

http://www8.hp.com/us/en/products/networking-switches/product-detail.html?oid=4220267#!tab=specs

This handles app. 15MM PPS and shouldnt bog down under load.

We do see a lot better performance under load from the E1000 driver than the VMXNET3. It could be the guest OS that has better adaptation of the hypervisor drivers. (Windows server).

jsone

@Keljian:

Very interesting, so if seeking the best in virtualization for PFsense, your experience would suggest using KVM?

This is a simple question with a complex answer, centos and vmware both appear to beable to do a reasonable job at getting near the physical limits of the hardware while using pfsense 2.2.4

centos 5 and centos 6 went completely off the rail between 5.7 and 5.9, were network performance completely bombed out you couldnt even run pfsense reliably. now centos 7 claims they are gearing for ddos prevention and other good things, im hoping they dont forget the lessons they learned the hardway many years ago.

as a free user to vmware i was provided with 2 downloads, the iso to install the server and the vsphere client, while using the vsphere client it kept spewing messages after everything i clicked saying i had to use some web based editor nonsense to further customize the settings, when i tried to find this application it said it required java be installed… well we dont install java or flash here .. for obvious reasons, this to me was a red flag to steer back towards centos, i mean why does the vsphere client not support editing your server? sounds a little worrisome.

the centos install is pretty quick and simple, although i have yet to test some more advanced features in virtio which in the past have kernel paniced both the hypervisor and pfsense guests in certain scenarios.

at this point i am leaning towards centos by may go baremetal if i run into sketchy virtio issues with more advanced network standards.

another great feature i found in centos 7, that i had not seen in other distros or even older centos versions was the tune-adm command

you can basically run the "tune-adm virtual-host" and have it optimize sysctl "things" perfectly for running pfsense. while other distrios make you shoot in the dark to some degree as to what syctl should be set, and what they should be set to.

Keljian

I have to admit I haven't had any issue whatsoever editing the ESXi Server using the vSphere client.. maybe there is some other dependency missing on your test environment box running the client.

I know that pfsense doesn't support SR-IOV yet as BSD doesn't yet. I assume it's coming in the future with the number of patches coming through, but that may turn the tables in terms of latency/ability to handle packets etc.

Is there any chance you can run some Hyper-V tests to compare? I'm curious.

johnpoz

current version of esxi wants everyone to use the vserver sort of setup verse direct editing of host with esxi client. If your vm hardware version is above 9 I think it will tell you should use the vserver sure.. It use to be a popup but pretty sure current not a big deal.

This is only warnings about using vclient vs vserver. And to be honest unless your in high end enterprise the stuff you can not edit with the client is of no concern..

vsphereclient.png_thumb

KOM

Then there's the fact that nobody likes using the vCenter Web Client. Slow and confusing. Today's fun for me is upgrading our entire virtual infrastructure to vSphere 6.0.

redpine

Any thoughts on how Xen would compare?

jsone

@redpine:

Any thoughts on how Xen would compare?

just finished a xen center 6.5 test run, after some digging it appears xen centers internal gutts are mostly centos or fedora based

iperf produced 18mbit

no way in the vm editor to change the network drivers, maybe you can hack around to find a way. id bet e1k would be slightly faster and virtio even more so.

cpu in the vm only used two of the cores(100% of each), which was exactly what occurred with centos without multi core virtio drivers

i did not find any clear indication that the xen center supported bsd, you probably want avoid xen for bsd based operating systems such as pfsense

prob want to stick with centos 7 with kvm for bsd virtio support.

i did not test the flood commands, who knows maybe thats where it shines? ;D but prob not!

it should be noted that i did disable hardware checksum offloads and rebooted the vm guest prior to the test.

update:8-21-15
so it may appear the scores were so dismally low related to: https://forum.pfsense.org/index.php?topic=88467.0
while in kvm, disabling HCO solves this issue maybe in xen you may need to mess with the hypervisor nics as prior post states. after scanning the post i feel safe saying xencenter isnt the place for bsd guest, it appears to specialize in linux and windows.

redpine

prob want to stick with centos 7 with kvm for bsd virtio support.

i did not test the flood commands, who knows maybe thats where it shines? ;D but prob not!

Thanks. I was all setup to install centos 6 and Xen (waiting for my SSD). Guess I'll change my dhcp server to serve up a centos 7 install via PXE instead. Really don't like KVM… ugh!!!

Is your configuration/setup with centos 7 KVM documented anywhere? Hate to setup pfSense through trial and error.

jsone

@redpine:

Is your configuration/setup with centos 7 KVM documented anywhere? Hate to setup pfSense through trial and error.

installing pfsense on centos 7.1

1. into centos , i choose server with GUI, then i check all the boxs on right for virtualization, and check the box for devel tools
a. storage is always tricky, i LOVE lvm for kvm storage, its never failed me when snapshotting and dd backups, so you may want to shrink the install drive down to 20gig and leave the rest of your main VG open for vm guests.
i. vgdisplay
ii. lvcreate –name pfsense -L 5G /dev/lvnamefromvgdisplay/pfsense
b. tuning sysctl in centos 7 run "tune-adm virtual-host" and reboot.

2. copy/gunzip the pfsense iso to your /iso/ using winscp

3. using xming(youll need the donation version of xming to use x11 forwarding with virt-manager) or xwindows do "virt-manager" when creating the VM choose "other" then it will populate more options choose freebsd 10x
a. virt-manager is where youll setup your br0, br1, br2, etc for each interface (youll assign these to your vm interfaces later), make sure you tell all interfaces to start on boot. youll want to reboot once its all setup to see if you missed something

4. tell the vm to boot from the iso, at the end , check the box for customize before launch.

5. add your extra network interfaces as needed, drag all the drivers types to "Virtio" and hit save.

6. launch the vm, force the vm off.

7. set howmany cores you want each interface to use for queues (usually the same as your cores)
to do this you get on terminal, type "virsh edit pfsense", find each virtio network interface
add <driver name="vhost" queues="8">to the bottom just above each tag, exit the editor with a save

8. start the vm

9. once you are into the pfsense UI, check all boxes for disable hardware offload under adv>misc

10. increase the state limit to atleast 1 million.

11. start to ddos yourself, responsibly.

update: 9-7-15

on centos 7, if we issued a network restart, NM greedily removes the guest nic interfaces and doesnt add them back

run these commands, then hypervisor network restart wont permanently cripple your pfsense guest

systemctl disable NetworkManager
systemctl stop NetworkManager

side notes:

1. centos 7 in my tests automatically disabled its firewalld system for all bridged interfaces, you do want to make sure thats happening in your install too.

you should see the following lines in the following file
/usr/lib/sysctl.d/00-system.conf

Disable netfilter on bridges.

net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

2. it should be completely fine to leave selinux and firewalld running

3. i did play with increasing the buffer on the interfaces in centos, it didnt make a huge difference in over all performance, tho it did appear that initial bursts were handled much faster with higher tx and rx queues. i left this all at default 256 for my tests and production. you may want to play with it in your setups

ethtool --show-ring enp0s20f0
Ring parameters for enp0s20f0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 256
RX Mini: 0
RX Jumbo: 0
TX: 256

#to change it (do this for each interface)
ethtool --set-ring enp0s20f0 rx 4096 tx 4096

4. depending on the nature of your setup you may want it less power save and more cpu ready, to do that you can tell speed step to be less conservative.

#to see where you are at
grep -i mhz /proc/cpuinfo
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

#to change it you can use this command
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do [ -f $CPUFREQ ] || continue; echo -n performance > $CPUFREQ; done</driver>

jsone

@Keljian:

Is there any chance you can run some Hyper-V tests to compare? I'm curious.

i had hyper v setup in pfsense 2.1, it was a disaster, no benchmarks from then, it was just a problem with nic driver compatibility. since then and windows 8+10 i decided not to renew my msdn, which despite the fact that ms said id keep access to all the software i paid for in the prior subscription, they lied its all locked down, the fact i even have to pay over $1000/y to test microsofts shit software is absurd. id love to help by testing it, but microsofts business methods lead me to say forget about microsoft as a hypervisor, can you even imagine having your entire network go down every 2nd tuesday of the month because of windows update on your pfsense hypervisor? :)

thats what carp is for? lol no.

whats that you say? windows srv 2012 doesnt reboot on the 2nd tuesday after updates anymore? you are right LOL it sits there vulnerable for days until you reboot it!

windows as a router hypervisor is really no less absurd than saying you need java or flash installed in your browser to manage your VMs.

just forget about ms all together as a production env os.

heper

@jstar1:

[just forget about ms all together as a production env os.
[/quote]

you do that in your reality, while the rest of us are stuck in this reality ;)

jsone

@heper:

@jstar1:

[just forget about ms all together as a production env os.
[/quote]

you do that in your reality, while the rest of us are stuck in this reality ;)

hey, ive got my share of prod windows servers like everyone else, everyday is another opportunity for me phase them out / move user interaction away from them ;)

ill be over here in my nice soft padded reality, just remember, microsoft wants to be an ASP and grab market share, every time you pay them for software, you are paying your competitor to allow you to compete with them, if you arent providing products that would suggest a conflict of interest, at a minimum you are stuck supporting a monopoly.

i found a Hyper-V Server 2012 R2 Evaluations | Unlimited, i might give that a try if i get some freetime, although it sounds like a major waste of time

http://www.microsoft.com/en-us/evalcenter/evaluate-hyper-v-server-2012-r2?i=1

https://technet.microsoft.com/en-us/library/dn792027.aspx

claims 2012hyperv supports freebsd, might have some interesting test results

apparently hyper-v has no UI to speak of, and to manage it remotely i would need a 2012 install with hyper-v mmc along with a ton of other nonsense i read about here.
http://pc-addicts.com/12-steps-to-remotely-manage-hyper-v-server-2012-core/

i think ill give up on testing this nightmare for now.

Keljian

I should mention I've noticed a lot less latency in ESXI with the new open-vm-tools which was released a day or two ago

jsone

some things you might need to know to get hyper v installed and working on a c2758

to install successfully switched c2758 to ide sata mode. (prob could just install the driver?)

install intel network drivers

http://www.supermicro.com/products/motherboard/Atom/X10/A1SRM-2758F.cfm

PnPUtil -i -a d:\PRO1000\Winx64\NDIS64\e1s64x64.inf
PnPUtil -e

To Turn Off:
NetSh Advfirewall set allprofiles state off
To Turn On:
NetSh Advfirewall set allrprofiles state on
To check the status of Windows Firewall:
Netsh Advfirewall show allprofiles

on some 2012 client somewhere do this to get hyperv tools
Install-WindowsFeature RSAT-Hyper-V-Tools -IncludeAllSubFeature

upload the iso to the hyperv by \192.168.x.x\c$

left firewall off for testing and setup.

redpine

3. using xming(youll need the donation version of xming to use x11 forwarding with virt-manager) or xwindows do "virt-manager" when creating the VM choose "other" then it will populate more options choose freebsd 10x

First install of Centos 7. I have an xming license, but like opennx better. Unfortunately neither work with gnome in Centos 7, so I have to use KDE. Probably will install xfce latter. Now on to installing pfSense in a VM.

Thanks for the help.

jsone

@redpine:

First install of Centos 7. I have an xming license, but like opennx better. Unfortunately neither work with gnome in Centos 7, so I have to use KDE. Probably will install xfce latter. Now on to installing pfSense in a VM.

Thanks for the help.

ya, xming is what you would want use to do x11 forwarding in ssh to a windows computer for virt-manager. on linux just start xwindows and click on the virt-manager icon within gnome

jsone

my next step was to test centos 7 kvm+pfsense 2.2.4 and baremetal in a lagg configuration.

lagg allows you to bond multiple interfaces together, using mode 4, gives you additional thru put AND redundancy

our switches support LACP, 3+4 so we setup all of the clients this way.

lan and wan clients have 2 1g nics, 2 bridge interfaces with seperate ips each, then we run 2 iperfs at the same time to two different iperf backend ips

the pfsense setup, has 1 bond interface with a wan vlan on it, lagg0 is the lan.

centos 7 guest worked like a charm, we got 1.5gbit thruput using piror mentioned iperf3 tests.

switched over to the baremetal unit, had to add a igb2 to the "lan" on its own subnet to configure the lagg group from the webui

once the lagg was up i performed the same test, the baremetal maxes out at 950mbit,

attempted to adjust the lagghash

ifconfig lagg0 laggash l3,l4
ifconfig lagg0 laggash l2
ifconfig lagg0 laggash l2,l3,l4

no improvement

watching "systat -ifstat 1"

shows laggmember0 igb0 is maxed out, while lagmember1 igb1 has 2mbit inbound, but no traffic going out.

lagg0_vlan100 114MB in, 2.1MB out
lagg0 114MB in, 114MB out
lo0 0,0
igb2, 0.0
igb1 2mb in, 0 out
igb0 114mb in, 114mb out

removed all vlans and interfaces not used by the lagg, rebooted. performed same test, still had results above, no performance gain.

should also be noted we attempted to adjust the strict tunable without and improvement.

i just plugged the cables into a different lagg on the switch, when the lagg came up, we saw the same issue in reverse, igb1 would pass all the traffic, while igb0 did 2mbit max with 0 send

###giving up on lagg with baremetal###
after reviewing the mac info, it turns out that as of 2.2.4 pfsense reports only 1 mac to the switch for LACP on a vlan, while it knows to report both macs on vlan1, as a result when the switch returns traffic for vlan100, it only knows about 1 port, so the router in a lacp will never go faster than 1 interfaces unless you bandaid in the switch somehow. either way, LACP doesnt work in 2.2.4 with vlans, id assume redundancy remains, although i did not test it. the centos 7 hypervisor managing the lag with a pfsense guest reports both laggmember macs on all vlans.

you can successfully get pfsense to route its traffic out of both lagg members using the following tunables which are not set bydefault in 2.2.4
sysctl net.link.lagg.default_use_flowid=0
sysctl net.link.lagg.0.use_flowid=0