Pfsense 2.1 vmware cpu host high usage



  • Hello,
    We have several pfsense 2.1 installed on vmware hosts 5.1 and 5.5
    the pfsense show about 3% cpu usage but in vmware we see constantly about 35% cpu usage  (4000+MHZ) per pfsense vm
    we have latest open-vm tools instaled on each pfsense.

    is there any way to solve this huge reoursce lose?



  • Just to chime in on this thread: I also am facing this issue. I'm using ESXi 5.1 build 1157734 on a Dell PowerEdge R720 with Intel i350 NICs. I'm running pfSense 2.1 with Open-VM-tools installed, 6 E1000 NICs and 1vCPU.

    pfSense itself does not detect high cpu load, but when you look at the performance graph in vSphere it shows a significantly higher cpu load. When generating high network load between two interfaces of the pfSense, the maximum throughput I get is 600Mbit and CPU is hitting 100% in VMWare. CPU load in pfsense is like 20-25%.

    Another thread on this forum (http://forum.pfsense.org/index.php/topic,41647.75.html) indicates that it might be the VMWare vSwitches eating up the cpu cycles. Looking at the age of this topic, the problem is already around for quite a while.

    Someone also opened a bug in redmine (https://redmine.pfsense.org/issues/2618), but the bug got rejected because apparently not every ESX install is experiencing the issue.

    So, is this a VMWare bug in specific hardware environments? Would be great if someone from the pfSense support team could provide some additional info…



  • I am seeing this issue as well.

    Pfsense with e1000 VNICS
    Cisco UCS chassis with B200-M3 blades
    40GB backplane

    CPU use appears to be about 10x what is shown in the dashboard and RRD graphs for CPU use. Seems logical that the extra network traffic would take some CPU but the load seems high. only pushing 20Mb max over firewall, it should be able to handle that virtualized.



  • I'm seeing the same kind of problems on two ESXi hosts (5.1 and 5.5) and pfSense 2.1

    ~300 Mhz on VMWare while idle (0.3% load on pfSense), maxing out at ~4000 Mhz with about 500Mbit throughput. pfSense indicates the usage should be about 25%, but VMWare doesn't seem to agree.

    The load also jumps up when transferring files over samba from a fileserver on the same host, probably due to promiscuous mode on the vm switch.

    Host 1 is an AMD-based whitebox server, Host 2 is a Intel Nuc i5. Both hosts are connected through Intel network adapters, both pfSense vm's are handling 7 VLAN's over this interface. I tried booting the vm's without network adapters, but idle was still about 180 Mhz.

    Tried installing VMWare tools and open-vm-tools, but noticed no improvement.

    All suggestions are welcome  :'(



  • @Corn:

    I'm seeing the same kind of problems on two ESXi hosts (5.1 and 5.5) and pfSense 2.1

    ~300 Mhz on VMWare while idle (0.3% load on pfSense), maxing out at ~4000 Mhz with about 500Mbit throughput. pfSense indicates the usage should be about 25%, but VMWare doesn't seem to agree.

    The load also jumps up when transferring files over samba from a fileserver on the same host, probably due to promiscuous mode on the vm switch.

    Host 1 is an AMD-based whitebox server, Host 2 is a Intel Nuc i5. Both hosts are connected through Intel network adapters, both pfSense vm's are handling 7 VLAN's over this interface. I tried booting the vm's without network adapters, but idle was still about 180 Mhz.

    Tried installing VMWare tools and open-vm-tools, but noticed no improvement.

    All suggestions are welcome  :'(

    If you run up a new 2.1 instance without installing vm tools do you see the same problem?



  • Yup, just tried a brand new 32 and 64 bit. 10x CPU usage in esxtop vs the guest os.



  • I'm having similar issues running pfSense in virtual box on a Linux Mint 16 Petra host.

    At first I thought it was because my hardware was underpowered.  The machine in question is my homebrew nas / backup server and the hardware was chosen for low noise and low power consumption over performance.

    Machine's specs:

    1.86Ghz dual core Intel Atom processor
    2GB Ram
    1 TB 2.5" hard drive
    Integrated graphics (don't remember the chipset)
    Linux Mint 16 Petra, XFCE desktop
    Oracle Virtualbox
    pfSense 2.1 running in Virtualbox
    3x NICs.  One WAN, two lan, with each lan being served by a SOHO router that's acting solely as a switch

    At first CPU usage was running around 90% when the VM was under load downloading large files.  I installed virtio drivers on pfSense and got it to use paravirtualized NICs.  This seemed to help it a little, but it still routinely sits on 80 percent while under network load, although at times it hovers more around 65 to 70 percent.  It also matters which utility I'm using to measure CPU consumption.  Mint's Task Manager application shows cpu usage hovering around 70 percent, but I think it aggregates from both cores.  htop from the terminal tends to show somewhat heavier load, with one core fairly regularly pegging out at 95%+.  The core that pegs out does change, it's not just the one that the VM is using.

    The VM is configured to use a single core of the dual core CPU (I am not give an option to allocate the second core to it).

    The paravirtualization of the NICs seemed to help things some, I haven't leveraged virtio yet for hard drive access, though I have doubts about that having much effect.

    Now, maybe it IS that my hardware is underpowered, but I know that pfSense will run well in a small network (less than 8 clients or so) on a 500 MHZ Geode processor.  I'm kind of amazed the virtual box adds that much overhead and that that overhead is apparently not mostly from virtualized NICs (if it were, going to virtio for the NICs should've solved the problem outright).

    As far as real world performance, I'd say my pfsense VM still, despite everything, compares favorably to my old Netgear SOHO router, and it has allowed me to set up two lans on different subnets (one with restrictive firewall rules; that one's for guests that don't need to be able to talk to my personal desktop machine).  Latency pinging out to the internet is about 10 ms higher on average than on my old router, but since I don't play much Quake anymore I can tolerate that ;D  Ping latency goes quite high under load, but my aforementioned Netgear router would do that too, so that's clearly a different problem.

    The last time I checked, the heavy CPU load only happened when pulling in data from the WAN.  CPU usage was minimal when my Roku streamed music or video from the plex server, though the VM router might not have much direct involvement in that traffic (the Linux Mint host has a connection, through a physical interface, to the same lan that the Roku has a connection to)

    Maybe I should just run it and not worry about it, but I don't really want to run the thing hard all the time if I can help it.



  • I have two ESX hosts and I can run two nodes in CARP, I don't wanna give up redundancy by going to a physical box. I wish we could figure this out, almost everyone I talk to who runs 2.x in ESX has this problem. I tried vmxnet3 NICs today with the latest driver from vmwaretools 9 and still 10x cpu in ESXTOP.



  • I am also experiencing this issue on my ESXi box with pfSense 2.1 running in a VM with the latest open-vm-tools installed.

    The ESXi host is a Dell R620 with Intel NICs.

    I'm running ESXi 5.5. Will be upgrading to 5.5 U1 this weekend to see if that helps any.



  • Exactly the same here with pfSense 2.1.2 on top of ESXi 5.1u2 … Has this already been identified as a bug? Developers are aware of this issue?



  • @kenshirothefist:

    Exactly the same here with pfSense 2.1.2 on top of ESXi 5.1u2 … Has this already been identified as a bug? Developers are aware of this issue?

    No one seems to acknowledge this. The base recommendation is to use Intel NICs, but in my case I'm using all Intel NICs and the problem persists. I'm also using HCL servers using the official OEM (Dell) install image.

    This problem is still occurring on ESXi 5.5 U1.

    If it matters, I'm using Intel I350 1G NICs.



  • Has anybody also noticed some increased latency when using pfSense in ESXi 5.x? I haven't had time to test it, but I'm really curios weather this hight ESXi cpu load is just "cpu load issue" or does it also affect firewall efficiency (especially when multiple TCP connections are being handled simultaneously)?



  • Bump. Any news on this? Anybody solved this?



  • This keeps popping up.  It might be helpful if people experiencing this problem posted a few standard things about their setup.  Then we might be able to see whether there is something common between them:

    • What is the ESXi host machine and processor?
    • Which version of pfSense and whether 32 or 64-bit?
    • How many vCPUs have you allocated to the VM?
    • How much memory have you allocated to the VM?
    • Have you  installed the pfSense packaged VM tools or the VMware-supplied tools?
    • Are you using the e1000 adapter type or something else?


    • What is the ESXi host machine and processor?

    IBM, Intel Xeon X5650

    • Which version of pfSense and whether 32 or 64-bit?

    2.1.3 64-bit

    • How many vCPUs have you allocated to the VM?

    1 CPU @ 2.67 GHz

    • How much memory have you allocated to the VM?

    512 MB

    • Have you  installed the pfSense packaged VM tools or the VMware-supplied tools?

    pfSense packaged VM tools

    • Are you using the e1000 adapter type or something else?

    E1000

    BTW: in my particular case there is very low and constant bandwidth (cca. 2 Mbps) but with thousands of active TCP connections (many small packets); currently I have only like 2% CPU load inside pfsense (cca. 50 Mhz), but cca. 1800 MHz Consumed Host CPU



  • kenshirothefist,

    Forgot to ask:

    • Any other pfSense packages running?

    I assume you have seen the last post in this thread https://forum.pfsense.org/index.php?topic=41647.0.  Anything like that going on in your system?

    I should also say that I've never experienced this problem, even though I've run multiple 32 and 64-bit versions of pfSense on at least four different (HP) hardware platforms since ESXi 3.5 was released.



    • Any other pfSense packages running?

    Open-VM-Tools, OpenVPN, pfBlocker, remote logging … however, even if I disable all these packages, cpu host usage still high



    • What is the ESXi host machine and processor?
      Tried many builds of 5.1 and 5.5 with same result
      Supermicro X8SIL
      Intel(R) Xeon(R) CPU X3440 @ 2.53GHz (Lynnfield)

    • Which version of pfSense and whether 32 or 64-bit?
      Tried 2.1.1 x64, then tried 2.1.2-3 x86

    • How many vCPUs have you allocated to the VM?
      Tried 1, had to bump up to 2 because if this issue, 50Mbit throughput = 70-80% of one physical core

    • How much memory have you allocated to the VM?
      Tried 512-2048 MB

    • Have you  installed the pfSense packaged VM tools or the VMware-supplied tools?
      Tried packaged tools in the past but since read not to use them. Then tried VMware-supplied, no difference

    • Are you using the e1000 adapter type or something else?
      Tried both e1000 and vmxnet3 (w/VMware-supplied driver), no difference.

    Packages - Avahi, OpenVPN export util, Cron, RRD Summary.
    It also happens on fresh install.

    Just to be clear, you have to watch esxtop to see this issue, it doesn't show up in the guest.



  • @biggsy, any news regarding this topic?



  • I can't see anything common between these configs and haven't been able to reproduce it any way.  Only have one machine to play with now though.

    Have you guys checked that link about speed mismatch?



  • @biggsy:

    Have you guys checked that link about speed mismatch?

    I have auto negotiate and it negotiates at 1000 Full … Anyway, I have 20+ running VM's on this host and only this pfSense appliance is having these issues with high pCPU load, although pfSense is the only freeBSD-based VM (others are centos and ubuntu based).



  • The worst I can do is about 93% CPU running a 120 Mbit/s download from AARNET (it's local).

    That's with a single vCPU on a Xeon E3-1265L v2 @ 2.5 GHz inside a Gen8 MicroServer.

    Idle the VM runs along at about 1.5% CPU  :-[



  • That's the thing, it has no business doing 93% of one core at 120Mbit, virtualization overhead should be minimal like with other OSes.

    I'm starting to think that people who "don't have" this problem aren't really seeing it.



  • Is it possible that this is related to VMware virtual machine monitor mode?

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1036775

    For example:

    datetime| vmx| MONITOR MODE: allowed modes : BT HV HWMMU
    datetime| vmx| MONITOR MODE: user requested modes : BT HV HWMMU
    datetime| vmx| MONITOR MODE: guestOS preferred modes: BT HWMMU HV
    datetime| vmx| MONITOR MODE: filtered list : BT HWMMU HV

    Where:

    allowed modes – Refers to the mode that the underlying hardware is capable of.
        user requested modes – Refers to the setting defined in the virtual machine configuration.
        guestOS preferred modes – Refers to the default values for the selected guest operating system.
        filtered list – Refers to the actual monitor modes acceptable for use by the Hypervisor, with the left-most mode being utilized.

    I have "automatic" for my pfsense VM and it reads like this:

    datetime| vmx| I120: MONITOR MODE: allowed modes          : BT32 HV HWMMU
    datetime| vmx| I120: MONITOR MODE: user requested modes  : BT32 HV HWMMU
    datetime| vmx| I120: MONITOR MODE: guestOS preferred modes: HWMMU HV BT32
    datetime| vmx| I120: MONITOR MODE: filtered list          : HWMMU HV BT32

    Therefore it is using hardware MMU and hardware instruction set virtualization … can't change it right now, but can someone test with different settings and post results?


    Also, is it possible that is related to using distributed vSwitch? Can someone test by using regular vSwitch vs. distributed vSwitch (again, I can't change my environment to regular vSwitch right now)?



  • Distributed vSwitch is just an abstraction for many standard vSwitches.



  • FYI: there is even more overhead when you go from 1 vCPU to 2 vCPU … I had 2.3 GHz CPU usage when my pfSense VM was configured with 1 vCPU (approaching to limit 2.67 GHz), then I reconfigured VM to use 2 vCPU and now I have 3.0 GHz CPU usage (probably from CPU threads trashing) ... this is really annoying ... and I really don't want to go back to physical ...



  • I've been following this thread but didn't think it was affecting me.  Then I took a look.  pfSense tells me it's using 2% CPU.  VMware tells me it's using almost nothing.  ESXTop tells me 20%…

    My config:

    Dell Powervault NX3000 (8 x L5520 @ 2.26 GHz)
    ESXi 5.5U1
    pfSense 2.1.3 i386
    2 x vCPU, 2GB RAM
    VM version 8 hardware
    Intel E1000 vNICs



  • I have 2 pfSense VMs running (one 2.1.3 [143MHz used] and the other 2.2-Alpha [95MHz used]) and both are running with very low CPU. Both as reported by ESX and pfSense.
    Are you guys running powerd, I am?

    Edit:
    Meant to add my config:
    ASUS Server
    AMD Opteron 12 core processor
    Intel NICs on 2.1.3
    VMXNET3 on 2.2
    1024MB of memory
    VMTools package is installed
    No other packages running.



  • KOM and podilarius,

    Under what sort of network load?

    My CPU numbers are also low - until I start a heavy (120 Mb/s) download then they diverge very quickly.  The ESXi/esxtop numbers go through the roof while pfSense sees little change.



  • I pumped up a few torrents to try and saturate my 90/90 link and could only manage about 10Mb/s.  Even then, ESXTop showed pfsense taking from 50-103% of %USED.

    I don't use powerd.



  • I am about to load test my 2.2 so I will let you know.



  • @podilarius:

    I am about to load test my 2.2 so I will let you know.

    Okay so pushing 500Mbits in and out of pfsense 2.2, I am getting 5987MHz and ESXTOP is showing 225-266% CPU usage for just pfsense 2.2. I hope that is out of 1000%.
    The VM itself is showing 89-91% usage. I have a AMD 6234 at 2.4GHz per core and has 12 cores.
    As you all know, if you are running 2.1 series and before, pf and i  think ipfw are giant locked and will only use 1 processor.
    To me it seems that 2.2 is giant locked to 2 CPUs. This could be because I have 2 nics involved, so I am not sure if it is one CPU per NIC or just locked to dual CPUs. I have asked in another thread with no answer.
    I am trying to get to 10GB speed, but I seem locked to 1GBE, but alas that is another issue for another topic.



  • Hi,
    is this problem solved or still present?

    I want to move my pfSense to ESXI purchasing new hardware, but now I'm not really sure about it..



  • I seem to have the same problem. 1.5 GhZ in ESX, ~14% on pfsense, about 1 MBit (!) traffic… :(
    Happens to both VM's of a HA Pair. Using Intel NICs, E1000 vNIC



    • What is the ESXi host machine and processor?  Supermicro X8DTU / E5620
    • Which version of pfSense and whether 32 or 64-bit?  64-bit
    • How many vCPUs have you allocated to the VM?  1
    • How much memory have you allocated to the VM?  1 GB
    • Have you  installed the pfSense packaged VM tools or the VMware-supplied tools? Open-VM-Tools
    • Are you using the e1000 adapter type or something else? E1000


  • I have the same problem with pfsense 2.1.5 running on KVM on Ubuntu Server 14.04.

    See attached screenshot with pfsense running top on the right and the host machine running the VMs on the left.

    • What is the ESXi host machine and processor?  Thinkserver TS140 / Intel Xeon CPU E3-1225 v3 @ 3.20GHz
    • Which version of pfSense and whether 32 or 64-bit?  pfsense 2.1.5 - 32-bit
    • How many vCPUs have you allocated to the VM?  4
    • How much memory have you allocated to the VM?  2 GB
    • Have you  installed the pfSense packaged VM tools or the VMware-supplied tools? No
    • Are you using the e1000 adapter type or something else? E1000




  • I have the same issue running pfsense 2.1.5 within a proxmox (kvm) virtualization.

    • What is the ESXi host machine and processor?  PCEngines APU (AMD G-T40E, 2*1GHZ, 4GB RAM), running proxmox 3.2 under debian 7
    • Which version of pfSense and whether 32 or 64-bit?  pfsense 2.1.5 - 64-bit
    • How many vCPUs have you allocated to the VM?  2
    • How much memory have you allocated to the VM?  2 GB
    • Are you using the e1000 adapter type or something else? testet all kind of virtual NICs including virtio

    pfsense idle: while pfsense assuming less than 10% on both CPUs, the hosts recognizes about 50-60% on both cores.
    pfsense busy:  while pfsense assuming about 30% on both CPUs, the hosts recognizes about 70-80% on both cores. Throughput is limited to aprox. 80 MBit/s.
    Other guests like a Debian installation consume only 1-2 % of host CPU during idle state.

    I also tried the latest 2.2 snapshot. The cpu consumption decreased to 20-30%, which is still to much, but much better than 2.1.5, but the throughput was limited to ~40 MBit, so this is not an option since my internet connection is 100 MBit/s

    Another issue is, that I have to emulate the CPU as an qemu64 cpu, becaus using "host" causes pfsense to crash during bootup (other guests are ok with the "host" option). I also had to turn of all kind of checksum offloading to reach these throughputs. with checksum offloading enabled, the throughput is less than 1 MBit/s

    I have no packages installed.



  • Just to help.

    • What is the ESXi host machine and processor?  Supermicro H8DCL / AMD 4386 / ESX 5.5.0 2068190
    • Which version of pfSense and whether 32 or 64-bit?  32-bit - 2.1.5 (no tweak)
    • How many vCPUs have you allocated to the VM?  1
    • How much memory have you allocated to the VM?  1 GB
    • Have you  installed the pfSense packaged VM tools or the VMware-supplied tools? Open-VM-Tools
    • Are you using the e1000 adapter type or something else? E1000

    Idle Time : 371 Mhz / 10% in Performance Vmware Tabs / 0% in Pfsense Dashboard
    High Load (download full speed) :  5857 Mhz / 100 % in Performance Vmware Tabs / 98% in Pfsense Dahboard



  • More to help

    • i have the same issue on several vm's on our esxi 5.5 Cluster
    • 100% CPU and only 3-5 mbit traffic.
    • also i have the Problem with openvpn on heavy load >200mbit the ip stack Crash an i get DUP! icmp pings.
    • now i install one new pfs 2.1.5 with a clean config and e1000 nics.
    • i change the vm today in the evening.
    • all vm's have 2gb ram and 1 vcpu.
    • the new one is our boarder router with bgp that shut route 1000mbit.
    • i Report the results next week.
    • we have esii 5.5 U1.

    regards alexander



  • so here the results of my test

    the current pfs 2.1.5 have defenitly a bug under vmware 5.5u1.

    in one test the failure occurse 2 min after the restart, i think the reasen was the high load (400mbit) traffic.

    after several time something is Crash and i get Dup! if i make a ping.

    the power will trunkated by 100mbit on each Interface.

    i test it with 8 cores then with 4 cores. my vm have 4 nics all Intel 1000.

    here the ping:

    PING 193.84.xxx.xxx (193.84.178.161) 56(84) bytes of data.
    64 bytes from 193.84.xxx.xxx: icmp_seq=1 ttl=64 time=8.10 ms
    64 bytes from 193.84.xxx.xxx: icmp_seq=1 ttl=64 time=8.10 ms (DUP!)
    64 bytes from 193.84.xxx.xxx: icmp_seq=2 ttl=64 time=8.38 ms
    64 bytes from 193.84.xxx.xxx: icmp_seq=2 ttl=64 time=8.38 ms (DUP!)

    after a reboot of the pfs everything is ok.

    now after one day tests …. i have News.

    i tested several versions with 8 cores with 6 cores with 4 cores but only 2 cores are stable.

    now with 1 socket and 2 cores no error occures since 6 hours.

    the Performance is poor but no error's. the cpu is constantly at 70%.

    last pid: 12575;  load averages:  0.10,  0.17,  0.19  up 0+06:28:28    16:14:20
    98 processes:  3 running, 78 sleeping, 17 waiting

    Mem: 64M Active, 25M Inact, 111M Wired, 408K Cache, 24M Buf, 7698M Free
    Swap: 2048M Total, 2048M Free

    PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
      11 root    171 ki31    0K    32K CPU0    0 351:45 82.96% [idle{idle: cpu0}]
      11 root    171 ki31    0K    32K RUN    1 343:38 73.00% [idle{idle: cpu1}]
        0 root    -68    0    0K  224K -      1  40:29 13.96% [kernel{em0 taskq}]
        0 root    -68    0    0K  224K -      0  30:51 10.99% [kernel{em4 taskq}]
      12 root    -32    -    0K  272K WAIT    0  0:02  1.95% [intr{swi4: clock}]
      12 root    -32    -    0K  272K WAIT    0  0:58  0.00% [intr{swi4: clock}]
      14 root    -16    -    0K    16K -      1  0:37  0.00% [yarrow]
        0 root    -16    0    0K  224K sched  1  0:36  0.00% [kernel{swapper}]
        0 root    -68    0    0K  224K -      1  0:24  0.00% [kernel{em3 taskq}]
      256 root      76  20  6908K  1380K kqread  1  0:17  0.00% /usr/local/sbin/check_reload_status
    79750 root      44    0 59596K  6756K select  1  0:06  0.00% /usr/local/bin/vmtoolsd -c /usr/local/share
    20203 root      44    0 24232K  5420K kqread  0  0:06  0.00% /usr/local/sbin/lighttpd -f /var/etc/lighty
    15152 root      44    0  5784K  1464K select  0  0:05  0.00% /usr/local/sbin/apinger -c /var/etc/apinger
        0 root    -68    0    0K  224K -      0  0:05  0.00% [kernel{em1 taskq}]
    86756 root      52    0  150M 38940K piperd  1  0:05  0.00% /usr/local/bin/php
    79831 root      44    0  146M 33480K accept  1  0:04  0.00% /usr/local/bin/php
      16 root    -16    -    0K    16K pftm    1  0:02  0.00% [pfpurge]
    53737 root      44    0  6960K  1652K select  1  0:02  0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/va

    really good shit, if i found no solution i should say bye bye pfsense :(


Log in to reply