Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    VMWare Pentest lab: Extremely high CPU on host

    Scheduled Pinned Locked Moved Virtualization
    85 Posts 29 Posters 70.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      Fmstrat
      last edited by

      @tommyboy180:

      It looks like you do have the multiprocessor kernal installed. Is your CPU still spiking for a long period of time?

      Yes, it's very repeatable. All I need to do is fire up a few downloads or uploads, or run a portscan or anything that makes a lot of connections and CPU on the host OS shoots up while the guest OS (pfsense) CPU appears low.

      Thanks.

      1 Reply Last reply Reply Quote 0
      • N
        NetJunkie
        last edited by

        I came here looking for a solution to the same problem.  I'm running pfSense 2.0 under VMware vSphere 5.0.  At idle it's fine, but under load the CPU use shown by vCenter spikes way up.  Inside the guest (pfSense) the load is almost nothing…maybe 5% on CPU.  Load average is well under 1.  In vCenter it'll be 80% - 90% of a single vCPU, two vCPUs cut that in half...four by fourth.  I've switched network cards from e1000 to VMXNET to VMXNET2 with the same results.

        1 Reply Last reply Reply Quote 0
        • R
          RootWyrm
          last edited by

          Confirmed with NetJunkie via Twitter; I'm also seeing unusually high CPU utilization even at low loads as well with 2.0-RELEASE. Averaging >10% in esxtop at <100KB/s combined with systat -vmstat disagreeing vehemently: <3% total CPU utilization.
          I thought it was pf itself not reporting or under-reporting CPU, but it's not. I'm on ESXi 4.1U1, 2 vCPU, 1GB, with decently large reservation. I'm not seeing exceptionally high INTR loading either; it's more or less exactly where I'd expect it with em(4)'s. I switched to POLLING, gave it a swift reboot to the rear, and relative CPU utilization is MUCH worse than expected - 50% system reported by systat, and ESXi reporting one core at 80%, one at 75%, one at 70% and one at 20% - constant on both. Never below 50%. This is at <20KB/s of traffic, as well.

          Something is definitely broken here.

          EDIT: How weirdly broken? Try this interesting setup: two em0 interfaces, enable POLLING, reboot. CPU utilization is insane, no? Now, disable POLLING, apply but do not reboot. Suddenly, the CPU utilization appears to be much, much better. The difference here was narrowed to pfSense reporting <1% and ESXi reporting <4%.

          1 Reply Last reply Reply Quote 0
          • T
            tester_02
            last edited by

            Running 2.0 release (64 bit) on vmware server.  No cpu load issue.
            Squid/squidguard/snort installed and 2 nic's.

            1 Reply Last reply Reply Quote 0
            • S
              sullrich
              last edited by

              From a shell post the output of:

              top -SH

              1 Reply Last reply Reply Quote 0
              • B
                billm
                last edited by

                I'm not seeing this on my ESXi 4.1.0 install with pfSense 2.1-development (upgraded right after v6 branch was merged in, so this is 2.0 w/ v6)  VM is configured as FreeBSD 64bit, running AMD64 release of pfSense.  Handed off a single CPU to pfSense but running an SMP kernel.  Ran 8mbit of small frames through the firewall and only saw host CPU usage a hair over what pfSense reported (25% in guest 30% of one core in host).

                Are you running the open-vm-tools package?  Also, paste the output of

                sysctl kern.timecounter.choice kern.timecounter.hardware kern.hz

                Thanks

                –Bill

                pfSense core developer
                blog - http://www.ucsecurity.com/
                twitter - billmarquette

                1 Reply Last reply Reply Quote 0
                • R
                  RootWyrm
                  last edited by

                  @sullrich:

                  From a shell post the output of:

                  top -SH

                  last pid: 11421;  load averages:  0.07,  0.03,  0.01    up 0+22:57:12  15:50:45
                  96 processes:  3 running, 77 sleeping, 16 waiting
                  CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                  Mem: 40M Active, 73M Inact, 131M Wired, 88K Cache, 110M Buf, 740M Free
                  Swap: 4096M Total, 4096M Free

                  PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
                    11 root    171 ki31    0K    16K CPU0    0  22.7H 100.00% {idle: cpu0}
                    11 root    171 ki31    0K    16K RUN    1  22.6H 100.00% {idle: cpu1}
                      0 root      76    0    0K    64K sched  1 1045.8  0.00% {swapper}
                    21 root      76 ki-6    0K    8K pollid  1  14:27  0.00% idlepoll
                    12 root    -44    -    0K  128K WAIT    0  8:07  0.00% {swi1: netisr 0}
                    12 root    -32    -    0K  128K WAIT    0  3:50  0.00% {swi4: clock}
                    12 root    -32    -    0K  128K WAIT    1  0:37  0.00% {swi4: clock}
                  31198 root      64  20  4524K  3032K bpf    1  0:30  0.00% arpwatch
                    14 root    -16    -    0K    8K -      1  0:20  0.00% yarrow
                  22066 root      44    0  4948K  2520K select  1  0:18  0.00% syslogd
                  32469 nobody    64  20  3572K  2344K select  0  0:16  0.00% darkstat
                  13799 root      64  20  3316K  1348K select  1  0:15  0.00% apinger
                  21140 root      76  20  3656K  1508K wait    0  0:13  0.00% sh
                  53332 root      44    0 26140K  5012K select  1  0:12  0.00% vmtoolsd
                  23900 root      44    0  3316K  924K piperd  0  0:09  0.00% logger
                  23696 root      44    0  6936K  3708K bpf    1  0:06  0.00% tcpdump
                  27742 root      44    0  3352K  1352K select  1  0:05  0.00% miniupnpd

                  Looks pretty normal, right? Right. So here's the interesting part.

                  2 users    Load  0.01  0.02  0.00                  Oct 14 15:52

                  Mem:KB    REAL            VIRTUAL                      VN PAGER  SWAP PAGER
                          Tot  Share      Tot    Share    Free          in  out    in  out
                  Act  57064  20596  298256    56456  757432  count
                  All  83272  25284  3511012    76448          pages
                  Proc:                                                            Interrupts
                    r  p  d  s  w  Csw  Trp  Sys  Int  Sof  Flt        cow    800 total
                              43      496    4  256      3133            zfod        atkbd0 1
                                                                            ozfod      fdc0 irq6
                  0.1%Sys  0.2%Intr  0.0%User  0.0%Nice 99.8%Idle        %ozfod      ata1 irq15
                  |    |    |    |    |    |    |    |    |    |    |      daefr      mpt0 irq17
                                                                            prcfr  400 cpu0: time
                                                          28 dtbuf          totfr  400 cpu1: time
                  Namei    Name-cache  Dir-cache    69211 desvn          react
                    Calls    hits  %    hits  %      835 numvn          pdwak
                        7      7 100                    65 frevn          pdpgs
                                                                            intrn
                  Disks  da0  md0 pass0                            134544 wire
                  KB/t  16.00  0.00  0.00                            41080 act
                  tps      0    0    0                            75208 inact
                  MB/s  0.00  0.00  0.00                                92 cache
                  %busy    0    0    0                            757340 free

                  Notice something missing? Yup. This is with polling disabled by the checkbox.

                  em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                          options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
                          options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>Notice a problem here? Yes. POLLING is still enabled. Checkbox in pfSense is UNCHECKED, but POLLING is on. Here's what happens when you check that POLLING box again.

                  em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                          options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
                          options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>last pid: 29327;  load averages:  0.87,  0.31,  0.12    up 0+23:07:15  16:00:48
                  96 processes:  4 running, 76 sleeping, 16 waiting
                  CPU:  0.0% user,  0.0% nice, 49.7% system,  0.0% interrupt, 50.3% idle
                  Mem: 40M Active, 76M Inact, 130M Wired, 92K Cache, 110M Buf, 738M Free
                  Swap: 4096M Total, 4096M Free

                  PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
                    21 root    171 ki-6    0K    8K CPU1    1  16:15 98.97% idlepoll
                    11 root    171 ki31    0K    16K RUN    0  22.8H 96.97% {idle: cpu0}
                    11 root    171 ki31    0K    16K RUN    1  22.7H  9.96% {idle: cpu1}
                      0 root      76    0    0K    64K sched  1 1045.8  0.00% {swapper}
                    12 root    -44    -    0K  128K WAIT    0  8:09  0.00% {swi1: netisr 0}
                    12 root    -32    -    0K  128K WAIT    0  3:51  0.00% {swi4: clock}
                    12 root    -32    -    0K  128K WAIT    1  0:37  0.00% {swi4: clock}
                  31198 root      64  20  4524K  3032K bpf    0  0:31  0.00% arpwatch
                    14 root    -16    -    0K    8K -      0  0:20  0.00% yarrow
                  22066 root      44    0  4948K  2520K select  0  0:18  0.00% syslogd
                  32469 nobody    64  20  3572K  2344K select  0  0:16  0.00% darkstat
                  13799 root      64  20  3316K  1348K select  0  0:15  0.00% apinger
                  21140 root      76  20  3656K  1508K wait    1  0:13  0.00% sh
                  53332 root      44    0 26140K  5012K select  0  0:12  0.00% vmtoolsd
                  23900 root      44    0  3316K  924K piperd  0  0:09  0.00% logger
                  23696 root      44    0  6936K  3708K bpf    0  0:06  0.00% tcpdump
                  27742 root      44    0  3352K  1352K select  0  0:05  0.00% miniupnpd

                  2 users    Load  0.96  0.45  0.18                  Oct 14 16:01

                  Mem:KB    REAL            VIRTUAL                      VN PAGER  SWAP PAGER
                          Tot  Share      Tot    Share    Free          in  out    in  out
                  Act  57188  20676  298332    56460  755756  count
                  All  83408  25364  3511088    76452          pages
                  Proc:                                                            Interrupts
                    r  p  d  s  w  Csw  Trp  Sys  Int  Sof  Flt        cow    805 total
                    1          42        3M    7  259    5 3107    3      3 zfod        atkbd0 1
                                                                            ozfod      fdc0 irq6
                  50.0%Sys  0.0%Intr  0.0%User  0.0%Nice 50.0%Idle        %ozfod      ata1 irq15
                  |    |    |    |    |    |    |    |    |    |    |      daefr    5 mpt0 irq17
                  =========================                                prcfr  400 cpu0: time
                                                          8 dtbuf        2 totfr  400 cpu1: time
                  Namei    Name-cache  Dir-cache    69211 desvn          react
                    Calls    hits  %    hits  %      890 numvn          pdwak
                        11      11 100                    65 frevn          pdpgs
                                                                            intrn
                  Disks  da0  md0 pass0                            133376 wire
                  KB/t  17.19  0.00  0.00                            41368 act
                  tps      5    0    0                            77764 inact
                  MB/s  0.09  0.00  0.00                                92 cache
                  %busy    1    0    0                            755664 free

                  8:01:24pm up 78 days  3:19, 200 worlds; CPU load average: 0.29, 0.16, 0.08
                  PCPU USED(%): 3.5 3.0  22  14  69 1.2 2.5 4.1 AVG:  15
                  PCPU UTIL(%): 3.6 3.2  22  12  66 1.2 2.4 2.7 AVG:  14
                  CORE UTIL(%): 6.7      34      67    5.0    AVG:  28

                  ID    GID NAME            NWLD  %USED    %RUN    %SYS  %WAIT    %RDY
                        1      1 idle                8  273.89  800.00    0.00    0.00  800.00
                  1537396 1537396 earthmother - p    5  102.58  97.93    0.04  380.97    0.07

                  This is with OpenVM Tools 313025. Timecounter looks like this:
                  kern.timecounter.choice: TSC(-100) ACPI-safe(850) i8254(0) dummy(-1000000)
                  kern.timecounter.hardware: ACPI-safe
                  kern.hz: 100

                  Pretty much exactly as expected; all other FreeBSD guests are exactly the same. ACPI-safe over TSC, no stepwarnings, and frequency 3579545 - no exceptions on any of them. (They're all 8.1-RELEASE currently.) This is on 32-bit, too, I forgot to mention.</rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,simplex,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,simplex,multicast>

                  1 Reply Last reply Reply Quote 0
                  • T
                    timotl
                    last edited by

                    I have been seeing this under ESXi 5 also.
                    I installed the vendor supplied tools and am using a single trunked E1000.

                    After trying all of the nic settings, I happened to disable powerd and the CPU usage went down by more than half.
                    Can anyone else confirm this?

                    -timotl

                    1 Reply Last reply Reply Quote 0
                    • L
                      loftyDan
                      last edited by

                      I too have this issue, using ESXi 5 (and previously on 4.1).  Changing the powerd settings did not resolve the issue for me.  I've tried 2.0 i386, my primary config, 2.1 i386 and 2.1 AMD64.  For both dev builds I tried with my config backup, and a clean install, and the results were always the same.  pfSense reports 16-20% CPU load, while ESXi reports a 62% load (on a Xeon X3440 @ 2.53GHz).  This is with a download speed of about 3.6 MB/sec (29 Mb/sec).  In every case Open-VM-Tools has been installed and I've been using the E1000 NIC.  Speeds directly connected to the modem yield 31 Mb/sec.

                      If there is anything else I can test, or any more information I can provide, please let me know.  I'd love for this problem to get resolved.

                      1 Reply Last reply Reply Quote 0
                      • V
                        Veni
                        last edited by

                        @loftyDan:

                        I too have this issue, using ESXi 5 (and previously on 4.1).  Changing the powerd settings did not resolve the issue for me.  I've tried 2.0 i386, my primary config, 2.1 i386 and 2.1 AMD64.  For both dev builds I tried with my config backup, and a clean install, and the results were always the same.  pfSense reports 16-20% CPU load, while ESXi reports a 62% load[…]

                        I'm seeing the same thing but on a single x5650 @ 2.67 GHz.

                        If i try to limit the CPU usage from the vSphere client then i don't get the performance i'm after(aprox 150 Mbps). Instead i get around 20-22 Mbps.
                        So it sounds as if the usage is real somehow, otherwise why whould i see performance issues when giving pfSense a maximum of 1-1.5 GHz?

                        1 Reply Last reply Reply Quote 0
                        • K
                          kkrauth
                          last edited by

                          Just to chime in on this thread, as I'm seeing the same issues. I'm running the following release:
                          [2.0.1-RELEASE][root@pfSense.localdomain]/root(7): uname -a
                          FreeBSD pfSense.localdomain 8.1-RELEASE-p6 FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 18:15:35 EST 2011    root@FreeBSD_8.0_pfSense_2.0-AMD64.snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8  amd64

                          within ESXi 5. I installed open-vm-tools and vmware's provided drivers for VMXNET3 adapter. Both internal/extenal NICs are running with the VMXNET3 driver. The problem was exactly the same using E1000 drivers.

                          The attached screenshot shows what is happening when the network is pretty much idle. During load, this spikes up even higher, even though pFsense top reports almost no usage whatsoever. I tried both with powerd turned on and off.

                          pfsense.png
                          pfsense.png_thumb

                          1 Reply Last reply Reply Quote 0
                          • M
                            marsboer
                            last edited by

                            Same issue on fresh pfSense 2.0.1 install running on KVM (Proxmox VE) with smp kernel. With only a couple of mbits of traffic the CPU usage increases massively on the physical host (above 50%) running on single virtual CPU and 512 MB RAM.

                            pfSense does not support virtio (the paravirtualized devices for KVM) so I thought using emulated NICs was the main reason for the bad CPU performance even under light load, but now I am starting to think that this is may be a more generic problem with pfSense in virtualized setups in general.

                            1 Reply Last reply Reply Quote 0
                            • C
                              clayton_ross
                              last edited by

                              i too am having the same problem.  pfsence 2.0 64, esxi 5.0  2 cores 2 nics vmtools

                              1 Reply Last reply Reply Quote 0
                              • I
                                iFloris
                                last edited by

                                As most others on this thread, I too have run into this problem.
                                Something that is not clear to me is if using e1000 is the source of such increased cpu usage on esx.
                                And if that is the case, does switching to another adapter, such as flexible or vmxnet 2/3 help in reducing load for any of you?

                                one layer of information
                                removed

                                1 Reply Last reply Reply Quote 0
                                • K
                                  kkrauth
                                  last edited by

                                  @iFloris:

                                  As most others on this thread, I too have run into this problem.
                                  Something that is not clear to me is if using e1000 is the source of such increased cpu usage on esx.
                                  And if that is the case, does switching to another adapter, such as flexible or vmxnet 2/3 help in reducing load for any of you?

                                  I tried all three virtual adapters and the behaviour was the same.

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    Mattofsweden
                                    last edited by

                                    I'm seeing the same issues here on a DELL PowerEdge R310 Quad Core Xeon:
                                    Using ESXi 4.1 and pfSense 2.0, 2.0.1, old-2.1-dev in i386/amd64 flavors
                                    Using ESXi 5.0 and pfSense 2.0.1 and 2.1-dev in i386/amd64 flavors from feb/march/april.

                                    Same results on other host hardware also (Two DELL Servers with virtualized environment at home for testing purposes.)

                                    Have not tried the VMXNET due to others not seeing any performance gain, only been using virtualized E1000 so far.

                                    What I'm using a lot is VLANs, which might be a contributing culprit for some of us? Assigning VLANs directly in switch configuration in vSphere, or natively in pfSense has had "largely" the same results.

                                    I absolutely love pfSense, now that I've got a hang of it, and have deployed quite a few in different scenarios past few months. But, not to sound negative here, there gotta be something we can do about these high loads in virtualized environments. I had to switch over to bare-metal, on slightly aged HW, on our lab network which is a bit unsatisfying. I loose a bit of my redundancy (if one VM or host fails, just fire up the copy or using HA Sync).

                                    I suppose it's underlying FreeBSD issue?
                                    I don't really know how to set up something similar in any of the *BSD flavors, and honestly can't find the time to learn currently, but surely one of you guys could test a simple routing setup using FreeBSD/OpenBSD/NetBSD and see if there's the same performance issue? (Maybe with/without VLAN incl. trunking/non-native.)

                                    Regards,
                                    Mattias

                                    IT Teacher & Networking Consultant

                                    1 Reply Last reply Reply Quote 0
                                    • G
                                      goodspeedal
                                      last edited by

                                      Just let you know, there are one more case for reference.

                                      I have tested the pfsense with the follow spec

                                      1. DELL 9200
                                         i)Build-in LAN 82566DC Gigabit LAN Cards
                                         ii)3 Intel 82541 Gigabit LAN Cards
                                      2. VM under ESXi-5-U1
                                         i)Setting: only one pfsense VM with FreeBSD 64bit
                                         ii)2 e1000 virtual LAN Cards
                                         iii)1 vCPU, 1024MB RAM
                                      3. pfsense with "Open VM Tools package", "Snort" installed
                                         i) Assigned one LAN for each interfaces(WAN, LAN)

                                      Result:
                                      I have just started the machine to test the stability, not even use it. It will freeze after a day. The freeze will only in the VM level, not affect the ESXi.

                                      Please let me know if you need any information from my setting as well. Since this is only a test machine, i wanna to put the pfsense in the DELL R610 later. But the migration will be on held at the moment. Thanks for any fix for the issue.

                                      1 Reply Last reply Reply Quote 0
                                      • D
                                        dLockers
                                        last edited by

                                        Have you tried enabling vt-d and passing the intel nics directly to pfsense?

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Supermule Banned
                                          last edited by

                                          So you want to risk frontend firewall with direct contact with the physical Nics on the server?

                                          Uninstall the vmtools package and reboot. Sed if it solves the issue…

                                          1.2.3 doesnt have any of this at all. Running in about 3% on the physical server.

                                          1 Reply Last reply Reply Quote 0
                                          • G
                                            goodspeedal
                                            last edited by

                                            @dLockers:

                                            Have you tried enabling vt-d and passing the intel nics directly to pfsense?

                                            Just checked with test system (DELL 9200) is not support pass-thr even the motherboard is enabled vt-d.
                                            But why can't i just use 2 virtual lan cards and connect each of them to a separate v-switch. And let other 2 real lan cards to connect the v-switches.
                                            It will be the same as setting as you suggested.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.