• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

VMWare Pentest lab: Extremely high CPU on host

Scheduled Pinned Locked Moved Virtualization
85 Posts 29 Posters 70.6k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F
    Fmstrat
    last edited by Oct 1, 2011, 8:39 PM

    Hi all,

    I'm having an odd issue with a PFsense 2.0 install in a pentest lab. Running VMWare Workstation 7 on an i7 920 with 16GB of RAM, and the PFSense instance has plenty of ram and access to two of the processors.

    The VM session has two network interfaces, one bridged to the local network through the on-board 1000Mb NIC, the other bridged to what we'll call the "external" network through a PCI 100Mb NIC.

    There is another VM instance running Backtrack that is bridged to the local network through the on-board 1000Mb NIC as well. When running nmap on the Backtrack instance to another machine on the "external" network (which is really just a remote lab machine), the PFSense instance, which is routing the traffic, spikes the host CPU up to 125% CPU usage (1.25 processors). The backtrack instance is barely using any CPU at this point. CPU usage INSIDE the PFSense VM is low, perhaps 10%.

    I've also noticed high utilization, like 80% CPU, just from running a number of concurrent downloads from any hardware or virtual machine I route through the PFSense VM.

    Any ideas?

    Thanks,
    B.

    1 Reply Last reply Reply Quote 0
    • T
      tommyboy180
      last edited by Oct 1, 2011, 11:52 PM

      My first reaction would to run TOP and see what it actually using that CPU. Did you select the multiprocessing kernal at install?

      -Tom Schaefer
      SuperMicro 1U 2X Intel pro/1000 Dual Core Intel 2.2 Ghz - 2 Gig RAM

      Please support pfBlocker | File Browser | Strikeback

      1 Reply Last reply Reply Quote 0
      • F
        Fmstrat
        last edited by Oct 12, 2011, 8:27 PM

        @tommyboy180:

        My first reaction would to run TOP and see what it actually using that CPU. Did you select the multiprocessing kernal at install?

        TOP on the client shows no CPU use (2%). You bring up a good point, I used the precreated VMWare image, and I'm not sure it's default kernel supports multiprocessing.

        Output of uname -a is: FreeBSD pfsense.coronium 8.1-RELEASE-p4 FreeBSD 8.1-RELEASE-p4 #0: Tue Jun 21 16:48:23 EDT 2011    sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.8  i386

        If this isn't what I should have, is it possible to change the kernel post install?

        1 Reply Last reply Reply Quote 0
        • T
          tommyboy180
          last edited by Oct 12, 2011, 8:40 PM

          It looks like you do have the multiprocessor kernal installed. Is your CPU still spiking for a long period of time?

          -Tom Schaefer
          SuperMicro 1U 2X Intel pro/1000 Dual Core Intel 2.2 Ghz - 2 Gig RAM

          Please support pfBlocker | File Browser | Strikeback

          1 Reply Last reply Reply Quote 0
          • F
            Fmstrat
            last edited by Oct 12, 2011, 8:57 PM

            @tommyboy180:

            It looks like you do have the multiprocessor kernal installed. Is your CPU still spiking for a long period of time?

            Yes, it's very repeatable. All I need to do is fire up a few downloads or uploads, or run a portscan or anything that makes a lot of connections and CPU on the host OS shoots up while the guest OS (pfsense) CPU appears low.

            Thanks.

            1 Reply Last reply Reply Quote 0
            • N
              NetJunkie
              last edited by Oct 13, 2011, 3:08 AM

              I came here looking for a solution to the same problem.  I'm running pfSense 2.0 under VMware vSphere 5.0.  At idle it's fine, but under load the CPU use shown by vCenter spikes way up.  Inside the guest (pfSense) the load is almost nothing…maybe 5% on CPU.  Load average is well under 1.  In vCenter it'll be 80% - 90% of a single vCPU, two vCPUs cut that in half...four by fourth.  I've switched network cards from e1000 to VMXNET to VMXNET2 with the same results.

              1 Reply Last reply Reply Quote 0
              • R
                RootWyrm
                last edited by Oct 13, 2011, 9:15 PM Oct 13, 2011, 9:02 PM

                Confirmed with NetJunkie via Twitter; I'm also seeing unusually high CPU utilization even at low loads as well with 2.0-RELEASE. Averaging >10% in esxtop at <100KB/s combined with systat -vmstat disagreeing vehemently: <3% total CPU utilization.
                I thought it was pf itself not reporting or under-reporting CPU, but it's not. I'm on ESXi 4.1U1, 2 vCPU, 1GB, with decently large reservation. I'm not seeing exceptionally high INTR loading either; it's more or less exactly where I'd expect it with em(4)'s. I switched to POLLING, gave it a swift reboot to the rear, and relative CPU utilization is MUCH worse than expected - 50% system reported by systat, and ESXi reporting one core at 80%, one at 75%, one at 70% and one at 20% - constant on both. Never below 50%. This is at <20KB/s of traffic, as well.

                Something is definitely broken here.

                EDIT: How weirdly broken? Try this interesting setup: two em0 interfaces, enable POLLING, reboot. CPU utilization is insane, no? Now, disable POLLING, apply but do not reboot. Suddenly, the CPU utilization appears to be much, much better. The difference here was narrowed to pfSense reporting <1% and ESXi reporting <4%.

                1 Reply Last reply Reply Quote 0
                • T
                  tester_02
                  last edited by Oct 14, 2011, 8:57 AM

                  Running 2.0 release (64 bit) on vmware server.  No cpu load issue.
                  Squid/squidguard/snort installed and 2 nic's.

                  1 Reply Last reply Reply Quote 0
                  • S
                    sullrich
                    last edited by Oct 14, 2011, 3:30 PM

                    From a shell post the output of:

                    top -SH

                    1 Reply Last reply Reply Quote 0
                    • B
                      billm
                      last edited by Oct 14, 2011, 4:23 PM

                      I'm not seeing this on my ESXi 4.1.0 install with pfSense 2.1-development (upgraded right after v6 branch was merged in, so this is 2.0 w/ v6)  VM is configured as FreeBSD 64bit, running AMD64 release of pfSense.  Handed off a single CPU to pfSense but running an SMP kernel.  Ran 8mbit of small frames through the firewall and only saw host CPU usage a hair over what pfSense reported (25% in guest 30% of one core in host).

                      Are you running the open-vm-tools package?  Also, paste the output of

                      sysctl kern.timecounter.choice kern.timecounter.hardware kern.hz

                      Thanks

                      –Bill

                      pfSense core developer
                      blog - http://www.ucsecurity.com/
                      twitter - billmarquette

                      1 Reply Last reply Reply Quote 0
                      • R
                        RootWyrm
                        last edited by Oct 14, 2011, 8:06 PM

                        @sullrich:

                        From a shell post the output of:

                        top -SH

                        last pid: 11421;  load averages:  0.07,  0.03,  0.01    up 0+22:57:12  15:50:45
                        96 processes:  3 running, 77 sleeping, 16 waiting
                        CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                        Mem: 40M Active, 73M Inact, 131M Wired, 88K Cache, 110M Buf, 740M Free
                        Swap: 4096M Total, 4096M Free

                        PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
                          11 root    171 ki31    0K    16K CPU0    0  22.7H 100.00% {idle: cpu0}
                          11 root    171 ki31    0K    16K RUN    1  22.6H 100.00% {idle: cpu1}
                            0 root      76    0    0K    64K sched  1 1045.8  0.00% {swapper}
                          21 root      76 ki-6    0K    8K pollid  1  14:27  0.00% idlepoll
                          12 root    -44    -    0K  128K WAIT    0  8:07  0.00% {swi1: netisr 0}
                          12 root    -32    -    0K  128K WAIT    0  3:50  0.00% {swi4: clock}
                          12 root    -32    -    0K  128K WAIT    1  0:37  0.00% {swi4: clock}
                        31198 root      64  20  4524K  3032K bpf    1  0:30  0.00% arpwatch
                          14 root    -16    -    0K    8K -      1  0:20  0.00% yarrow
                        22066 root      44    0  4948K  2520K select  1  0:18  0.00% syslogd
                        32469 nobody    64  20  3572K  2344K select  0  0:16  0.00% darkstat
                        13799 root      64  20  3316K  1348K select  1  0:15  0.00% apinger
                        21140 root      76  20  3656K  1508K wait    0  0:13  0.00% sh
                        53332 root      44    0 26140K  5012K select  1  0:12  0.00% vmtoolsd
                        23900 root      44    0  3316K  924K piperd  0  0:09  0.00% logger
                        23696 root      44    0  6936K  3708K bpf    1  0:06  0.00% tcpdump
                        27742 root      44    0  3352K  1352K select  1  0:05  0.00% miniupnpd

                        Looks pretty normal, right? Right. So here's the interesting part.

                        2 users    Load  0.01  0.02  0.00                  Oct 14 15:52

                        Mem:KB    REAL            VIRTUAL                      VN PAGER  SWAP PAGER
                                Tot  Share      Tot    Share    Free          in  out    in  out
                        Act  57064  20596  298256    56456  757432  count
                        All  83272  25284  3511012    76448          pages
                        Proc:                                                            Interrupts
                          r  p  d  s  w  Csw  Trp  Sys  Int  Sof  Flt        cow    800 total
                                    43      496    4  256      3133            zfod        atkbd0 1
                                                                                  ozfod      fdc0 irq6
                        0.1%Sys  0.2%Intr  0.0%User  0.0%Nice 99.8%Idle        %ozfod      ata1 irq15
                        |    |    |    |    |    |    |    |    |    |    |      daefr      mpt0 irq17
                                                                                  prcfr  400 cpu0: time
                                                                28 dtbuf          totfr  400 cpu1: time
                        Namei    Name-cache  Dir-cache    69211 desvn          react
                          Calls    hits  %    hits  %      835 numvn          pdwak
                              7      7 100                    65 frevn          pdpgs
                                                                                  intrn
                        Disks  da0  md0 pass0                            134544 wire
                        KB/t  16.00  0.00  0.00                            41080 act
                        tps      0    0    0                            75208 inact
                        MB/s  0.00  0.00  0.00                                92 cache
                        %busy    0    0    0                            757340 free

                        Notice something missing? Yup. This is with polling disabled by the checkbox.

                        em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                                options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
                                options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>Notice a problem here? Yes. POLLING is still enabled. Checkbox in pfSense is UNCHECKED, but POLLING is on. Here's what happens when you check that POLLING box again.

                        em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
                                options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
                                options=db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum>last pid: 29327;  load averages:  0.87,  0.31,  0.12    up 0+23:07:15  16:00:48
                        96 processes:  4 running, 76 sleeping, 16 waiting
                        CPU:  0.0% user,  0.0% nice, 49.7% system,  0.0% interrupt, 50.3% idle
                        Mem: 40M Active, 76M Inact, 130M Wired, 92K Cache, 110M Buf, 738M Free
                        Swap: 4096M Total, 4096M Free

                        PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
                          21 root    171 ki-6    0K    8K CPU1    1  16:15 98.97% idlepoll
                          11 root    171 ki31    0K    16K RUN    0  22.8H 96.97% {idle: cpu0}
                          11 root    171 ki31    0K    16K RUN    1  22.7H  9.96% {idle: cpu1}
                            0 root      76    0    0K    64K sched  1 1045.8  0.00% {swapper}
                          12 root    -44    -    0K  128K WAIT    0  8:09  0.00% {swi1: netisr 0}
                          12 root    -32    -    0K  128K WAIT    0  3:51  0.00% {swi4: clock}
                          12 root    -32    -    0K  128K WAIT    1  0:37  0.00% {swi4: clock}
                        31198 root      64  20  4524K  3032K bpf    0  0:31  0.00% arpwatch
                          14 root    -16    -    0K    8K -      0  0:20  0.00% yarrow
                        22066 root      44    0  4948K  2520K select  0  0:18  0.00% syslogd
                        32469 nobody    64  20  3572K  2344K select  0  0:16  0.00% darkstat
                        13799 root      64  20  3316K  1348K select  0  0:15  0.00% apinger
                        21140 root      76  20  3656K  1508K wait    1  0:13  0.00% sh
                        53332 root      44    0 26140K  5012K select  0  0:12  0.00% vmtoolsd
                        23900 root      44    0  3316K  924K piperd  0  0:09  0.00% logger
                        23696 root      44    0  6936K  3708K bpf    0  0:06  0.00% tcpdump
                        27742 root      44    0  3352K  1352K select  0  0:05  0.00% miniupnpd

                        2 users    Load  0.96  0.45  0.18                  Oct 14 16:01

                        Mem:KB    REAL            VIRTUAL                      VN PAGER  SWAP PAGER
                                Tot  Share      Tot    Share    Free          in  out    in  out
                        Act  57188  20676  298332    56460  755756  count
                        All  83408  25364  3511088    76452          pages
                        Proc:                                                            Interrupts
                          r  p  d  s  w  Csw  Trp  Sys  Int  Sof  Flt        cow    805 total
                          1          42        3M    7  259    5 3107    3      3 zfod        atkbd0 1
                                                                                  ozfod      fdc0 irq6
                        50.0%Sys  0.0%Intr  0.0%User  0.0%Nice 50.0%Idle        %ozfod      ata1 irq15
                        |    |    |    |    |    |    |    |    |    |    |      daefr    5 mpt0 irq17
                        =========================                                prcfr  400 cpu0: time
                                                                8 dtbuf        2 totfr  400 cpu1: time
                        Namei    Name-cache  Dir-cache    69211 desvn          react
                          Calls    hits  %    hits  %      890 numvn          pdwak
                              11      11 100                    65 frevn          pdpgs
                                                                                  intrn
                        Disks  da0  md0 pass0                            133376 wire
                        KB/t  17.19  0.00  0.00                            41368 act
                        tps      5    0    0                            77764 inact
                        MB/s  0.09  0.00  0.00                                92 cache
                        %busy    1    0    0                            755664 free

                        8:01:24pm up 78 days  3:19, 200 worlds; CPU load average: 0.29, 0.16, 0.08
                        PCPU USED(%): 3.5 3.0  22  14  69 1.2 2.5 4.1 AVG:  15
                        PCPU UTIL(%): 3.6 3.2  22  12  66 1.2 2.4 2.7 AVG:  14
                        CORE UTIL(%): 6.7      34      67    5.0    AVG:  28

                        ID    GID NAME            NWLD  %USED    %RUN    %SYS  %WAIT    %RDY
                              1      1 idle                8  273.89  800.00    0.00    0.00  800.00
                        1537396 1537396 earthmother - p    5  102.58  97.93    0.04  380.97    0.07

                        This is with OpenVM Tools 313025. Timecounter looks like this:
                        kern.timecounter.choice: TSC(-100) ACPI-safe(850) i8254(0) dummy(-1000000)
                        kern.timecounter.hardware: ACPI-safe
                        kern.hz: 100

                        Pretty much exactly as expected; all other FreeBSD guests are exactly the same. ACPI-safe over TSC, no stepwarnings, and frequency 3579545 - no exceptions on any of them. (They're all 8.1-RELEASE currently.) This is on 32-bit, too, I forgot to mention.</rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,simplex,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum></up,broadcast,running,simplex,multicast>

                        1 Reply Last reply Reply Quote 0
                        • T
                          timotl
                          last edited by Oct 20, 2011, 2:12 PM

                          I have been seeing this under ESXi 5 also.
                          I installed the vendor supplied tools and am using a single trunked E1000.

                          After trying all of the nic settings, I happened to disable powerd and the CPU usage went down by more than half.
                          Can anyone else confirm this?

                          -timotl

                          1 Reply Last reply Reply Quote 0
                          • L
                            loftyDan
                            last edited by Oct 26, 2011, 2:31 AM

                            I too have this issue, using ESXi 5 (and previously on 4.1).  Changing the powerd settings did not resolve the issue for me.  I've tried 2.0 i386, my primary config, 2.1 i386 and 2.1 AMD64.  For both dev builds I tried with my config backup, and a clean install, and the results were always the same.  pfSense reports 16-20% CPU load, while ESXi reports a 62% load (on a Xeon X3440 @ 2.53GHz).  This is with a download speed of about 3.6 MB/sec (29 Mb/sec).  In every case Open-VM-Tools has been installed and I've been using the E1000 NIC.  Speeds directly connected to the modem yield 31 Mb/sec.

                            If there is anything else I can test, or any more information I can provide, please let me know.  I'd love for this problem to get resolved.

                            1 Reply Last reply Reply Quote 0
                            • V
                              Veni
                              last edited by Nov 29, 2011, 6:21 PM

                              @loftyDan:

                              I too have this issue, using ESXi 5 (and previously on 4.1).  Changing the powerd settings did not resolve the issue for me.  I've tried 2.0 i386, my primary config, 2.1 i386 and 2.1 AMD64.  For both dev builds I tried with my config backup, and a clean install, and the results were always the same.  pfSense reports 16-20% CPU load, while ESXi reports a 62% load[…]

                              I'm seeing the same thing but on a single x5650 @ 2.67 GHz.

                              If i try to limit the CPU usage from the vSphere client then i don't get the performance i'm after(aprox 150 Mbps). Instead i get around 20-22 Mbps.
                              So it sounds as if the usage is real somehow, otherwise why whould i see performance issues when giving pfSense a maximum of 1-1.5 GHz?

                              1 Reply Last reply Reply Quote 0
                              • K
                                kkrauth
                                last edited by Feb 12, 2012, 4:02 PM

                                Just to chime in on this thread, as I'm seeing the same issues. I'm running the following release:
                                [2.0.1-RELEASE][root@pfSense.localdomain]/root(7): uname -a
                                FreeBSD pfSense.localdomain 8.1-RELEASE-p6 FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 18:15:35 EST 2011    root@FreeBSD_8.0_pfSense_2.0-AMD64.snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8  amd64

                                within ESXi 5. I installed open-vm-tools and vmware's provided drivers for VMXNET3 adapter. Both internal/extenal NICs are running with the VMXNET3 driver. The problem was exactly the same using E1000 drivers.

                                The attached screenshot shows what is happening when the network is pretty much idle. During load, this spikes up even higher, even though pFsense top reports almost no usage whatsoever. I tried both with powerd turned on and off.

                                pfsense.png
                                pfsense.png_thumb

                                1 Reply Last reply Reply Quote 0
                                • M
                                  marsboer
                                  last edited by Feb 13, 2012, 1:29 PM

                                  Same issue on fresh pfSense 2.0.1 install running on KVM (Proxmox VE) with smp kernel. With only a couple of mbits of traffic the CPU usage increases massively on the physical host (above 50%) running on single virtual CPU and 512 MB RAM.

                                  pfSense does not support virtio (the paravirtualized devices for KVM) so I thought using emulated NICs was the main reason for the bad CPU performance even under light load, but now I am starting to think that this is may be a more generic problem with pfSense in virtualized setups in general.

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    clayton_ross
                                    last edited by Apr 1, 2012, 5:51 AM

                                    i too am having the same problem.  pfsence 2.0 64, esxi 5.0  2 cores 2 nics vmtools

                                    1 Reply Last reply Reply Quote 0
                                    • I
                                      iFloris
                                      last edited by Apr 4, 2012, 11:51 AM

                                      As most others on this thread, I too have run into this problem.
                                      Something that is not clear to me is if using e1000 is the source of such increased cpu usage on esx.
                                      And if that is the case, does switching to another adapter, such as flexible or vmxnet 2/3 help in reducing load for any of you?

                                      one layer of information
                                      removed

                                      1 Reply Last reply Reply Quote 0
                                      • K
                                        kkrauth
                                        last edited by Apr 4, 2012, 1:04 PM

                                        @iFloris:

                                        As most others on this thread, I too have run into this problem.
                                        Something that is not clear to me is if using e1000 is the source of such increased cpu usage on esx.
                                        And if that is the case, does switching to another adapter, such as flexible or vmxnet 2/3 help in reducing load for any of you?

                                        I tried all three virtual adapters and the behaviour was the same.

                                        1 Reply Last reply Reply Quote 0
                                        • M
                                          Mattofsweden
                                          last edited by Apr 26, 2012, 10:16 PM

                                          I'm seeing the same issues here on a DELL PowerEdge R310 Quad Core Xeon:
                                          Using ESXi 4.1 and pfSense 2.0, 2.0.1, old-2.1-dev in i386/amd64 flavors
                                          Using ESXi 5.0 and pfSense 2.0.1 and 2.1-dev in i386/amd64 flavors from feb/march/april.

                                          Same results on other host hardware also (Two DELL Servers with virtualized environment at home for testing purposes.)

                                          Have not tried the VMXNET due to others not seeing any performance gain, only been using virtualized E1000 so far.

                                          What I'm using a lot is VLANs, which might be a contributing culprit for some of us? Assigning VLANs directly in switch configuration in vSphere, or natively in pfSense has had "largely" the same results.

                                          I absolutely love pfSense, now that I've got a hang of it, and have deployed quite a few in different scenarios past few months. But, not to sound negative here, there gotta be something we can do about these high loads in virtualized environments. I had to switch over to bare-metal, on slightly aged HW, on our lab network which is a bit unsatisfying. I loose a bit of my redundancy (if one VM or host fails, just fire up the copy or using HA Sync).

                                          I suppose it's underlying FreeBSD issue?
                                          I don't really know how to set up something similar in any of the *BSD flavors, and honestly can't find the time to learn currently, but surely one of you guys could test a simple routing setup using FreeBSD/OpenBSD/NetBSD and see if there's the same performance issue? (Maybe with/without VLAN incl. trunking/non-native.)

                                          Regards,
                                          Mattias

                                          IT Teacher & Networking Consultant

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received