100% CPU Usage



  • Hi

    I've been running my pfsense install as a VM for a few months now with no issues. The VM is has a single core and 1GB of Ram and a dedicated dual HP (intel) nic, there are 5 interfaces, 1 WAN (on em1) and 4 on em0 (3 VLANS). I'm running 2.2 and have a few packages running, bandwidthd, darkstat, Open-VM-Tools, pfBlockerNG, Postfix (not configured), RRD summary, snort, squid3 and squidGaurd.

    The CPU usage in pfsense Dashboard shows 100% and 3.4Ghz of consumed host CPU under vSphere even though there is almost zero traffic across any of the interfaces. I've attached a copy of the System Activity screen which shows squid at the top with WPCU 1.98% but I'm not entirely that's a bad thing.

    Do anyone have any suggestions where I can look next for more clues?

    Thanks

    ![Screen Shot 04 -02-15 at 11.23 AM.JPG](/public/imported_attachments/1/Screen Shot 04 -02-15 at 11.23 AM.JPG)
    ![Screen Shot 04 -02-15 at 11.23 AM.JPG_thumb](/public/imported_attachments/1/Screen Shot 04 -02-15 at 11.23 AM.JPG_thumb)



  • Try running 'top -S'. This will show kernel activity. Also, try 'ps -aux | head' to see which processes are using the most CPU.



  • I hadn't noticed your reply and I killed the squid service, and the CPU usage dropped back down to 0%. I think that this thread may need to be moved to packages?



  • Yeah - Sending a screen shot of very low CPU utilization doesn't exactly support your thread title…



  • @muswellhillbilly:

    Try running 'top -S'. This will show kernel activity. Also, try 'ps -aux | head' to see which processes are using the most CPU.

    I ran the 2 commands, attached are the outputs.

    Added a screen shot of the Summary > resources tab in vSphere.

    ![Screen Shot 04 -02-15 at 02.20 PM.JPG](/public/imported_attachments/1/Screen Shot 04 -02-15 at 02.20 PM.JPG)
    ![Screen Shot 04 -02-15 at 02.19 PM.JPG](/public/imported_attachments/1/Screen Shot 04 -02-15 at 02.19 PM.JPG)
    ![Screen Shot 04 -02-15 at 02.29 PM 001.JPG](/public/imported_attachments/1/Screen Shot 04 -02-15 at 02.29 PM 001.JPG)
    ![Screen Shot 04 -02-15 at 02.20 PM.JPG_thumb](/public/imported_attachments/1/Screen Shot 04 -02-15 at 02.20 PM.JPG_thumb)
    ![Screen Shot 04 -02-15 at 02.19 PM.JPG_thumb](/public/imported_attachments/1/Screen Shot 04 -02-15 at 02.19 PM.JPG_thumb)
    ![Screen Shot 04 -02-15 at 02.29 PM 001.JPG_thumb](/public/imported_attachments/1/Screen Shot 04 -02-15 at 02.29 PM 001.JPG_thumb)



  • While the individual percentages show about 6%, all with squid, the load averages show 400% on a single core. Because top isn't showing what is actually creating that load, even though it is counting that load, my first guess would be a strange VM interaction with network or storage drivers. My first guess would be something to do with the storage driver because squid is the only thing listed as using CPU.

    I have only theoretical understandings of how VMs work and no practice, so I could be wrong with my guesses.



  • What does the Dashboard on the pfsense show now? (Status/Dashboard) Does this reflect the same usage you're seeing on VSphere?

    Another thought: It may be that Squid or some other process may be spawning numerous other processes which individually don't take up much cpu but collectively do. Can you run a 'ps -aux > cpu.txt' and post the 'cpu.txt' file?



  • Attached is a screenshot of both.

    ![Screen Shot 04 -02-15 at 03.05 PM.JPG](/public/imported_attachments/1/Screen Shot 04 -02-15 at 03.05 PM.JPG)
    ![Screen Shot 04 -02-15 at 03.05 PM.JPG_thumb](/public/imported_attachments/1/Screen Shot 04 -02-15 at 03.05 PM.JPG_thumb)



  • It is very interesting that you're at 100% load but PFSense says you're 3.3ghz CPU is at 411mhz. What are your PowerD settings?



  • Yet another thought: Are you running VMWare Tools on this guest?



  • Open-VM-Tools is inatalled and powerd is off.



  • Ok, I've done a bit of Googling (wonderful thing, Google!), and found that this is a more common problem than perhaps you thought. There are a number of posts on this very forum which raise the issue, eg: https://forum.pfsense.org/index.php?topic=70092.0. You might have a look at some of the other forum entries to see if anyone else has come up with a solution.

    One thing which did come up was the use of the VMXNET NIC driver as an alternative to the E1000 driver. Apparently there have been instances where the use of the wrong NIC driver has resulted in this sort of CPU behaviour. I don't know which ones you may be using on your VM, but you might try changing these as a test. Apparently pfSense will work with the VMXNET3 driver, so perhaps give that a go.

    See also: https://forum.pfsense.org/index.php?topic=56858.0



  • The vm has a dedicated dual port HP (intel) nic so its not using the software E1000 or VMXNET3 nics/drivers. The spike in CPU has only been since upgrading to 2.2 and or when squid is running. Attached are 2 out puts from 'ps -aux'.

    Thanks

    cpu_no_squid.txt
    cpu_with_squid.txt



  • The VM has to have the NICs defined within the VM settings in VMWare. Your hardware is using the Intel equipment, sure, but the VM will have to have it's own NICs defined in the guest parameters.

    The output files you've posted seem to show Squid running in both cases, so I can't see any difference between them from that perspective.



  • I apologise for not making myself clear but I use VT-d to pass through a dual port HP NIC to my VM, so no NIC's are defined in the VM settings.

    I've attached 2 ps -aux out puts, high_cpu.txt is from when squid was running and installed and low_cpu.txt is from after uninstalling squid and squid-guard. To me the output files do not suggest that the squid process(es) are the issue, however, monitoring the actual CPU usage before and after uninstalling squid/squid-guard from the hypervisor does suggest that squid may be at fault.

    Does anyone else agree? Does this thread need moving?



  • Given you have a VMWare stack available, have you tried installing a test pfSense system in parallel to see whether you can duplicate these effects on a different guest on the same VM environment? It might be worth setting this up using NICs defined within the VM config instead and see whether the same issue arises. I've heard of a number of issues concerning upgrading to 2.2 and maybe a clean install would be a useful way to establish if the problem is down to the (in-place?) upgrade.


Log in to reply