Pfsense high cpu usage KVM (Unraid)



  • Maybe i should just try to reïnstall it. Shouldn't be that hard to do. Ill post more after some more testing.



  • A reïnstall made no change, the cpu usage went up on 1 of the cores. during this test i even gave it 8 Cpu core's (4.0ghz) and 4GB of RAM. Download speed was 150mbit. So i have no clue what the option is other than the virtual nic or something...
    Sadly i dont have any other nics available to test with. Any suggestions on a step i might try out?

    Thanks!


  • Netgate Administrator

    With vmx NICs you will need to add the following line to /boot/loader.conf.local to get multiple queue support:
    hw.pci.honor_msi_blacklist=0

    Reboot to apply that. Check the output of vmstat -i to be sure it's creating multiple queues.

    Be sure all hardware offloading support is disabled in Sys > Adv > Networking.

    Steve



  • @stephenw10

    Hi, Thanks for your reply,

    I tried to find the /boot/loader.conf.local file but could only find a /boot/loader.conf
    I tried adding it into there ( hw.pci.honor_msi_blacklist=0 ) but still no change.
    It has done something because it moved up in the file.

    During speedtest i get these results with vmstat -i:
    ae70e044-cd64-426d-aa9b-831bafb1867b-afbeelding.png
    And when using the top -S -H command still get the same results.

    Any other suggestions?

    Thanks!



  • you need to create the file
    /boot/loader.conf.local
    if it's missing
    copy inside
    hw.pci.honor_msi_blacklist=0
    save and reboot


  • Netgate Administrator

    Yup create the file if it doesn't exist. If you put it in loader.conf it may get overwritten.

    However that will only do anything for vmx NICs. You have em NICs there currently.

    Steve



  • @stephenw10 Allright, will set them to VMXNET3, reboot, create the file with the line and inform if there are any changes.

    Thanks for the help @kiokoman & @stephenw10 !

    Creating config file:
    982b0dd4-2c9c-4ecf-9ac8-be2915f3b4be-afbeelding.png



  • Okay so further testing will come in later but for now i seem to reach my maximum provider speed on my linux server behind the firewall:
    30ee4f36-acce-4b80-9fbf-49be03d205a0-afbeelding.png

    BUT it did drop back down to 14.4Megabyte's per second and go up and down all the time:
    9e4f1f56-336e-4798-bf02-06b239e5bad7-afbeelding.png
    Cpu usage seems to have set a bit:
    e79a25f5-47e3-4adf-983d-fd919f94def1-afbeelding.png

    Using SMB protocol i get this from moving a file WAN to LAN:
    e45d4618-c85f-4ace-af3c-533642fda829-afbeelding.png

    It's 2 virtual cores are running at nearly full power (cpu 6/7) (cpu 4 is being used on the server side in the LAN network.):
    45cb9478-33ab-446f-b5ad-60bca539c805-afbeelding.png

    I don't know if this is just a performance bug but speeds seem to have increased, altough cpu usage is still high (compared to the hardware specifications of pfsense)

    Changing to a quad core (virtual processor) did not change much either, cpu usage stays high on 2 cores:
    016158cd-9055-41c1-9087-e3bc4ac87e56-afbeelding.png

    Wish i could put my finger on the issue.


  • Netgate Administrator

    I still only see one tx queue and one rx queue on each NIC. Does vmstat -i show more?

    I assume you created that file in /boot

    Steve



  • @stephenw10

    yep its placed under /boot/loader.conf.local
    9c8aefad-741a-439a-9981-93582a95b5a6-afbeelding.png

    vmstat -i during speedtest on server in lan side:
    a7a528cf-d4b7-4795-bc21-5c250ff4579f-afbeelding.png



  • I actually don't know how to read the vmstat -i, but i hope you might know more @stephenw10



  • one queue

    vmx0: tq0 (transmission queue 0)
    vmx0: rq0 (receive queue 0)

    with multiple queue you should see tq0 / tq1 etc etc


  • Netgate Administrator

    Yeah, that. Though I don't have anything vmx to test again right now.
    I think it probably is working as you are seeing the high numbered IRQs which MSI uses.
    Try removing that line or commenting it out and rebooting. Do you see any change?

    On other NICs you might see something like:

    [2.4.4-RELEASE][root@5100.stevew.lan]/root: vmstat -i
    interrupt                          total       rate
    irq7: uart0                          432          0
    irq16: sdhci_pci0                    536          0
    cpu0:timer                      68688188       1001
    cpu3:timer                       1069435         16
    cpu2:timer                       1060293         15
    cpu1:timer                       1086989         16
    irq264: igb0:que 0                 68630          1
    irq265: igb0:que 1                 68630          1
    irq266: igb0:que 2                 68630          1
    irq267: igb0:que 3                 68630          1
    irq268: igb0:link                      3          0
    irq269: igb1:que 0                 68630          1
    irq270: igb1:que 1                 68630          1
    irq271: igb1:que 2                 68630          1
    irq272: igb1:que 3                 68630          1
    irq273: igb1:link                      1          0
    irq274: ahci0:ch0                   4473          0
    irq290: xhci0                         85          0
    irq291: ix0:q0                    216643          3
    irq292: ix0:q1                     47933          1
    irq293: ix0:q2                    325480          5
    irq294: ix0:q3                    514752          7
    irq295: ix0:link                       2          0
    irq301: ix2:q0                     74629          1
    irq302: ix2:q1                       507          0
    irq303: ix2:q2                      1703          0
    irq304: ix2:q3                     89446          1
    irq305: ix2:link                       1          0
    irq306: ix3:q0                     70295          1
    irq307: ix3:q1                      4985          0
    irq308: ix3:q2                    186433          3
    irq309: ix3:q3                    413486          6
    irq310: ix3:link                       1          0
    Total                           74405771       1084
    

    https://www.freebsd.org/cgi/man.cgi?query=vmx#MULTIPLE_QUEUES

    Steve



  • try to add this on your loader.conf.local

    hw.vmx.txnqueue="4"
    hw.vmx.rxnqueue="4"
    


  • @kiokoman & @stephenw10

    I added the rule with
    hw.vmx.txnqueue="4"
    hw.vmx.rxnqueue="4"

    I did not see any change whatsoever in vmstat -i:
    5aef0aff-dddc-4b0e-936f-aead27b125d6-afbeelding.png

    and commenting out the first rule also did not change anything:
    1f593ee1-26de-4e5d-b6b4-e79c72ffecd6-afbeelding.png

    Edit:

    Even when doing a download on a server in LAN and using top -S -H i have this outcome:
    1adb31da-90e6-406b-99d7-968622a15c02-afbeelding.png


  • Netgate Administrator

    You are seeing load on all CPUs there and none is at 100% so it's not CPU limited at that point.



  • @stephenw10 i have increased it before to 4 cores running at 4ghz. Right now i dont know what to do at all:( i really like the easy way of working with pfsense but i dont know what further investigation i can do because the cpu usage is skyrocket high with 250mbit/s


  • Netgate Administrator

    Yes, there is something significantly wring with your virtualisation setup there. You can pass 250Mbps with a something ancient and slow like a 1st gen APU at 1GHz.

    Steve



  • @stephenw10 Poor me then, i will see if i will try some other things with this setup



  • to me the problem should be investigated on the vm side more than from inside pfsense. i see on google that people tend to bridge the interface instead off using the passthrough for unraid.
    personally, for example, i was never able to make pfSense work reliable under virtualbox and i had to change the vm to qemu/kvm