Problem with Slow Speeds



  • Hello All!

    I'm new to pfsense and I am having a bit of an issue with throughput.

    I thought that I would leave it to the experts instead of tearing my hair out trying to rectify this problem.

    So the issue that I am having is that my setup does not seem to be using my CPU to its full potential.

    When I am transferring files from WAN -> LAN I get speeds of around 390 Mbps and 50% CPU usage

    Then when I go from LAN -> WAN I get speeds in the range of 460 Mbps and 80% CPU usage

    My hardware is as follows:

    ASUS DSBV-DX MB
    Dual Core Intel Xeon 5110 1.6 gHz 1066 FSB processor
    FB DDR2 533MHz 1 GB
    Intel 82563EB Onboard Gigabit Interfaces

    I have also tried it between two PCI-X Intel Pro1000 adapters and I get the same results.

    Memory usage shows at around 8%.

    The source of the files that I am testing gets around 600 Mbps download by direct connection, so I am wondering what the problem may be that I do not seem to be using the full amount of my processors? Or is there something that I am missing?

    Thanks!
    Dan



  • There are a lot of details missing so its impossible to provide anything like a definitive answer.

    Your motherboard appears to be at least a couple of generations old so I'll assume the onboard NIC is attached by PCI. Standard PCI operates at 33MHz and is capable of moving 4 bytes every clock cycle. Data can be moved by a transaction which begins by an address and is followed by four bytes of data. Some PCI devices are capable of operating in "burst" mode in which the address is followed a chunk of 4xN (N an integer) bytes of data. N is limited to ensure no single device hogs the bus for long periods. If a NIC does NOT support burst mode, then the absolute best transfer rate would be 4 bytes every two cycles (address then data) or 33,000,000*(4/2)*8 bits/sec or about 530Mbps.

    Your configuration seems to involve bulk data coming into the pfSense box and leaving it so the data needs to be transferred between wire and memory twice on the pfSense box: once on reception, once on transmission.  Hence the best throughput that could be seen with a pair of NICs that don't support burst mode and sharing the same PCI bus is 530/2Mbps or about 260Mbps. If the two NICs share the same bus and can fully utilise the bus in burst mode then your maximum throughput would be somewhat less than 530Mbps.

    Without a lot more detailed information about the pfSense box's IO bus configuration(s) you have tested, I would guess that the throughputs you have observed are limited by the IO configuration of the box rather than by CPU or memory.

    A general problem with using file transfers to measure throughput is that file transfers are also limited by the read speed of the source disk, the write speed of the target disk (don't forget to include overhead to allocate new space) and by other activities that might be difficult to control (e.g. swapping, paging, file system 'garbage collection' such as marking the space of deleted files so that its available for reallocation, writing file system journal etc etc).



  • Thanks for the reply!

    Sorry for the lack of info I guess I overlooked it. Please let me know if I left anything out that would be of use in making a diagnosis.

    This is a lab setting, the WAN that I am testing is a gigabit network and my target is a NAS server. I can point it at a RAM disk on the server to achieve wirespeed but for now it has difficulty maxing out just one of the drives on the server. The switch that I am using is a Procurve 1800-8g, an 8 port managed gigabit switch. I have looked at the switch during transfers and it is not showing any errors on the interfaces in use during the transfer. I have also searched this forum and found a similar post http://forum.pfsense.org/index.php/topic,22855.0.html

    I followed the advice in that thread and looked at the shell during transfers to monitor errors or stray interrupts but everything is error-free and smooth, except for the lack of scaling to my max processor capacity!

    The client that is the "master" of the transactions is a Phenom x4 at 3.3 gHz, I have done tests in the same configuration of server-computer-switch minus the pfbox and I can get 120+ MB/sec copying to/from RAM disks on the machines.

    Something else to note that I tried hooking up a 10/100 laptop to the pfsense box in place of the Phenom and was expecting to see wirespeed at 100 Mbits but I was only coming close to 70 Mbits/sec! Tested without the box I can achieve wirespeed of 100Mbits in this configuration!

    I believe that your assumption that my speed is limited by the PCI bus is a good one, I will explore getting a PCIe NIC and hope that solves the problem.

    This box is purely for home use, I just wanted to learn more about pfsense. It will probably make its way to LAN parties about 3x per year though. I am interested in the ability to bond WAN connections.

    Thanks for the help please let me know if there is anything else you can think of!

    Dan



  • Which packages are currently installed?



  • Its just the default install right now. I just loaded it up and began testing.



  • Suggest you read http://forum.pfsense.org/index.php/topic,22855.0.html and http://forum.pfsense.org/index.php/topic,23811.0.html

    You could try some pings between the client and pfSense and the client and server to see how the round trip time changes with the increased number of hops.



  • Thanks! I tried the tips in the post but everything looks normal.

    I did some research on the onboard gigabit ports and they share a 4x PCIe link.

    Any opinions on what else it may be?



  • What differences did you observe in ping times?



  • Here is an idea, I'm not sure if it is valid though. Your bottleneck could be your CPU still. I don't know if pfSense/FreeBSD will make a connection multi-threaded on your CPU. If it won't then when you are seeing 50% usage it's because one core is being maxed out and that's all the single thread can do. Just an idea though.


  • Rebel Alliance Developer Netgate

    At the moment, pf is not multi-threaded, so the firewall process itself doesn't scale across cores. (But other system processes like the GUI and services do)

    There is work being done to alleviate this limitation, but it's not a simple problem and probably won't happen anytime soon.



  • @CaseyBlackburn:

    Here is an idea, I'm not sure if it is valid though. Your bottleneck could be your CPU still. I don't know if pfSense/FreeBSD will make a connection multi-threaded on your CPU. If it won't then when you are seeing 50% usage it's because one core is being maxed out and that's all the single thread can do. Just an idea though.

    It would be worth taking some snapshots of the output of the shell command top -S or watching the RRD graph to see if you are maxing out a CPU.



  • @wallabybob:

    What differences did you observe in ping times?

    My ping times are:

    Connected to switch:

    Avg= .301 ms
    min=.270
    max=.420

    pfbox:

    WAN->LAN

    Avg=.651 ms
    min=.513
    max=.869



  • @wallabybob:

    @CaseyBlackburn:

    Here is an idea, I'm not sure if it is valid though. Your bottleneck could be your CPU still. I don't know if pfSense/FreeBSD will make a connection multi-threaded on your CPU. If it won't then when you are seeing 50% usage it's because one core is being maxed out and that's all the single thread can do. Just an idea though.

    It would be worth taking some snapshots of the output of the shell command top -S or watching the RRD graph to see if you are maxing out a CPU.

    That's it!

    top -S shows one CPU has idle time of 5% during the transfer, while the other had 95% idle time.

    So apparently I am maxing out a single core.

    Thanks for the help everyone!!!



  • What processes are the top CPU consumers shown by top -S?


  • Rebel Alliance Developer Netgate

    @wallabybob:

    What processes are the top CPU consumers shown by top -S?

    top -SH is better. It will show both system and kernel threads.


Locked