Slow throughput



  • So I decided to give pfsense a try after working with lots of different firewalls/routers/utms etc (I do realize pfsense is not a UTM, yet). But I had a number of older Dell Optiplex desktops (Pentium 4 2.8-3.0 ghz + hyperthreading) with 1-2gb of ram as well as a number of PCIx dual port intel nics. So I decided to install pfsense using my lan as the wan connection to test. I put a desktop with a gigabit nic behind the firewall and ran some tests using iperf and the fastest transfer I got was 36Mbits/sec. When I take the pfsense box out of the equation I get between 290 and 520 MBits/sec depending on the window size. Now I was not expecting to get over 200Mbits/sec with pfsense installed with default settings etc, but I expected to at least break 100Mbits/sec. Any suggestions on changes to make in pfSense (or hardware) to make things run faster.

    Long term I am going to build a rackmount sever with PCIe nics, but If I am not getting performance for just a basic test on my test hardware then it's a hard sell to start spending money, since I know plenty of stuff that works, and works well for not much more.

    Thanks in advance for any help/suggestions.

    -J



  • Does no one have any ideas?

    I am going to try Vyatta and see how the performance is with that, at least try and figure out if I have a bad nic or something.



  • What was the % CPU utilization during the test?  Also, is it possible you have a speed/duplex mismatch somewhere?


  • Rebel Alliance Developer Netgate

    You might also try to disable checksum offloading, it's under Advanced Options.

    Or try out a build of pfSense 2.0, see if it's any better/worse.



  • CPU was less then 10%. Thanks for the suggestions. I will give it a look. I also tried Vyatta and got similar results. I am also going to try another intel nic to see if it might be a bad nic (they are pulls from other systems) just to make sure that not it either.



  • So I tried Vyatta and it has the same result. Maybe bad nic? I am going to try a different intel nic next as well as play with some of the nic settings.



  • Do you have any packages installed, namely Squid?


  • Rebel Alliance Developer Netgate

    Some other things to do from the console/ssh while the traffic is flowing:

    Let top -SH run for a bit, see if you notice any interrupts or other processes using a lot of CPU time that isn't otherwise shown by top (e.g. interrupts and kernel threads)

    Watch the output of netstat -ni to see if the error counters are increasing for any interface

    Watch the error counters on your switch ports if you have a managed switch, or router interface if you have a modem/router which can give you stats.

    Watch the output of vmstat -i to see if there are any devices with way too many interrupts, or devices sharing interrupts that maybe shouldn't be doing so (USB is notorious for this).

    Let systat -vmstat run for a while, watch for anything crazy (ctrl-c or :q to quit)

    Let systat -iostat run for a while, watch for anything crazy (ctrl-c or :q to quit)

    Similarly, other views in systat might help, such as -netstat, -ip, -ifstat, -tcp, and -mbufs. You can switch views while it's running by typing :name such as :netstat or :ip



  • So I did some more testing today but did not find any answers. Made the changes noted as well as checked for errors and didn't see any. I am going to try 2.0 as well as another identical setup I have to see if there are any differences. Also when I was logging I did see that the CPU only went up to 4% but only yeilded 34mbps.



  • How does your iperf test relate to what you ultimately want to do with the pfSense box?

    For example, suppose I had a 100Mpbs connection to the internet. Suppose iperf gives a bandwith figure for a single TCP connection. (I don't know that it does, but let assume so for the sake of illustration) and that figure is 30Mbps between a LAN system and a system on the Internet. This wouldn't bother me because I don't currently have any serious use for 30Mbps download from the internet over a single TCP connection. I have a home system and the predominate use of the internet is gaming, web surfing and email. From time to time I download a CD image but for that I almost always use bittorrent which (I believe) uses multiple TCP connections. I'm usually doing other work on my netbook while the bittorrent runs in the background, so even if I could stream only 30Mbps in the whole torrent I suspect its unlikely that the disk will be able to keep up. (The disk is likely to be doing lots of seeks to the swap file, the browser cache, the downloaded file etc etc.)

    If you are interested in doing VoIP through the pfSense box then a much more interesting metric might be whats the distribution of delays in the VoIP stream introduced by pfSense?

    Though the iperf figures are probably interesting I suspect they are irrelevant to most pfSense users because they don't spend most of their time running iperf through their pfSense box.



  • I will be using pfsense as a firewall in a dual wan config. We have a 10mb symmetrical fiber connection and a 25/5 biz cable. Our fiber connection can be upgraded up to 1gb per link. While I don't plan on going to 1gb, upgrading to 20,30 or even 50mb on the fiber in the next year or so is not unheard of. Plus we might ad a 3rd wan link (20mb ethernet) to the mix for more deversity. The goal is high speed deversity. Deadicated Fiber network (dual direction fiber ring), cable network and traditional telco.

    We host some applications in house that we write, demo products for clients as well use voip for our phone system (voip phones + voip service).

    The reason I was starting with iPerf testing was to see what throughput I could get. I plan on doing some throughput tests over VPN, VOIP testing and adding in more wan links to see how that works out but I was just really suprised to see the results.



  • I can understand your surprise.

    Here's a possible explanation (assumes iperf uses TCP connections): The throughput of a TCP connection depends on the speed of the communications channel, the TCP window size and the round trip time (the time it takes to send data and receive the corresponding acknowledgement). TCP can send at most a window full of data before it has to wait for an acknowledgement of already sent data. When you increase the number of hops in the communications channel you almost always increase the round trip time hence you may need to increase the TCP window size to compensate for the larger round trip time. If you add 1mS to a round trip time of a few hundred mSecs its unlikely anyone will notice. If you add 1mS to a round trip time of a couple hundred microSeconds it may be very noticeable.



  • I tried another laptop just to rule that out and without pfsense got 500-700 MBits/sec dependent on window size and got 36-40 MBits/sec with pfsense. Next I am going to try other hardware and try 2.0.



  • Alright so I tried different hardware as well as the latest 2.0 build and got w/ 8k window 441 MBits/s from the latop to my network, and 31.5 MBits/s when I put the 2.0 pfsense box in the mix.



  • Can you please post a diagram of what your physical connectivity looks like?  What sits between your PC testing machine on the WAN side and the LAN side?  Any network switches?  Are your testing boxes connected directly to the pfSense box?



  • I will draw some up now. I have tried on the lan side to include a gb switch as well as direct connection to pfsense.



  • Here you go. The pfSense box has a dual port Intel Pro/1000 64 bit PCI card (in a 32 bit slot). While I realize this will not yeild top numbers, it should do a heck of a lot better then it is. Also I am running these tests using part of my LAN as the WAN instead of putting this box in to my wan links (wanted to get the performance stuff licked before going any further).

    Like I said before I have tried using a switch between the pfSense box and the laptop but that didn't make any difference in performance.

    Thanks in advance.



  • OK - the name of the game is isolation and elimination.  Here is what I would do:

    Setup base systems:
    –-------------------------

    • Get three test machines, install your favorite OS (Linux, BSD, Win) on all 3
    • Using a pair of machines at a time, connect two test machines back-to-back in the same subnet to ensure all the NICs are working properly (system 1 <--> system 2,  system-1 <--> system-3,  system-2 <--> system-3)
    • Don't use any additional hardware (no network switches, etc)
    • Use a tool to measure download/upload performance (iPerf or use a web server on test machine 1 and download files to test machine 2)
    • Do this for all three machines until you have verified they all work properly
    • Get baseline traffic measured

    Note: You mentioned your pfSense has two NICs.  Use BOTH NICs in this test to ensure both ports are operating properly and at rated speeds

    Test back-to-back with pfSense (see attached image):
    –------------------------------------------------------------------

    • On machine one, perform a fresh install of pfSense 1.2.3-RELEASE (no additional packages, etc)
    • Configure the LAN and WAN ports on the pfSense box as necessary
    • Configure test box two with an ip address on the WAN side (don't change any OS stuff)
    • Configure test box three with an ip address on the LAN side (don't change any OS stuff)
    • Don't use any additional hardware (no network switches, use same cables, etc)
    • Use the same tool to measure download/upload performance

    If you have problems with the back-to-back tests with pfSense, you have narrowed down the problem to the pfSense box:

    • Go into BIOS on pfSense box and disable any power-saving features (APIC, etc).  Look for any adjustments to the PCI bus - re-run tests
    • Go into pfSense and disable any h/w offloading, r/x checksumming, h/w VLANs, etc - re-run tests
    • Find a completely different machine to run pfSense - re-run tests

    Remember, the name of the game is isolation and elimination.  Start with a good known and work from there.  This could easily be a BIOS issue on your pfSense box and the PCI-X NIC, or it could be some incompatibility between your Dell network switch and the pfSense box (jumbo frames etc).

    Let us know what you find...




  • Worked out the issue. For some reason our current FW/Router was slowing down traffic. Our DHCP and Static zones are on seperate subnets, our existing FW is slow and was limiting the throughput. I put the pfsense box and my test server on the same subnet and voila everything worked fine. With a single dual port Pro/1000 card in a 64bit 66mhz slot i am getting about 260MBits/s with 8k window which is spot on. Your testing procedure made me realize what was happening so thanks for your help.



  • Glad you got it sorted out!



  • i'm having the same problem but still having problems tracing it out using the troubleshooting method (using iperf)

    Here's the list of setups i've tried

    Ubuntu -> Pfsense - 814Mb/s
    Pfsense -> Ubuntu - 1Gb/s
    Win2k8R2 -> Ubuntu - 727Mb/s
    Win2k8R2 -> Pfsense - 460Mb/s
    Win2k8R2 -> Win2k8R2 - 910Mb/s
    Win2k8R2 -> FreeNAS - 502Mb/s
    FreeNAS -> Pfsense - 293Mb/s
    Pfsense -> FreeNAS - 400Mb/s

    All of the devices above are equipped with gigabit ethernet interfaces with CAT6 cables.



  • i have narrowed down the problem.. and it seems that the tcp window size on each of these machines are different.. i've tried to manually put in a fixed tcp window size on the windows machines but it doesn't commit to the changes no matter how  :-\



  • @calvinz:

    i have narrowed down the problem.. and it seems that the tcp window size on each of these machines are different.. i've tried to manually put in a fixed tcp window size on the windows machines but it doesn't commit to the changes no matter how  :-\

    Have you tried the "DrTCP" utility from http://www.dslreports.com ?



  • somehow the tweaks from DrTCP doesn't apply to Windows7/Windows Server 2008 R2.. :(



  • Try running the following command in an elevated Command Prompt for Win7/ 2k8:

    netsh int tcp set global rss=enabled autotuninglevel=experimental congestionprovider=ctcp netdma=enabled


Locked