Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Load testing methods, PPS & Bandwidth - performance with igb/em

    Scheduled Pinned Locked Moved Hardware
    6 Posts 3 Posters 12.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      ben_uk
      last edited by

      Hi guys,

      I'm replacing my existing firewalls with some pfsense boxes and I'm just trying to get an idea of performance and how that should be tested. To give a quick overview of the configuration, I'm using the below. The servers are probably overkill - but its what I have spare. I should note that I tested this with two different motherboards, one with igb and one with em but saw the same results. The servers listed below use a Supermicro X9SCD-F with 2x integrated Intel 82580DB

      2 pfsense servers: 3.4 GHz Intel Xeon E3-1240v2 / 16GB RAM / 2x 10KRPM SATAIII HDD (RAID1 gmirror)
      2 test servers: 3.4 GHz Intel Xeon E3-1240v2 / 16GB RAM / 1x 10KRPM SATAIII HDD
      2 switches: 1Gbit Juniper EX-3200

      The configuration is a router on a stick set-up to provide firewalling and inter-vlan routing - with a single trunked 1Gbit interface to the switch (carrying the WAN VLAN and the internal VLANs)

      Bandwidth
      I've been using iperf for inter-vlan testing, using the following command:

      Server 1: iperf -s
      Server 2: iperf -c 188.94.17.130 -d

      Packets per second

      pfsense 1: netstat -w 1 -I igb0 (to view packets/second)
      Server 1: hping 10.0.1.1 -q -i u2 –data 64 --icmp | tail -n10
      Server 2: hping 10.0.2.1 -q -i u2 --data 64 --icmp | tail -n10

      (ie. pinging each other).

      Are there any better / more accurate ways of performing testing? I'm not quite getting the output that I expect.

      Eg. Bandwidth results

      
      [ ID] Interval       Transfer     Bandwidth
      [  4]  0.0-10.0 sec    459 MBytes    385 Mbits/sec
      [ ID] Interval       Transfer     Bandwidth
      [  5]  0.0-10.0 sec    642 MBytes    538 Mbits/sec
      
      

      Which I would expect (with full duplex 1GB - auto neg. turned off on everything) to be 1Gb each way (not 1Gb total)?

      Eg. Packets per second

      The theoretical max over a 1Gbit connection should be about 1.4 million 64 byte packets per second, but I'm falling well short of this

      
                  input         (igb0)           output
         packets  errs idrops      bytes    packets  errs      bytes colls
          731579     0     0   77851066     489258     0   56387720     0
      
      

      During the test, top -aSCHIP shows

      
      last pid: 55400;  load averages:  0.38,  0.14,  0.09        up 0+00:21:53  14:30:55
      157 processes: 10 running, 110 sleeping, 37 waiting
      CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
      CPU 1:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
      CPU 2:  0.0% user,  0.0% nice,  0.0% system, 28.2% interrupt, 71.8% idle
      CPU 3:  0.0% user,  0.0% nice,  0.0% system, 27.4% interrupt, 72.6% idle
      CPU 4:  0.0% user,  0.0% nice,  6.0% system,  0.0% interrupt, 94.0% idle
      CPU 5:  0.0% user,  0.0% nice, 12.8% system,  0.0% interrupt, 87.2% idle
      CPU 6:  0.0% user,  0.0% nice, 13.2% system,  0.0% interrupt, 86.8% idle
      CPU 7:  0.0% user,  0.0% nice,  9.8% system,  0.0% interrupt, 90.2% idle
      Mem: 52M Active, 15M Inact, 434M Wired, 72K Cache, 34M Buf, 15G Free
      Swap: 32G Total, 32G Free
      
      

      And vmstat -i shows

      
      interrupt                          total       rate
      irq1: atkbd0                          18          0
      irq16: ehci0                        2014          1
      irq19: atapci0                     11985          9
      irq23: ehci1                        2015          1
      cpu0: timer                      2584464       1995
      irq256: igb0:que 0                778068        600
      irq257: igb0:que 1                740291        571
      irq258: igb0:que 2              10529010       8130
      irq259: igb0:que 3              10489491       8099
      irq260: igb0:que 4                830229        641
      irq261: igb0:que 5                762681        588
      irq262: igb0:que 6                798454        616
      irq263: igb0:que 7                887188        685
      irq264: igb0:link                      3          0
      cpu1: timer                      2564435       1980
      cpu4: timer                      2564434       1980
      cpu3: timer                      2564434       1980
      cpu5: timer                      2564434       1980
      cpu6: timer                      2564434       1980
      cpu2: timer                      2564434       1980
      cpu7: timer                      2564434       1980
      Total                           46366950      35804
      
      

      In terms of BSD tunables

      
      /etc/sysctl.conf
      
      dev.igb.0.enable_lro=0
      dev.igb.1.enable_lro=0
      kern.random.sys.harvest.interrupt=0
      kern.random.sys.harvest.ethernet=0
      net.inet.ip.fastforwarding=1
      kern.timecounter.hardware=HPET
      dev.igb.0.rx_processing_limit=480
      dev.igb.1.rx_processing_limit=480
      kern.ipc.nmbclusters=512000
      
      /boot/loader.conf
      
      autoboot_delay="3"
      vm.kmem_size="435544320"
      vm.kmem_size_max="535544320"
      kern.ipc.nmbclusters="655356"
      hw.igb.num_queues="8"
      hw.igb.max_interrupt_rate="30000"
      hw.igb.rxd="3096"
      hw.igb.txd="3096"
      
      

      What I want to know is,

      1. Are the testing methods I am using accurate?
      2. Are the results I am seeing good/average/poor?
      3. Is there anything else I should be doing.

      NB. There is no NAT/rate limiting, just pure firewalling and VLAN routing.

      Incidentally, I did contact the consultancy wing of pfsense for paid professional support - but after 3 emails without a response, I'm not sure that anyone actually supports it?

      1 Reply Last reply Reply Quote 0
      • B
        ben_uk
        last edited by

        Actually, regarding bandwidth, I've managed to answer my own question. It appears that the rate is normal (ie. the 1Gbit total). After reviewing systat I can see that it is processing at max performance on the interface.

        
        # systat -ifstat
        
              Interface           Traffic               Peak                Total
        
                   igb0  in    115.311 MB/s        115.311 MB/s            6.273 GB
                         out   115.497 MB/s        115.497 MB/s            6.033 GB
        
        

        I split WAN off from the VLAN trunk and put then on igb0 and igb1 respectively, ran iperf again and saw a full 1Gbps in each direction. So the trunk was certainly the limiting factor.

        1 Reply Last reply Reply Quote 0
        • W
          wallabybob
          last edited by

          @ben_uk:

          1. Are the testing methods I am using accurate?

          They measure what they measure - they are accurate in that respect. But perhaps what they measure is not particularly relevant to your particular circumstance. Consider a motor car. 0-60kmph is a probably a very relevant metric if you want to drag other people at the traffic lights but probably not particularly relevant to an elderly person purchasing a car for trips on suburban roads to destinations at most a few suburbs away.

          Your ping statistic is interesting, but how much of your "real life" traffic is continuous pings?

          Some people have reported that putting a pfSense box between two systems results in significant loss of bandwidth over a single TCP connection between the systems. This might be relevant if they are looking primarily to reduce the time of a single bulk transfer (e.g. a large backup) through a pfSense box but is perhaps of much less relevance if they are more concerned that the pfSense box is adequate to support large numbers of concurrent web page downloads. What attributes of a pfSense box are most important to you?

          1 Reply Last reply Reply Quote 0
          • B
            ben_uk
            last edited by

            At the moment, pfsense is already suitable. The key task is to route <20Mbps over a 1000Mb bearer - but as it is an edge appliance, there is a requirement to be able to cope in non-normal situations (small DOS attacks and high levels of inter-vlan traffic). Note, I say cope, this is not the purpose of the firewall, but it is going to be best if the firewall is tuned to the best of its ability.

            So to answer your question. No, they won't be under continuous ping nor sustained transfers.

            But I was actually just aiming to work towards a target of 1.4M pps - but I wonder if the single VLAN trunk (ie. just 1x 1Bb interface) is actually the limiting factor, due to tx and rx occuring 4 times over (hence the halved iperf results seen above).

            Given the server has 2x 1Gb interfaces, what would be an optimal configuration?

            igb0: wan
            igb1: vlan trunk

            or

            igb0 + igb1 (lacp lagg): vlan trunk

            or

            something else?

            Regarding the testing methods - I actually wanted to know how people actually test PPS rates. I've really struggled to find examples of what testing/tools/commands people use when coming up with a figure for 64 byte packet forwarding. Ie. whether they use hping or not, whether it is UDP or not, whether it is ICMP or not, etc.

            1 Reply Last reply Reply Quote 0
            • B
              ben_uk
              last edited by

              Again, to follow up here. I set up a LAGG with LACP and bonded the two interfaces for the VLAN trunk to see if it altered the bandwidth test. Between 2 servers, it didn't change anything - but when testing 4 servers, the performance was shown. There's a good explanation on the limitations of LACP here https://supportforums.cisco.com/thread/2132362

              If you are just transferring between 2 addresses  that conversation will only flow down a single port within that port channel , thats the way port channels work .  As you get more inputs from different addresses then the port channels will be more evened out due to the way the switch hashes the traffic from different sources down each port in the port channel .  A single given conversation will only go down a single port .

              1 Reply Last reply Reply Quote 0
              • S
                SeventhSon
                last edited by

                I would go for the LAGG option, for redundancy (at least for NIC/cable).

                As for PPS testing, just lower the MTU on the sending and run the iperf again?

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.