Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Fabiatech FX5625 improving throughput

    Scheduled Pinned Locked Moved Hardware
    13 Posts 2 Posters 452 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      SimonB256
      last edited by SimonB256

      @stephenw10

      After manually chucking some data through to generate this load, the main process responsible is 'intr{irq257: em0:rx0}' with similar processes for the other interfaces alongside it but not quite as high (understandably as em0 is the WAN interface).

      Sample output:

      PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
         11 root     155 ki31     0K    64K CPU1    1  23.6H  87.26% [idle{idle: cpu1}]
         12 root     -92    -     0K   832K CPU0    0 233:12  79.73% [intr{irq257: em0:rx0}]
         11 root     155 ki31     0K    64K RUN     3  23.9H  76.87% [idle{idle: cpu3}]
         11 root     155 ki31     0K    64K CPU2    2  23.2H  49.23% [idle{idle: cpu2}]
         12 root     -92    -     0K   832K WAIT    2   4:05  34.74% [intr{irq278: em5:rx0}]
          0 root     -92    -     0K   816K -       3   7:47  20.59% [kernel{em0 rxq (cpuid 0)}]
         11 root     155 ki31     0K    64K RUN     0  18.6H  13.41% [idle{idle: cpu0}]
         12 root     -92    -     0K   832K WAIT    2  51:47  11.30% [intr{irq261: em1:rx0}]
         12 root     -92    -     0K   832K WAIT    0 107:33   5.89% [intr{irq265: em2:rx0}]
          0 root     -92    -     0K   816K -       2  23:04   5.05% [kernel{dummynet}]
         12 root     -92    -     0K   832K WAIT    1  16:14   4.75% [intr{irq258: em0:tx0}]
         12 root     -92    -     0K   832K WAIT    3   0:16   4.39% [intr{irq279: em5:tx0}]
         12 root     -92    -     0K   832K WAIT    3   6:09   1.87% [intr{irq262: em1:tx0}]
         12 root     -92    -     0K   832K WAIT    2  13:49   1.40% [intr{irq269: em3:rx0}]
          0 root     -92    -     0K   816K -       1   1:41   0.75% [kernel{em5 rxq (cpuid 2)}]
         12 root     -92    -     0K   832K WAIT    1  15:43   0.58% [intr{irq266: em2:tx0}]
      74844 root      20    0  9868K  4700K CPU3    3   0:00   0.53% top -aSH
          0 root     -92    -     0K   816K -       1   2:42   0.46% [kernel{em1 rxq (cpuid 2)}]
         12 root     -92    -     0K   832K WAIT    0  11:15   0.42% [intr{irq281: em6:rx0}]
         12 root     -60    -     0K   832K WAIT    1   3:25   0.27% [intr{swi4: clock (0)}]
         12 root     -92    -     0K   832K WAIT    3   2:18   0.26% [intr{irq270: em3:tx0}]
      

      Checking things like mbuf et al, and there appears to be plenty of room there:

      35554/14801/50355 mbufs in use (current/cache/total)
      33501/13093/46594/249500 mbuf clusters in use (current/cache/total/max)
      33501/13051 mbuf+clusters out of packet secondary zone in use (current/cache)
      0/34/34/124749 4k (page size) jumbo clusters in use (current/cache/total/max)
      0/0/0/36962 9k jumbo clusters in use (current/cache/total/max)
      0/0/0/20791 16k jumbo clusters in use (current/cache/total/max)
      75890K/30022K/105912K bytes allocated to network (current/cache/total)
      0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
      0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
      0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
      0/0/0 requests for jumbo clusters denied (4k/9k/16k)
      0 sendfile syscalls
      0 sendfile syscalls completed without I/O request
      0 requests for I/O initiated by sendfile
      0 pages read by sendfile as part of a request
      0 pages were valid at time of a sendfile request
      0 pages were requested for read ahead by applications
      0 pages were read ahead by sendfile
      0 times sendfile encountered an already busy page
      0 requests for sfbufs denied
      0 requests for sfbufs delayed
      

      Current MBUF limit set as:

      [2.4.5-RELEASE][admin@firewall1.midlandcomputers.com]/root: sysctl kern.ipc.nmbclusters
      kern.ipc.nmbclusters: 249500
      
      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        em uses a single receive and transmit queue so you're unlikely to exhaust the mbufs.

        What throughput were you seeing when that was taken?
        Between which interfaces

        What throughput do you see without any of those loader variables, just using the em defaults?

        What output do you get from vmstat -i and sysctl net.isr

        Steve

        S 1 Reply Last reply Reply Quote 0
        • S
          SimonB256 @stephenw10
          last edited by

          Output from sysctl net.isr :

          net.isr.numthreads: 4
          net.isr.maxprot: 16
          net.isr.defaultqlimit: 256
          net.isr.maxqlimit: 10240
          net.isr.bindthreads: 0
          net.isr.maxthreads: 4
          net.isr.dispatch: direct
          

          Output from vmstat -i:

          interrupt                          total       rate
          irq18: uhci2+                     304106          3
          cpu0:timer                     108772857       1036
          cpu1:timer                      68073061        648
          cpu2:timer                       9281390         88
          cpu3:timer                      19118159        182
          irq257: em0:rx0                194215751       1850
          irq258: em0:tx0                229258370       2183
          irq259: em0:link                       1          0
          irq261: em1:rx0                 48310327        460
          irq262: em1:tx0                 82599543        787
          irq263: em1:link                       1          0
          irq265: em2:rx0                113082535       1077
          irq266: em2:tx0                193176467       1840
          irq267: em2:link                       1          0
          irq269: em3:rx0                 23497096        224
          irq270: em3:tx0                 39913436        380
          irq271: em3:link                       1          0
          irq273: em4:rx0                   157084          1
          irq274: em4:tx0                   104642          1
          irq275: em4:link                       1          0
          irq277: pcib8                          1          0
          irq278: em5:rx0                  3537702         34
          irq279: em5:tx0                  3615446         34
          irq280: em5:link                       1          0
          irq281: em6:rx0                 11959127        114
          irq282: em6:tx0                 15965140        152
          irq283: em6:link                       1          0
          irq284: em7:rx0                   421216          4
          irq285: em7:tx0                    21775          0
          irq286: em7:link                       9          0
          Total                         1165385247      11098
          

          In the example I posted above I was simply downloading large files two hosts without bandwidth caps. Where em0 is the WAN interface, and em1 & em5 were where the hosts were residing.

          I will remove what I have entered from the loader.conf, reboot and retry, but rebooting the firewall during office hours is a pain to arrange. I'll get this done this evening.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            You might try setting:
            net.isr.bindthreads=1

            The core affinity might give you better distribution.

            S 1 Reply Last reply Reply Quote 1
            • S
              SimonB256 @stephenw10
              last edited by

              Hi,

              I've set that and rebooted, and will test over the weekend.

              I might be gong completely along the wrong train of thought, but would net.isr.direct=1 possibly also help?

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                @SimonB256 said in Fabiatech FX5625 improving throughput:

                net.isr.direct

                That doesn't exist in FreeBSD after 9 (pfSense 2.4.5 is built on 11.3), that's what net.isr.dispatch: direct does.

                Steve

                1 Reply Last reply Reply Quote 0
                • S
                  SimonB256
                  last edited by

                  Just to update, it appears that I am now getting better throughput after adding net.isr.bindthreads=1.

                  Thank you for your help.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Ah, good to hear. What sort of improvement are you seeing?

                    1 Reply Last reply Reply Quote 0
                    • S
                      SimonB256
                      last edited by

                      In terms of throughput I'm only seeing a 15-20Mbps increase (so we're up to 470Mbps). But we're seeing far less packet loss at the top end of these speeds.

                      Looking further at the kind of traffic we're handling. We're talking around 600-700 flows at any given time (according to ntop I have running elsewhere in the network), and around 15k-20k states listed on the firewall itself.

                      So I imagine for this small device, handling a reasonable amount of small connections at any time might explain why we wouldn't be getting the 600Mbps+ theoretical max.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Yes that seems reasonable. You would only see >600Mbps using all full size packets.

                        Steve

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.