Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    CPU Usage when network used

    Installation and Upgrades
    7
    99
    2867
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Q
      qwaven last edited by

      Hi all,

      I've got a fairly new install of Pfsense w/ the latest update. I have not installed or setup anything major like IDS...etc. Fairly basic firewall rules and some NAT.

      I have installed a 10G SFP+ PCIE card. CHELSIO COMMUNICATIONS T520-SO-CR which I have understood to be compatible with PFSense. Upon first boot it did appear to update some firmware on the card and detect fine.

      I do have a few VLAN's configured on the SFP port.

      Generally speaking the card "works" however I've noticed that my CPU utilization goes way up when doing just about nothing on the box except transferring data.

      For example: I've setup iperf3 as a client on PFSense and pushing data to my NAS.

      Pfsense --> SFP+ NAS (VLAN) --> directly connected switch --> NAS also on the same switch

      I notice two things:

      1. The speed is pretty weak for 10G at about 445 Mbits/sec.
      2. The CPU graph on PFSense seems to climb from 0-5% to around 60-70% utilization once iperf starts.

      Something does not seem to working very well and I am trying to determine where the issue is. Throughput wise could be any number of things external to the fw but the CPU utilization I would imagine is all PFSense. I am hoping to correct both however for now would be happy to see the CPU go back to a little more normal.

      My hardware is:

      CPU Type Intel(R) Pentium(R) CPU N3700 @ 1.60GHz
      4 CPUs: 1 package(s) x 4 core(s)
      AES-NI CPU Crypto: Yes (active)
      16GB of RAM
      all graphs on the dashboard seem minimal to no usage when idle.

      Status
      up
      MAC Address
      00:07:43:4c:66:48
      IPv4 Address
      10.10.254.1
      Subnet mask IPv4
      255.255.255.128
      IPv6 Link Local
      fe80::207:43ff:fe4c:6648%cxl1.254
      MTU
      1500
      Media
      10Gbase-Twinax <full-duplex,rxpause,txpause>
      In/out packets
      74293466/127365445 (33.39 GiB/124.34 GiB)
      In/out packets (pass)
      74293466/127365445 (33.39 GiB/124.34 GiB)
      In/out packets (block)
      2825/72 (252 KiB/4 KiB)
      In/out errors
      0/199
      Collisions
      0

      Also note I have not setup anything like jumbo frames...etc either. I have pretty much left the network settings / tuning at system default. I do not see any errors on the network.

      Hoping someone(s) can assist with digging deeper to determine if anything can be corrected?

      Cheers!

      1 Reply Last reply Reply Quote 0
      • stephenw10
        stephenw10 Netgate Administrator last edited by

        Hmm, you're running a low power laptop CPU. Is it running in it's turbo mode?

        You might try enabling powerd in Sys > Adv > Misc.

        Try running at the command line whilst you test: top -aSH

        See what is using your CPU cycles and how that's spread across the cores.

        Do you have other copper NICs you can test against?

        Steve

        1 Reply Last reply Reply Quote 0
        • Q
          qwaven last edited by qwaven

          Thanks for the reply.

          I am running a low power cpu but it is not a laptop :)

          I am not aware of how to check turbo mode, unless this is also part of powerd? I have this on hiadaptive.

          Its come to my attention that part of my issue might be that the adapter card I have used is plugged into a pcie port to slow for its speed (10) which might be the issue. I would have figured this would just not give me as much speed when transferring though. I do not recall having cpu issues before when using the built in 1G ports.

          I will try some tests when possible via the command you gave.

          Cheers!

          1 Reply Last reply Reply Quote 0
          • stephenw10
            stephenw10 Netgate Administrator last edited by stephenw10

            Plugging it into a narrower PCI bus, say an 8x card in a 1x slot should still work. The total throughput will obviously be lower but the bandwidth of a single PCI lane is still more than 500Mbps, even for v1.

            Some CPUs/boards need powerd to use turbo mode. Some are disabled by default in the BIOS.
            Turbo mode frequencies are not usually reported directly. You might see it shown as 1600MHz normally and 1601MHz when turbo is enabled.

            For example:

            Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz
            Current: 2600 MHz, Max: 3201 MHz
            4 CPUs: 1 package(s) x 4 core(s)
            AES-NI CPU Crypto: Yes (active) 
            

            Steve

            1 Reply Last reply Reply Quote 0
            • Q
              qwaven last edited by

              I will do some more tests to confirm this.

              For reference:

              The board is: https://www.supermicro.com/products/motherboard/X11/X11SBA-LN4F.cfm
              Which has PCIE 2.0 -1x which is 500MB/s throughput. Unfortunately this means I need a better board at some point if I ever want to actually utilize my 10G setup since running dual 10G interfaces seems kinda silly right now. :)

              CPU wise though I do see that mine is running at the lowest range compared to what is listed on intels site which says burst can be 2.56Ghz.

              https://ark.intel.com/content/www/us/en/ark/products/91830/intel-pentium-processor-n3710-2m-cache-up-to-2-56-ghz.html

              Cheers!

              Grimson 1 Reply Last reply Reply Quote 0
              • Grimson
                Grimson Banned @qwaven last edited by

                @qwaven said in CPU Usage when network used:

                CPU wise though I do see that mine is running at the lowest range compared to what is listed on intels site which says burst can be 2.56Ghz.

                Burst is up to 2.56 GHz, but it's up to the board/BIOS designer to decide if and how high the CPU is allowed to burst. Those decisions are usually based on how the cooling is designed, with passive cooling you often won't get the full burst speed or even no burst at all. Consult the documentation of your hardware, or their support, to see if and how far the CPU can use burst on that board.

                1 Reply Last reply Reply Quote 0
                • Q
                  qwaven last edited by

                  So yeah tried a few things. Enabled/disable powerd/powerd options to maximum...etc
                  Bios anything I could find to speed up the cpu.

                  Nothing seems to change it from 1.6. Looking likely that I cannot change this further.

                  Random snippet of my CPU. Was more around 50% utilization this time.

                  PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
                  0 root -92 - 0K 688K - 3 1:03 95.53% [kernel{igb0 que (qid 0)}]
                  11 root 155 ki31 0K 64K RUN 1 19:19 78.63% [idle{idle: cpu1}]
                  11 root 155 ki31 0K 64K CPU0 0 19:23 67.33% [idle{idle: cpu0}]
                  11 root 155 ki31 0K 64K CPU2 2 19:13 64.38% [idle{idle: cpu2}]
                  11 root 155 ki31 0K 64K CPU3 3 19:12 30.00% [idle{idle: cpu3}]
                  12 root -92 - 0K 816K WAIT 1 0:19 19.35% [intr{irq265: t5nex0:1a0}]
                  12 root -92 - 0K 816K WAIT 2 0:15 17.80% [intr{irq267: t5nex0:1a2}]
                  12 root -92 - 0K 816K WAIT 1 0:15 16.78% [intr{irq266: t5nex0:1a1}]
                  12 root -92 - 0K 816K WAIT 3 0:15 15.07% [intr{irq268: t5nex0:1a3}]
                  12 root -92 - 0K 816K WAIT 0 0:16 14.61% [intr{irq269: igb0:que 0}]

                  Cheers!

                  1 Reply Last reply Reply Quote 0
                  • stephenw10
                    stephenw10 Netgate Administrator last edited by

                    What were you doing at that point? You have nearly 100% on one NIC queue. Other load looks to be spread nicely though.

                    Steve

                    1 Reply Last reply Reply Quote 0
                    • Q
                      qwaven last edited by

                      this was just a simple download from the internet.

                      Internet --> PFSense WAN --> (Nat) --> PFsense 10G interface --> Switch ---> Destination Host

                      igb0 would be the WAN nic (1g port)
                      I believe t5nex0 is the chelsio 10g card.

                      Cheers!

                      1 Reply Last reply Reply Quote 0
                      • stephenw10
                        stephenw10 Netgate Administrator last edited by

                        Hmm, well it looks like if the restriction is anywhere it's on WAN.

                        Were you able to test with just the igb NICs? Remove the 10G from the test?

                        Steve

                        1 Reply Last reply Reply Quote 0
                        • Q
                          qwaven last edited by

                          ok so it took some changing things around a bit but I have now switched to using only 1G interfaces.

                          Unfortunately I am not sure the results are much different.

                          PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
                          0 root -92 - 0K 688K CPU3 3 5:25 94.14% [kernel{igb0 que (qid 0)}]
                          11 root 155 ki31 0K 64K CPU1 1 24.8H 58.08% [idle{idle: cpu1}]
                          11 root 155 ki31 0K 64K CPU2 2 24.8H 55.88% [idle{idle: cpu2}]
                          11 root 155 ki31 0K 64K RUN 0 24.8H 44.88% [idle{idle: cpu0}]
                          12 root -92 - 0K 816K WAIT 0 1:15 36.41% [intr{irq287: igb3:que 0}]
                          11 root 155 ki31 0K 64K RUN 3 24.8H 32.95% [idle{idle: cpu3}]
                          12 root -92 - 0K 816K CPU1 1 1:15 30.30% [intr{irq288: igb3:que 1}]
                          78054 root 34 0 266M 218M bpf 2 0:48 22.20% /usr/local/bin/ntopng -d

                          For reference the transfer was going about 40 megabytes/sec.

                          Cheers!

                          1 Reply Last reply Reply Quote 0
                          • stephenw10
                            stephenw10 Netgate Administrator last edited by

                            If you expand the window to get more output from top do you actually see more than one queue on igb0?

                            You said you have mostly default settings, I assume you did not set the number of igb queues? Or any other loader tunable?

                            Steve

                            1 Reply Last reply Reply Quote 0
                            • Q
                              qwaven last edited by

                              Hi Steve,

                              I still had the shell open from the same transfer. Here is a more complete view.
                              I am not clear if
                              kernel{igb0 que (qid 0)} is different than intr{irq269: igb0:que 0} however for igb3 I see [intr{irq288: igb3:que 1}] and [intr{irq287: igb3:que 0}] which still seems low given I have 4 cores no? I have not adjusted anything manually like this.

                              PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
                              11 root 155 ki31 0K 64K CPU3 3 25.4H 74.96% [idle{idle: cpu3}]
                              11 root 155 ki31 0K 64K RUN 1 25.4H 54.03% [idle{idle: cpu1}]
                              11 root 155 ki31 0K 64K RUN 0 25.3H 41.49% [idle{idle: cpu0}]
                              0 root -92 - 0K 688K CPU2 2 10:46 35.19% [kernel{igb0 que (qid 0)}]
                              11 root 155 ki31 0K 64K RUN 2 25.3H 33.86% [idle{idle: cpu2}]
                              12 root -92 - 0K 816K CPU1 1 3:36 31.32% [intr{irq288: igb3:que 1}]
                              12 root -92 - 0K 816K WAIT 0 3:40 29.27% [intr{irq287: igb3:que 0}]
                              12 root -92 - 0K 816K WAIT 0 5:50 17.34% [intr{irq269: igb0:que 0}]
                              78054 root 30 0 266M 221M RUN 1 2:13 16.83% /usr/local/bin/ntopng -d /v
                              78054 root 22 0 266M 221M uwait 3 0:12 9.10% /usr/local/bin/ntopng -d /v
                              78054 root 25 0 266M 221M uwait 0 0:11 7.71% /usr/local/bin/ntopng -d /v
                              78054 root 23 0 266M 221M uwait 3 0:11 7.62% /usr/local/bin/ntopng -d /v
                              78054 root 23 0 266M 221M nanslp 3 1:31 4.48% /usr/local/bin/ntopng -d /v
                              78054 root 21 0 266M 221M nanslp 1 0:48 4.16% /usr/local/bin/ntopng -d /v
                              78054 root 20 0 266M 221M nanslp 0 0:39 1.45% /usr/local/bin/ntopng -d /v
                              41253 unbound 20 0 65412K 44220K kqread 0 0:01 0.67% /usr/local/sbin/unbound -c
                              36170 root 21 0 98680K 39040K accept 3 0:06 0.62% php-fpm: pool nginx (php-fp
                              20 root -16 - 0K 16K - 0 0:37 0.57% [rand_harvestq]
                              0 root -92 - 0K 688K - 1 0:04 0.42% [kernel{igb3 que (qid 0)}]
                              12 root -92 - 0K 816K WAIT 3 0:34 0.34% [intr{irq290: igb3:que 3}]
                              78054 root 20 0 266M 221M bpf 1 0:03 0.25% /usr/local/bin/ntopng -d /v
                              198 root 20 0 9860K 4776K CPU0 0 0:07 0.25% top -aSH
                              75724 root 20 0 8428K 4984K kqread 0 0:04 0.21% redis-server: /usr/local/bi
                              23537 root 20 0 12912K 13032K usem 0 0:00 0.16% /usr/local/sbin/ntpd -g -c
                              12 root -72 - 0K 816K WAIT 3 0:14 0.14% [intr{swi1: netisr 0}]
                              50030 root 20 0 9464K 5868K select 3 0:10 0.14% /usr/local/sbin/miniupnpd -
                              22585 root 20 0 23592K 8804K kqread 3 0:01 0.12% nginx: worker process (ngin
                              12 root -60 - 0K 816K WAIT 0 1:21 0.11% [intr{swi4: clock (0)}]
                              65534 root 20 0 6600K 2356K bpf 3 0:07 0.08% /usr/local/sbin/filterlog -
                              0 root -92 - 0K 688K - 2 0:00 0.07% [kernel{igb3 que (qid 1)}]
                              339 root 36 0 98552K 39340K accept 1 0:13 0.07% php-fpm: pool nginx (php-fp
                              74721 root 20 0 50888K 35668K nanslp 3 0:02 0.07% /usr/local/bin/php -f /usr/
                              81162 root 20 0 6392K 2540K select 1 0:04 0.06% /usr/sbin/syslogd -s -c -c
                              78054 root 20 0 266M 221M nanslp 0 0:00 0.05% /usr/local/bin/ntopng -d /v
                              49333 dhcpd 20 0 12576K 7924K select 3 0:01 0.05% /usr/local/sbin/dhcpd -user
                              12 root -92 - 0K 816K RUN 2 0:20 0.04% [intr{irq289: igb3:que 2}]
                              78054 root 20 0 266M 221M select 0 0:00 0.04% /usr/local/bin/ntopng -d /v
                              19 root -16 - 0K 16K pftm 0 0:22 0.03% [pf purge]
                              44931 root 20 0 12904K 8152K select 0 0:01 0.03% sshd: root@pts/0 (sshd)
                              23537 root 20 0 12912K 13032K select 0 0:08 0.03% /usr/local/sbin/ntpd -g -c
                              36968 root 20 0 6900K 2444K nanslp 1 0:00 0.02% [dpinger{dpinger}]
                              36442 root 20 0 6900K 2444K nanslp 1 0:00 0.02% [dpinger{dpinger}]
                              12 root -88 - 0K 816K WAIT 0 0:06 0.01% [intr{irq257: xhci0}]
                              37136 root 20 0 6900K 2444K nanslp 1 0:00 0.01% [dpinger{dpinger}]
                              15 root -68 - 0K 80K - 3 0:05 0.01% [usb{usbus0}]
                              78054 root 20 0 266M 221M nanslp 0 0:00 0.01% /usr/local/bin/ntopng -d /v
                              15 root -68 - 0K 80K - 2 0:05 0.01% [usb{usbus0}]

                              Cheers!

                              1 Reply Last reply Reply Quote 0
                              • stephenw10
                                stephenw10 Netgate Administrator last edited by

                                @qwaven said in CPU Usage when network used:

                                [intr{irq290: igb3:que 3}]

                                It looks like you have 4 queues for igb3 which is what I expect for a 4 core CPU but I only see one for igb0.
                                You might try running vmstat -i to confirm you do have the expected queues for each NIC. I thought they were all on-chip in that CPU but maybe igb0 is different in which case you might try using igb3, or one of the others, as WAN.

                                Steve

                                1 Reply Last reply Reply Quote 0
                                • Q
                                  qwaven last edited by

                                  So with vmstat I see the correct number:

                                  irq269: igb0:que 0 57225866 135
                                  irq270: igb0:que 1 421673 1
                                  irq271: igb0:que 2 425910 1
                                  irq272: igb0:que 3 421212 1
                                  irq273: igb0:link 11 0

                                  irq287: igb3:que 0 94141932 223
                                  irq288: igb3:que 1 45221540 107
                                  irq289: igb3:que 2 27199303 64
                                  irq290: igb3:que 3 35826209 85
                                  irq291: igb3:link 5 0

                                  Cheers!

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10
                                    stephenw10 Netgate Administrator last edited by

                                    Mmm, but all the interrupt loading is on one queue. Do you have a PPPoE WAN?

                                    The single thread performance of the N3700 is... not good. And potentially much worse if turbo/burst is not working.

                                    Do you see any significant improvement if you disable ntop-ng?

                                    Steve

                                    1 Reply Last reply Reply Quote 0
                                    • Q
                                      qwaven last edited by

                                      yes the WAN is PPPoE. Would there be something I can do to use more queues properly?

                                      I can try and turn ntop off later to see what happens.

                                      Cheers!

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10
                                        stephenw10 Netgate Administrator last edited by stephenw10

                                        Ah! OK then, currently, you are limited to a single queue on the PPPoE interface and hence a single core.

                                        See: https://redmine.pfsense.org/issues/4821

                                        And the upstream: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203856

                                        You can probably get some performance by setting the sysctl net.isr.dispatch to deferred in Sys > Adv > System Tunables. That will require a reboot.

                                        https://docs.netgate.com/pfsense/en/latest/hardware/tuning-and-troubleshooting-network-cards.html#pppoe-with-multi-queue-nics

                                        Steve

                                        1 Reply Last reply Reply Quote 0
                                        • Q
                                          qwaven last edited by

                                          tried the dispatch
                                          sysctl net.isr.dispatch
                                          net.isr.dispatch: deferred

                                          cpu seemed to about 50% utilization.

                                          interrupt total rate
                                          cpu0:timer 122117 254
                                          cpu2:timer 121707 253
                                          cpu3:timer 116674 243
                                          cpu1:timer 115728 241
                                          irq256: ahci0 11720 24
                                          irq257: xhci0 2850 6
                                          irq258: hdac0 2 0
                                          irq260: t5nex0:evt 2 0
                                          irq269: igb0:que 0 659069 1372
                                          irq270: igb0:que 1 1457 3
                                          irq271: igb0:que 2 516 1
                                          irq272: igb0:que 3 515 1
                                          irq273: igb0:link 3 0
                                          irq274: pcib5 1 0
                                          irq280: pcib6 1 0
                                          irq286: pcib7 1 0
                                          irq287: igb3:que 0 453042 943
                                          irq288: igb3:que 1 573830 1194
                                          irq289: igb3:que 2 755133 1572
                                          irq290: igb3:que 3 438318 912
                                          irq291: igb3:link 3 0
                                          irq292: pcib8 1 0
                                          Total 3372690 7020

                                          1 Reply Last reply Reply Quote 0
                                          • Q
                                            qwaven last edited by

                                            Also now tried disabling ntop cpu usage looks to be maybe 8-10% less.

                                            1 Reply Last reply Reply Quote 0
                                            • stephenw10
                                              stephenw10 Netgate Administrator last edited by

                                              Is that total CPU was 50%? Did throughput increase?

                                              Steve

                                              1 Reply Last reply Reply Quote 0
                                              • Q
                                                qwaven last edited by

                                                That would be what was shown on the dashboard for cpu performance. If utilization is stuck on 1 core I am not sure if there would be anything else we can do.

                                                As for throughput, it was about the same but I am not worrying about that as the source for the transfer may impact this as well. Ideally it would be great to see it closer to my actual speed but I'm not sure about testing it reliably.

                                                Cheers!

                                                1 Reply Last reply Reply Quote 0
                                                • Q
                                                  qwaven last edited by

                                                  Hi again,

                                                  I'm assuming we've exhausted trying to improve the cpu utilization with this but I just wanted to say thanks for the help/efforts with this. I am still open to try anything though.

                                                  Cheers!

                                                  1 Reply Last reply Reply Quote 0
                                                  • stephenw10
                                                    stephenw10 Netgate Administrator last edited by

                                                    I suspect it might be. The single thread performance of that CPU is about equal to that of the Pentium M I used to run and that was good fpr ~650Mbps. At least according to this:
                                                    https://www.cpubenchmark.net/compare/Intel-Core2-Duo-E4500-vs-Intel-Pentium-N3700-vs-Intel-Pentium-M-1.73GHz/936vs2513vs1160
                                                    Obviously that's synthetic and there are many variable etc. No PPPoE overhead in that test either.
                                                    The E4500 can pass Gigabit, just. (at full size TCP packets...many variables etc!).

                                                    If that is to be believed then it probably is running burst mode and I'm not sure there's much we can do before RSS is re-written in FreeBSD to allow multiple cores.

                                                    You probably could see better performance off-loading the PPPoE to another device. That would probably mean a double NAT scenario unfortunately.

                                                    Steve

                                                    1 Reply Last reply Reply Quote 0
                                                    • Q
                                                      qwaven last edited by

                                                      Hi Steve,

                                                      It's unfortunate about this RSS issue. I have another board that I plan to try out, however its quite overkill especially if only 1 core is going to be used for pppoe. However it does have some better on board hardware that may help overall. It is however still just 2ghz/core.

                                                      https://www.supermicro.com/products/motherboard/atom/A2SDi-H-TP4F.cfm

                                                      Cheers!

                                                      1 Reply Last reply Reply Quote 0
                                                      • stephenw10
                                                        stephenw10 Netgate Administrator last edited by

                                                        Yes. I have a PPPoE WAN but fortunately/unfortunately it's no where near fast enough to worry about this. 😉

                                                        No benchmarks for the C3958 but if we assume it's the same as the C3858 but with 4 more cores then it should make about ~40% better single thread performance.

                                                        It does seem like a waste of cores unless you virtualise it.

                                                        Steve

                                                        1 Reply Last reply Reply Quote 0
                                                        • Q
                                                          qwaven last edited by

                                                          Hi Steve,

                                                          So I flipped it over. Performance so far looks drastically better. CPU in the gui was about 5-6% while transferring over pppoe. I believe still just the 1 core.

                                                          PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
                                                          11 root 155 ki31 0K 256K CPU1 1 7:39 97.26% [idle{idle: cpu1}]
                                                          11 root 155 ki31 0K 256K CPU10 10 7:41 97.12% [idle{idle: cpu10}]
                                                          11 root 155 ki31 0K 256K CPU13 13 7:33 96.96% [idle{idle: cpu13}]
                                                          11 root 155 ki31 0K 256K CPU7 7 7:45 96.85% [idle{idle: cpu7}]
                                                          11 root 155 ki31 0K 256K CPU11 11 7:38 96.51% [idle{idle: cpu11}]
                                                          11 root 155 ki31 0K 256K RUN 4 7:43 96.46% [idle{idle: cpu4}]
                                                          11 root 155 ki31 0K 256K CPU3 3 7:44 96.46% [idle{idle: cpu3}]
                                                          11 root 155 ki31 0K 256K CPU9 9 7:36 96.26% [idle{idle: cpu9}]
                                                          11 root 155 ki31 0K 256K CPU5 5 7:42 95.99% [idle{idle: cpu5}]
                                                          11 root 155 ki31 0K 256K RUN 8 7:19 95.56% [idle{idle: cpu8}]
                                                          11 root 155 ki31 0K 256K CPU6 6 7:42 95.12% [idle{idle: cpu6}]
                                                          11 root 155 ki31 0K 256K CPU2 2 7:42 94.98% [idle{idle: cpu2}]
                                                          11 root 155 ki31 0K 256K CPU12 12 7:40 93.93% [idle{idle: cpu12}]
                                                          11 root 155 ki31 0K 256K RUN 15 7:35 87.04% [idle{idle: cpu15}]
                                                          11 root 155 ki31 0K 256K CPU14 14 7:31 82.95% [idle{idle: cpu14}]
                                                          11 root 155 ki31 0K 256K RUN 0 7:24 79.60% [idle{idle: cpu0}]

                                                          irq298: ix0:q0 2716423 6058
                                                          irq299: ix0:q1 244578 545
                                                          irq300: ix0:q2 461159 1029
                                                          irq301: ix0:q3 243416 543
                                                          irq302: ix0:q4 378891 845
                                                          irq303: ix0:q5 124788 278
                                                          irq304: ix0:q6 478729 1068
                                                          irq305: ix0:q7 125913 281
                                                          irq306: ix0:link 1 0
                                                          irq307: ix1:q0 326596 728
                                                          irq308: ix1:q1 254938 569
                                                          irq309: ix1:q2 614196 1370
                                                          irq310: ix1:q3 250402 558
                                                          irq311: ix1:q4 388996 868
                                                          irq312: ix1:q5 128709 287
                                                          irq313: ix1:q6 492403 1098
                                                          irq314: ix1:q7 130143 290
                                                          irq315: ix1:link 1 0

                                                          ix0 is pppoe and ix1 is internal lans.

                                                          I was thinking about virtualizing. However I've seen so many talks about people suggesting this is not a great choice for a firewall. However I'm open to exploring this more. Do you have any thoughts? Proxmox was my first choice.

                                                          Cheers!

                                                          1 Reply Last reply Reply Quote 0
                                                          • stephenw10
                                                            stephenw10 Netgate Administrator last edited by

                                                            Nice, what sort of throughput were you seeing at that point?

                                                            I can't really advise on hypervisors, I'm not using anything right now.

                                                            A lot of people here are using Proxmox though. ESXi is also popular.

                                                            Steve

                                                            1 Reply Last reply Reply Quote 0
                                                            • Q
                                                              qwaven last edited by

                                                              Same throughput but I believe this is more because of the source. I have not had a chance to test internally the network to see if anything there is improved. Will update once I have.

                                                              1 Reply Last reply Reply Quote 0
                                                              • Q
                                                                qwaven last edited by

                                                                so testing with iperf3, I still don't seem to be getting anywhere close to 10G bandwidth.

                                                                It looks about spot on with 1G.

                                                                [ 41] 0.00-10.00 sec 56.4 MBytes 47.4 Mbits/sec 3258 sender
                                                                [ 41] 0.00-10.00 sec 56.4 MBytes 47.3 Mbits/sec receiver
                                                                [ 43] 0.00-10.00 sec 58.1 MBytes 48.8 Mbits/sec 3683 sender
                                                                [ 43] 0.00-10.00 sec 58.0 MBytes 48.6 Mbits/sec receiver
                                                                [SUM] 0.00-10.00 sec 1.10 GBytes 943 Mbits/sec 69930 sender
                                                                [SUM] 0.00-10.00 sec 1.10 GBytes 941 Mbits/sec receiver

                                                                Any ideas?

                                                                This is literally SFP+ 10G interface on pfsense to switch to fileserver. The file server has two 10G bonded links. Nothing else running.

                                                                Cheers!

                                                                1 Reply Last reply Reply Quote 0
                                                                • stephenw10
                                                                  stephenw10 Netgate Administrator last edited by

                                                                  How many processes are you running there?

                                                                  You have 8 queues so I don't expect to any advantage over 8.

                                                                  Is that result testing over 1G? What do you actually see over 10G?
                                                                  I would anticipate something ~4Gbps maybe. Though if you're running iperf on the firewall it may reduce that.

                                                                  Steve

                                                                  1 Reply Last reply Reply Quote 0
                                                                  • Q
                                                                    qwaven last edited by

                                                                    My test with iperf was sending 20 connections (what I saw someones example on the internets doing) and it looks pretty much to saturate if it were 1G.

                                                                    This is not 1G. This is using my internal network. Pfsense reports it as 10G, the switch is all 10G, and the file server has 2x10G.

                                                                    Curious why would iperf on the firewall reduce this?

                                                                    fyi cpu did not appear stressed in any way.

                                                                    Cheers!

                                                                    1 Reply Last reply Reply Quote 0
                                                                    • stephenw10
                                                                      stephenw10 Netgate Administrator last edited by

                                                                      That seems far too much like a 1G link limit to be coincidence.

                                                                      Check that each part is actually linked at 10G.

                                                                      Steve

                                                                      1 Reply Last reply Reply Quote 0
                                                                      • Q
                                                                        qwaven last edited by

                                                                        so on my pfsense I can see all my internal interface vlans are listed with:

                                                                        media: Ethernet autoselect (10Gbase-T <full-duplex>)

                                                                        on my NAS I see the bonded interfaces:

                                                                        Settings for eth4:
                                                                        Supported ports: [ FIBRE ]
                                                                        Supported link modes: 1000baseKX/Full
                                                                        10000baseKR/Full
                                                                        Supported pause frame use: Symmetric Receive-only
                                                                        Supports auto-negotiation: No
                                                                        Advertised link modes: 1000baseKX/Full
                                                                        10000baseKR/Full
                                                                        Advertised pause frame use: Symmetric
                                                                        Advertised auto-negotiation: No
                                                                        Speed: 10000Mb/s
                                                                        Duplex: Full

                                                                        Port: Direct Attach Copper
                                                                        PHYAD: 0
                                                                        Transceiver: internal
                                                                        Auto-negotiation: off
                                                                        Cannot get wake-on-lan settings: Operation not permitted
                                                                        Current message level: 0x00000014 (20)
                                                                        link ifdown
                                                                        Link detected: yes

                                                                        Settings for eth5:
                                                                        Supported ports: [ FIBRE ]
                                                                        Supported link modes: 1000baseKX/Full
                                                                        10000baseKR/Full
                                                                        Supported pause frame use: Symmetric Receive-only
                                                                        Supports auto-negotiation: No
                                                                        Advertised link modes: 1000baseKX/Full
                                                                        10000baseKR/Full
                                                                        Advertised pause frame use: Symmetric
                                                                        Advertised auto-negotiation: No
                                                                        Speed: 10000Mb/s
                                                                        Duplex: Full

                                                                        Port: Direct Attach Copper
                                                                        PHYAD: 0
                                                                        Transceiver: internal
                                                                        Auto-negotiation: off
                                                                        Cannot get wake-on-lan settings: Operation not permitted
                                                                        Current message level: 0x00000014 (20)
                                                                        link ifdown
                                                                        Link detected: yes

                                                                        On the switch:

                                                                        0/3 PC Mbr Enable Auto D 10G Full Up Enable Enable Disable (nas)
                                                                        0/4 PC Mbr Enable Auto D 10G Full Up Enable Enable Disable (nas)
                                                                        ...
                                                                        0/16 Enable Auto 10G Full Up Enable Enable Disable (pfsense)

                                                                        1 Reply Last reply Reply Quote 0
                                                                        • Grimson
                                                                          Grimson Banned last edited by

                                                                          Do you use traffic shaping/limiters?

                                                                          1 Reply Last reply Reply Quote 0
                                                                          • Q
                                                                            qwaven last edited by

                                                                            Unless there is something configured from a default install I have not set anything myself. Going into the traffic shaper area it does not appear to have anything set.

                                                                            For reference I have dismantled my NAS bonded interfaces and just using 1 interface now. Results are about the same showing about 1G speed.

                                                                            Thanks!

                                                                            1 Reply Last reply Reply Quote 0
                                                                            • Q
                                                                              qwaven last edited by

                                                                              Update: I have now separated the NAS from the rest of the VLAN's I had to try and ensure nothing going on there. Now its on its own 10G interface. Results about the same.

                                                                              Another interesting fact. If I reverse the iperf direction. NAS to PFsense I can see the bandwidth spike up to more around the 2G range.

                                                                              Doing -P20 (20 transfers at once)
                                                                              [SUM] 0.00-10.00 sec 2.71 GBytes 2.33 Gbits/sec receiver

                                                                              Without, it will drop down to a little over 1G.

                                                                              Any ideas?

                                                                              1 Reply Last reply Reply Quote 0
                                                                              • stephenw10
                                                                                stephenw10 Netgate Administrator last edited by

                                                                                Is that using the -R switch? Can you try running the actual client on the NAS and server on pfSense? That will open firewall states differently.

                                                                                You could also try disabling pf as a test. If there is a CPU restriction still that should show far higher throughput.

                                                                                Steve

                                                                                1 Reply Last reply Reply Quote 0
                                                                                • Q
                                                                                  qwaven last edited by

                                                                                  I had not used -R before but I tried it with or w/o -P20 and the results seem to be about the same.

                                                                                  I have also tried replacing the SFP+ cables with brand new ones. No difference.

                                                                                  Disabling PF (firewall) did not appear to do anything noticeable.

                                                                                  Two things I have noticed now.

                                                                                  1. Transfer with PFSense as the client and Fileserver as the server the speed is best and using parallel connections (-P20) it gets a little over 2G.
                                                                                    However when I reverse this and have PFSense as the server and the file server as client the speeds are drastically worse.

                                                                                  2. There does appear to be a lot of retries with the iperf sending. I am not sure if this is a "normal" result or not. It does appear to happen regardless of the direction. But is always the sender.

                                                                                  [ ID] Interval Transfer Bitrate Retr
                                                                                  [ 5] 0.00-10.00 sec 304 MBytes 255 Mbits/sec 15 sender
                                                                                  [ 5] 0.00-10.00 sec 302 MBytes 253 Mbits/sec receiver
                                                                                  [ 7] 0.00-10.00 sec 25.5 MBytes 21.4 Mbits/sec 9 sender
                                                                                  [ 7] 0.00-10.00 sec 24.2 MBytes 20.3 Mbits/sec receiver
                                                                                  [ 9] 0.00-10.00 sec 210 MBytes 176 Mbits/sec 15 sender
                                                                                  [ 9] 0.00-10.00 sec 208 MBytes 174 Mbits/sec receiver
                                                                                  [ 11] 0.00-10.00 sec 116 MBytes 97.5 Mbits/sec 9 sender
                                                                                  [ 11] 0.00-10.00 sec 114 MBytes 95.9 Mbits/sec receiver
                                                                                  [ 13] 0.00-10.00 sec 35.9 MBytes 30.1 Mbits/sec 19 sender
                                                                                  [ 13] 0.00-10.00 sec 34.2 MBytes 28.7 Mbits/sec receiver
                                                                                  [ 15] 0.00-10.00 sec 104 MBytes 87.1 Mbits/sec 17 sender
                                                                                  [ 15] 0.00-10.00 sec 102 MBytes 85.5 Mbits/sec receiver
                                                                                  [ 17] 0.00-10.00 sec 127 MBytes 106 Mbits/sec 13 sender
                                                                                  [ 17] 0.00-10.00 sec 124 MBytes 104 Mbits/sec receiver
                                                                                  [ 19] 0.00-10.00 sec 449 MBytes 377 Mbits/sec 11 sender
                                                                                  [ 19] 0.00-10.00 sec 447 MBytes 375 Mbits/sec receiver
                                                                                  [ 21] 0.00-10.00 sec 64.1 MBytes 53.8 Mbits/sec 18 sender
                                                                                  [ 21] 0.00-10.00 sec 62.4 MBytes 52.3 Mbits/sec receiver
                                                                                  [ 23] 0.00-10.00 sec 261 MBytes 219 Mbits/sec 19 sender
                                                                                  [ 23] 0.00-10.00 sec 258 MBytes 216 Mbits/sec receiver
                                                                                  [ 25] 0.00-10.00 sec 182 MBytes 153 Mbits/sec 15 sender
                                                                                  [ 25] 0.00-10.00 sec 180 MBytes 151 Mbits/sec receiver
                                                                                  [ 27] 0.00-10.00 sec 129 MBytes 108 Mbits/sec 13 sender
                                                                                  [ 27] 0.00-10.00 sec 127 MBytes 106 Mbits/sec receiver
                                                                                  [ 29] 0.00-10.00 sec 288 MBytes 242 Mbits/sec 13 sender
                                                                                  [ 29] 0.00-10.00 sec 285 MBytes 239 Mbits/sec receiver
                                                                                  [ 31] 0.00-10.00 sec 48.7 MBytes 40.8 Mbits/sec 11 sender
                                                                                  [ 31] 0.00-10.00 sec 47.3 MBytes 39.6 Mbits/sec receiver
                                                                                  [ 33] 0.00-10.00 sec 332 MBytes 279 Mbits/sec 13 sender
                                                                                  [ 33] 0.00-10.00 sec 330 MBytes 277 Mbits/sec receiver
                                                                                  [ 35] 0.00-10.00 sec 76.5 MBytes 64.2 Mbits/sec 17 sender
                                                                                  [ 35] 0.00-10.00 sec 74.6 MBytes 62.6 Mbits/sec receiver
                                                                                  [ 37] 0.00-10.00 sec 233 MBytes 196 Mbits/sec 16 sender
                                                                                  [ 37] 0.00-10.00 sec 230 MBytes 193 Mbits/sec receiver
                                                                                  [ 39] 0.00-10.00 sec 78.1 MBytes 65.5 Mbits/sec 16 sender
                                                                                  [ 39] 0.00-10.00 sec 76.6 MBytes 64.3 Mbits/sec receiver
                                                                                  [ 41] 0.00-10.00 sec 58.4 MBytes 49.0 Mbits/sec 16 sender
                                                                                  [ 41] 0.00-10.00 sec 57.1 MBytes 47.9 Mbits/sec receiver
                                                                                  [ 43] 0.00-10.00 sec 67.5 MBytes 56.6 Mbits/sec 18 sender
                                                                                  [ 43] 0.00-10.00 sec 65.8 MBytes 55.2 Mbits/sec receiver
                                                                                  [SUM] 0.00-10.00 sec 3.11 GBytes 2.68 Gbits/sec 293 sender
                                                                                  [SUM] 0.00-10.00 sec 3.08 GBytes 2.64 Gbits/sec receiver

                                                                                  iperf Done.

                                                                                  I have started engaging support with the file server manufacture to see if they have any thoughts. It's looking more and more likely that PFSense is not the issue here. But as always open to any suggestions...

                                                                                  Cheers!

                                                                                  1 Reply Last reply Reply Quote 0
                                                                                  • stephenw10
                                                                                    stephenw10 Netgate Administrator last edited by

                                                                                    Use iperf3 if you can. That's available for installing from the command line in pfSense.

                                                                                    pfSense is not optimised to be a server (or client in this case). It will almost certainly perform better testing through it rather than to it.

                                                                                    Steve

                                                                                    1 Reply Last reply Reply Quote 0
                                                                                    • First post
                                                                                      Last post