• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz

General pfSense Questions
8
37
3.9k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    SpaceBass @SpaceBass
    last edited by May 4, 2023, 4:06 PM

    After two days of pulling hair trying to restore the config from the 6100 to the R230 (thanks pfBlockerng for borking the firewall in a restore :/ ) I have the R230 up and running. And the results are really quite poor.

    My ISP has an iperf3 server three hops away from me. Details below on how I run iperf3 tests.

    6100 - ~9.6Gpbs / sec
    R230 - ~692 Mbits/sec

    Intravlan
    6100 - ~2Gbps/sec
    R230 - 1.61 Mbits/sec ... in fact, after the first few packets, it drops to 0Mbits/sec

    For baseline, same hosts on subnet (vLAN): 9.27 Gbits/sec

    DETAILS

    Hardware

    pfSense box

    • Dell R230 Xeon E3-1270 v5 @ 3.6GHz
    • 16GB
    • 2x Samsung 850 SSD in ZFS redundant pool
    • HP NC523SPF NIC in PCIe port 2 (which I believe is full 16 lanes)

    switches & cables & optics

    • unifi aggregation 10G switches
    • Intel 850mm SFP+ optics
    • mm patch cables (same ones used to get faster results with 6100)

    Testing
    iperf3:
    iperf3 -c server.fqdn.foo.bar -P 10
    iperf3 -c server.fqdn.foo.bar -P 10 -R
    iperf3 -c server.fqdn.foo.bar -P 10 -6

    NOTE while I have tried testing right from pfSense, all posted results are from a LAN host to another host and as indicated above the LAN host has no problem sending/receiving 10Gbps traffic.

    system monitoring:
    top -aSH

    pfSense

    • WAN - static IPv4, dynamic IPv6
    • Hardware Checksum Offloading - tried both on and off
    • Hardware TCP Segmentation Offloading - tried both on and off
    • hn ALTQ support - on
    S K 2 Replies Last reply May 4, 2023, 4:08 PM Reply Quote 0
    • S
      SpaceBass @SpaceBass
      last edited by May 4, 2023, 4:08 PM

      **a more successful result from lan to ISP's iperf3 **
      Clear to see the single thread max out :/

      0 root        -92    -     0B  1376K CPU7     7   0:54  99.87% [kernel{ql1 rcvq}]
      
      iperf3 -c 198.60.x.x -P 10
      
      [SUM]   0.00-10.00  sec  1.89 GBytes  1.62 Gbits/sec                  sender
      [SUM]   0.00-10.03  sec  1.88 GBytes  1.61 Gbits/sec                  receiver
      
      1 Reply Last reply Reply Quote 0
      • S
        stephenw10 Netgate Administrator
        last edited by May 4, 2023, 6:43 PM

        @spacebass said in hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz:

        NC523SPF

        What NIC chipset/driver is that? Those numbers seems really low.

        Make sure you have multiple queues in attached for each NIC.

        Steve

        S 1 Reply Last reply May 4, 2023, 8:53 PM Reply Quote 0
        • K
          keyser Rebel Alliance @SpaceBass
          last edited by May 4, 2023, 8:22 PM

          @spacebass It’s your 10Gbe NIC that’s causing issues.

          I have worked with HPE hardware for 20 years, and the NC523 card is a dud. I can’t quite remember the details, but the card took more than a year and half worth of Windows driver updates to it’s Qlogic controller to reliably deliver about 2Gbits performance. Until then it fluctuacted wildly and stalled to zero any time you tried to push it above about a gbit.

          I can only guess how bad the driver state with FreeBSD is, but i’m quite sure the card is your culprit. Ditch it and get a Intel based 10Gbe NIC that has good driver support in pfSense.

          Love the no fuss of using the official appliances :-)

          S 1 Reply Last reply May 4, 2023, 8:55 PM Reply Quote 0
          • S
            SpaceBass @stephenw10
            last edited by May 4, 2023, 8:53 PM

            @stephenw10 said in hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz:

            What NIC chipset/driver is that? Those numbers seems really low.

            Thanks @stephenw10 for replying!
            It is a Qlogic 3200 which uses the qlxgb drivers.

            I applied the following tunables (but no real change)

            kern.ipc.nmbjumbo9=262144
            net.inet.tcp.recvbuf_max=262144
            net.inet.tcp.recvbuf_inc=16384
            kern.ipc.nmbclusters=1000000
            kern.ipc.maxsockbuf=2097152
            net.inet.tcp.recvspace=131072
            net.inet.tcp.sendbuf_max=262144
            net.inet.tcp.sendspace=65536
            

            Make sure you have multiple queues in attached for each NIC.

            Can you elaborate? I'm not using traffic shaping and that's the only context I have for queues

            1 Reply Last reply Reply Quote 0
            • S
              SpaceBass @keyser
              last edited by May 4, 2023, 8:55 PM

              @keyser the good news is that I have Intel cards on order...

              That said, I have a ton of these HP cards and have no problem getting 10Gbps on Linux-based boxes...it could be drivers, but the qlxgb in FreeBSD is pretty tried and true.

              1 Reply Last reply Reply Quote 0
              • S
                stephenw10 Netgate Administrator
                last edited by May 4, 2023, 8:56 PM

                Ah I wasn't sure if ql1 there was that. Ok then at the very least I'd start by disabling all the hardware off-loading options. But if you know exactly which driver it is you can start looking for known bugs/workarounds.

                But, yeah, if you can use an Intel NIC, you should.

                S 1 Reply Last reply May 4, 2023, 9:11 PM Reply Quote 0
                • S
                  SpaceBass @stephenw10
                  last edited by May 4, 2023, 9:11 PM

                  @stephenw10 thanks - if I want to add the full (supposed) 8 queues, how would I go about it?

                  1 Reply Last reply Reply Quote 0
                  • S
                    stephenw10 Netgate Administrator
                    last edited by May 4, 2023, 9:13 PM

                    How many queues are you getting?

                    It's probably a sysctl or loader tunable for that driver. Without having one to test it's difficult to say.
                    Try sysctl hw.ql or maybe sysctl hw.qlxgb and see what values exist.

                    Also check sysctl dev.ql.0

                    S 1 Reply Last reply May 4, 2023, 11:07 PM Reply Quote 0
                    • S
                      SpaceBass @stephenw10
                      last edited by May 4, 2023, 11:07 PM

                      @stephenw10 said in hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz:

                      sysctl dev.ql.0

                      that's the key ... no mq options there though and I dont see any listed in the readme.txt in the driver's source.

                      I'll wait for the Intel NIC and see what I can get out of that

                      1 Reply Last reply Reply Quote 0
                      • S
                        stephenw10 Netgate Administrator
                        last edited by May 5, 2023, 11:10 AM

                        How many queues is it using by default? If it's just 1 that would explain the single threaded performance.

                        S 1 Reply Last reply May 5, 2023, 3:42 PM Reply Quote 0
                        • S
                          SpaceBass @stephenw10
                          last edited by May 5, 2023, 3:42 PM

                          @stephenw10 I'll fire the box up and check ... just curious, what am I looking for in the output of sysctl? I didn't see anything with 'mq' or 'queues' in the output when I first checked.

                          perhaps related - what does it tell us that I get closer to 4Gbps with the -R flag on iperf3 (eg inbound) vs 2Gbps without the flag?

                          I'm not overly determined to make this Qlogic card work, but this is a really good learning opportunity and I'm enjoying the process.

                          1 Reply Last reply Reply Quote 1
                          • S
                            stephenw10 Netgate Administrator
                            last edited by May 5, 2023, 5:07 PM

                            You might have more Rx queues than Tx queues for example. Most drivers show that in the boot logs but I'm not sure qlxb does. vmstat -i may also show it.

                            S 1 Reply Last reply May 5, 2023, 6:12 PM Reply Quote 0
                            • S
                              SpaceBass @stephenw10
                              last edited by May 5, 2023, 6:12 PM

                              @stephenw10
                              Thanks for the continued help...

                              Here's what I see

                              dev.ql.0.wake: 0
                              dev.ql.0.num_sds_rings: 4
                              dev.ql.0.num_rds_rings: 2
                              dev.ql.0.free_pkt_thres: 1024
                              dev.ql.0.snd_pkt_thres: 16
                              dev.ql.0.rcv_pkt_thres_d: 32
                              dev.ql.0.rcv_pkt_thres: 128
                              dev.ql.0.jumbo_replenish: 2
                              dev.ql.0.std_replenish: 8
                              dev.ql.0.debug: 0
                              dev.ql.0.fw_version: 4.16.50.1401759177
                              dev.ql.0.stats: 0
                              dev.ql.0.%parent: pci2
                              dev.ql.0.%pnpinfo: vendor=0x1077 device=0x8020 subvendor=0x103c subdevice=0x3733 class=0x020000
                              dev.ql.0.%location: slot=0 function=0 dbsf=pci0:2:0:0 handle=\_SB_.PCI0.PEG1.PEGP
                              dev.ql.0.%driver: ql
                              dev.ql.0.%desc: Qlogic ISP 80xx PCI CNA Adapter-Ethernet Function v1.1.36
                              

                              I'm having trouble finding much documentation online for this driver... would snd_pkt_thres be the number of threads it is able to or currently using for the outbound queues?

                              Here's the output from a TrueNAS box with the same card which has no trouble moving 10Gbps traffic:

                              dev.ql.0.%desc: Qlogic ISP 80xx PCI CNA Adapter-Ethernet Function v1.1.36
                              root@matterhorn[~]# sysctl sysctl dev.ql.1
                              dev.ql.1.num_sds_rings: 4
                              dev.ql.1.num_rds_rings: 2
                              dev.ql.1.free_pkt_thres: 1024
                              dev.ql.1.snd_pkt_thres: 16
                              dev.ql.1.rcv_pkt_thres_d: 32
                              dev.ql.1.rcv_pkt_thres: 128
                              dev.ql.1.jumbo_replenish: 2
                              dev.ql.1.std_replenish: 8
                              dev.ql.1.debug: 0
                              dev.ql.1.fw_version: 4.20.1.1429931003
                              dev.ql.1.stats: 0
                              
                              1 Reply Last reply Reply Quote 0
                              • S
                                stephenw10 Netgate Administrator
                                last edited by May 5, 2023, 6:18 PM

                                Mmm, so likely those are the default values. There should be a description of each tunable if you run: sysctl -d dev.ql.0

                                S 1 Reply Last reply May 5, 2023, 6:23 PM Reply Quote 0
                                • S
                                  SpaceBass @stephenw10
                                  last edited by SpaceBass May 5, 2023, 6:29 PM May 5, 2023, 6:23 PM

                                  @stephenw10 said in hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz:

                                  sysctl -d dev.ql.0

                                  well that's a helpful command! thanks

                                  still not seeing anything about queues though :/

                                  dev.ql.0:
                                  dev.ql.0.wake: Device set to wake the system
                                  dev.ql.0.num_sds_rings: Number of Status Descriptor Rings
                                  dev.ql.0.num_rds_rings: Number of Rcv Descriptor Rings
                                  dev.ql.0.free_pkt_thres: Threshold for # of packets to free at a time
                                  dev.ql.0.snd_pkt_thres: Threshold for # of snd packets
                                  dev.ql.0.rcv_pkt_thres_d: Threshold for # of rcv pkts to trigger indication defered
                                  dev.ql.0.rcv_pkt_thres: Threshold for # of rcv pkts to trigger indication isr
                                  dev.ql.0.jumbo_replenish: Threshold for Replenishing Jumbo Frames
                                  dev.ql.0.std_replenish: Threshold for Replenishing Standard Frames
                                  dev.ql.0.debug: Debug Level
                                  dev.ql.0.fw_version: firmware version
                                  dev.ql.0.stats: Statistics
                                  dev.ql.0.%parent: parent device
                                  dev.ql.0.%pnpinfo: device identification
                                  dev.ql.0.%location: device location relative to parent
                                  dev.ql.0.%driver: device driver name
                                  dev.ql.0.%desc: device description
                                  

                                  Also, interesting, when I do an iperf3 with -R here's the top output on the pfSense box...

                                  CPU 0:  0.0% user,  0.0% nice,  1.9% system, 86.5% interrupt, 11.6% idle
                                  CPU 1:  0.4% user,  0.0% nice,  8.1% system,  2.7% interrupt, 88.8% idle
                                  CPU 2:  0.0% user,  0.0% nice,  5.8% system, 54.8% interrupt, 39.4% idle
                                  CPU 3:  0.4% user,  0.0% nice,  8.5% system,  6.6% interrupt, 84.6% idle
                                  CPU 4:  0.8% user,  0.0% nice, 25.9% system,  6.2% interrupt, 67.2% idle
                                  CPU 5:  0.4% user,  0.0% nice, 21.6% system,  7.3% interrupt, 70.7% idle
                                  CPU 6:  0.4% user,  0.0% nice,  0.8% system, 51.4% interrupt, 47.5% idle
                                  CPU 7:  0.4% user,  0.0% nice,  5.4% system,  8.9% interrupt, 85.3% idle
                                  Mem: 305M Active, 201M Inact, 872M Wired, 14G Free
                                  ARC: 404M Total, 63M MFU, 336M MRU, 32K Anon, 1218K Header, 3853K Other
                                       122M Compressed, 283M Uncompressed, 2.31:1 Ratio
                                  Swap: 1024M Total, 1024M Free
                                  
                                    PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
                                     12 root        -92    -     0B   528K CPU0     0   0:32  90.84% [intr{irq265: ql0}]
                                     11 root        155 ki31     0B   128K CPU1     1  45:28  87.99% [idle{idle: cpu1}]
                                     11 root        155 ki31     0B   128K CPU7     7  45:32  86.71% [idle{idle: cpu7}]
                                     11 root        155 ki31     0B   128K CPU3     3  45:28  83.73% [idle{idle: cpu3}]
                                     11 root        155 ki31     0B   128K RUN      5  45:17  71.47% [idle{idle: cpu5}]
                                     11 root        155 ki31     0B   128K CPU4     4  45:15  70.76% [idle{idle: cpu4}]
                                     12 root        -92    -     0B   528K CPU2     2   0:24  60.84% [intr{irq266: ql0}]
                                      0 root        -92    -     0B  1376K -        6   0:31  54.00% [kernel{ql1 txq}]
                                     11 root        155 ki31     0B   128K RUN      6  44:26  48.31% [idle{idle: cpu6}]
                                     12 root        -92    -     0B   528K WAIT     6   0:49  43.46% [intr{irq268: ql1}]
                                     11 root        155 ki31     0B   128K RUN      2  44:57  37.84% [idle{idle: cpu2}]
                                     12 root        -92    -     0B   528K WAIT     6   0:49  23.37% [intr{irq264: ql0}]
                                     11 root        155 ki31     0B   128K RUN      0  45:02  12.88% [idle{idle: cpu0}]
                                      0 root        -92    -     0B  1376K -        7   0:07  12.82% [kernel{ql0 rcvq}]
                                      0 root        -92    -     0B  1376K -        1   0:12   5.35% [kernel{ql1 rcvq}]
                                  99487 avahi        20    0    13M  4500K select   5   0:06   2.82% avahi-daemon: running [washington.local]
                                     12 root        -92    -     0B   528K WAIT     4   0:20   2.34% [intr{irq267: ql0}]
                                      0 root        -92    -     0B  1376K -        1   0:11   1.83% [kernel{ql0 txq}]
                                  
                                  1 Reply Last reply Reply Quote 0
                                  • S
                                    stephenw10 Netgate Administrator
                                    last edited by May 5, 2023, 6:36 PM

                                    Well at least 4 IRQs for ql0 there. Does vmstat -i show those?

                                    Nothing about queues there I agree. That's the sort of setting that would usually be a loader value though. Those are usually shown in hw.ql but only values that are set are shown.

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      SpaceBass
                                      last edited by May 6, 2023, 10:15 PM

                                      @stephenw10

                                      Intel X520-da2 update

                                      tl;dr better performance for sure, but still not 10Gbps. 8 CPU cores, each NIC using 4 queues.

                                      I'm increasingly of the opinion that even with a beefy CPU pfSense just doesnt like doing 10Gbps 😂

                                      iperf3
                                      iperf3 -c ISP's server -P 10

                                      [SUM]   0.00-10.00  sec  5.64 GBytes  4.84 Gbits/sec  1455             sender
                                      [SUM]   0.00-10.03  sec  5.63 GBytes  4.82 Gbits/sec                  receiver
                                      

                                      iperf3 -c ISP's server -P 10 -R

                                      [SUM]   0.00-10.03  sec  5.18 GBytes  4.43 Gbits/sec  4033             sender
                                      [SUM]   0.00-10.00  sec  5.14 GBytes  4.42 Gbits/sec                  receiver
                                      

                                      iperf3 -c local server on other vLAN -P 18

                                      [SUM]   0.00-10.00  sec  5.40 GBytes  4.64 Gbits/sec  11944             sender
                                      [SUM]   0.00-10.01  sec  5.39 GBytes  4.62 Gbits/sec                  receiver
                                      

                                      top

                                      last pid: 52809;  load averages:  0.71,  0.41,  0.36                             up 0+00:28:16  16:09:59
                                      742 threads:   10 running, 699 sleeping, 33 waiting
                                      CPU 0:  0.0% user,  0.0% nice, 31.1% system,  0.0% interrupt, 68.9% idle
                                      CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                                      CPU 2:  0.4% user,  0.0% nice,  3.9% system,  0.0% interrupt, 95.7% idle
                                      CPU 3:  0.4% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.2% idle
                                      CPU 4:  0.0% user,  0.0% nice, 75.2% system,  0.0% interrupt, 24.8% idle
                                      CPU 5:  0.4% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.6% idle
                                      CPU 6:  0.4% user,  0.0% nice, 75.6% system,  0.0% interrupt, 24.0% idle
                                      CPU 7:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
                                      Mem: 297M Active, 178M Inact, 920M Wired, 14G Free
                                      ARC: 384M Total, 53M MFU, 326M MRU, 32K Anon, 1086K Header, 3216K Other
                                           118M Compressed, 268M Uncompressed, 2.27:1 Ratio
                                      Swap: 1024M Total, 1024M Free
                                      
                                        PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
                                         11 root        155 ki31     0B   128K CPU7     7  27:56  99.89% [idle{idle: cpu7}]
                                         11 root        155 ki31     0B   128K CPU5     5  27:55  99.73% [idle{idle: cpu5}]
                                         11 root        155 ki31     0B   128K CPU3     3  27:56  99.16% [idle{idle: cpu3}]
                                         11 root        155 ki31     0B   128K CPU1     1  27:55  99.14% [idle{idle: cpu1}]
                                         11 root        155 ki31     0B   128K RUN      2  26:31  95.65% [idle{idle: cpu2}]
                                          0 root        -76    -     0B  1376K -        6   1:13  78.85% [kernel{if_io_tqg_6}]
                                          0 root        -76    -     0B  1376K CPU4     4   1:16  72.63% [kernel{if_io_tqg_4}]
                                         11 root        155 ki31     0B   128K RUN      0  26:34  69.36% [idle{idle: cpu0}]
                                          0 root        -76    -     0B  1376K -        0   1:27  30.30% [kernel{if_io_tqg_0}]
                                         11 root        155 ki31     0B   128K RUN      4  26:34  27.30% [idle{idle: cpu4}]
                                         11 root        155 ki31     0B   128K CPU6     6  26:57  21.08% [idle{idle: cpu6}]
                                          0 root        -76    -     0B  1376K -        2   1:40   3.32% [kernel{if_io_tqg_2}]
                                      67535 unbound      20    0   107M    54M kqread   4   0:02   0.40% /usr/local/sbin/unbound -c /var/unbox
                                      

                                      dmesg

                                      root: dmesg | grep queues
                                      ix0: Using 4 RX queues 4 TX queues
                                      ix0: allocated for 4 queues
                                      ix0: allocated for 4 rx queues
                                      ix0: netmap queues/slots: TX 4/2048, RX 4/2048
                                      ix1: Using 4 RX queues 4 TX queues
                                      ix1: allocated for 4 queues
                                      ix1: allocated for 4 rx queues
                                      ix1: netmap queues/slots: TX 4/2048, RX 4/2048
                                      
                                      O 1 Reply Last reply May 7, 2023, 7:27 AM Reply Quote 1
                                      • O
                                        ogghi @SpaceBass
                                        last edited by May 7, 2023, 7:27 AM

                                        @spacebass
                                        Pretty exactly the same on my end. I will now try to get in touch with our ISP again to make sure it's not their core router being the culprit here!

                                        https://www.reddit.com/r/PFSENSE/comments/137iv07/comment/jj6oqw4/?utm_source=share&utm_medium=web2x&context=3

                                        1 Reply Last reply Reply Quote 0
                                        • Cool_CoronaC
                                          Cool_Corona
                                          last edited by Cool_Corona May 7, 2023, 8:22 AM May 7, 2023, 8:22 AM

                                          Are you guys using SATA on your hardware??

                                          Remember there is a 6gbit/s limit to that when writing to the disk sybsystem.

                                          And I bet that is what you see.

                                          IN short... your NIC is pushing the limits of the disk subsystem.

                                          1 Reply Last reply Reply Quote 0
                                          11 out of 37
                                          • First post
                                            11/37
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.