Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz

    Scheduled Pinned Locked Moved General pfSense Questions
    37 Posts 8 Posters 4.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Ah I wasn't sure if ql1 there was that. Ok then at the very least I'd start by disabling all the hardware off-loading options. But if you know exactly which driver it is you can start looking for known bugs/workarounds.

      But, yeah, if you can use an Intel NIC, you should.

      S 1 Reply Last reply Reply Quote 0
      • S
        SpaceBass @stephenw10
        last edited by

        @stephenw10 thanks - if I want to add the full (supposed) 8 queues, how would I go about it?

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          How many queues are you getting?

          It's probably a sysctl or loader tunable for that driver. Without having one to test it's difficult to say.
          Try sysctl hw.ql or maybe sysctl hw.qlxgb and see what values exist.

          Also check sysctl dev.ql.0

          S 1 Reply Last reply Reply Quote 0
          • S
            SpaceBass @stephenw10
            last edited by

            @stephenw10 said in hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz:

            sysctl dev.ql.0

            that's the key ... no mq options there though and I dont see any listed in the readme.txt in the driver's source.

            I'll wait for the Intel NIC and see what I can get out of that

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              How many queues is it using by default? If it's just 1 that would explain the single threaded performance.

              S 1 Reply Last reply Reply Quote 0
              • S
                SpaceBass @stephenw10
                last edited by

                @stephenw10 I'll fire the box up and check ... just curious, what am I looking for in the output of sysctl? I didn't see anything with 'mq' or 'queues' in the output when I first checked.

                perhaps related - what does it tell us that I get closer to 4Gbps with the -R flag on iperf3 (eg inbound) vs 2Gbps without the flag?

                I'm not overly determined to make this Qlogic card work, but this is a really good learning opportunity and I'm enjoying the process.

                1 Reply Last reply Reply Quote 1
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  You might have more Rx queues than Tx queues for example. Most drivers show that in the boot logs but I'm not sure qlxb does. vmstat -i may also show it.

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    SpaceBass @stephenw10
                    last edited by

                    @stephenw10
                    Thanks for the continued help...

                    Here's what I see

                    dev.ql.0.wake: 0
                    dev.ql.0.num_sds_rings: 4
                    dev.ql.0.num_rds_rings: 2
                    dev.ql.0.free_pkt_thres: 1024
                    dev.ql.0.snd_pkt_thres: 16
                    dev.ql.0.rcv_pkt_thres_d: 32
                    dev.ql.0.rcv_pkt_thres: 128
                    dev.ql.0.jumbo_replenish: 2
                    dev.ql.0.std_replenish: 8
                    dev.ql.0.debug: 0
                    dev.ql.0.fw_version: 4.16.50.1401759177
                    dev.ql.0.stats: 0
                    dev.ql.0.%parent: pci2
                    dev.ql.0.%pnpinfo: vendor=0x1077 device=0x8020 subvendor=0x103c subdevice=0x3733 class=0x020000
                    dev.ql.0.%location: slot=0 function=0 dbsf=pci0:2:0:0 handle=\_SB_.PCI0.PEG1.PEGP
                    dev.ql.0.%driver: ql
                    dev.ql.0.%desc: Qlogic ISP 80xx PCI CNA Adapter-Ethernet Function v1.1.36
                    

                    I'm having trouble finding much documentation online for this driver... would snd_pkt_thres be the number of threads it is able to or currently using for the outbound queues?

                    Here's the output from a TrueNAS box with the same card which has no trouble moving 10Gbps traffic:

                    dev.ql.0.%desc: Qlogic ISP 80xx PCI CNA Adapter-Ethernet Function v1.1.36
                    root@matterhorn[~]# sysctl sysctl dev.ql.1
                    dev.ql.1.num_sds_rings: 4
                    dev.ql.1.num_rds_rings: 2
                    dev.ql.1.free_pkt_thres: 1024
                    dev.ql.1.snd_pkt_thres: 16
                    dev.ql.1.rcv_pkt_thres_d: 32
                    dev.ql.1.rcv_pkt_thres: 128
                    dev.ql.1.jumbo_replenish: 2
                    dev.ql.1.std_replenish: 8
                    dev.ql.1.debug: 0
                    dev.ql.1.fw_version: 4.20.1.1429931003
                    dev.ql.1.stats: 0
                    
                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Mmm, so likely those are the default values. There should be a description of each tunable if you run: sysctl -d dev.ql.0

                      S 1 Reply Last reply Reply Quote 0
                      • S
                        SpaceBass @stephenw10
                        last edited by SpaceBass

                        @stephenw10 said in hoping for 10Gbps, getting sub 1Gbps speed Xeon E3-1270 v5 3.6GHz:

                        sysctl -d dev.ql.0

                        well that's a helpful command! thanks

                        still not seeing anything about queues though :/

                        dev.ql.0:
                        dev.ql.0.wake: Device set to wake the system
                        dev.ql.0.num_sds_rings: Number of Status Descriptor Rings
                        dev.ql.0.num_rds_rings: Number of Rcv Descriptor Rings
                        dev.ql.0.free_pkt_thres: Threshold for # of packets to free at a time
                        dev.ql.0.snd_pkt_thres: Threshold for # of snd packets
                        dev.ql.0.rcv_pkt_thres_d: Threshold for # of rcv pkts to trigger indication defered
                        dev.ql.0.rcv_pkt_thres: Threshold for # of rcv pkts to trigger indication isr
                        dev.ql.0.jumbo_replenish: Threshold for Replenishing Jumbo Frames
                        dev.ql.0.std_replenish: Threshold for Replenishing Standard Frames
                        dev.ql.0.debug: Debug Level
                        dev.ql.0.fw_version: firmware version
                        dev.ql.0.stats: Statistics
                        dev.ql.0.%parent: parent device
                        dev.ql.0.%pnpinfo: device identification
                        dev.ql.0.%location: device location relative to parent
                        dev.ql.0.%driver: device driver name
                        dev.ql.0.%desc: device description
                        

                        Also, interesting, when I do an iperf3 with -R here's the top output on the pfSense box...

                        CPU 0:  0.0% user,  0.0% nice,  1.9% system, 86.5% interrupt, 11.6% idle
                        CPU 1:  0.4% user,  0.0% nice,  8.1% system,  2.7% interrupt, 88.8% idle
                        CPU 2:  0.0% user,  0.0% nice,  5.8% system, 54.8% interrupt, 39.4% idle
                        CPU 3:  0.4% user,  0.0% nice,  8.5% system,  6.6% interrupt, 84.6% idle
                        CPU 4:  0.8% user,  0.0% nice, 25.9% system,  6.2% interrupt, 67.2% idle
                        CPU 5:  0.4% user,  0.0% nice, 21.6% system,  7.3% interrupt, 70.7% idle
                        CPU 6:  0.4% user,  0.0% nice,  0.8% system, 51.4% interrupt, 47.5% idle
                        CPU 7:  0.4% user,  0.0% nice,  5.4% system,  8.9% interrupt, 85.3% idle
                        Mem: 305M Active, 201M Inact, 872M Wired, 14G Free
                        ARC: 404M Total, 63M MFU, 336M MRU, 32K Anon, 1218K Header, 3853K Other
                             122M Compressed, 283M Uncompressed, 2.31:1 Ratio
                        Swap: 1024M Total, 1024M Free
                        
                          PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
                           12 root        -92    -     0B   528K CPU0     0   0:32  90.84% [intr{irq265: ql0}]
                           11 root        155 ki31     0B   128K CPU1     1  45:28  87.99% [idle{idle: cpu1}]
                           11 root        155 ki31     0B   128K CPU7     7  45:32  86.71% [idle{idle: cpu7}]
                           11 root        155 ki31     0B   128K CPU3     3  45:28  83.73% [idle{idle: cpu3}]
                           11 root        155 ki31     0B   128K RUN      5  45:17  71.47% [idle{idle: cpu5}]
                           11 root        155 ki31     0B   128K CPU4     4  45:15  70.76% [idle{idle: cpu4}]
                           12 root        -92    -     0B   528K CPU2     2   0:24  60.84% [intr{irq266: ql0}]
                            0 root        -92    -     0B  1376K -        6   0:31  54.00% [kernel{ql1 txq}]
                           11 root        155 ki31     0B   128K RUN      6  44:26  48.31% [idle{idle: cpu6}]
                           12 root        -92    -     0B   528K WAIT     6   0:49  43.46% [intr{irq268: ql1}]
                           11 root        155 ki31     0B   128K RUN      2  44:57  37.84% [idle{idle: cpu2}]
                           12 root        -92    -     0B   528K WAIT     6   0:49  23.37% [intr{irq264: ql0}]
                           11 root        155 ki31     0B   128K RUN      0  45:02  12.88% [idle{idle: cpu0}]
                            0 root        -92    -     0B  1376K -        7   0:07  12.82% [kernel{ql0 rcvq}]
                            0 root        -92    -     0B  1376K -        1   0:12   5.35% [kernel{ql1 rcvq}]
                        99487 avahi        20    0    13M  4500K select   5   0:06   2.82% avahi-daemon: running [washington.local]
                           12 root        -92    -     0B   528K WAIT     4   0:20   2.34% [intr{irq267: ql0}]
                            0 root        -92    -     0B  1376K -        1   0:11   1.83% [kernel{ql0 txq}]
                        
                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Well at least 4 IRQs for ql0 there. Does vmstat -i show those?

                          Nothing about queues there I agree. That's the sort of setting that would usually be a loader value though. Those are usually shown in hw.ql but only values that are set are shown.

                          1 Reply Last reply Reply Quote 0
                          • S
                            SpaceBass
                            last edited by

                            @stephenw10

                            Intel X520-da2 update

                            tl;dr better performance for sure, but still not 10Gbps. 8 CPU cores, each NIC using 4 queues.

                            I'm increasingly of the opinion that even with a beefy CPU pfSense just doesnt like doing 10Gbps šŸ˜‚

                            iperf3
                            iperf3 -c ISP's server -P 10

                            [SUM]   0.00-10.00  sec  5.64 GBytes  4.84 Gbits/sec  1455             sender
                            [SUM]   0.00-10.03  sec  5.63 GBytes  4.82 Gbits/sec                  receiver
                            

                            iperf3 -c ISP's server -P 10 -R

                            [SUM]   0.00-10.03  sec  5.18 GBytes  4.43 Gbits/sec  4033             sender
                            [SUM]   0.00-10.00  sec  5.14 GBytes  4.42 Gbits/sec                  receiver
                            

                            iperf3 -c local server on other vLAN -P 18

                            [SUM]   0.00-10.00  sec  5.40 GBytes  4.64 Gbits/sec  11944             sender
                            [SUM]   0.00-10.01  sec  5.39 GBytes  4.62 Gbits/sec                  receiver
                            

                            top

                            last pid: 52809;  load averages:  0.71,  0.41,  0.36                             up 0+00:28:16  16:09:59
                            742 threads:   10 running, 699 sleeping, 33 waiting
                            CPU 0:  0.0% user,  0.0% nice, 31.1% system,  0.0% interrupt, 68.9% idle
                            CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
                            CPU 2:  0.4% user,  0.0% nice,  3.9% system,  0.0% interrupt, 95.7% idle
                            CPU 3:  0.4% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.2% idle
                            CPU 4:  0.0% user,  0.0% nice, 75.2% system,  0.0% interrupt, 24.8% idle
                            CPU 5:  0.4% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.6% idle
                            CPU 6:  0.4% user,  0.0% nice, 75.6% system,  0.0% interrupt, 24.0% idle
                            CPU 7:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
                            Mem: 297M Active, 178M Inact, 920M Wired, 14G Free
                            ARC: 384M Total, 53M MFU, 326M MRU, 32K Anon, 1086K Header, 3216K Other
                                 118M Compressed, 268M Uncompressed, 2.27:1 Ratio
                            Swap: 1024M Total, 1024M Free
                            
                              PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
                               11 root        155 ki31     0B   128K CPU7     7  27:56  99.89% [idle{idle: cpu7}]
                               11 root        155 ki31     0B   128K CPU5     5  27:55  99.73% [idle{idle: cpu5}]
                               11 root        155 ki31     0B   128K CPU3     3  27:56  99.16% [idle{idle: cpu3}]
                               11 root        155 ki31     0B   128K CPU1     1  27:55  99.14% [idle{idle: cpu1}]
                               11 root        155 ki31     0B   128K RUN      2  26:31  95.65% [idle{idle: cpu2}]
                                0 root        -76    -     0B  1376K -        6   1:13  78.85% [kernel{if_io_tqg_6}]
                                0 root        -76    -     0B  1376K CPU4     4   1:16  72.63% [kernel{if_io_tqg_4}]
                               11 root        155 ki31     0B   128K RUN      0  26:34  69.36% [idle{idle: cpu0}]
                                0 root        -76    -     0B  1376K -        0   1:27  30.30% [kernel{if_io_tqg_0}]
                               11 root        155 ki31     0B   128K RUN      4  26:34  27.30% [idle{idle: cpu4}]
                               11 root        155 ki31     0B   128K CPU6     6  26:57  21.08% [idle{idle: cpu6}]
                                0 root        -76    -     0B  1376K -        2   1:40   3.32% [kernel{if_io_tqg_2}]
                            67535 unbound      20    0   107M    54M kqread   4   0:02   0.40% /usr/local/sbin/unbound -c /var/unbox
                            

                            dmesg

                            root: dmesg | grep queues
                            ix0: Using 4 RX queues 4 TX queues
                            ix0: allocated for 4 queues
                            ix0: allocated for 4 rx queues
                            ix0: netmap queues/slots: TX 4/2048, RX 4/2048
                            ix1: Using 4 RX queues 4 TX queues
                            ix1: allocated for 4 queues
                            ix1: allocated for 4 rx queues
                            ix1: netmap queues/slots: TX 4/2048, RX 4/2048
                            
                            O 1 Reply Last reply Reply Quote 1
                            • O
                              ogghi @SpaceBass
                              last edited by

                              @spacebass
                              Pretty exactly the same on my end. I will now try to get in touch with our ISP again to make sure it's not their core router being the culprit here!

                              https://www.reddit.com/r/PFSENSE/comments/137iv07/comment/jj6oqw4/?utm_source=share&utm_medium=web2x&context=3

                              1 Reply Last reply Reply Quote 0
                              • Cool_CoronaC
                                Cool_Corona
                                last edited by Cool_Corona

                                Are you guys using SATA on your hardware??

                                Remember there is a 6gbit/s limit to that when writing to the disk sybsystem.

                                And I bet that is what you see.

                                IN short... your NIC is pushing the limits of the disk subsystem.

                                1 Reply Last reply Reply Quote 0
                                • O
                                  ogghi
                                  last edited by

                                  It is, but pFsense should not write data to disk while transferring?!
                                  Or better, not the data it is routing through!

                                  Cool_CoronaC 1 Reply Last reply Reply Quote 1
                                  • Cool_CoronaC
                                    Cool_Corona @ogghi
                                    last edited by

                                    @ogghi But youre downloading a file to test IPERF. Guess where that is written?

                                    S S 2 Replies Last reply Reply Quote 0
                                    • S
                                      SpaceBass @Cool_Corona
                                      last edited by

                                      @cool_corona if that were the case, hosts on the same network would also be bottlenecked.

                                      Cool_CoronaC O 2 Replies Last reply Reply Quote 1
                                      • S
                                        SteveITS Galactic Empire @Cool_Corona
                                        last edited by

                                        @cool_corona Don’t run speed tests on pfSense if at all possible, use a host behind it. Then it (also) isn’t using CPU cycles on the test.

                                        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                                        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                                        Upvote šŸ‘ helpful posts!

                                        1 Reply Last reply Reply Quote 1
                                        • Cool_CoronaC
                                          Cool_Corona @SpaceBass
                                          last edited by

                                          @spacebass Why?? doesnt pass through pfsense?

                                          R 1 Reply Last reply Reply Quote 0
                                          • R
                                            rcoleman-netgate Netgate @Cool_Corona
                                            last edited by

                                            @cool_corona It's a single NIC route. If you want to test throughput you should test the THROUGH part of it

                                            Ryan
                                            Repeat, after me: MESH IS THE DEVIL! MESH IS THE DEVIL!
                                            Requesting firmware for your Netgate device? https://go.netgate.com
                                            Switching: Mikrotik, Netgear, Extreme
                                            Wireless: Aruba, Ubiquiti

                                            Cool_CoronaC 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.