Performance – Throughput / Speed Issues
-
Hi All,
FYI - This is my first time using pfSense, so hopefully I'm not wasting everyone’s time because I missed something stupid. Apologies in advance if this turns out to be a gross oversight on my part!!!
This is a new pfSense deployment. This build is intended to replace a production FortiGate 100D that cannot keep up with the recent service upgrade. The 100D can only manage about 600Mbps of straight, unencrypted real-world throughput. It comes nowhere near to the sales specs of 2.5 Gbps.
The issue is, my average speed loss when testing with the new pfSense server is about 15.5%, but I've had tests with as high as 27% loss.
Hopefully the following is enough information about the hardware and topology involved. If I am missing anything, please let me know and I will try and get the information as soon as possible.
Speeds with ComCast Gateway connected directly to HP ProCurve Switch
Latency Download Upload
9.561 918.2455 36.91273
10.924 950.1666 48.54074
9.573 949.4188 24.99929Speeds with ComCast Gateway connected to pfSense Server
Latency Download Upload
9.442 787.2579 38.67714
11.421 797.3936 50.65541
9.565 795.5139 28.34676Things I've tried…
I've tried the default kern.ipc.nmbclusters setting, as well as having it set to 1000000
I've tried the following combinations under System / Advanced / Networking:
Hardware Checksum Offloading = UnChecked
Hardware TCP Segmentation Offloading = Checked
Hardware Large Receive Offloading = CheckedHardware Checksum Offloading = Checked
Hardware TCP Segmentation Offloading = Checked
Hardware Large Receive Offloading = CheckedHardware Checksum Offloading = UnChecked
Hardware TCP Segmentation Offloading = UnChecked
Hardware Large Receive Offloading = UnCheckedConnection
ComCast Business Internet 1 GigNetwork Equipment
ComCast DOCSIS 3.1 Gateway
Vendor: Technicolor
Model: CGA4131COMHP ProCurve V1910-48G - JE009A Switch
104 Gbps switching capacity, max.
77.4 Mpps forwarding rate, max.pfSense Server Hardware
Supermicro SuperServer 5018D-FN8T
Intel Xeon processor D-1518 2.2GHz Quad Core
16GB 2400MHZ DDR4 ECC Reg CL17 DIMM 1RX4
Dual 10G SFP+ ports from D-1500 SoC
Quad 1GbE with Intel I350-AM4
Dual 1GbE with Intel I210Added
Intel I350-T2 Server NIC (WAN & LAN Connected here)
1 x Samsung SSD 960 EVO NVMe M.2 250GB
1 x Samsung SSD 860 EVO 250GB (pfSense installed here for now on ZFS)Network Topology
ComCast Gateway <–> pfSense <--> HP ProCurve Switch <--> WorkstationspfSense Configuration
(It's basically a default install right out of the box right now for testing)System / Advanced / Networking
Hardware Checksum Offloading = UnChecked
Hardware TCP Segmentation Offloading = Checked
Hardware Large Receive Offloading = CheckedSystem / Routing / Gateways
GW_WAN (default) onlySystem / Advanced / System Tunables
kern.ipc.nmbclusters = 1000000Interfaces / WAN
Block private networks and loopback addresses = Checked
Block bogon networks = CheckedInterfaces / LAN
Block private networks and loopback addresses = UnChecked
Block bogon networks = UnCheckedFirewall / NAT / Outbound
Auto Created mappings onlyFirewall / Rules / WAN
1 rule added after default installation
Action: Pass / Interface: wan / Protocol: icmp / ICMP Subtypes: any / Source: any / Destination: anyFirewall / Rules / LAN
Default Anti-Lockout Rule
Default allow LAN to any rule
Default allow LAN IPv6 to any rulepfSense Server Hardware Performance Details
IPerf Results - Verify Cables, Switch, and ComCast Gateway conectivity
(pfSense running IPerf is more than capable of hitting 940 Mbits/sec on both the WAN & LAN Ports on the network)
–----------------------------------------------------------
External NIC (WAN Port)
TCP window size: 64.2 KByte (default)[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.10 GBytes 947 Mbits/sec–----------------------------------------------------------
Internal NIC (LAN Port)
TCP window size: 64.2 KByte (default)[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.10 GBytes 948 Mbits/secTOP Command Results
3 sample/average "top" command results from pfSense while running speed tests.
–----------------------------------------------------------
last pid: 55894; load averages: 0.50, 0.34, 0.21 up 0+23:49:1813:22:17
601 processes: 10 running, 471 sleeping, 8 zombie, 112 waiting
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 31M Active, 129M Inact, 549M Wired, 132K Buf, 15G Free
ARC: 155M Total, 423K MFU, 151M MRU, 156K Anon, 534K Header, 2556K Other
41M Compressed, 113M Uncompressed, 2.74:1 Ratio
Swap: 2048M Total, 2048M FreePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 128K CPU0 0 23.8H 99.85% [idle{idle: cpu0}]
11 root 155 ki31 0K 128K CPU1 1 23.8H 99.56% [idle{idle: cpu1}]
11 root 155 ki31 0K 128K CPU5 5 23.8H 98.39% [idle{idle: cpu5}]
11 root 155 ki31 0K 128K CPU2 2 23.8H 98.10% [idle{idle: cpu2}]
11 root 155 ki31 0K 128K CPU6 6 23.8H 97.85% [idle{idle: cpu6}]
11 root 155 ki31 0K 128K CPU7 7 23.8H 95.07% [idle{idle: cpu7}]
11 root 155 ki31 0K 128K RUN 4 23.8H 94.09% [idle{idle: cpu4}]
11 root 155 ki31 0K 128K CPU3 3 23.8H 82.67% [idle{idle: cpu3}]
12 root -92 - 0K 1808K CPU3 3 0:09 17.38% [intr{irq290: igb0:que 3}]
12 root -92 - 0K 1808K WAIT 4 0:04 5.76% [intr{irq300: igb1:que 4}]
12 root -92 - 0K 1808K WAIT 7 0:08 4.69% [intr{irq294: igb0:que 7}]
12 root -92 - 0K 1808K WAIT 2 0:08 4.69% [intr{irq289: igb0:que 2}]
12 root -92 - 0K 1808K WAIT 0 0:07 1.86% [intr{irq296: igb1:que 0}]
12 root -92 - 0K 1808K WAIT 5 0:14 1.27% [intr{irq292: igb0:que 5}]
12 root -92 - 0K 1808K WAIT 0 0:21 1.17% [intr{irq287: igb0:que 0}]
12 root -92 - 0K 1808K WAIT 2 0:04 0.98% [intr{irq298: igb1:que 2}]
12 root -92 - 0K 1808K WAIT 6 0:04 0.68% [intr{irq302: igb1:que 6}]
12 root -92 - 0K 1808K WAIT 6 0:16 0.59% [intr{irq293: igb0:que 6}]
53541 root 20 0 39364K 4148K bpf 7 0:03 0.49% /usr/local/bandwidthd/bandwidthd
53015 root 20 0 39364K 4368K bpf 0 0:03 0.49% /usr/local/bandwidthd/bandwidthd
52709 root 20 0 39364K 4592K bpf 1 0:03 0.39% /usr/local/bandwidthd/bandwidthd
53227 root 20 0 39364K 4352K bpf 4 0:03 0.39% /usr/local/bandwidthd/bandwidthd
53853 root 20 0 39364K 4136K bpf 2 0:03 0.39% /usr/local/bandwidthd/bandwidthd
52658 root 20 0 39364K 4604K bpf 6 0:03 0.29% /usr/local/bandwidthd/bandwidthd
48506 root 21 0 268M 37828K accept 2 0:09 0.10% php-fpm: pool nginx (php-fpm){php-fpm}
53454 root 20 0 39364K 4164K bpf 1 0:03 0.10% /usr/local/bandwidthd/bandwidthd
0 root -16 - 0K 5632K swapin 0 18.6H 0.00% [kernel{swapper}]
24 root -16 - 0K 16K pftm 4 0:20 0.00% [pf purge]
12 root -60 - 0K 1808K WAIT 0 0:18 0.00% [intr{swi4: clock (0)}]
327 root 52 0 266M 35800K accept 5 0:11 0.00% php-fpm: pool nginx (php-fpm){php-fpm}
326 root 52 0 266M 35052K accept 0 0:11 0.00% php-fpm: pool nginx (php-fpm){php-fpm}
12 root -92 - 0K 1808K WAIT 1 0:10 0.00% [intr{irq288: igb0:que 1}]
22 root -8 - 0K 128K tx->tx 1 0:10 0.00% [zfskern{txg_thread_enter}]
47952 root 20 0 10488K 2568K select 5 0:10 0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /etc/syslog.con
25 root -16 - 0K 16K - 0 0:09 0.00% [rand_harvestq]
12 root -92 - 0K 1808K WAIT 4 0:08 0.00% [intr{irq291: igb0:que 4}]
9287 root 20 0 12736K 1904K bpf 0 0:07 0.00% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
0 root -12 - 0K 5632K - 7 0:06 0.00% [kernel{zio_write_issue_0}]
0 root -12 - 0K 5632K - 5 0:06 0.00% [kernel{zio_write_issue_4}]
0 root -12 - 0K 5632K - 1 0:06 0.00% [kernel{zio_write_issue_5}]
0 root -12 - 0K 5632K - 1 0:06 0.00% [kernel{zio_write_issue_2}]
0 root -12 - 0K 5632K - 2 0:06 0.00% [kernel{zio_write_issue_1}]
0 root -12 - 0K 5632K - 4 0:06 0.00% [kernel{zio_write_issue_3}]
0 root -16 - 0K 5632K - 4 0:05 0.00% [kernel{zio_write_intr_7}]
0 root -16 - 0K 5632K - 5 0:05 0.00% [kernel{zio_write_intr_4}]
0 root -16 - 0K 5632K - 0 0:05 0.00% [kernel{zio_write_intr_3}]
0 root -16 - 0K 5632K - 2 0:05 0.00% [kernel{zio_write_intr_0}]
0 root -16 - 0K 5632K - 1 0:05 0.00% [kernel{zio_write_intr_6}]
0 root -16 - 0K 5632K - 6 0:05 0.00% [kernel{zio_write_intr_5}]
0 root -16 - 0K 5632K - 3 0:05 0.00% [kernel{zio_write_intr_1}]
0 root -16 - 0K 5632K - 5 0:05 0.00% [kernel{zio_write_intr_2}]
–----------------------------------------------------------
last pid: 96083; load averages: 0.22, 0.15, 0.13 up 0+23:46:1113:19:10
601 processes: 9 running, 471 sleeping, 8 zombie, 113 waiting
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 39M Active, 123M Inact, 548M Wired, 132K Buf, 15G Free
ARC: 155M Total, 423K MFU, 151M MRU, 568K Anon, 535K Header, 2551K Other
41M Compressed, 113M Uncompressed, 2.74:1 Ratio
Swap: 2048M Total, 2048M FreePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 128K CPU1 1 23.7H 100.00% [idle{idle: cpu1}]
11 root 155 ki31 0K 128K CPU2 2 23.7H 100.00% [idle{idle: cpu2}]
11 root 155 ki31 0K 128K CPU6 6 23.7H 100.00% [idle{idle: cpu6}]
11 root 155 ki31 0K 128K CPU0 0 23.7H 100.00% [idle{idle: cpu0}]
11 root 155 ki31 0K 128K CPU3 3 23.7H 99.07% [idle{idle: cpu3}]
11 root 155 ki31 0K 128K CPU5 5 23.7H 97.66% [idle{idle: cpu5}]
11 root 155 ki31 0K 128K CPU7 7 23.7H 97.27% [idle{idle: cpu7}]
11 root 155 ki31 0K 128K RUN 4 23.7H 92.48% [idle{idle: cpu4}]
12 root -92 - 0K 1808K WAIT 4 0:07 6.69% [intr{irq291: igb0:que 4}]
12 root -92 - 0K 1808K WAIT 5 0:09 4.79% [intr{irq292: igb0:que 5}]
12 root -92 - 0K 1808K WAIT 7 0:05 3.96% [intr{irq294: igb0:que 7}]
12 root -92 - 0K 1808K WAIT 0 0:04 3.47% [intr{irq296: igb1:que 0}]
12 root -92 - 0K 1808K WAIT 4 0:01 2.78% [intr{irq300: igb1:que 4}]
12 root -92 - 0K 1808K WAIT 0 0:13 2.49% [intr{irq287: igb0:que 0}]
12 root -92 - 0K 1808K WAIT 3 0:04 2.20% [intr{irq290: igb0:que 3}]
12 root -92 - 0K 1808K WAIT 6 0:11 1.27% [intr{irq293: igb0:que 6}]
12 root -92 - 0K 1808K WAIT 2 0:05 1.27% [intr{irq289: igb0:que 2}]
12 root -92 - 0K 1808K WAIT 7 0:01 0.78% [intr{irq303: igb1:que 7}]
12 root -92 - 0K 1808K WAIT 3 0:02 0.68% [intr{irq299: igb1:que 3}]
12 root -92 - 0K 1808K WAIT 1 0:02 0.39% [intr{irq297: igb1:que 1}]
53541 root 20 0 39364K 4148K bpf 1 0:02 0.29% /usr/local/bandwidthd/bandwidthd
53853 root 20 0 39364K 4136K bpf 7 0:02 0.29% /usr/local/bandwidthd/bandwidthd
52709 root 20 0 39364K 4592K bpf 4 0:02 0.20% /usr/local/bandwidthd/bandwidthd
52658 root 20 0 39364K 4604K bpf 1 0:02 0.20% /usr/local/bandwidthd/bandwidthd
327 root 22 0 268M 37660K accept 5 0:11 0.10% php-fpm: pool nginx (php-fpm){php-fpm}
53227 root 20 0 39364K 4352K bpf 6 0:02 0.10% /usr/local/bandwidthd/bandwidthd
53537 root 20 0 39364K 4152K bpf 0 0:02 0.10% /usr/local/bandwidthd/bandwidthd
0 root -16 - 0K 5632K swapin 2 18.6H 0.00% [kernel{swapper}]
24 root -16 - 0K 16K pftm 0 0:19 0.00% [pf purge]
12 root -60 - 0K 1808K WAIT 3 0:18 0.00% [intr{swi4: clock (0)}]
326 root 52 0 266M 35048K accept 4 0:10 0.00% php-fpm: pool nginx (php-fpm){php-fpm}
22 root -8 - 0K 128K tx->tx 3 0:10 0.00% [zfskern{txg_thread_enter}]
47952 root 20 0 10488K 2568K select 3 0:10 0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /etc/syslog.con
48506 root 52 0 268M 37828K accept 3 0:09 0.00% php-fpm: pool nginx (php-fpm){php-fpm}
25 root -16 - 0K 16K - 0 0:09 0.00% [rand_harvestq]
9287 root 20 0 12736K 1904K bpf 0 0:06 0.00% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
0 root -12 - 0K 5632K - 4 0:06 0.00% [kernel{zio_write_issue_0}]
0 root -12 - 0K 5632K - 2 0:06 0.00% [kernel{zio_write_issue_4}]
0 root -12 - 0K 5632K - 3 0:06 0.00% [kernel{zio_write_issue_5}]
0 root -12 - 0K 5632K - 1 0:06 0.00% [kernel{zio_write_issue_2}]
0 root -12 - 0K 5632K - 5 0:06 0.00% [kernel{zio_write_issue_1}]
0 root -12 - 0K 5632K - 0 0:06 0.00% [kernel{zio_write_issue_3}]
12 root -92 - 0K 1808K WAIT 1 0:06 0.00% [intr{irq288: igb0:que 1}]
0 root -16 - 0K 5632K - 2 0:05 0.00% [kernel{zio_write_intr_7}]
0 root -16 - 0K 5632K - 3 0:05 0.00% [kernel{zio_write_intr_4}]
0 root -16 - 0K 5632K - 7 0:05 0.00% [kernel{zio_write_intr_3}]
0 root -16 - 0K 5632K - 5 0:05 0.00% [kernel{zio_write_intr_0}]
0 root -16 - 0K 5632K - 5 0:05 0.00% [kernel{zio_write_intr_6}]
0 root -16 - 0K 5632K - 6 0:05 0.00% [kernel{zio_write_intr_5}]
0 root -16 - 0K 5632K - 3 0:05 0.00% [kernel{zio_write_intr_1}]
0 root -16 - 0K 5632K - 4 0:05 0.00% [kernel{zio_write_intr_2}]
–----------------------------------------------------------
last pid: 3002; load averages: 0.21, 0.15, 0.13 up 0+23:46:1713:19:16
601 processes: 9 running, 471 sleeping, 8 zombie, 113 waiting
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 39M Active, 123M Inact, 548M Wired, 132K Buf, 15G Free
ARC: 155M Total, 423K MFU, 151M MRU, 152K Anon, 535K Header, 2551K Other
41M Compressed, 113M Uncompressed, 2.74:1 Ratio
Swap: 2048M Total, 2048M FreePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 128K CPU1 1 23.7H 97.85% [idle{idle: cpu1}]
11 root 155 ki31 0K 128K CPU6 6 23.7H 96.19% [idle{idle: cpu6}]
11 root 155 ki31 0K 128K CPU2 2 23.7H 95.46% [idle{idle: cpu2}]
11 root 155 ki31 0K 128K CPU0 0 23.7H 94.58% [idle{idle: cpu0}]
11 root 155 ki31 0K 128K RUN 3 23.7H 94.09% [idle{idle: cpu3}]
11 root 155 ki31 0K 128K CPU7 7 23.7H 92.68% [idle{idle: cpu7}]
11 root 155 ki31 0K 128K CPU5 5 23.7H 92.19% [idle{idle: cpu5}]
11 root 155 ki31 0K 128K CPU4 4 23.7H 87.35% [idle{idle: cpu4}]
12 root -92 - 0K 1808K WAIT 4 0:07 7.67% [intr{irq291: igb0:que 4}]
12 root -92 - 0K 1808K WAIT 5 0:09 6.69% [intr{irq292: igb0:que 5}]
12 root -92 - 0K 1808K WAIT 0 0:04 3.96% [intr{irq296: igb1:que 0}]
12 root -92 - 0K 1808K WAIT 0 0:13 3.86% [intr{irq287: igb0:que 0}]
12 root -92 - 0K 1808K WAIT 7 0:05 3.76% [intr{irq294: igb0:que 7}]
12 root -92 - 0K 1808K WAIT 4 0:01 3.27% [intr{irq300: igb1:que 4}]
12 root -92 - 0K 1808K WAIT 3 0:04 3.17% [intr{irq290: igb0:que 3}]
12 root -92 - 0K 1808K WAIT 2 0:05 3.08% [intr{irq289: igb0:que 2}]
12 root -92 - 0K 1808K WAIT 6 0:12 1.86% [intr{irq293: igb0:que 6}]
12 root -92 - 0K 1808K WAIT 7 0:01 1.56% [intr{irq303: igb1:que 7}]
12 root -92 - 0K 1808K WAIT 3 0:02 0.78% [intr{irq299: igb1:que 3}]
12 root -92 - 0K 1808K WAIT 1 0:03 0.49% [intr{irq297: igb1:que 1}]
53541 root 20 0 39364K 4148K bpf 3 0:02 0.29% /usr/local/bandwidthd/bandwidthd
53853 root 20 0 39364K 4136K bpf 0 0:02 0.29% /usr/local/bandwidthd/bandwidthd
326 root 22 0 266M 35048K accept 4 0:10 0.20% php-fpm: pool nginx (php-fpm){php-fpm}
52709 root 20 0 39364K 4592K bpf 0 0:02 0.20% /usr/local/bandwidthd/bandwidthd
52658 root 20 0 39364K 4604K bpf 6 0:02 0.20% /usr/local/bandwidthd/bandwidthd
53227 root 20 0 39364K 4352K bpf 3 0:02 0.10% /usr/local/bandwidthd/bandwidthd
53015 root 20 0 39364K 4368K bpf 3 0:02 0.10% /usr/local/bandwidthd/bandwidthd
53537 root 20 0 39364K 4152K bpf 6 0:02 0.10% /usr/local/bandwidthd/bandwidthd
0 root -16 - 0K 5632K swapin 6 18.6H 0.00% [kernel{swapper}]
24 root -16 - 0K 16K pftm 6 0:19 0.00% [pf purge]
12 root -60 - 0K 1808K WAIT 0 0:18 0.00% [intr{swi4: clock (0)}]
327 root 52 0 268M 37660K accept 1 0:11 0.00% php-fpm: pool nginx (php-fpm){php-fpm}
22 root -8 - 0K 128K tx->tx 7 0:10 0.00% [zfskern{txg_thread_enter}]
47952 root 20 0 10488K 2568K select 3 0:10 0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /etc/syslog.con
48506 root 52 0 268M 37828K accept 3 0:09 0.00% php-fpm: pool nginx (php-fpm){php-fpm}
25 root -16 - 0K 16K - 3 0:09 0.00% [rand_harvestq]
9287 root 20 0 12736K 1904K bpf 7 0:06 0.00% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
0 root -12 - 0K 5632K - 1 0:06 0.00% [kernel{zio_write_issue_0}]
0 root -12 - 0K 5632K - 4 0:06 0.00% [kernel{zio_write_issue_4}]
0 root -12 - 0K 5632K - 1 0:06 0.00% [kernel{zio_write_issue_5}]
0 root -12 - 0K 5632K - 2 0:06 0.00% [kernel{zio_write_issue_2}]
0 root -12 - 0K 5632K - 7 0:06 0.00% [kernel{zio_write_issue_1}]
0 root -12 - 0K 5632K - 2 0:06 0.00% [kernel{zio_write_issue_3}]
12 root -92 - 0K 1808K WAIT 1 0:06 0.00% [intr{irq288: igb0:que 1}]
0 root -16 - 0K 5632K - 7 0:05 0.00% [kernel{zio_write_intr_7}]
0 root -16 - 0K 5632K - 7 0:05 0.00% [kernel{zio_write_intr_4}]
0 root -16 - 0K 5632K - 0 0:05 0.00% [kernel{zio_write_intr_3}]
0 root -16 - 0K 5632K - 1 0:05 0.00% [kernel{zio_write_intr_0}]
0 root -16 - 0K 5632K - 2 0:05 0.00% [kernel{zio_write_intr_6}]
0 root -16 - 0K 5632K - 0 0:05 0.00% [kernel{zio_write_intr_5}]
0 root -16 - 0K 5632K - 6 0:05 0.00% [kernel{zio_write_intr_1}]
–----------------------------------------------------------NETSTAT Command Results
3 sample/average "netstat -m" command results from pfSense while running speed tests.
–----------------------------------------------------------
17202/6333/23535 mbufs in use (current/cache/total)
16790/3210/20000/1000000 mbuf clusters in use (current/cache/total/max)
16790/3197 mbuf+clusters out of packet secondary zone in use (current/cache)
0/837/837/504803 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/149571 9k jumbo clusters in use (current/cache/total/max)
0/0/0/84133 16k jumbo clusters in use (current/cache/total/max)
37880K/11351K/49231K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
120 sendfile syscalls
59 sendfile syscalls completed without I/O request
86 requests for I/O initiated by sendfile
809 pages read by sendfile as part of a request
874 pages were valid at time of a sendfile request
0 pages were requested for read ahead by applications
0 pages were read ahead by sendfile
0 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayed17206/6329/23535 mbufs in use (current/cache/total)
16794/3206/20000/1000000 mbuf clusters in use (current/cache/total/max)
16794/3193 mbuf+clusters out of packet secondary zone in use (current/cache)
0/837/837/504803 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/149571 9k jumbo clusters in use (current/cache/total/max)
0/0/0/84133 16k jumbo clusters in use (current/cache/total/max)
37889K/11342K/49231K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
120 sendfile syscalls
59 sendfile syscalls completed without I/O request
86 requests for I/O initiated by sendfile
809 pages read by sendfile as part of a request
874 pages were valid at time of a sendfile request
0 pages were requested for read ahead by applications
0 pages were read ahead by sendfile
0 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayed17194/6341/23535 mbufs in use (current/cache/total)
16782/3218/20000/1000000 mbuf clusters in use (current/cache/total/max)
16782/3205 mbuf+clusters out of packet secondary zone in use (current/cache)
0/837/837/504803 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/149571 9k jumbo clusters in use (current/cache/total/max)
0/0/0/84133 16k jumbo clusters in use (current/cache/total/max)
37862K/11369K/49231K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
120 sendfile syscalls
59 sendfile syscalls completed without I/O request
86 requests for I/O initiated by sendfile
809 pages read by sendfile as part of a request
874 pages were valid at time of a sendfile request
0 pages were requested for read ahead by applications
0 pages were read ahead by sendfile
0 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayedThank you very much for taking time out of your day to read my post!
Cheers,
Clint -
Seems like decent hardware. I assume Hyperthreading is enabled with queues numbered 0-7. Try disabling HT. I'm not sure how much of a difference it may or may not make, but I've hard bad things about HT for certain workloads, like firewalls. And FreeBSD doesn't like much more than 4 threads for network.
-
Harvy66,
First off, thank you so very much for responding, any help at all is greatly appreciated at this point!!!
I disabled hyper-threading and re-ran the tests. Unfortunately, there really was no discernable difference in the results. The throughput seems to max out just shy of 800Mbps per interface (1.6Gbps total) no matter what I try.
Latency Download Upload
10.209 769.0828 30.01639
11.294 794.7166 42.77486
11.709 797.9307 45.43179last pid: 60228; load averages: 0.41, 0.22, 0.10 up 0+00:06:5607:57:25
49 processes: 1 running, 46 sleeping, 2 zombie
CPU 0: 2.0% user, 0.0% nice, 0.8% system, 7.9% interrupt, 89.4% idle
CPU 1: 2.0% user, 0.0% nice, 0.4% system, 7.5% interrupt, 90.2% idle
CPU 2: 1.2% user, 0.0% nice, 0.0% system, 20.1% interrupt, 78.7% idle
CPU 3: 1.2% user, 0.0% nice, 0.8% system, 13.0% interrupt, 85.0% idle
Mem: 109M Active, 12M Inact, 450M Wired, 132K Buf, 15G Free
ARC: 119M Total, 176K MFU, 117M MRU, 16K Anon, 382K Header, 1672K Other
34M Compressed, 86M Uncompressed, 2.52:1 Ratio
Swap: 2048M Total, 2048M Freelast pid: 84188; load averages: 0.13, 0.17, 0.09 up 0+00:08:2407:58:53
49 processes: 1 running, 46 sleeping, 2 zombie
CPU 0: 0.5% user, 0.0% nice, 0.0% system, 15.8% interrupt, 83.8% idle
CPU 1: 0.5% user, 0.0% nice, 0.5% system, 14.9% interrupt, 84.2% idle
CPU 2: 0.5% user, 0.0% nice, 0.5% system, 9.5% interrupt, 89.6% idle
CPU 3: 0.5% user, 0.0% nice, 0.5% system, 0.0% interrupt, 99.1% idle
Mem: 109M Active, 12M Inact, 451M Wired, 132K Buf, 15G Free
ARC: 119M Total, 176K MFU, 117M MRU, 160K Anon, 383K Header, 1673K Other
34M Compressed, 86M Uncompressed, 2.52:1 Ratio
Swap: 2048M Total, 2048M Freelast pid: 83979; load averages: 0.18, 0.18, 0.09 up 0+00:07:4007:58:09
49 processes: 1 running, 46 sleeping, 2 zombie
CPU 0: 0.0% user, 0.0% nice, 0.5% system, 14.3% interrupt, 85.2% idle
CPU 1: 1.6% user, 0.0% nice, 0.5% system, 0.0% interrupt, 97.9% idle
CPU 2: 0.5% user, 0.0% nice, 0.0% system, 13.8% interrupt, 85.7% idle
CPU 3: 1.1% user, 0.0% nice, 0.0% system, 11.6% interrupt, 87.3% idle
Mem: 109M Active, 12M Inact, 451M Wired, 132K Buf, 15G Free
ARC: 119M Total, 176K MFU, 117M MRU, 160K Anon, 383K Header, 1673K Other
34M Compressed, 86M Uncompressed, 2.52:1 Ratio
Swap: 2048M Total, 2048M Free
Displaying per-CPU statistics.
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
52257 root 1 20 0 39364K 3888K bpf 2 0:00 0.72% bandwidthd
51875 root 1 20 0 39364K 3888K bpf 1 0:00 0.71% bandwidthd
50749 root 1 20 0 39364K 4056K bpf 0 0:00 0.70% bandwidthd
51594 root 1 20 0 39364K 3896K bpf 0 0:00 0.68% bandwidthd
50861 root 1 20 0 39364K 3896K bpf 1 0:00 0.67% bandwidthd
51329 root 1 20 0 39364K 3896K bpf 0 0:00 0.65% bandwidthd
52168 root 1 20 0 39364K 3888K bpf 1 0:00 0.65% bandwidthd
50483 root 1 20 0 39364K 4068K bpf 2 0:00 0.62% bandwidthd
83979 root 1 20 0 20068K 3152K CPU3 3 0:00 0.08% top
67292 root 1 20 0 78872K 7248K select 0 0:00 0.02% sshd
45690 root 1 20 0 10488K 2564K select 1 0:00 0.01% syslogd
9036 root 5 52 0 13036K 1920K uwait 2 0:00 0.01% dpinger
14638 root 2 20 0 24660K 12500K select 1 0:00 0.01% ntpd
7102 root 1 20 0 12736K 1764K bpf 1 0:00 0.00% filterlog
321 root 1 20 0 259M 18412K kqread 3 0:00 0.00% php-fpm
23744 root 1 20 0 96068K 47132K select 0 0:00 0.00% bsnmpd
323 root 1 22 0 261M 28944K accept 3 0:00 0.00% php-fpm
322 root 1 20 0 261M 27900K accept 1 0:00 0.00% php-fpm
26063 root 1 52 20 13096K 2420K wait 1 0:00 0.00% sh
71280 root 1 20 0 13400K 3376K pause 3 0:00 0.00% tcsh
93481 root 1 52 0 39440K 2704K wait 1 0:00 0.00% login
336 root 1 40 20 19456K 2956K kqread 0 0:00 0.00% check_reload_status
67826 root 1 52 0 13096K 2584K wait 0 0:00 0.00% sh
94899 root 1 52 0 13096K 2704K wait 0 0:00 0.00% sh
94787 root 2 20 0 10588K 2320K piperd 2 0:00 0.00% sshlockout_pf
14169 root 1 20 0 12504K 1832K nanslp 3 0:00 0.00% cron
374 root 1 20 0 9176K 4692K select 0 0:00 0.00% devd
95106 root 1 52 0 13096K 2584K ttyin 1 0:00 0.00% sh
94666 root 1 52 0 10396K 2132K ttyin 2 0:00 0.00% getty
13522 root 1 20 0 53524K 4584K select 3 0:00 0.00% sshd
94078 root 1 52 0 10396K 2132K ttyin 3 0:00 0.00% getty
94421 root 1 52 0 10396K 2132K ttyin 1 0:00 0.00% getty
94237 root 1 52 0 10396K 2132K ttyin 2 0:00 0.00% getty
93518 root 1 52 0 10396K 2132K ttyin 0 0:00 0.00% getty
94167 root 1 52 0 10396K 2132K ttyin 0 0:00 0.00% getty
93789 root 1 52 0 10396K 2132K ttyin 3 0:00 0.00% getty
13806 root 1 52 0 25416K 5024K kqread 3 0:00 0.00% nginx
14093 root 1 52 0 25416K 5020K kqread 0 0:00 0.00% nginx
13731 root 1 52 0 25416K 4552K pause 2 0:00 0.00% nginx
83834 root 1 52 20 6180K 1936K nanslp 0 0:00 0.00% sleep
48054 root 1 52 0 8232K 2012K wait 3 0:00 0.00% minicron
48186 root 1 20 0 8232K 2028K nanslp 3 0:00 0.00% minicron
49134 root 1 52 0 8232K 2012K wait 1 0:00 0.00% minicron
48418 root 1 52 0 8232K 2012K wait 3 0:00 0.00% minicron
338 root 1 52 20 19456K 2860K kqread 1 0:00 0.00% check_reload_status
48856 root 1 52 0 8232K 2028K nanslp 1 0:00 0.00% minicron
49525 root 1 52 0 8232K 2028K nanslp 1 0:00 0.00% minicronI’m going to install Windows 10 on the hardware so I can verify an “Apples to Apples” speed test (current workstation vs. the new pfSense hardware). If those results show the hardware can in fact get into the 930Mbps/950Mbps range, my next step will be to do a fresh install of pfSense, but this time I won’t run the WAN/LAN links on the same Intel I350-T2 Server NIC. I “doubt” splitting the links will solve it, an Intel I350-T2 NIC should have more than enough horsepower to run two unencrypted links wide open, but I still need to rule it out just the same.
Again, thank you very much for the reply!!!
Cheers,
Clint -
Seems like decent hardware.
That's the same thought came to me while reading…wished OP good luck as well.
-
I'm not much experienced in troubleshooting other hardware, but lets give this a shot while no other people are posting. Backup your current config and do a fresh install. Don't make any changes to the default settings, other than the bare minimum, and see how it performs. Weed out all variables. Out of the box, pfSense has pretty good performance.
-
First, again, thank you to Harvy66 and also NollipfSense, I appreciate any insight you and others may have!
Ok, so here’s what I’ve done so far:
• Installed Windows 10 on the pfSense hardware, ran same speed tests as I’ve been testing with from my workstation, the results are on par with my workstation, 920Mbps to just over 950Mbps.• Installed Windows 2016 Server on the pfSense hardware, ran same speed tests as I’ve been testing with from my workstation, the results are on par with my workstation, 920Mbps to just over 950Mbps.
(Unfortunately, this only tests one of the of the 2 Intel I350-T2 NIC ports at a time, so while it is on par with my workstation, it’s not a true throughput test of both NIC ports simultaneously.)• Performed a fresh install of pfSense 2.4.3 and the only changes made were whatever the initial setup wizard changes and I enabled the Secure Shell.
(Unfortunately, after the fresh, unaltered install, there was no difference. The results were exactly as before, a speed loss of at least 15% to the mid/high 20% range.)So, I did some more research to make sure my math was correct in regards to the available PCIe bandwidth and the Intel I350-T2 Server NIC. My math sanity check appears correct, PCIe 2.x has a theoretical max bandwidth of 500MB/s per lane, so a 4-lane PCI Express 2.1 card should have 2,000MB/s, or 16,000 Mbps available to it (in a perfect world). Obviously this is way more than card or I need, so we're good.
But… From the Intel specs:
"Intel Ethernet Controller I350 with PCI Express* V2.1 (5 GT/s) Support enables customers to take full advantage of 1 GbE by providing maximum bi-directional throughput per port on a single adapter"Could this be more typical marketing speak??? They say “providing maximum bi-directional throughput per port”, they do not specifically say a full 1 GbE per port simultaneously! Hummmmmmm. Also, I noticed the “V2.1 (5 GT/s)”. I have the motherboard currently set to Gen 3 (8 GT/s).
Next Steps:
• Change the BIOS settings for Slot 7 on the motherboard to Gen 2 (5 GT/s) and retest.• Split the WAN/LAN links like I mentioned before, but haven’t had time to get to this one yet.
If changing the PCI-E settings in the BIOS for Slot 7 does it, great. But if splitting the WAN/LAN fixes it, what does that tell us? The FreeBSD 11.1 Release doesn’t have the greatest support/driver for Intel I350-T2 NIC’s??? If so, no one has ever caught this before??? These are just some questions to think about…
Thanx for reading and hanging in there with me!!!
Cheers,
Clint -
Hi -
I have been using essentially the exact same hardware setup with a symmetric gigabit fiber connection for about a year now and it has worked great. The hardware in the 5018D-FN8T has no trouble saturating a gigabit link and that's even with the Snort IDS package also running on the interfaces. The only difference in my case is that I used the onboard NIC ports (I210 and I350) for the WAN And LAN connections vs. an add-on card in the PCI Express slot. As you have already suggested I would first try to the other system network ports to see if that makes a difference in speed. If not, there are also some NIC parameters we can tune in FreeBSD to improve performance with very high speed internet connections.
Hope this helps.
-
The FreeBSD 11.1 Release doesn’t have the greatest support/driver for Intel I350-T2 NIC’s???
I suspected this and filed bug report; however, it was regarding the issue with Suricata in inline mode and netmap. It would be great to share your finds with them (https://bugs.freebsd.org/bugzilla/enter_bug.cgi).
-
Hi All,
(Sorry I don’t have more at this time, I have been slammed with other issues at work.)
Here’s the latest update. I’m pretty sure this issue is the add-on Intel I350-T2 Server NIC in PCIe Slot 7. Changing the BIOS settings for Slot 7 on the motherboard to Gen 2 (5 GT/s), if anything, appears to have had a slightly negative effect, although pretty negligible.
So I looked into some NIC Performance tuning for FreeBSD, most of what I found was for 10GbE, but I was hoping it may help. Here’s a summary of the settings I tried (I’m not listing every combination I tested, but suffice it to say, most, if not all of these changes actually made things worse by an additional 5-7% or so):
/boot/loader.conf
kern.ipc.nmbclusters="1000000"System Tunables
set to at least 16MB for 10GE hosts
kern.ipc.maxsockbuf=16777216
set autotuning maximum to at least 16MB too
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216enable send/recv autotuning
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.recvbuf_auto=1increase autotuning step size
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_inc=524288As I mentioned, I’m slammed at work right now, but as soon as I get a chance, my next, and I believe final step, is to do yet another fresh, unaltered install and test splitting the WAN/LAN links between the onboard I210, the onboard I350-AM4, and the add-on I350-T2. All signs point to the add-on Intel I350-T2. Now whether it’s the design limitations of NIC itself, a faulty NIC, or the FreeBSD driver, I’m not sure how to go about verifying that part. I have to check and see if I have any additional 2-Port Server NICs here I can test with.
-
tman222, thank you very much for your input, it’s nice to know I picked the right hardware! Are the NIC params you mentioned the same as the ones I tried (listed above)?
-
NollipfSense, I don’t have an ID for the site you referenced, but I would be more than happy to share anything you think may help!
Cheers,
Clint -
-
Hi Clint,
Here are some helpful threads and resources on network tuning to get your started. There are some other parameters you can tweak as well beyond the ones that you have already adjusted:
https://forum.pfsense.org/index.php?topic=113496.0
https://forum.pfsense.org/index.php?topic=132345https://calomel.org/freebsd_network_tuning.html
For further troubleshooting, please also see the "Where is the bottleneck ?" section here:
https://bsdrp.net/documentation/technical_docs/performanceHope this helps - please let us know if you have more questions.