NIC buffer tuning

stvboyle

I'm running pfSense 2.0.1, amd64. I have systems with 4 nics, 2 bce and 2 igb. The systems are fairly busy, pushing >20Kpps in and out all the time. Sometimes I see inbound errors or drops. I've been playing with the suggestions here:
http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

However, there are some other buffer tuning settings that seem like they might help.

For the igb interfaces:
hw.igb.rxd=4096
hw.igb.txd=4096

For the bce interfaces:
hw.bce.rx_pages=8
hw.bce.tx_pages=8

Does anyone have any experience or recommendations related to these buffer settings?

Thanks!!

Nachtfalke

Perhaps this could help you to improve performance.
I just copied and pasted this here. The description is from somewhere on the web:

net.inet.tcp.sendbuf_max 		16777216 	

net.inet.tcp.recvbuf_max 		16777216 	

kern.ipc.somaxconn 	The kern.ipc.somaxconn sysctl variable limits the size of the listen queue for accepting new TCP connections. The default value of 128 is typically too low for robust handling of new connections in a heavily loaded web server environment. For such environments, it is recommended to increase this value to 1024 or higher. 	2048 	

kern.ipc.nmbclusters 	The NMBCLUSTERS kernel configuration option dictates the amount of network Mbufs available to the system. A heavily-trafficked server with a low number of Mbufs will hinder FreeBSD's ability. Each cluster represents approximately 2 K of memory, so a value of 1024 represents 2 megabytes of kernel memory reserved for network buffers. A simple calculation can be done to figure out how many are needed. If you have a web server which maxes out at 1000 simultaneous connections, and each connection eats a 16 K receive and 16 K send buffer, you need approximately 32 MB worth of network buffers to cover the web server. A good rule of thumb is to multiply by 2, so 2x32 MB / 2 KB = 64 MB / 2 kB = 32768\. We recommend values between 4096 and 32768 for machines with greater amounts of memory. 	131072 	

kern.maxfilesperproc 	Set maximum files allowed open per process 	32768 	

kern.maxfiles 	Set maximum files allowed open 	262144 	

net.inet.ip.intr_queue_maxlen 	Maximum size of the IP input queue 	3000

I have this in system tunables but I am not sure - someone in the forum said that this will not work there and you need to put this into:
/boot/loader.conf
or
/boot/loader.conf.local (this will not be overwritten after a firmware update)
and then reboot pfsense.

stvboyle

Thank for the reply. I'm already using most of those.

I have the following in /boot/loader.conf.local:
kern.ipc.somaxconn="4096"
kern.ipc.nmbclusters="262144"
kern.ipc.maxsockets="204800"
kern.ipc.nmbjumbop="192000"
kern.maxfiles="204800"
kern.maxfilesperproc="200000"
net.inet.icmp.icmplim="50"
net.inet.icmp.maskrepl="0"
net.inet.icmp.drop_redirect="1"
net.inet.icmp.bmcastecho="0"
net.inet.tcp.tcbhashsize="4096"
net.inet.tcp.msl="7500"
net.inet.tcp.inflight.enable="1"
net.inet.tcp.inflight.debug="0"
net.inet.tcp.inflight.min="6144"
net.inet.tcp.blackhole="2"
net.inet.udp.blackhole="1"
net.inet.ip.rtexpire="2"
net.inet.ip.rtminexpire="2"
net.inet.ip.rtmaxcache="256"
net.inet.ip.accept_sourceroute="0"
net.inet.ip.sourceroute="0"

I have the following under System Tunables:
debug.pfftpproxy (0)
vfs.read_max (32)
net.inet.ip.portrange.first (1024)
net.inet.tcp.blackhole (2)
net.inet.udp.blackhole (1)
net.inet.ip.random_id (1)
net.inet.tcp.drop_synfin (1)
net.inet.ip.redirect (1)
net.inet6.ip6.redirect (1)
net.inet.tcp.syncookies (1)
net.inet.tcp.recvspace (65228)
net.inet.tcp.sendspace (65228)
net.inet.ip.fastforwarding (0)
net.inet.tcp.delayed_ack (0)
net.inet.udp.maxdgram (57344)
net.link.bridge.pfil_onlyip (0)
net.link.bridge.pfil_member (1)
net.link.bridge.pfil_bridge (0)
net.link.tap.user_open (1)
kern.randompid (347)
net.inet.ip.intr_queue_maxlen (1000)
hw.syscons.kbd_reboot (0)
net.inet.tcp.inflight.enable (1)
net.inet.tcp.log_debug (0)
net.inet.icmp.icmplim (0)
net.inet.tcp.tso (1)
kern.ipc.maxsockbuf (4262144)

As far as I can tell, it is the NIC dropping packets. I do not see any mbuf misses:
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

What I am seeing is this:
dev.igb.0.mac_stats.missed_packets: 1043856
dev.igb.0.mac_stats.recv_no_buff: 8191

I found this article from Intel that indicates it could be a processing issue, where the CPU is not returning buffers to the NIC fast enough:
http://communities.intel.com/community/wired/blog/2011/06/24/parameter-talk-tx-and-rx-descriptors

I still need to tweak the igb queues from 4 down to 1. However, I do not see excessive CPU load:
last pid: 26963; load averages: 0.27, 0.28, 0.22 up 126+21:00:11 08:24:38
143 processes: 7 running, 110 sleeping, 26 waiting
CPU: 0.1% user, 0.1% nice, 1.2% system, 17.9% interrupt, 80.7% idle
Mem: 512M Active, 66M Inact, 686M Wired, 140K Cache, 102M Buf, 2652M Free
Swap: 8192M Total, 8192M Free

Since these are production systems I'm dealing with my ability to experiment is somewhat limited. As time permits I'll be testing the various settings and will report back here if I find settings that help.

Thanks.

dhatz

There have been several commits to the driver in recent months, check http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?view=log

stvboyle

Thanks for the pointer to the driver changes. Any idea when some of those might be included in pfSense?

I'm not likely to create my own build that tries to pull in driver changes.

stvboyle

Wanted to follow up here. I did two things that have really improved the packet loss situation I've been seeing:

1. I disabled pfsync - this eliminated a ton of traffic on the Carp interfaces and I have no more packet loss. We don't care about making the states redundant in our case. I see this as more of a work-around than a solution.

2. On our Intel nics I set the following in /boot/loader.conf.local:
hw.igb.rxd=4096
hw.igb.txd=4096

That eliminated all the packet loss I was seeing on my Intel nics.

Nachtfalke

What is this command doing ?
What are the default values ?

Tikimotel

I have a dual intel nic setup (em0/em1).

Here are my system tunables:


net.inet.tcp.sendbuf_max;Set autotuning maximum to at least 16MB;16777216
net.inet.tcp.recvbuf_max;Set autotuning maximum to at least 16MB;16777216
net.inet.tcp.sendbuf_auto;Enable send/recv autotuning;1
net.inet.tcp.recvbuf_auto;Enable send/recv autotuning;1
net.inet.tcp.sendbuf_inc;Increase autotuning step size;524288
net.inet.tcp.recvbuf_inc;Increase autotuning step size;524288
net.inet.tcp.slowstart_flightsize;Squid optimize: It would be more beneficial to increase the slow-start flightsize via the net.inet.tcp.slowstart_flightsize sysctl rather than disable delayed acks. (default = 1, --> 64) 262144/1460=maxnumber;64
net.inet.udp.recvspace;Optimized.;65536
net.local.stream.recvspace;Optimized. (10x (mtu 16384+40));164240
net.local.stream.sendspace;Optimized. (10x (mtu 16384+40));164240
kern.ipc.somaxconn;Optimized for squid;4096
net.inet.tcp.mssdflt;Optimized. (default = 512, --> 1460);1460
net.inet.tcp.inflight.min;FreeBSD Manual recommended. 6144;6144
net.inet.tcp.local_slowstart_flightsize;Loopback optimized. (for MTU 16384) see "net.local.stream.****space";10
net.inet.tcp.nolocaltimewait;Loopback optimized.;1
net.inet.tcp.delayed_ack;Optimized. see "net.inet.tcp.slowstart_flightsize";1
net.inet.tcp.delacktime;Optimized.;100

";" = the table separator
Check with ifconig to see if device "lo0" has a MTU of 16384.

My "loader.conf.local":


# Increase nmbclusters for Squid and intel
kern.ipc.nmbclusters="131072"

# Max. backlog size
kern.ipc.somaxconn="4096"

# On some systems HPET is almost 2 times faster than default ACPI-fast
# Useful on systems with lots of clock_gettime / gettimeofday calls
# See http://old.nabble.com/ACPI-fast-default-timecounter,-but-HPET-83--faster-td23248172.html
# After revision 222222 HPET became default: http://svnweb.freebsd.org/base?view=revision&revision=222222
kern.timecounter.hardware="HPET"

# Tweaks hardware
coretemp_load="yes"
legal.intel_wpi.license_ack="1"
legal.intel_ipw.license_ack="1"

# Usefull if you are using Intel-Gigabit NIC
hw.em.rxd="4096"
hw.em.txd="4096"
hw.em.tx_int_delay="512"
hw.em.rx_int_delay="512"
hw.em.tx_abs_int_delay="1024"
hw.em.rx_abs_int_delay="1024"
hw.em.enable_msix="1"
hw.em.msix_queues="2"
hw.em.rx_process_limit="100"
hw.em.fc_setting="0"

stvboyle

Hey Nachtfalke,

You can check the current setting with:
kenv -q | grep hw.igb

I believe the default is 2048 and maxes out at 4096 in pfSense. I put the config entries I mentioned previously in /boot/loader.conf.local - a reboot will be required. I think the documentation for the igb driver is here:
http://www.freebsd.org/cgi/man.cgi?query=igb&sektion=4&manpath=FreeBSD+8.1-RELEASE

Some of the Intel nics use the em driver, it has similar settings:
http://www.freebsd.org/cgi/man.cgi?query=igb&sektion=4&manpath=FreeBSD+8.1-RELEASE

Regards,
Steve