Loader.conf.local tuning for modern hardware
I've been looking to make best use of my fairly modern hardware with decent specs. Haswell i5 with 8GiB of ram and i350-T2 NIC
If packets come in too fast to be processed, if there's no room to place them, they will get dropped. Most places say 1k-2k, but hey, if I got lots of memory, what's 8MiB per rx/tx? Doesn't sound like it would hurt, so why not. Of course if your system can't handle the packet rate coming in, you're screwed anyway.
I recently found about this one. Must be a power of two and defaults to something small, like 8k. This is not the max number of states, but it seems to work as a cache to quickly look up the states. I saw some benchmarks where leaving the default while allowing 1mil states showed a large drop in performance, like 3x-4x, any time the number of active states went over the states_hashsize. The box did "support" all of the states, but there was a large performance hit for not also increasing this setting to be at least the same size. Unfortunately it needs to be a power of two, so it's not as simple to configure like states. Rule of thumb, states_hashsize should be the same size or larger than your max number of states.
I was not able to find out much info about this one, but assuming it's similar(was located next to net.pf.states_hashsize in the man pages), I increased it from the default of something like 8k. Only costs memory and I got plenty.
Flow control sucks unless you're point-to-point, which my LAN is not. The last thing I want is a client claiming it's overloaded and cause my router to stop sending packets to EVERYONE.
These were a recent find from a presentation from BSDCan 1016 about handling DDOS attacks with FreeBSD. The default is 100, and this limit how many packets can be processed per interrupt. I guess back before proper msi-x support, constantly processing packets could get itself get interrupted. To reduce the chance of interrupts interrupting interrupts, they set a max work done. msi-x pretty much fixes this, assuming your NIC and motherboard support MSI-x correctly. -1 just tells the NIC to process as many packets as it wants per interrupt, reducing context switching and the number of interrupts.
I'm not sure if these apply to the firewall or to application running on the server, so they may not be useful, but it came from a guide talking about tweaking FreeBSD 10.3 for a server getting hammered with lots of TCP connections.
hey nice post, i have a standard expi930ctblk single port consume gigabit nics from intel, not sure which setting I would use for flow control ect, would I also set the igb setting?
I assume it would be "igb" and stands for "Intel 1Gb". "ixgbe" stands for "Intel 10Gb"
According to this list, these three values are important for responding to legitimate traffic during a SYN flood.
This quote from the paper explains the 16 second timeouts that I recommend for "System/Advanced/Firewall & NAT" "TCP First" and "TCP Opening"
… SYN,ACK should be retransmitted to the remote system, and defaults to 3. Three retransmits corresponds to 1+2+4+8 = 15 seconds, and the odds are that if a connection cannot be established by then, the user has given up.
edit: Addition link to read http://blog.cochard.me/2016/05/playing-with-freebsd-packet-filter.html
edit2: This is what I'm using for my conf now, with no benchmarks to back anything up
hw.igb.rxd=2048 <– Lowered this because I was able to get line rate packet processing and see no reason to have larger buffers if it's keeping up
hw.igb.txd=2048 <-- Lowered this because I was able to get line rate packet processing and see no reason to have larger buffers if it's keeping up
net.pf.states_hashsize=524288 <-- I lowered my states to only 256k, so I lowered the hash size
net.pf.source_nodes_hashsize=524288 <-- Still not entirely sure what this does, but one of the DOS issues with FreeBSD is looking up sources to remove them. Not sure if related, so increased to hold state table.
net.inet.tcp.syncache.hashsize="2048" <-- Increased this. Larger hashes eat more memory, but I can make the buckets smaller while supporting the same number of cookies
net.inet.tcp.syncache.bucketlimit="16" <-- This is linear scaling. Larger buckets result in longer worst case times
net.inet.tcp.syncache.cachelimit="32768" <-- roughly hash size times bucket limit
"process_limit" becomes very important with more cores.
edit: I just found out about RST Cookies
RST cookies—for the first request from a given client, the server intentionally sends an invalid SYN-ACK. This should result in the client generating an RST packet, which tells the server something is wrong. If this is received, the server knows the request is legitimate, logs the client, and accepts subsequent incoming connections from it.
Great post! Using your parameters I was able to remove the hw.igb.num_queues="1" workaround for my HP375T-4 (aka Intel i340T4) NIC
hey nice post, i have a standard expi930ctblk single port consume gigabit nics from intel, not sure which setting I would use for flow control ect.
does this still excist i only find something like