Loader.conf.local tuning for modern hardware

  • I've been looking to make best use of my fairly modern hardware with decent specs. Haswell i5 with 8GiB of ram and i350-T2 NIC


    If packets come in too fast to be processed, if there's no room to place them, they will get dropped. Most places say 1k-2k, but hey, if I got lots of memory, what's 8MiB per rx/tx? Doesn't sound like it would hurt, so why not. Of course if your system can't handle the packet rate coming in, you're screwed anyway.


    I recently found about this one. Must be a power of two and defaults to something small, like 8k. This is not the max number of states, but it seems to work as a cache to quickly look up the states. I saw some benchmarks where leaving the default while allowing 1mil states showed a large drop in performance, like 3x-4x, any time the number of active states went over the states_hashsize. The box did "support" all of the states, but there was a large performance hit for not also increasing this setting to be at least the same size. Unfortunately it needs to be a power of two, so it's not as simple to configure like states. Rule of thumb, states_hashsize should be the same size or larger than your max number of states.


    I was not able to find out much info about this one, but assuming it's similar(was located next to net.pf.states_hashsize in the man pages), I increased it from the default of something like 8k. Only costs memory and I got plenty.


    Flow control sucks unless you're point-to-point, which my LAN is not. The last thing I want is a client claiming it's overloaded and cause my router to stop sending packets to EVERYONE.


    These were a recent find from a presentation from BSDCan 1016 about handling DDOS attacks with FreeBSD. The default is 100, and this limit how many packets can be processed per interrupt. I guess back before proper msi-x support, constantly processing packets could get itself get interrupted. To reduce the chance of interrupts interrupting interrupts, they set a max work done. msi-x pretty much fixes this, assuming your NIC and motherboard support MSI-x correctly. -1 just tells the NIC to process as many packets as it wants per interrupt, reducing context switching and the number of interrupts.


    I'm not sure if these apply to the firewall or to application running on the server, so they may not be useful, but it came from a guide talking about tweaking FreeBSD 10.3 for a server getting hammered with lots of TCP connections.

  • hey nice post, i have a standard expi930ctblk single port consume gigabit nics from intel, not sure which setting I would use for flow control ect, would I also set the igb setting?

  • I assume it would be "igb" and stands for "Intel 1Gb". "ixgbe" stands for "Intel 10Gb"

  • https://people.freebsd.org/~jlemon/papers/syncache.pdf

    According to this list, these three values are important for responding to legitimate traffic during a SYN flood.


    This quote from the paper explains the 16 second timeouts that I recommend for "System/Advanced/Firewall & NAT" "TCP First" and "TCP Opening"

    … SYN,ACK should be retransmitted to the remote system, and defaults to 3. Three retransmits corresponds to 1+2+4+8 = 15 seconds, and the odds are that if a connection cannot be established by then, the user has given up.

    edit: Addition link to read http://blog.cochard.me/2016/05/playing-with-freebsd-packet-filter.html

    edit2: This is what I'm using for my conf now, with no benchmarks to back anything up

    hw.igb.rxd=2048 <– Lowered this because I was able to get line rate packet processing and see no reason to have larger buffers if it's keeping up
    hw.igb.txd=2048 <-- Lowered this because I was able to get line rate packet processing and see no reason to have larger buffers if it's keeping up
    net.pf.states_hashsize=524288 <-- I lowered my states to only 256k, so I lowered the hash size
    net.pf.source_nodes_hashsize=524288 <-- Still not entirely sure what this does, but one of the DOS issues with FreeBSD is looking up sources to remove them. Not sure if related, so increased to hold state table.
    net.inet.tcp.syncache.hashsize="2048" <-- Increased this. Larger hashes eat more memory, but I can make the buckets smaller while supporting the same number of cookies
    net.inet.tcp.syncache.bucketlimit="16" <-- This is linear scaling. Larger buckets result in longer worst case times
    net.inet.tcp.syncache.cachelimit="32768" <-- roughly hash size times bucket limit

  • https://www.bsdcan.org/2016/schedule/attachments/365_Improving PF

    "process_limit" becomes very important with more cores.

    edit: I just found out about RST Cookies

    RST cookies—for the first request from a given client, the server intentionally sends an invalid SYN-ACK. This should result in the client generating an RST packet, which tells the server something is wrong. If this is received, the server knows the request is legitimate, logs the client, and accepts subsequent incoming connections from it.

  • Great post! Using your parameters I was able to remove the hw.igb.num_queues="1" workaround for my HP375T-4 (aka Intel i340T4) NIC

  • hey nice post, i have a standard expi930ctblk single port consume gigabit nics from intel, not sure which setting I would use for flow control ect.

  • @Harvy66:


    does this still excist i only find something like

  • @harvy66 said in Loader.conf.local tuning for modern hardware:

    Haswell i5

    Hey Harvey66,

    with the recent 11.2 update of freebsd do you have any new tweaks,tunables,loader.conf entries. If so would you mind sharing ?