TCP timeouts

  • I was tossing around the idea of lowering some of the timeouts involved with TCP/UDP when a connection is not established. To me it seems odd to have a 120sec timeout when you should be getting back responses in tens of milliseconds. A 3 way handshake should complete in under in at worst about 30 seconds. Most TCP implementations will assume a non-response in 1 second means the packet was lost and will retry, and if no response in 2 seconds, try again, after 4 seconds, assume the connection is dead. So at worst, if the other side does not respond in under 7 seconds(6.9999999999), the connection is considered dead and TCP throws a timeout exception.

    This is done per stage of the 3-ways handshake, which gives you an upper bound of 21 seconds. Unless I'm getting insanely high bufferbloat or I'm communicating with Mars, packets should be responded to in a timely fashion, except for possibly 1-3 seconds of buffer bloat. But I'm not looking to reduce established connections, just establishing ones. First and Opening. Maybe setting First to 25 and Opening to 15.

  • How long does it take to initiate a connection over a LTE data link?  TCP timeouts in general have a fair bit of interconnectedness;  sometimes not so obvious, subtle breakage can occur.

    If everything is on the same LAN, yes things should respond in a timely fashion, but the overall time is governed by the slowest (lowest bit rate) link.

    Not saying there isn't room for tuning, but just keep an eye out for weird behavior.

  • I guess the 1 sec default is some default for something, but I can't find what. Looks like 3 is more common.

    3sec retry, 6sec retry, 12sec retry  21sec upper bound.

    21sec is a maximum per step of the 3-way handshake. If the first packet timeout of 120sec is meant to be all 3 steps, then 120sec is reasonable. I took first to mean the state of the connection is such that one packet has been sent. As soon as the first response comes back, 2 way of the hand shake, it is now "establishing", which has a 30sec timeout. So me, first should also be 30 seconds. 30sec for a 21sec upper bound isn't bad.

  • Default: tcp.closed                  90s

    I wanted to look this up and found out it is recommended to be 2 minutes to allow "out of order packets" to make it to their destination. You know, if you're using 1970s hardware where packets can have up to 90 seconds of delay. Nothing says awesome like a 90,000ms ping because of some random route.

    Some of these TCP timeouts are dynamic, but several of these have fixed maximums based on the number of retries, like 21sec as I've posted before.

    Mer mentioned LTE data link, which is another way to say "what about other connection technologies"? Unless the remote server you're connecting to, you're still pretty much going to have the same issue. Even if your device was configured to handle say a 60sec TCP SYN timeout, the other side probably won't. Your device will be waiting u to 60 seconds, but the other side is still going to be waiting at max 21 seconds.

    As long as you're using or connecting to a BSD, Linux, Apple, or Windows system, you're still going to be limited to about 21 seconds unless the remote system has a custom configuration for their TCP settings. Your ISP's network could be running a transparent TCP proxy, then it would make a difference. I'm pretty sure TCP proxies are common with sat links.

    That closed timeout is a funny one. It exists for the sole purpose of out of order packets that have not arrived yet but are in-flight from before the connection got closed. For any Earth bound routes, even 10 seconds would be incredibly long. The default 90s seems like an eternity. China has a 250ms RTT to me in Midwest USA, which means about 125ms total time one way. In 90 seconds a packet could cross the distance between MidWest USA to China 720 times. I don't know about others, but my packets have a 64 hop limit. If you assume the average hop is a ghastly 1000ms, that puts a 64sec max limit.

    New York City to London is about 70ms and that's a long single hop. 70ms * 64 hop limit is about 4.5 seconds.

    i'm just saying, some of these values are a bit large. Of course, with my limit understand and knowledge.

  • It's been a while since I've had to look at the exact usage of TCP Close from a programming view. While the TCP protocol officially allows data to be received during and after a close has been initiated and even completed, the common TCP APIs do not allow this. Calling close on TCP is a blocking call and you cannot read from the stream once closed. For all intents and purposes, once a connection is closed, it's useless.

    The only reason I see leaving a closed connection opened for a long period is for the situations where one side closes the connection because of a timeout but the other side still thinks it's open. In these cases, leaving the state in the firewall allows the remote system to potentially send a RST packet when an unexpected communications comes in. This would allow the connection to quickly die on the remaining computer's side of things. But really, one side has already timed out and closed the connection without a complete exchange, so I personally see no reason to be so courteous. Let them wait the 21sec timeout.

    If someone has some other opinions, facts, or corrections to add to this, please do. I feel like I'm muttering to myself.