You are not logged in.
Taken from the original FAQ.
Remember that TCP guarantees all data transmitted will be delivered, if at all possible. When you close a socket, the server goes into a TIME_WAIT state, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to each other that they will send no more data. This, it seemed to me was good enough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully. Second, there may be "wandering duplicates" left on the net that must be dealt with if they are delivered.
Andrew Gierth ( firstname.lastname@example.org) helped to explain the closing sequence in the following usenet posting:
Assume that a connection is in ESTABLISHED state, and the client is about to do an orderly release. The client's sequence no. is Sc, and the server's is Ss. The pipe is empty in both directions.
Client Server ====== ====== ESTABLISHED ESTABLISHED (client closes) ESTABLISHED ESTABLISHED ------->> FIN_WAIT_1 <<-------- FIN_WAIT_2 CLOSE_WAIT <<-------- (server closes) LAST_ACK , ------->> TIME_WAIT CLOSED (2*msl elapses...) CLOSED
Note: the +1 on the sequence numbers is because the FIN counts as one byte of data. (The above diagram is equivalent to fig. 13 from RFC 793).
Now consider what happens if the last of those packets is dropped in the network. The client has done with the connection; it has no more data or control info to send, and never will have. But the server does not know whether the client received all the data correctly; that's what the last ACK segment is for. Now the server may or may not care whether the client got the data, but that is not an issue for TCP; TCP is a reliable rotocol, and must distinguish between an orderly connection close where all data is transferred, and a connection abort where data may or may not have been lost.
So, if that last packet is dropped, the server will retransmit it (it is, after all, an unacknowledged segment) and will expect to see a suitable ACK segment in reply. If the client went straight to CLOSED, the only possible response to that retransmit would be a RST, which would indicate to the server that data had been lost, when in fact it had not been.
(Bear in mind that the server's FIN segment may, additionally, contain data.)
DISCLAIMER: This is my interpretation of the RFCs (I have read all the TCP-related ones I could find), but I have not attempted to examine implementation source code or trace actual connections in order to verify it. I am satisfied that the logic is correct, though.
More commentarty from Vic:
The second issue was addressed by Richard Stevens ( email@example.com, author of "Unix Network Programming). I have put together quotes from some of his postings and email which explain this. I have brought together paragraphs from different postings, and have made as few changes as possible.
From Richard Stevens ( firstname.lastname@example.org):
If the duration of the TIME_WAIT state were just to handle TCP's full-duplex close, then the time would be much smaller, and it would be some function of the current RTO (retransmission timeout), not the MSL (the packet lifetime).
A couple of points about the TIME_WAIT state.
A wandering duplicate is a packet that appeared to be lost and was retransmitted. But it wasn't really lost ... some router had problems, held on to the packet for a while (order of seconds, could be a minute if the TTL is large enough) and then re-injects the packet back into the network. But by the time it reappears, the application that sent it originally has already retransmitted the data contained in that packet.
Because of these potential problems with TIME_WAIT assassinations, one should not avoid the TIME_WAIT state by setting the SO_LINGER option to send an RST instead of the normal TCP connection termination (FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it's your friend and it's there to help you :-)
I have a long discussion of just this topic in my just-released "TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of the most misunderstood features of TCP.
I'm currently rewriting "Unix Network Programming" and will include lots more on this topic, as it is often confusing and misunderstood.
An additional note from Andrew:
Closing a socket: if SO_LINGER has not been called on a socket, then close() is not supposed to discard data. This is true on SVR4.2 (and, apparently, on all non-SVR4 systems) but apparently not on SVR4; the use of either shutdown() or SO_LINGER seems to be required to guarantee delivery of all data.
From: Stephen Satchell
On the question of using SO_LINGER to send a RST on close to avoid the TIME_WAIT state: I've been having some problems with router access servers (names withheld to protect the guilty) that have problems dealing with back-to-back connections on a modem dedicated to a specific channel. What they do is let go of the connection, accept another call, attempt to connect to a well-known socket on a host, and the host refuses the connection because there is a connection in TIME_WAIT state involving the well-known socket. (Steve's book TCP Illustrated Vol 1 discusses this problem in more detail.) In order to avoid the connection-refused problem, I've had to install an option to do reset-on-close in the server when the server initiates the disconnection.
My server is a Linux system running 2.0.34 if that level of detail is important to the discussion.
The IP address is usually the same, but the socket number is always different -- I've logged the socket numbers used by the router access servers and they are indeed different. I don't have a log for refused connections, however. (Interested in how to record this information, by the way.)