1. Actual Problems
Preliminary check found that when a TCP connection cannot be created externally, a large number of TCP connections in the time_wait status exist on the online server (up to 10 million + single-host connections at a time, the time_wait generated by the module that caused the alarm is about 2 W), which makes it unable to establish a new TCP connection with the downstream module.
Time_wait involves status migration during TCP release connection and the impact of specific socket APIs on TCP status. These concepts are gradually introduced below.
2. TCP status migration
Connection-oriented TCP protocol requires a TCP connection established before each peer communication. The connection can be abstracted as a 4-tuples (four-tuple, also known as socket pair) :( local_ip, local_port, remote_ip, remote_port), the four elements uniquely represent a TCP connection.
1) TCP connection Establishment
The TCP connection is usually called the "three-way handshake" (three-way handshake), which can be used to indicate:
It can be explained as follows:
A. the client sends SYN to the server and specifies that the initial package number (sequence number) is J;
B. the server sends its own SYN Packet and indicates that the initial package number is K. At the same time, ackj + 1 is returned for synj of the client (note: J + 1 indicates that the server expects the next package from the client to be in the order of J + 1 );
C. After the client receives the SYN + ACK from the server, it sends ackk + 1. TCP is established successfully.
In fact, during the three handshakes established by TCP, we also need to use the SYN packet to determine their MSS, timestamp, and other parameters. This involves the details of the Protocol. This article aims to give a reference and will not start any further.
2) tcpconnection termination
In response to the three handshakes that establish the connection, when releasing a TCP connection, it needs to go through four steps (also known as "four waves"), as shown in:
It can be explained as follows:
A. A connected party first calls close () to initiate active close. This API will prompt the TCP transport layer to send a FIN packet to remotepeer, this package indicates that the application initiating the active close will not send data (Note: here, the promise of "no longer send data" is from the application layer perspective. At the TCP transmission layer, or send the unsent data in the kernel TCP send buffer corresponding to the application to the link ).
After the remote peer receives the fin, it needs to complete the passive close (passive close), which is divided into two steps:
B. First, at the TCP transmission layer, issue an ACK packet for the FIN packet of the other party (the main ACK packet order is based on the order of the FIN packet of the other Party plus 1 );
C. then, the application at the application layer receives the peer's EOF (end-of-file, and the peer's Fin package is used as the EOF application to the application at the application layer), knowing that this connection will no longer have data from the peer, therefore, close () is also called to close the connection, which will prompt the TCP transport layer to send fin.
D. The peer that initiates the active shutdown sends an ACK packet after receiving the remote peer's Fin. At this point, the TCP connection is closed.
Note 1:Either party of the TCP connection can first call close () to initiate an active close, and use the client to initiate the active close, instead of saying that only the client can initiate the active close.
NOTE 2:In the preceding description of TCP connection establishment/release, protocol details such as retransmission and congestion control caused by various reasons are not taken into account. If you are interested, you can view various tcp rfc documents, for example, TCP rfc793.
3) TCP statetransition dimo-
The above describes the process of establishing and releasing a connection over TCP. Here we will give a general description of the migration process of the TCP state machine. The TCP state machine migration diagram described in TCP rfc793 is extracted as follows (referenced here ):
The TCP state machine contains 11 statuses, And the statuses are migrated under various socket APIs drivers. Although this figure looks complicated, for those who have some experience in TCP network programming, it is easy to understand. For more information about the migration process, see section 2.6 of Linux Network Programming volume1.
3. time_wait status
After the above preparations, we will finally discuss the content related to the topic of this article. Pai_^
The TCP status migration diagram shows that only the party that calls close () to initiate an active shutdown will enter the time_wait status, in addition, it is required to enter (the three State Migration lines shown in the lower left corner of the figure will eventually enter this State to return to the initial closed state ).
You can also see that the TCP connection that enters the time_wait status must pass through 2msl to return to the initial status, where MSL refers to Max
Segment lifetime, that is, the maximum lifetime of a data packet in the network. An appropriate MSL value should be specified for each TCP protocol implementation method. For example, the recommended value of rfc1122 is 2 minutes. For example, TCP implementation of the Berkeley system usually selects 30 seconds as the MSL value. This means that the typical duration of time_wait is 1-4 minutes.
There are two main reasons for the time_wait status:
1) Reliable release for TCP full-duplex connections
Refer to the TCP release connection mentioned earlier in this article, and assume that the ACK (the last packet of 4 interactions) sent by the party initiating active close (the client in the figure) is lost in the network, because of the TCP retransmission mechanism, the party executing passiveclose (the server in the figure) needs to resend its fin before the fin arrives at the client (the client is the active close initiator, the client must maintain the connection status (although it has called close). Specifically, the resources corresponding to the TCP connection (local_ip, local_port) cannot be immediately released or reassigned. The TCP connection can be restored to the initial closed State only when the fin resent by the romete peer is reached and the Ack is re-sent by the client. If the activeclose side does not enter time_wait to maintain its connection status, when the passive Close side resends the fin, the TCP transport layer of the active Close side will respond to the other side with the RST packet, this will be considered an error by the other party (in fact, this is a normal connection close process, not an exception ).
2) in order to make the old data packet disappear due to network expiration
To illustrate this problem, we first assume that the TCP protocol does not have a time_wait status limit, and then assume that there is a TCP connection: (local_ip, local_port, remote_ip, remote_port). For some reason, close the connection first, and then quickly create a new connection with the same four tuples. As described earlier in this article, TCP connections are uniquely identified by tuples. Therefore, in our assumptions, the TCP protocol stack cannot distinguish the two TCP connections. In its view, this is basically the same connection. The process of first releasing and then establishing in the middle is "imperceptible" to it. This may happen when the data sent by the local peer in the previous TCP connection reaches the remote peer, the TCP transmission layer of the remot peer receives and transmits the normal data of the current TCP connection to the application layer (in fact, in the scenario we assume, before the old data arrives at the remote peer, the old connection is disconnected and a new TCP connection consisting of the same four tuples has been established. Therefore, these old data should not be passed up to the application layer), which leads to data disorder and various unpredictable strange phenomena. As a reliable transmission protocol, TCP must consider and avoid this situation at the protocol level, which is the 2nd reason for the existence of time_wait status.
Specifically, after the local peer actively calls close, the TCP connection enters the time_wait state, and the TCP connection in this state cannot immediately establish a new connection with the same four elements, that is, the local port occupied by the initiator of active close cannot be reassigned during time_wait. Because the time_wait status lasts for 2msl, this ensures that the old data packets in the duplex link of the old TCP connection disappear due to expiration (beyond MSL, you can use the same four tuples to create a new connection without data disorder between the first and second connections.
Another in-depth explanation
There are two reasons for the time_wait status: (1) making the process of closing 4 handshakes more reliable; the last ack of 4 handshakes is sent by the active closing party, if this Ack is lost, the passive shutdown party will send another fin. If the active shutdown party can maintain a 2msl time_wait status, there is a greater chance that the lost ack will be sent again. (2) prevent lost duplicate from damaging the transmission of new normal links. Lost duplicate is very common in the actual network. It is often because of a router failure and the path cannot be converged. As a result, a packet performs a similar endless jump between routers A, B, and C. The IP header has a TTL, which limits the maximum number of hops of a packet in the network. Therefore, this packet has two kinds of fate: either the TTL is changed to 0 and disappears in the network; alternatively, the router path converges before the TTL value is 0, and the remaining TTL hops finally reach the destination. However, it is a pity that TCP sent a packet exactly the same as it earlier through the timeout retransmission mechanism and reached its destination before it, therefore, its fate is destined to be abandoned by the TCP protocol stack. Another concept is "incarnation connection", which refers to the new connection that is the same as the socket pair, which is called "incarnation of previous connection. Lost duplicate with incarnation connection will cause a fatal error to our transmission. As we all know, TCP is stream, and the arrival sequence of all packets is inconsistent. serial numbers are concatenated by the TCP protocol stack. Assume that an incarnation connection receives seq = 1000, when a lost duplicate is seq = 1000, Len = 1000, TCP considers this lost duplicate to be valid and put it in the Receive Buffer, resulting in transmission errors. A 2msl time_wait status ensures that all lost duplicate disappears to avoid errors caused to new connections.
Q: What does so_reuseaddr mean when writing a TCP/sock_stream service program?
A: This socket option notifies the kernel. If the port is busy but the TCP status is time_wait, it can be reused.
Port. If the port is busy while the TCP status is in another status, an error message is still returned when the port is reused,
Specify "The address is in use ". If you want to restart the service immediately after the service program is stopped, and the new socket is still
When the same port is used, the so_reuseaddr option is very useful. You must be aware that at this time any non-Period
The arrival of data may lead to confusion in the service program response, but this is only a possibility. In fact, it is not
Possible.
TCP/IP protocol -- reasons for the existence of time_wait status