Reference articles
What happens with TCP (bottom)
Http://coolshell.cn/articles/11609.html
TCP/IP details-congestion control & slow start fast recovery congestion avoidance
http://blog.csdn.net/kinger0/article/details/48206999
TCP window full
http://blog.csdn.net/abccheng/article/details/50503457
noun explanation
Mtu:maximum transmission Unit, the maximum transmission unit, is specified by hardware, such as an Ethernet MTU of 1500 bytes. Refers to the IP layer including the IP header and data maximum of 1500 bytes. The IP layer is to be fragmented. Typically, TCP does not occur for UDP protocols. For the TCP protocol, this Protocol is a connection-oriented protocol, and for the TCP protocol it is very concerned about the order in which packets arrive and whether errors occur in the transmission. Therefore some TCP applications have requirements for shards---cannot be fragmented (DF).
Mms:maximum segment size, the maximum section size, is the maximum data fragment size for each transmission of a TCP packet, typically by sending end-to-end TCP to notify the peer to the maximum TCP data that can be sent in each subsection. The MSS value is the MTU value minus the IPv4 header (in bytes) and the TCP header (in bytes), which may be subtracted from the TCP option.
TTL: (Time to Live), which indicates the lifetime of the datagram in the network, is set by the source point of the datagram, which is designed to prevent undelivered datagrams from going around the internet without restrictions, thus wasting network resources and reducing the TTL value by 1 per router. When the TTL value is 0 o'clock, the datagram is discarded and works at the IP layer.
RTO: (retransmission timeout) is the retransmission time-out. RTT: (Round trip time) is made up of three parts: the propagation times of the link (propagation delay), the processing time of the end system, the queuing and processing time in the router cache (queuing delay). Where the value of the first two parts is relatively fixed for a TCP connection, the queuing and processing time in the router cache varies with the overall network congestion level. Therefore, the change of RTT reflects the congestion degree of the network to some extent.
CWnd: Send-side window (congestion windows)
Rwnd: Receiver window (receiverwindows)
Flow control: End-to-end, the application layer processing speed of the receiving end is independent of speed, Rwnd control returned by the receiving end
Congestion control: Active control of the sender control CWnd, there is slow start (starting from CWnd 1 start, exponential start), congestion avoidance ( after reaching Ssthresh, in order to avoid congestion began to attempt linear growth ), Fast retransmission (receiver each received a message segment to reply to a current maximum continuous location of the confirmation, the sender as long as received three duplicate confirmation to know that the receiver dropped packets, fast retransmission packet loss packets, and TCP immediately reduced the congestion window CWnd to 1), Fast recovery ( linear growth directly from Ssthresh).
Upper value of Sender window = Min [Rwnd, CWnd]
When Rwnd < CWnd, it is the receiver's ability to limit the maximum value of the sender window.
When CWnd < Rwnd, the network congestion limit is the maximum value of the sender window.
Flow control Guo Unintentional
Links: https://www.zhihu.com/question/32255109/answer/68558623
Source: Know
The right to be the author of all, reproduced please contact the author for authorization.
1) TCP sliding window is divided into accepting window, sending window
The Sliding window protocol is
Transport layer for flow controlOne of the measures that
the receiver's own window size is advertised by the sending party, so as to control the sender's sending speed, so as to prevent the sender to send too fast and cause themselves drowned.
In recognition of ACK, ACK is usually understood as a confirmation given after receiving the data ack,ack contains two very important information:
The first is expected to receive the next byte ordinal n, which means that the receiver has received the first n-1 byte data, at this time if the receiver receives the N+1 byte data instead of the nth byte data,
The receiver does not send an ACK with a sequence number of n+2. For example, if the receiver receives 1-1024 bytes, it sends an ACK with a confirmation number of 1025, but then receives 2049-3072, which does not send an ACK with a confirmation number of 3072 and still sends a 1025 ACK.
The second is the current window size m, so the sender can calculate how many bytes of data can be sent after receiving the ACK, and assuming that the current sender has been sent to Byte X, the number of bytes that can be sent is y=m-(x-n). This is the basic principle of sliding window control flow
Focus: The sender receives an ACK from the expected number of the next byte ordinal N and Window m, as well as the currently sent byte ordinal x, to calculate the number of bytes that can also be sent.
the first byte ordinal of the send-side window must be the next byte ordinal that is expected to be received in the ACK, such as:
52 53 54 55 bytes are the byte order that can be newly sent
The first byte order of the receiving window must have been fully received, the data in the back window is expected to be accepted, and the data behind the window is not expected to be accepted.
http://blog.chinaunix.net/uid-20778955-id-539945.htmlhttp://www.netis.com.cn/flows/2012/08/tcp-%E6%BB%91%E5%8A%A8%E7%AA%97%E5%8F%A3%E7%9A%84%E7%AE%80%E4%BB%8B/
The TCP sliding window is divided into the receiving window and the sending window
It is not appropriate to discuss the two types of Windows without analyzing them.
TCP's sliding window mainly has two functions, one is to provide TCP reliability, and the other is to provide TCP flow control characteristics. At the same time, the sliding window mechanism also embodies the design idea of TCP for byte stream. The related field of the window in the TCP segment.
The TCP window is a 16bit bit field that represents the byte capacity of the window, that is, TCP's standard window is 2^16-1=65535 bytes maximum.
In addition, a TCP window enlargement factor is included in the TCP option field, Option-kind is 3,option-length 3 bytes, and the value range of Option-data is 0-14. The window enlargement factor is used to enlarge the TCP window, and the original 16bit window can be enlarged to 31bit.
Fundamentals of sliding windows
1) for the sender of a TCP session, any time in its sending cache data can be divided into 4 classes, "has been sent and received the ACK of the", "has been sent but not received the end of the Ack", "not sent but the end is allowed to send", "not sent and the peer is not allowed to send." The two parts of the data that have been sent but have not received an ACK and are not sent but are sent to the end are called send Windows.
When a new ACK is received from the receiving party for subsequent bytes in the Send window, the window slides and the sliding principle is the same.
The window slides when ack=36 is received.
2) for TCP receiver, there are 3 kinds in its receive cache at a certain moment. "Received", "not received ready to receive", "not received is not ready to receive" (due to ACK directly by the TCP stack reply, the default no application delay, there is no "received not reply Ack"). Where "receive ready to receive" is called a receive window.
Send window relationship to receive window
TCP is a duplex protocol in which both sides of a session can receive and send data at the same time. Each side of the TCP session maintains a "Send Window" and a "receive window". The respective "Receive window" size depends on the application, system, and hardware limitations (TCP transfer rate cannot be greater than the data processing rate applied). The respective "Send window" requirements depend on the "receive window" advertised by the peer, which requires the same.
Sliding window for flow-oriented reliability
1) The most basic transmission reliability comes from the "confirm retransmission" mechanism.
2) The reliability of the TCP sliding window is also based on the "Confirm retransmission".
3) The sending window will move the left edge of the sending window only if it receives an ACK acknowledgement of the byte in the sending window of this paragraph.
4) The Receive window will move the left edge only if all previous segments are confirmed. The window does not move and is not acknowledged for subsequent bytes when there is a previous byte that is not received but a subsequent byte is received. This ensures that the data is re-transmitted to the peer.
Flow control characteristics of sliding window
TCP sliding window is dynamic, we can imagine a common math problem in primary school, a pool, volume V, hourly intake of water V1, the amount of water V2. When the pool is full, it is not allowed to inject again, if there is a hydraulic system to control the size of the pool, then you can control the water injection rate and amount. Such a pool is similar to a TCP window. According to the change of its processing ability, the application of TCP receive window size control on the side to the terminal of the sending window traffic limit.
The application notifies the TCP protocol stack through the API to narrow the TCP receive window when it is needed, such as low memory. The TCP stack then includes a new window size notification to the peer when the next segment is sent, and a window to the notification to change the Send window to reduce the sending rate.
Congestion control
1. Congestion: the need for resources exceeds the available resources. If many resources in the network are not available at the same time, the performance of the network is obviously bad, and the throughput of the whole network decreases with the increase of load.
Congestion control: prevents excessive data from being injected into the network, which can prevent routers or links in the network from overloading. congestion control has to do with a premise: The network can withstand the existing network load. Congestion control is a global process that involves all hosts, routers, and all the factors that are associated with reducing network transmission performance.
Flow control: Pointing to point traffic control, is the end to the correct problem. The flow control is to suppress the sending side to send the data rate, so that the receiving end time to receive.
Congestion control cost: the need to obtain information about the distribution of traffic within the network. Before congestion control is implemented, it is also necessary to exchange information and various commands between nodes in order to select the control strategy and implementation control. This creates additional overhead. Congestion control also needs to allocate some resources to individual users, so that network resources can not be better shared.
2. Several congestion control methods
Slow start (Slow-start), congestion avoidance (congestion avoidance), fast retransmit, and fast recovery (fast recovery).
2.1 Slow start and congestion avoidance
The sender maintains a state variable of the congested window CWnd (congestion windows). The size of the congestion window depends on the degree of congestion of the network and is dynamically changing. The sender makes its own sending window equal to congestion.
The principle of the sender to control the congestion window is that, as long as the network is not congested, the congestion window increases some more so that more packets can be sent out. But as long as the network becomes congested, congestion windows are reduced to reduce the number of packets injected into the network.
Slow start algorithm: When the host begins to send data, if the large amount of data bytes are injected into the network immediately, then it is possible to cause network congestion, because the network load is not clear now. Therefore, a better method is to first detect, that is, from small to large gradually increase the sending window, that is, from small to large gradually increase the congestion window value. Typically, the congestion window CWnd is set to the value of the MSS of the maximum packet segment when the message segment is just beginning to be sent. And after each receipt of a new message segment confirmation, the congestion window to increase the value of up to one MSS. Using this method to gradually increase the sender's congestion window CWnd, can make the packet injection to the network more reasonable rate.
Every passing round, the congestion window is doubled. The time spent in a transfer round is actually a roundtrip time rtt. However, the "transmission round" is more emphasis: the congestion window CWnd allowed to send the message segments are sent continuously, and received the last byte sent to confirm.
In addition, the slow start of "slow" does not mean that the growth rate of CWnd is slow, but rather when TCP begins to send a message segment is set cwnd=1, so that the sender at the beginning only send a message segment (to test the network congestion), and then gradually increase the CWnd.
In order to prevent the congested window CWnd from growing too large to cause network congestion, you also need to set a slow-start threshold ssthresh state variable (how to set Ssthresh). The use of the slow start threshold Ssthresh is as follows:
When CWnd < Ssthresh, use the slow-start algorithm described above.
When CWnd > Ssthresh, stop using the slow start algorithm and use the congestion avoidance algorithm instead.
When CWnd = Ssthresh, you can either use the slow-start algorithm or use congestion control to avoid the algorithm.
Congestion avoidance algorithm: Let the congestion window CWnd grow slowly, that is, each time a round trip through the RTT to the sender of the congestion window CWnd plus 1, instead of doubling. In this way, the congestion window CWnd grows slowly by linear law, and the congestion window grows much slower than the slow-start algorithm.
The slow start threshold Ssthresh is set to half of the value of the sender window when congestion occurs (but not less than 2), as long as the sender determines that the network is congested (based on the absence of acknowledgement), either in the slow start or in the congestion avoidance phase. Then the congestion window CWnd is reset to 1 and the slow start algorithm is executed. The purpose of this is to quickly reduce the number of packets sent to the network by the host, so that the congested routers have enough time to complete the backlog of packets in the queue.
For example, the above-mentioned congestion control process is illustrated with specific numerical values. The size of the sending window is now as large as the congested window.
<1>. When the TCP connection is initialized, the congestion window CWnd is set to 1. As mentioned earlier, for ease of understanding, the window units in the figure do not use bytes and the number of message segments. The initial value of the slow start threshold is set to 16 message segments, which is CWnd = 16.
<2>. When a slow-start algorithm is executed, the initial value of the congestion window CWnd is 1. Each time the sender receives a confirmation ACK for the new segment, it puts the congestion window value another 1 and then starts the next round of transmission (the horizontal axis of the graph is the transmission turn). Therefore, the congestion window CWnd increases exponentially with the transmission rounds. When the congestion window CWnd grows to a slow threshold value of ssthresh (i.e. when cwnd=16), the congestion control algorithm is implemented instead, and the congestion window grows by the linear law.
<3>. Assuming that the value of the congestion window grows to 24 o'clock, the network times out (which is likely to be congestion on the network). The updated Ssthresh value becomes 12 (that is, half the value of the congested window becomes 24 when the time-out occurs), the Congestion window is reset to 1, and the slow-start algorithm is executed. When the cwnd=ssthresh=12 is implemented instead of congestion avoidance algorithm, the congestion window grows linearly, increasing the size of one MSS per round trip time.
It is emphasized that "congestion avoidance" does not mean that congestion can be avoided altogether. It is impossible to use the above measures to completely avoid network congestion. "Congestion avoidance" is to control congestion in the congestion avoidance phase to increase by linear law, making the network less prone to congestion.
2.2 Fast retransmission and fast recovery
2.2 Fast retransmission and fast recovery
If the time-out timer set by the sender has arrived but has not yet received confirmation, it is likely that the network is congested, causing the segment to be discarded somewhere in the network. At this point, TCP immediately reduces the congestion window CWnd to 1, and executes the slow start algorithm, while the slow start threshold value ssthresh halved. This is a situation where fast retransmission is not used.
The fast retransmission algorithm first requires the receiving party to issue a duplicate acknowledgment every time a sequence of messages is received (in order for the sender to know earlier that the message segment did not reach the other) and not wait for the sender to confirm the data.
After receiving the M1 and M2, the receivers were given confirmation. Now assume that the receiver did not receive M3 but then received the M4. Obviously, the receiver cannot confirm the M4, because M4 is the received out-of-sequence message segment. According to the principle of reliable transmission, the receiver can do nothing, or at the appropriate time to send a confirmation of the M2. However, according to the provisions of the fast retransmission algorithm, the receiver should send a duplicate confirmation of the M2 in time, so that the sender can know earlier that the message segment M3 did not reach the receiving party. The sender then sends M5 and M6. Once the receiving party receives these two messages, it also has to issue a duplicate confirmation of the M2. In this way, the sender received a total of four acknowledgment of the M2 of the receiver, of which the first three were repeated confirmations. The fast retransmission algorithm also stipulates that the sender should immediately re-transmit the message segment M3 that has not been received, without having to wait for the retransmission timer of M3 set to expire, as long as it receives three duplicate confirmations in a row. Since the sender is re-transmitting the unacknowledged segment as early as possible, using fast retransmission can increase the overall network throughput by about 20%.
There are also fast recovery algorithms used with fast retransmission, which have the following two main points:
<1>. When the sender receives three duplicate confirmations consecutively, the "multiplication reduction" algorithm is executed to halve the slow-start threshold ssthresh. This is to prevent congestion on the network. Note: The slow start algorithm is not executed next.
<2>. Since the sender now thinks that the network is likely to have no congestion, the difference from slow start is that the slow start algorithm is now not executed (that is, the congestion window CWnd is not now set to 1), but instead sets the CWnd value to the value of the slow start threshold ssthresh halved, and then begins to perform the Congestion avoidance algorithm ("addition increases "), causing the congestion window to slowly increase linearly.
The fast retransmission and fast recovery are given, and the "TCP Reno version" is indicated.
Difference: The new TCP Reno version uses the fast recovery algorithm rather than the slow start algorithm after fast retransmission.
Also some fast retransmission implementation is to start with the Congestion window CWnd value increased a bit, that is, equal to Ssthresh + 3 X MSS. The reason for this is that since the sender receives three duplicate confirmations, it indicates that three groups have left the network. These three groupings no longer consume the resources of the network but remain in the recipient's cache. It can be seen that the network does not accumulate groupings but reduces the three groupings. Therefore, the congestion window can be expanded appropriately.
When using the fast recovery algorithm, the slow start algorithm is only used when the TCP connection is established and when the network time-out occurs.
The performance of TCP is obviously improved by using such congestion control method.
The receiver sets the receive window Rwnd according to its own receiving ability, and writes the window value to the Window field in the TCP header and transmits it to the sender. Therefore, the receive window is also called the notification window. Therefore, from the receiver's point of view of the sender's traffic control, the sender's sending window must not exceed the receiving window Rwnd given by the other party.
Upper value of Sender window = Min [Rwnd, CWnd]
When Rwnd < CWnd, it is the receiver's ability to limit the maximum value of the sender window.
When CWnd < Rwnd, the network congestion limit is the maximum value of the sender window.
TCP Protocol Header window, sliding window, flow control, congestion control relationship