Those things of the TCP protocol (concluding article)

Source: Internet
Author: User

Transport Layer Overview Transport Layer Overview

TCP protocol Features: connection-oriented, byte-stream, reliable transmission

For links:

1. Both parties that use the TCP protocol must first establish a connection, and both must allocate the appropriate kernel resources. TCP connections are full-duplex, which means that both parties can read and write based on a single connection.

BYTE stream:

1. When the sender application writes more than once, the TCP send module will first put the data in the send buffer, when the TCP send module is actually sent, the data in the send buffer may be encapsulated in one or more packets sent. All according to the above conclusions, the number of write operations performed by the application does not correspond to the amount of packets sent by TCP.

2. When the receiving end receives one or more packet data, the TCP module must put the application layer data that they carry to the TCP accept buffer in accordance with the sequence number of the message segment. It also notifies the application to read the data, so that the application can read the data at once or read the data more than once, depending on the size of the application read buffer. According to the above conclusions, there is no fixed relationship between the number of TCP read operations and the number of received packets.

Summary: The number of write operations performed by the sending side has no relation to the number of read operations on the receiving end, there is no boundary limit for the sending and receiving of the application, and UDP is not.

TCP byte stream service process:

The above is the byte stream service process:

We can see that sending a short call is not the same as receiving a short number of times. All data is first entered into the buffer and is being manipulated. Encapsulated into a message segment for sending or receiving.

Reliable transmission

The TCP protocol is a reliable transport protocol, and the mechanisms that support this reliable transmission are:

    • Send reply mechanism: that is, each data message segment sent by the sending side must be answered by the receiving party.
    • Timeout from wear mechanism: that is, the sending side after sending the message segment data, itself to start the timer, if not received in the specified time response to the receiving end, then the data will be re-sent.
To understand the role of TCP in the entire TCP protocol stack we need to analyze the TCP header structure:

    • The head structure appears in each TCP message segment
    • Header structure includes 20-byte fixed structure and 40-byte header option fields
    • 16-bit source port: Where does this segment data come from?
    • 16-bit destination port: Which upper layer protocol or application the packet data is passed to. The client basically uses the temporary port number, and the server chooses the well-known service port number, under Linux, all the well-known service port numbers are defined under the/etc/services folder.
      The HTTP service port number for the 80,dns service port number is 53,FTP server ports of 21.
    • 32-bit serial number: In a TCP communication (from establishing a connection to disconnecting), the TCP protocol will split the data into several data segments according to the actual transmission capability when sending the data, and the 32-bit serial number is marked for this data segment.
    • 32 Digit Confirmation Number: If the receiver does not receive a segment of the data will tell the sender: "I did not receive this data please resend" so with the confirmation number can guarantee the integrity of the data
    • 4-bit head length: Identifies the number of 4 bytes
    • URG: Is the emergency pointer valid?
    • ACK: Indicates whether the confirmation number is valid
    • PSH: Indicates that the receiving application should immediately read the data from the receive buffer and make room for receiving subsequent data.
    • RST: Requires the other person to reestablish the connection.
    • SYN: Request to establish a connection
    • FIN: Close the connection with the other side.
    • 16-bit window size: TCP traffic control.
    • 16-bit checksum: The receiving end of the TCP data to the CRC algorithm to verify that the TCP message segment in the process of transmission is not damaged. (including data section)
    • 16-bit emergency pointer: The ordinal of the next byte that represents the last critical data.
    • Header options: Maximum of 40 bytes

1.kind:说明选项类型。2.length:表示该选项的总长度。3.info:选项的具体信息。

Kind type:

TCP connection state transfer Process Server typical state transfer process

1, the first server through the listen system call, into the listen state, the passive open, passive waiting for the client connection, the server listens to a client connection request, this connection request is called synchronization message segment. TCP then places it in the kernel wait queue and sends an acknowledgment segment with SYN to the client, after which the connection is in the SYN_RCVD state. So this time if the server received a client ACK acknowledgement message segment, the server entered the established, this State can allow both parties to transmit data.

Transfer process for server shutdown state

1, when the client actively shut down the connection, the client actively call close, the server receives the client's shutdown message segment, the server returns an ACK acknowledgment message segment, is connected to enter the Close_wait state. This state is just like the literal meaning of waiting for the server to close the connection. This time the server will also send an end message to the client segment, this time into the last_ack. The next step is to wait for the client to end the connection for the last acknowledgment.

Client connection state Transfer process

1, the client first through the Connect system call, to the server to send synchronization message segment. Bring the system into the ysn_sent state. Next there are two possible, if the connection fails, the connection will revert to the closed state, if the customer received more than the server returned by the synchronization message segment, confirm the message segment, indicating that the client successfully connected to the server. Connection transfer to established status (on-function)

The state transfer process when the client is shutting down

1, when the client shuts down, the client sends a close connection message segment, while the connection enters the sfin_wait_1 state, if the service-side ACK acknowledgement is received, the connection status enters the Fin_wait_2 state. When the end message segment is also sent by the server, the client sends an ACK acknowledgement and the client enters the TIME_WAIT state.

Transfer diagram for TCP connection state:

Learn more about connection transfer process by referring to pictures

The process by which TCP establishes connections and closes connections

First, let's review what we've learned before, TCP header structure has 6 identity bits

    • SYN: Only valid when a three-time handshake is established, indicating a synchronization segment.
    • ACK: the acknowledgement flag for the TCP request.
    • FIN: Flag used to end a TCP connection that identifies this segment as an end message segment

That's what we need to know when we talk about the following:

Three-time handshake to establish a connection

Let's start with the three-time handshake process:

    • The first handshake: in fact, the client sends a data packet with a SYN sent to the server, corresponding to the picture is from the host a-> Host B (syn=1,seq=i), SEQ for the corresponding serial number of I. After the host a enters the syn_sent state. While waiting for the server to confirm
    • Second handshake: After the server receives the synchronization segment (that is, the synchronization message segment with SYN), the client's synchronization message segment must be determined. Then oneself also send a message segment with a SYN, corresponding to the picture also died the second segment of the message, wherein Syn=1, indicating that this is a synchronization message segment, Ack=1, indicating that this is a confirmation message segment, confirming that the ordinal value is i+1, the ordinal value is J.
    • Third handshake: This is the client received confirmation of the synchronization message section, it sends a confirmation packet to the server, corresponding to the picture is the third segment of the message, ack=1 that is to say that this is a confirmation message segment, confirm that the serial number is j+1

After the three-time handshake is established, the client and server begin transmitting data.

Three-time handshake:

Four-time handshake close connection

Next, let's introduce the next four handshake close

    • First handshake: Indicates that client a sends to the server an end segment with a SYN to turn off the data transfer from client A to service B, corresponding to the first segment of the picture, fin=1,seq=i, indicating an end segment
    • Second handshake: Service side B After receiving this end segment, it returns a confirmation segment, which corresponds to the second acknowledgment segment in the image. Where Ack=1,ack=i+i, indicates a confirmation message segment, confirming that the serial number is i+1.
    • Third handshake: Server B Closes the connection to client A, and he sends a segment with a SYN end message, corresponding to the third segment of the message on the image, where syn=1,seq=j, which indicates the end segment, is numbered J.
    • Fourth handshake: Client A Returns a confirmation segment, which confirms that the sequence number is j+1, and the corresponding picture is the fourth segment of the message, ack=1,ack=j+1.
Summary: Why does it take three handshakes to establish a connection? Why do I need to shake four times to close the connection? Why does it take three handshakes to establish a connection?
    • Accept the connection is missing need 4 times handshake, after establishing the connection in the listen state, it can combine the synchronization message segment and the acknowledgment message segment into a message segment to send, confirm the message section to play a role, synchronous message section plays a synchronous role, that is, three times the second segment of the handshake is actually played two role

    • When the connection is closed, when receiving the other end of the message segment, that the other party does not have the data sent to you, but their data may not be all completed, so you might also need to continue to send some data after sending the end of the message segment to the other side, you agree that you can now disconnect, so close the connection requires four times handshake

    • A deadlock can occur if you change the three-time handshake to a two-time handshake. About deadlocks, (search)

Four-time handshake:

Meaning of the TIME_WAIT state

In TCP details (ii), it is mentioned that when the client receives the end segment of the server, it does not immediately enter the closed state, but instead shifts to the TIME_WAIT state, in which case the client waits for a period of twice times the MSL (maximum lifetime of the message segment). Wait until this time to completely shut down. The duration of the time_wait state is twice times that of the MSL, which is enough to allow a group in one Direction to live at most msl seconds to be discarded, and the maximum number of responses in the other direction to survive the MSL seconds is discarded. By implementing this rule, we can guarantee that every successful TCP connection is established. The repeating groupings from the previous avatar of the link have vanished in the network.

There are two reasons why the TIME_WAIT state exists:
    • Reliable termination of TCP connections:

      For example, in the four handshake, if the third segment is lost, this time the server will resend the message segment, so the client needs to stay in a state to handle the duplicate information. Otherwise, the client sends a reset message segment to the server, making it a mistake for the server to think

    • Ensure that the delayed TCP message segment has enough time to identify and discard:

      On a Linux system, a TCP port cannot be opened multiple times at the same time, and when a TCP connection is in the TIME_WAIT state, we will not be able to immediately establish a new connection using the port occupied by that connection.
      However, in turn, we consider that if there is no such mechanism, then the server has just closed the connection, and established a similar connection, become "The original link Avatar", the "Avatar" may receive the original connection of the data message segment, this situation is not allowed to occur.

Ime_wait disadvantages of Too many states:
    • According to the appeal, we can learn that if the ime_wait state is too large, it consumes a lot of port numbers,
How to deal with ime_wait state too much
    • modifying kernel parameters

    • Close the connection as passively as possible

    • To modify a long connection to a short connection

    • Avoid the negative effects of the ime_wait state as much as possible with the socket option

function of RST reset message segment

In the TCP protocol explained in detail (a), the header information of TCP, there is a 6-bit ID field in the header, the 6 bits have a time RST flag bit. If the flag bit of RST in a TCP message segment = 1, the message segment is the RST message segment, which is the reset message segment, in some cases, one end of the TCP connection is sent with an RST data segment like the other end to notify the other side to close the connection or reconnect.

The RST message segment is the request to connect, the time to send this message segment is:

    • When the connection is established
    • When sending data in the middle
    • When the connection is closed

In all three cases, it is possible to send an RST reset message segment.

To send the RST message segment the RST reset message segment appears for many reasons, in the network programming, it is difficult to troubleshoot why? So list some common reasons to share with everyone:
    • The port is not open: If the server program port bit is open and the client is connected.
    • Request Timeout: For example, when a client connects to a server, the Connect system call fails with the error message, but this time the ping command is used to test for no packet loss, and if the client receives the server-side synchronization message segment by using the Grab tool view, The RST reset message segment is sent, which may be the reason why the request timed out
    • Early shutdown: If the server shutdown or abnormal termination of the connection, and due to network failure, the other party did not receive the end of the message segment not at this time the client also maintained the original connection, this time we put the servers restarted, but the secondary server has no information on the connection. If the client writes a data to the server at this point, the other party responds with a RST reset message segment
    • Receive data on a closed socket: This situation does not see much, for example, when the connection is closed, the data packets in the network go to the target segment, when the target segment finds that the connection is closed and the reset message segment is sent to the other connection.
    • An exception terminates a connection: After the data exchange is complete, the data packet segment is sent.
    • Send data to a port in the Listen state: receive the RST reset message segment sent to the peer
Understanding the above content is helpful for debugging network programming. TCP reliable transmission mechanism TCP timeout retransmission

If the network exception is rescued when a timeout or packet loss occurs, the TCP module must be able to retransmit a TCP segment that is not received by the other party within the timeout period.

    • The TCP module maintains a retransmission timer for each TCP packet segment: The timer starts when the TCP message segment is first sent, and if the receiver's response is forfeited within the timeout period, the TCP module will retransmit the TCP segment and reset the timer
    • If timed out, re-pass, reset timer
TCP Congestion Control

Tasks for the TCP module

    • Improve network utilization
    • Reduced packet loss rate
    • Congestion control

      Congestion control, and TCP traffic control, this control mechanism is for the reliable transmission of TCP settings, congestion control task is to ensure that the subnet can carry the traffic reached. This is a global problem, involving all aspects of the behavior (just interested can go to their own search, here do not do too much introduction).

      For congestion control we explain in detail:

      Feedback process for congestion control:

The ultimate controlled variable of congestion control is the amount of data that is sent to the network in successive writes, which we call a send window. However, the sending window finally sends the data with the TCP message segment, and the sending port limits the number of TCP segments that are continuously sent, and the sending window is called the Swnd, the maximum length of these segments is called the MSS send side needs a reasonable choice to send the window, if the sending window is too small, there will be network delay phenomenon, If the sending window is too large, it can easily cause network congestion. The receiver can control the sending window by receiving the notification window, but this does not seem enough, so introduce a state variable called congestion window on the sending side, the congestion window referred to as CWnd, and receive the notice window short of rwnd. The closed loop feedback control of congestion control is shown in the image.

# # #几种拥塞控制的方法:

    • TCP Slow start
    • Congestion avoidance
    • Fast re-transmission
    • Fast Recovery
Slow start, congestion avoidance diagram:

We know that the sending side maintains a state variable of a congested window, the congestion window is called CWnd, and the size of the congestion window depends on the congestion level of the network and the dynamic change. The processing principle of CWnd is that as long as there is no congestion in the network, the congestion window is larger, so that more packets can be sent out. As long as there is congestion in the network, the congestion window decreases. To reduce the number of injections into a network grouping.

Let's now analyze the algorithm for congestion control:

    • Slow start: When the host starts sending data, it can cause network congestion if a large number of bytes are injected into the network. Because we don't know the load of the network now. So the better way is to automatically detect, from small to large full increase congestion window values, usually in the beginning to send the message segment, the congestion port is set to a maximum message segment, and in each received a new message segment, the size of the congestion window plus 1, according to exponential law growth, That is, increase the value of one of the largest message segments. We use the same method to gradually increase the congestion window on the sending side. Data that can be injected into the network is more reasonable.

      Corresponds to the diagram, the horizontal axis from the 0->3 is full start state. In order to prevent the network congestion caused by the increase, we also need to set a full-boot Ssthresh value, the Ssthresh usage of the slow start is that if the congestion window is less than the Ssthresh value, the slow start algorithm is applied. If the congestion window is larger than the Ssthresh value, stop applying the slow-start algorithm. Instead of using congestion avoidance algorithms

    • Congestion avoidance: Congestion avoidance algorithm in order to let the congestion window grow slowly, that is, every time after a round trip, the sender of the congestion window plus 1, rather than multiply, so that congestion window is slowly growing in accordance with the linear law. Corresponding to the image is the obvious inflection point, are congestion avoidance algorithm execution.

It is not possible to control network congestion by simply applying slow start and congestion control

Next, we introduce fast retransmission, fast recovery

In many cases, the TCP sender can accept a duplicate acknowledgment segment, such as the loss of TCP packets, and so on, the sender if the connection received three duplicate acknowledgment message segments, you can determine the network congestion. This time using fast retransmission, fast recovery algorithm. Fast retransmission algorithm I asked the receiving end to send a retransmission confirmation as soon as it received a lost message.

Congestion occurs or there are three processes that do not step:

The process after congestion occurs:

    • Received three duplicate acknowledgment processing: When three duplicate segments are received, recalculation of the full-boot Ssthresh value is half the Ssthresh value, and then the retransmission segment is immediately re-transmitted.
    • Received 1 duplicate acknowledgment processing: After halving the Ssthresh value of the slow start, the congestion avoidance algorithm is started instead of the slow-start algorithm just mentioned.
    • Processing after receiving new data: Reset the slow-start Ssthresh value so that the congestion window is equal to the current set Ssthresh value.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Those things of the TCP protocol (concluding article)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.