TCP protocol
Overview
The TCP protocol and the UDP protocol are at the same level: the Transport Layer , but there is a big difference between the two, the TCP protocol has the following characteristics:
- TCP provides reliable data transfer service, TCP is connection-oriented , that is, the data in the communication between the first to establish a connection, the end of the communication to release the connection, which is the following 3 times the handshake, 4 waves;
- TCP is a point-to-point connection, that is, a TCP connection can only be two endpoints at each end;
- TCP provides reliable, error-free, not lost, non-repetitive, sequential services;
- TCP provides full-duplex communication that allows both parties to send data at any time, and TCP sets the send cache and receive cache at both ends of the connection;
- TCP is a byte- stream-oriented , the data transmitted by TCP is a sequence of bytes transmitted, the data block and data block no boundary information, for TCP, all the data is the same, TCP can not distinguish the meaning of the data.
TCP Message structure:
The header of the TCP segment has a fixed portion of the first 20 bytes, and the following 4n bytes are the fields that are added as needed and are the complete TCP message structure:
20-byte fixed part, each field function description:
- Source and Destination ports : 2 bytes each, write the source port number and destination port number, respectively. This is similar to the UDP header because it is a transport layer protocol.
- Sequence number : Account for 4 byte sequence, the ordinal range [0,2^32-1], the sequence number increases to 2^32-1, the next sequence number goes back to 0. TCP is byte-stream-oriented, and each of the bytes in a stream transmitted by TCP is numbered sequentially, while the ordinal field value in the header refers to the ordinal of the first byte of the data in this section.
- Confirm serial Number : 4 bytes, expect to receive the other side of the next message segment of the first data byte ordinal.
- Data offset : 4 bits, which refers to the length of the header of a TCP segment, including a fixed 20-byte and option field.
- reserved : 6-bit, reserved for future use, currently 0.
- control bit : A total of 6 control bits, indicating the nature of this paper, meaning as follows:
URG emergency : When Urg=1, it tells the system that there are urgent data in this message, priority transmission (such as emergency shutdown), which should be used in conjunction with the emergency pointer field.
ack Acknowledgement : The confirmation number field is valid only when ack=1. After a TCP connection is established, all message segments must have the ACK field set to 1.
PSH Push : If one end of the TCP connection wants the other end to respond immediately, the PSH field can "urge" the other side, no longer wait until the buffer fills up to send.
RET Reset : If a serious error occurs with the TCP connection, the RST is set to 1, the TCP connection is disconnected, and the connection is re-established.
SYN synchronization : Used to establish and release the connection, which is described in more detail later.
FIN termination : Used to release the connection, when Fin=1, indicates that the sender has already sent a request to release the TCP connection.
- window : occupies 2 bytes. The window value refers to the sender's own receive window size because there is a limited amount of space to receive the cache.
- Inspection and : 2 bytes. Like a UDP message, there is a test and check that the message is not in the process of transmission error.
- Emergency pointer : 2 bytes. When Urg=1 is valid, it indicates the number of bytes of emergency data in this section.
- options : variable length, up to 40 bytes. Specific option fields, and then do the introduction when needed.
TCP connection and release:
3.1. Three-time handshake
TCP three-time handshake process:
- The client makes a request connection message segment, where the header control bit syn=1, the initial sequence number Seq=x. The client enters the syn-sent (synchronous sent) state.
- After the service side receives the request message section, sends the acknowledgment message segment to the client. Confirm that the header of the message section syn=1,ack=1, the confirmation number is ack=x+1, and choose an initial ordinal seq=yfor themselves. The server enters the SYN-RCVD (synchronously received) state.
- After the client receives the acknowledgment segment from the server, it also sends a confirmation message to the server. This message section ack=1, confirmation number ack=y+1, and its own serial number seq=x+1. This message segment can already carry data, if not carry the data without consuming serial number, then the next segment sequence number is still seq=x+1.
- At this point the TCP connection has been established, the client enters the established (established connection) state, when the service side receives the confirmation, also enters the established state, between them can formally transmit the data.
3.2. Wave four times
TCP Four wave waving process:
- At this point both ends of the TCP connection are still in the established state, the client stops sending data and emits a FIN message segment . The first fin=1, the ordinal seq=u(U equals the last byte of the client transfer data plus 1). The client enters the fin-wait-1 (terminating wait 1) state.
- Service-side reply acknowledgment message segment, confirmation number ack=u+1, the sequence number seq=v(v equals the service end transmits the data last byte serial number plus 1), the server enters the close-wait (shutdown wait) state. The TCP connection is now semi- open and the client still receives the server if it continues to send data.
- The client receives the acknowledgement message, enters the fin-wait-2 state , after the service side finishes processing the data, issues the FIN message segment,Fin=1, confirms the number ack=u+1, then enters last-ack (final confirmation) State.
- Client reply confirmation message segment,ack=1, confirmation number ack=w+1(w is half open half closed state, received the last byte data number), sequence number seq=u+1, and then enter time-wait ( Time waiting) status.
Note that the connection is not released at this point, and it takes time to wait until the status ends (4 minutes) before the connection is CLOSED. Set the time to wait because, it is possible that the last acknowledgment message is missing and need retransmission, the following specific explanation time-wait (time wait) status reasons:
The TIME_WAIT state is maintained at twice times the maximum section life, 2MSL (msl=30s~120s)
- Reason one: Reliable termination of TCP full-duplex connections
If the client receives a server-side fin, the ACK packets sent are lost in the path , due to the time-out retransmission mechanism, The server side is not receiving an answer from the client, to resend Fin , at which time the total experience may be 2MSL , when the second fin arrives, the Ensure that the necessary state information is maintained on the client, to ensure that the final ACK is sent, and that both sides are able to close .
- reason two: Allow old repeating sections to fade in the network
If you close a link, create a new link immediately, the IP address and port number of the link is the same as the link that was just closed, if the repeating section in the previous link does not fade , this repeating section is likely to be received by a new link, so in order to prevent the section in the old link from disappearing, it does not affect the new link, and the time_wait is set to twice times the MSL.
3.3. Time-wait status of the impact and resolution
because of the presence of time-wait state, so that the socket can be entered and retained for a considerable period of time, if your system has many sockets in the time-wait State, Then you may be affected when you need to create a new socket connection, which can also affect the extensibility of your program. Because in a TCP connection, if a socket is closed, it will remain time-wait for about 4 minutes. If many connections are turned on and off quickly, the socket in the time-wait state of the system will accumulate a lot, and you can use the Netstat command to view the socket in the time-wait state. Due to the limitation of the number of local ports, only a limited number of sockets can be established at the same time, and if too many sockets are in the TIME_WAIT state, you will find it difficult to establish a new external connection because the local port used to create the new connection is too scarce. In Linux, you can allow connection reuse by setting up so_reuseaddr , time-wait exists for its own reasons, and it is not always a good idea to reduce the time by 2MSL or use SO-REUSEADDR to allow connection reuse. If you have the ability to design your protocol to avoid time-wait problems, you can avoid all the problems here.
Implementation of TCP reliable transmission:
- (1) The length of the TCP message segment is variable, and adjusts according to the cache state and network state of the sending and receiving parties.
- (2) When TCP receives data from the other end of the TCP connection, it sends an acknowledgment.
- (3) When TCP sends out a segment, it initiates a timer, waits for the destination to acknowledge receipt of the message segment, and if it cannot receive a confirmation in time, the segment will be re-sent. This is the time-out retransmission that is described later.
- (4) TCP will keep its header and data checked and. If there is a mistake in the detection and discovery of the message segment, the segment will be discarded and wait for the timeout to retransmit.
- (5) TCP sorts the data in bytes, with a sequence number in the message segment to ensure the correctness of the order.
- (6) TCP can also provide flow control. Each side of a TCP connection has a send and receive cache. The receiving side of TCP only allows the other end to send the data that the receiving buffer can accept. This prevents faster hosts from causing buffer overruns for slower hosts.
The timeout retransmission mechanism is time-consuming and waits for confirmation for each datagram sent. In practice, this is not true, the real situation is that the use of pipeline transmission: The sender can continuously send a plurality of message segments (continuous transmission of the length of the data is called a window), without having to stop every time to wait for confirmation. In practice, the receiving party does not have to reply to every message received, but instead uses the cumulative acknowledgement method: After receiving multiple successive segments, the receiver only replies to the last segment of the message, indicating that the data before it has been received. In this way, the transmission efficiency has been greatly improved.
UDP protocol
User Datagram Protocol, which only adds a little bit of functionality on top of the IP datagram service, and its main features are:
(1) UDP is not connected, it is not necessary to establish a connection (while TCP is required) before sending the data, which reduces the overhead and delay.
(2) UDP does its best to deliver and does not guarantee delivery reliability.
(3) UDP is message-oriented, for the IP datagram delivered from the network layer, only a very simple package (8-byte UDP header), the first overhead is small.
(4) UDP does not have congestion control, the sender will not reduce the sending rate when the network congestion occurs. This feature is important for some real-time applications, such as IP telephony, video conferencing, and so on, which allow for the loss of some data when congestion occurs, because if you do not discard this data, it is most likely to cause a delay accumulation.
(5) UDP supports a pair of one or one-to-many, many-to-one, and many-to-many interactive communications.
UDP message:
A UDP datagram can be divided into two parts: the UDP header and the data part. The data is part of the data that is delivered by the application layer. The UDP header has a total of 8 bytes, and these 8 bytes are divided into 4 fields:
(1) The Source port 2 bytes is available when the other party needs to reply, it can be 0 when not needed;
(2) The Destination port 2 bytes must be the most important field;
(3) length 2 bytes length value includes header and data part;
(4) The checksum 2 bytes is used to verify that the UDP datagram has errors during transmission and is discarded if there are errors.
The difference between TCP and UDP
Message-oriented transmission mode is the application layer to the UDP long message, UDP is sent, that is, one message sent at a time. Therefore, the application must select the appropriate size of the message. If the message is too long, the IP layer needs to be fragmented, reducing efficiency. If it is too short, the IP will be too small. UDP does not merge or split the messages that are delivered by the application layer, but retains the boundaries of these messages. That is to say, the application layer to the UDP long message, UDP is sent, that is, send a message at a time.
For byte-stream, although the interaction between the application and TCP is a block of data at a time (varying in size), TCP sees the application as a series of unstructured byte streams. TCP has a buffer, and when the data block that the application transmits is too long, TCP can divide it short and then transmit it. If the application sends only one byte at a time, TCP can also wait to accumulate enough word sections to send out the message segment.
Comparison of TCP and UDP:
TCP/IP (iii): Transport layer TCP vs. UDP