1. Some noun explanations in TCP(1) MSS (maximum segment size)the maximum packet size of TCP, which has a 16-bit portion in the TCP segment for placing the value, so the maximum is 65535, setsockopt () and getsockopt can be used to set up and obtain tcp_maxseg to influence MSS;(2) MSL (maximum segment lifetime)IP message segment can exist in the network for the longest time, this is the system-level parameters, no interface modification, Windows can be modified through the registry, usually 2 minutes, the minimum is 30 seconds, Linux above cannot be modified; (3) TTL (Time to live)is set by the source host time to live, not a specific time value, is the only IP datagram in the network can experience the maximum number of hops, that can go through the maximum number of routes, this field is set in the IP header, composed of 8 bits, so the maximum number of hops is 255 ; can be set by setsockopt;(4) RTT (round trip time)refers to the time spent by the client to the server, TCP contains the function of dynamically estimating RTT, while the RTT is also affected by the network congestion changes; (5) MTU (maximum transmission unit)the maximum transmission unit, which is specified by the hardware, different network MTU is not the same;
2, the meaning of the flag bit in the TCP message header(1) URGemergency flag bit, when this flag is placed 1 o'clock, the 16-bit emergency pointer in the TCP message is valid, the emergency pointer is the emergency data offset, altogether so many bytes; (2) ACKconfirm that the serial number is valid, indicating the TCP message receiving the specified serial number; (3) PSHindicates that the receiver should give the message segment to the application layer as soon as possible, this PSH flag bit with the socket in the No_delay method is associated with the set send no delay, no matter how small the sending content, are sent directly, the receiving party receives this data does not wait, Direct the data to the application processing; this is the Nagle algorithm for shutting down the data sent in TCP (the Nagle algorithm means that there can be at most one unacknowledged, incomplete small group on a TCP connection, and no other small groupings are sent until the acknowledgement of that grouping arrives. Instead, these small groups are cached to be sent out in a large packet when the acknowledgement arrives, which effectively avoids the fact that too many small packets appear in the network, resulting in low data transfer efficiency and network congestion.(4) SYNTCP establishes a connection flag; (5) FINThe sending end sends the task, the TCP closes the connection the flag; (6) RSTresets the connection, which is usually used to indicate an abnormal disconnection of the connection, and the RST flag bit indicates a hard error in the TCP connection;
3. Status of the TCP connectionLISTEN: Listening for connection requestssyn_sent: Waits for a matching connection request after sending a connection request (sending SYN)syn_received: After receiving a connection request and sending a confirmation ACK for the Connection connection request (receiving the SYN and sending the Syn+ack)established: After receiving the syn+ack and sending an ACK, or receiving an ACK to establish the connection, the connection is established.fin_wait_1: The application process is first actively closed, after the FIN is sent;fin_wait_2: After the FIN is sent, after receiving the fin+ack of the other party;CLOSING: The description is to close the connection at the same time, send fin, and receive the other's fin, so will also reply to Fin+ack, then enter this stateTime_wait: After receiving the other's fin, and sending the ACK of fin, or at the same time close the connection, received the other side of their own fin ack reply after entering this state, indicating that the connection is near the close of the state, this stage needs to wait 2MSL time, Because the last confirmation of the Fin's ACK may be lost or not reached, the other side will repeat the fin, the fin will arrive in the MSL time, and therefore need to wait for the maximum time of a round trip, if no new fin is received during this period, indicating that the last ACK has arrived, if received, Note that the last ACK was lost, the ACK of fin needs to be re-sent, and wait for 2MSL time again;close_wait: The application process has not yet sent fin, but after receiving the other's fin, send the ACK of fin and then enter this state;Last_ack: The application process after sending the other side of the Fin ack, and then send the fin, then enter this state, in this state waiting for the ACK of Fin arrives;CLOSED: Enters this state after the arrival of the 2ML time, or after the ACK of the last fin is received;
The establishment of a TCP connection normally requires three handshakes, and the disconnection normally requires four waves of waving; these states are not visible in programming, they are maintained by TCP, the API is not available, and the changes in each state are as follows :
4. TCP Transfer Protocol
(1) retransmission (from the sender's point of view)TCP connection is a reliable data transmission connection, the reliability of an important guarantee is the ability to re-transmit the lost packets in the network, so the TCP retransmission is very important, and TCP retransmission is divided into time-out retransmission and fast retransmission two, time-out retransmission refers to the sender itself after sending the data, Retransmission value at timeout (RTOWhen the time arrives, it does not receive an ACK, then it triggers a time-out retransmission, and the fast retransmission means that the receiver finds that the received message is not the message it wants, and when there is a missing message sequence, it sends three identical ACK (the message requesting the lost sequence) to the sender. When the sender receives three identical message ACK, it will trigger the fast retransmission mechanism and send the lost message.Fast Retransmission is a supplement to the retransmission of timeouts, which makes retransmission more timely;
a) time-out retransmissiontime-out retransmission of the RTO (retransmission timeout), based on a number of RTT calculation average (there is a more complex calculation algorithm), according to the changes in the network environment to adjust dynamically, each time after the data sent to the RTO time, If the ACK has not arrived, it will be a time-out retransmission, if the retransmission has not received an ACK, the next retransmission of the RTO value will be doubled (multiplied by 2), know the retransmission to the system set the number of retransmissions, and then close the connection;By default, Windows hosts are re-transmitted by default 5 times. Most Linux systems default to a maximum of 15 times. Both types of operating systems are configurable.
b) Fast retransmissiononce the sender receives three duplicate ACK, it triggers a fast retransmission, at which point all messages to be sent will wait in the queue until the fast retransmission is sent;
(2) flow control (from the receiving side angle)when TCP sends data, the sender considers how much data the receiver can accept, cannot send packet size arbitrarily, otherwise it will cause network congestion, and TCP uses sliding windows to control traffic; a 16-bit window field in the TCP message segment, Used to save the size of the buffer currently available to receive data, the receiver can control the sending speed by setting the window size according to the receive buffer size;the sender's sending window includes a message sent but not received an ACK and a message waiting to be sent, while the receiving window refers to the size of the receiving buffer space, of course, each time the datagram size is MSS controlled, the three handshake at the beginning of the TCP connection will contain the respective MSS size of the communicating parties. In order to send the maximum packet size (MSS) of the message, if the receiver in the process of receiving window is less than MSS, the sender will be the actual window size to send the message segment;The Sliding window protocol includes the following types of protocols:
a) Stop waiting for the agreementthe sender needs to be able to send new data after the ACK of the last datagram sent, which results in lower channel utilization;
b) Back N protocolthe sender can continuously send n number of reports, each send a message will set a time-out timer, if a message triggered a timeout retransmission, the sender needs to re-transmit all the messages after the message, the receiving party must discard all messages after the message; so it can cause a lot of waste. , in the case of poor network environment will lead to lower efficiency;
c) Select retransmission ProtocolSelect Retransmission protocol is the improvement of the back N protocol, that is, the sender can continue to send n packets, but in one of the messages triggered time-out retransmission or fast retransmission, the sender only sends the missing message segment, the receiving party will not discard the previously received messages, The message after the error message will be stored in the buffer, and so on after the receipt of the lost messages together to the application layer, which effectively avoids the waste, but the receiving buffer has certain requirements, usually sliding window using this protocol;
d) 0 windowin some cases, the receiver may not be able to process the received packets, no longer receive new data, you can use the 0 window protocol, the receive window size is set to 0, the sender will not send new data; Wait until the receiver has enough buffer space, An ACK message is sent to the sender to open the window, but the ACK may be lost, causing the sender and the receiver to enter each other to wait for a dead loop; In order to avoid this situation, the sender will set the
persistence timer and send a probe message to the receiver every once in a while. To see if the acceptance window is open, the time to adhere to the timer is exponential backoff (each time it is multiplied by 2), at most every 60 seconds, to know that the window is open or the application connection is closed;
e) confused window syndromerefers to the sliding window when the receiving window is smaller (less than a segment size, MSS) to send the packet, not the packet length of the case to send data, in this case bandwidth utilization is very low; To avoid this situation, you can start on both sides of the sender and receiver; The receiver does not advertise the small window: When the acceptance window is 0, the window will not open unless it can receive a maximum message MSS size or half the size of the buffer.The sender does not send packets: Using the TCP Nagle algorithm, or when there is no waiting for the confirmed message, send the data directly, or the sender accumulates to a full-length message segment (MSS), or the message size is larger than the receiving window half of the time in the sending;
(3) congestion control (from a network perspective)TCP retransmission and the flow control of the sliding window are only considered from the point of view of the sender and the receiver, not from the perspective of the network environment, if the network congestion, all kinds of retransmission will only cause congestion more serious, so TCP has a control strategy for network congestion, Includes: Slow start, congestion avoidance, congestion occurrence and rapid recovery;
a) slow start, exponential growthTCP supports the slow-start algorithm, which works by controlling the rate at which the sender sends a new datagram to the network, at the same rate as the receiving side returns a confirmed datagram; Slow start adds a congestion window to the sender (congestion
Window,cwnd), which is the method that the sender uses to control the traffic, and the receiver's advertisement window size is the method that the receiver controls the traffic;
each time the sender takes the congestion window and the minimum value of the advertisement window as the upper limit of the total number of datagrams sent , each sender will send a datagram with the size of MSS, sending the number of datagrams to be sent;at the beginning of the TCP connection, CWnd is initialized to an MSS packet size, and when each ACK is received, the congestion window will increase the size of an MSS segment, so CWnd will grow exponentially(because as the CWnd increases, the number of messages allowed increases, each message receives an ACK that adds an MSS, and therefore doubles), as;
slow start refers to the initial start of sending the upper limit is small, but the growth is not slow, of course, can not continue to grow like this, the sender also has a parameter is the slow start threshold (Ssthresh, usually 65535), when the size of CWnd reached Ssthresh, Slow start end, enter the congestion avoidance phase, as follows;
b) Congestion avoidance, linear growthat this stage, after each received an ACK, increased for CWnd (1/cwnd) * MSS size, so after a complete samsara, that is, all sent datagrams after the ACK is received, CWnd altogether only added an MSS size ((1/cwnd) *mss *CWND=MSS), so that the growth of CWnd from exponential growth to linear growth, to avoid excessive growth caused by network congestion;
c) Congestion occurs and multiplication is reducedwhen there is a packet loss, it indicates that there is congestion in the network, this time setting The Ssthresh is half of the current window (a small value for the actual send limit: CWnd and the receiver's advertised window), whilehandle according to different situation:if it is a time-out retransmission: Indicates that congestion is more severe,into the slow-start phase,CWnd is set to 1 MSS packet size, retransmission of lost data, when a new datagram is confirmed, CWnd can grow an MSS size;if it is a fast retransmission: indicates a possible slight congestion, but does not need to enter the slow start phase, into the fast recovery phase (with fast retransmission corresponding processing, similar to the congestion avoidance phase);
d) Fast recoveryretransmit The missing message segment and set CWnd to SSTHRESH+3*MSS Packet Size (because there are 3 duplicate ACK, so there are three old messages leaving the network is accepted), each time a duplicate ACK is received, CWnd adds an MSS segment size and sends a message (retransmission), When the ACK of the new packet is received, it indicates that the previously lost packet has been confirmed, resumed to the congestion avoidance stage, and set the size of CWnd to Ssthresh;
(4) The keepalive mechanismTCP provides keepalive heartbeat control mechanism, the default is 2 hours of communication between the two sides without any communication, TCP keepalive mechanism will be activated, every 75 seconds to send a probe message, sent a total of 10 times, if the 10 times are not corresponding, the decision connection has been closed, disconnected;
5. TCP Server Designincoming connection request queue, when the server is in the corresponding other events, when a new connection request appears, TCP by default will complete three handshake, accept the connection, and then put the connection in a connection request queue, waiting for the application layer to accept the connection to process, the length of this queue can be set, In Python isSocket.listen (backlog), thisThe backlog is the length of the connection queue, called the backlog value, if the queue is full, the new incoming connection will no longer receive a response, if the queue connection is not processed by the application layer for a long time, it will also be broken due to timeouts;
TCP Transport Protocol