What is TCP?
Specifically about what TCP is, I'm not going to say it in detail; When you see this article, I think you know the concept of TCP, and want to get a deeper understanding of TCP's work, we will continue. It's just a super-messy protocol, and it's the foundation of the Internet and the basic skills that every programmer must have. First look at the OSI layer seven model: we need to know that TCP works in the network OSI seven layer model fourth layer--transport layer, IP in the third layer--network layer, ARP in the second Layer--data link layer, on the second layer of data, we call it frame, The data on the third layer is called packet, and the fourth layer of data is called segment. At the same time, we need to simply know that the data from the application layer is sent down, each layer will be added header information, encapsulation, and then sent to the data receiving end. The basic process you need to know is that each data is packaged and packaged in a data package. In the OSI seven layer model, the role of each layer and the corresponding protocol are as follows: TCP is a protocol, how is this protocol defined, and what is its data format like? For a deeper analysis, it is necessary to understand, or even memorize, the meaning of each field in the TCP protocol. Oh, come on. The above is the TCP protocol header format, because it is too important, is to understand the basis of other content, the following will be the information of each field is described in detail.
- Source port and Destination port: 16 bits each, representing the source port number and destination port number, used to distinguish between different processes in the host, and the IP address is used to distinguish between different hosts, The source port number and destination port number match the source IP address and destination IP address in the IP header to uniquely determine a TCP connection;
- Sequence number: Used to identify the data stream sent from the TCP originator to the TCP receiver, which represents the ordinal of the first data byte in the data flow in this segment, and is mainly used to solve the problem of the chaotic sequence of the network report;
- Acknowledgment number:32 bit confirms that the serial number contains the next sequence number expected to be received at the end of the send acknowledgement, so the confirmation sequence number should be the last data byte sequence number plus 1. However, the confirmation Sequence Number field is valid only if the ACK flag in the flag bit (described below) is 1 o'clock. Mainly used to solve the problem of not losing packets;
- Offset: To the number of the first, this value is required because the length of the optional field is variable. This field accounts for 4bit (up to 15 32bit of words, that is, the first ministerial of 4*15=60 bytes), so TCP has a maximum of 60 byte headers. However, there is no optional field, and the normal length is 20 bytes;
- The
- TCP FLAGS:TCP header has 6 flag bits, multiple of which can be set to 1 at the same time, primarily for manipulating TCP's state machine, followed by
URG
, ACK
, PSH
, RST
, SYN
, FIN
. The meaning of each flag bit is as follows:
- URG: This flag indicates that the TCP packet's emergency pointer field (which is coming soon) is valid to ensure that the TCP connection is not interrupted and that the middle-tier device is to be processed as soon as possible;
- ACK: This flag indicates that the answer domain is valid. That is to say, the TCP answer number will be included in the TCP packet, there are two values: 0 and 1, 1 is the time to indicate that the answer domain is valid, and vice versa is 0;
- PSH: This flag bit represents a push operation. The so-called push operation means that the packet is delivered to the application immediately after it reaches the receiving end, rather than queued in the buffer;
- RST: This flag indicates a connection reset request. Used to reset the connection that generated the error, and is also used to deny errors and illegal packets;
- SYN: Indicates the synchronization sequence number used to establish the connection. The
SYN
flag bit is used with the ACK
flag bit, when the connection request is syn
=1, ACK
= 0; When the connection is responding, the syn
=1, ACK
= 1; the packet for this flag is often used for port scanning. The scanner sends a SYN
packet, and if the host responds with a packet back, it indicates that the host has this port, but since this scan is only the first handshake of the TCP three handshake, the success of this scan indicates that the machine being scanned is not safe. , a secure host will force a tightly connected three-time handshake for TCP;
- fin: Indicates that the sender has reached the end of the data, that is, the data transfer is complete, no data can be transmitted, send the
FIN
The connection is disconnected after the flag bit of the TCP packet. The packet of this flag is also often used for port scanning.
- Windows: Window size, which is known as a sliding window, for flow control; This is a complex issue that will not be summarized in this blog post;
Well, the basics are ready, so let's start the next journey.
What is the three-time handshake?
TCP is connection-oriented, and you must first establish a connection between the two parties in either direction before the other party sends the data. In the TCP/IP protocol, the TCP protocol provides a reliable connection service, which is initialized by a three-time handshake. The purpose of the three-time handshake is to synchronize the serial number and confirmation number of both parties and Exchange TCP window size information. This is the TCP three handshake that is often asked in the interview. Just understanding the concept of the TCP three handshake is not helpful for you to get a job, you need to understand some of the details of the TCP three handshake. Take a look at the diagram first. How clear a picture, of course, is not my painting, I just quoted to explain the problem.
- First handshake: establishes the connection. The client sends the connection request message segment, the
SYN
location is 1, Sequence Number
X, and then the client enters the SYN_SEND
state, waiting for the server to confirm;
- Second handshake: The server receives the
SYN
message segment. The server receives the message segment of the client, SYN
it needs to SYN
confirm the segment, set it Acknowledgment Number
to X+1 ( Sequence Number
+ 1), and also send the SYN
request information itself, the SYN
location is 1, Sequence Number
y The server will put all of the above information into a message segment (that is SYN+ACK
, a message segment), a concurrent send to the client, when the server entered the SYN_RECV
state;
- Third handshake: The client receives a message segment from the server
SYN+ACK
. Then Acknowledgment Number
set to Y+1, send the message segment to the server, ACK
after the message segment is sent, the client and server side are ESTABLISHED
in state, complete the TCP three handshake.
Three handshakes are completed, and the client and server can begin transmitting data. The above is the general introduction of the TCP three handshake.
What about the four breakup?
After the client and server have established a TCP connection through three handshake, when the data transfer is complete, it is bound to disconnect the TCP connection. For the disconnection of TCP, there is a mysterious "four breakup".
- The first breakup: Host 1 (Can make the client, can also be the server side), set up
Sequence Number
and Acknowledgment Number
, to host 2 send a FIN
message segment; At this time, the host 1 into the FIN_WAIT_1
state; This means that the host 1 has no data to be sent to host 2;
- Second breakup: Host 2 received the host 1 sent the
FIN
message segment, to the host 1 back a ACK
message segment, Acknowledgment Number
Sequence Number
add 1; Host 1 into the FIN_WAIT_2
state; Host 2 tells host 1 that I "agree" to your closing request;
- Third breakup: Host 2 to the host 1 send a
FIN
message segment, request to close the connection, while the host 2 into the LAST_ACK
State;
- The fourth break: Host 1 received the host 2 sent the message segment,
FIN
to host 2 sent a ACK
message segment, and then host 1 into the TIME_WAIT
state, host 2 received a host 1 of the ACK
message segment, the connection is closed, at this time, the host 1 wait for 2MSL still did not receive a reply, This proves that the server side has shut down properly, so the host 1 can also shut down the connection.
At this point, TCP's four breakup is so enjoyable to complete. When you see here, your mind will have a lot of questions, a lot of don't understand, feel very messy; All right, we continue to summarize.
Why do you have to shake hands three times
Now that you have summed up the three-time handshake for TCP, why do you have to do it three times? How do you think two times can be done. Why does TCP have to connect three times? This is said in Shehiren's computer network:
In order to prevent the failed connection request packet suddenly transmitted to the service side, resulting in an error.
In the book, we also cite an example, as follows:
"Invalid connection request segment" is generated in a situation where the first connection request message segment of the client is not lost, but is stuck in a network node for a long period of time before it reaches the server at some point after the connection is released. Originally this is a message segment that has already expired. However, after the server receives this failed connection request message segment, it is mistaken for a new connection request from the client. The client is then sent a confirmation message segment, agreeing to establish a connection. Assuming that the "three-time handshake" is not used, the new connection is established as soon as the server issues a confirmation. Because the client is now not making a connection request, the server acknowledgement is ignored and data is not sent to the server. But the server thought the new transport connection had been established and waited for the client to send the data. In this way, many of the server's resources are wasted. The use of "three-time handshake" method can prevent the above phenomenon. For example, in that case, the client does not issue confirmation to the server's confirmation. The server knows that the client does not require a connection because it cannot receive a confirmation. ”
This is clear, preventing server-side waiting and wasting resources.
Why did you break up four times?
What about the four breakup? TCP protocol is a connection-oriented, reliable, byte-stream-based Transport layer communication protocol. TCP is a full-duplex mode, which means that when the host 1 sends FIN
a message segment, only indicates that the host 1 has no data to send, Host 1 tells the host 2, its data has all been sent, but this time the host 1 can still accept data from host 2; When Host 2 returns ACK
Message section, it is known that the Host 1 no data sent, but host 2 can still send data to host 1, when the host 2 also sent a FIN
message segment, this time indicates that the host 2 also no data to send, will tell the host 1, I have no data to send, And then each other will happily interrupt this TCP connection. If you want to understand the principle of four break-up correctly, you need to understand the state changes during the four breakup process.
FIN_WAIT_1
: This state should be well explained, in fact, FIN_WAIT_1
and FIN_WAIT_2
the real meaning of the state is to wait for each other's fin message. The difference between the two states is that the FIN_WAIT_1
state is actually when the socket in the established state, it would like to actively close the connection, sent to the other side of the FIN
message, when the socket is entered into the FIN_WAIT_1
state. And when the other party responds to the ACK message, then into the FIN_WAIT_2
state, of course, in the actual normal situation, regardless of the circumstances of the other party, should immediately respond to the ACK message, so the FIN_WAIT_1
state is generally more difficult to see, and the FIN_WAIT_2
state is sometimes often can be seen with netstat. (Active side)
FIN_WAIT_2
: The above has explained in detail this state, FIN_WAIT_2
the actual state of the socket, indicating a semi-connection, that is, one side requires close connection, but also tell the other side, I have a little bit of data to send you (ACK information), and then close the connection. (Active side)
CLOSE_WAIT
: The meaning of this state is actually expressed in waiting to be closed. How do you understand it? When the other side close a socket FIN
to send a message to yourself, your system will undoubtedly respond to an ACK message to the other side, this time into the CLOSE_WAIT
state. Next, the real thing you really need to consider is whether you still have the data sent to the other person, if not, then you can close the socket, send FIN
messages to each other, that is, close the connection. So what you CLOSE_WAIT
need to accomplish in the state is waiting for you to close the connection. (Passive side)
LAST_ACK
: This state is still relatively easy to understand, it is a passive shutdown after sending a FIN
message, and finally wait for the other party's ACK message. When an ACK message is received, it is also possible to enter the closed available state. (Passive side)
TIME_WAIT
: It received the fin message from the other side, and sent out the ACK message, so 2MSL can return to the closed available state. If the finWAIT1 state, received the other side with the fin sign and the ACK sign of the message, you can directly into the TIME_WAIT
state, without having to go through the FIN_WAIT_2
state. (Active side)
CLOSED
: Indicates a connection interruption.
A brief analysis of TCP's three-time handshake and four breakup