Key points and difficulties of TCP protocol (1)
1. network protocol design
ISO puts forward the OSI layered network model, which is theoretical. TCP/IP finally implements a layered protocol model, each layer corresponds to a set of network protocols to complete a set of specific functions. The network protocols under this group are reused and reused. This is the essence of the layered model. In the end, all logic is encoded into cables or electromagnetic waves.
The layered model is well understood, but it is not so easy to design the Protocol for each layer. The beauty of TCP/IP is that the more complex the Protocol is, the more complex it is. We define a network as a device that is connected to each other. The essence of the network is "end-to-end" communication. However, devices that want to communicate with each other do not have to be connected directly, therefore, some intermediate devices must be responsible for data forwarding. Therefore, the protocol used to connect the cables of these intermediate devices is defined as the link layer protocol. In fact, the so-called link is actually initiated with a device, ends on another device through one wire. We call a link a "Hop ". Therefore, an end-to-end network contains many hops ".
2. TCP and IP protocols
End with the IP protocol, we can already complete an end-to-end communication. Why do we still need the TCP protocol? This is a problem. After understanding this problem, we can understand why the TCP protocol has become so "complex" and why it is so simple.
As the name shows, TCP is used to control transmission, that is, to control end-to-end transmission. Why is this control not implemented in the IP protocol. The answer is simple, that is, this will increase the complexity of the IP protocol, and what the IP protocol needs is simple. What is the cause?
First, let's take a look at why the IP protocol is the waist of the hourglass. Its underlying layer is a wide range of link layer protocols. These links provide completely different semantics for each other, and to interconnect these heterogeneous networks, we need a network-layer protocol to provide at least some adaptive functions. In addition, it must not provide too many "guaranteed services", because the upper-layer guarantees depend on the lower-layer's more restrictive guarantees, you can never implement an IP protocol over a m throughput link to ensure a m throughput...
The IP protocol is designed as a packet forwarding protocol. Each hop must go through an intermediate node. The routing design is another major innovation of the TCP/IP network. In this way, no direction is required for the IP protocol, the routing information and protocols are no longer strongly correlated. They are only associated by IP addresses. Therefore, the IP protocol is simpler. As an intermediate node, the router cannot be too complex, which involves cost issues. Therefore, the router is only responsible for routing and packet forwarding.
Therefore, the transmission control protocol must be implemented at the endpoint. Before going into details about the TCP protocol, we should first look at what it cannot do. Because the IP Protocol does not provide guarantees, TCP cannot provide such guarantees that depend on the underlying link of the IP, such as bandwidth, such as latency, these are determined by the link layer. Since the IP Protocol cannot be repaired or the TCP protocol cannot, it can modify some "unguaranteed properties" that begin with the IP layer. These properties include the unreliability of the IP layer, the IP layer is not sequential, And the IP layer has no direction or connection.
This section summarizes the TCP/IP model from the bottom up, with more features and fewer devices to be implemented. However, the complexity of devices is increasing, which minimizes costs, as for the performance or factors, it depends on the software. TCP protocol is such a software. In fact, TCP did not consider performance, efficiency, and fairness at the very beginning, TCP protocol is complicated.
3. TCP protocol
This is a software-only protocol. Why do we design two endpoints? For more information, see the previous section. This section describes the TCP protocol and briefly discusses it in the middle.
3.1.TCP
Specifically, the TCP protocol has two identities. As a network protocol, it makes up for the shortcomings of the IP protocol in the best effort for service, and achieves connection, reliable transmission, and packets arrive in order. As a host software, it isolates host services and networks from UDP and the transmission layer protocol between the left and right. They can be seen as a multiplexing/demultiplexing, multiplexing/demultiplexing of host process data to the IP layer.
It can be seen that TCP exists as an interface from any angle. As a network protocol, it implements the control logic of TCP with the peer TCP interface, and serves as a multiplexing/demultiplexing, it implements the protocol stack function with the lower-layer IP protocol interface, which is the basic definition of the hierarchical network protocol model (two types of interfaces: one class and the lower layer interfaces, and the other class and the peer layer interfaces ).
We are used to taking TCP as the top of the protocol stack, instead of using the application layer protocol as a part of the protocol stack. This is partly because the application layer is reused by TCP/UDP, there is a complicated situation. The application layer protocol is interpreted in a different way. The application layer protocol is used to be encapsulated in a way similar to the ASN.1 standard, this reflects the importance of the TCP protocol as a multiplexing/demultiplexing. Because of its direct and application interfaces, it can be easily directly controlled by applications to implement different transmission control policies, this is one of the reasons why TCP is designed not too far away from the application.
In short, there are four key points of TCP: connection, reliable transmission, arrival, and end-to-end traffic control. Note: TCP is designed to only ensure these four points. At this time, although it has some problems, it is very simple, but the bigger problems are quickly presented, so that it has to consider what is related to the IP network, for example, fairness and efficiency increase congestion control, so TCP is now like this.
. TCP with connection, reliable transmission, and data arrival in order
The IP protocol has no direction, and the data transmission can reach the peer end all by routing. Therefore, it is to arrive at the peer end in one hop, as long as one hop does not reach the peer route, data transmission will fail, in fact, routing is also one of the core of the Internet. In fact, the core basic skills provided by the IP layer can have two points: Address Management and routing. TCP uses the simple IP routing function, so TCP does not have to consider routing. This is another reason why it is designed as an end-to-end protocol.
Since the IP address has tried its best to allow separate data packets to reach the peer end, TCP can implement other stricter control functions on this network with the best effort. TCP adds connectivity to the communication of unconnected IP networks, confirms the status of the data that has been sent, and ensures the Data Order.
3.2.1. Connected
This is the basis of TCP, because the reliability and data sequence of subsequent transmission depend on a connection, which is the simplest implementation method. Therefore, TCP is designed as a stream-based protocol, since TCP needs to establish a connection in advance, it doesn't matter how much data is transmitted afterwards, as long as the data of the same connection can be identified.
● FAQ: handshake and 4 waves
TCP uses three handshakes to establish a connection. The handshake initializes the information required for transmission reliability and data sequence. The information includes the initial serial numbers in both directions, and the validation numbers are generated by the initial serial numbers, the three handshakes are used because the three handshakes have prepared the required information for transmission reliability and data sequence. The 3rd handshakes do not need to be transmitted separately, it can be transmitted together with data.
Why is it necessary for TCP to use four waves to remove a connection? Because TCP is a full-duplex Protocol, each channel must be removed separately. Note that the meaning of the four and three handshakes is different. Many people will ask why the three handshakes are established and the four handshakes are removed.
The purpose of three handshakes is to allocate resources and initialize the serial number. data transmission is not involved at this time. Three handshakes are enough to terminate data transmission, and reclaim resources. At this time, the serial numbers of the two endpoints are no longer related. You must wait for the two ends to have no data transmission before removing the virtual link. This is not as simple as initialization, if the SYN sign is found, a serial number is initialized and the SYN serial number is confirmed. Therefore, data transmission in this direction must be terminated separately.