This reference article is mainly used for the diffusion of knowledge points in subsequent articles, in this special backup and diffusion learning Exchange.
There are TCP protocols and UDP protocols in the transport layer.
1.UDP Introduction
UDP is a transport layer protocol, and the TCP protocol is in a hierarchy, but unlike the TCP protocol, the UDP protocol does not provide time-out retransmission, error retransmission and other functions, that is, it is unreliable protocol.
1.1.UDP protocol Header 1.2 UDP port number
Because many software needs to use the UDP protocol, the UDP protocol must use a flag to distinguish between the different programs required by the packet. The port number is the function of this, for example, a UDP program A in the system to register 3000 port, then, from the outside to pass in the destination port number 3000 UDP packets will be handed to the program. The port number can theoretically have 2^16 so much. Because it's 16 bits in length.
1.3 UDP inspection and
This is an optional option, not all systems are checking UDP packets and data (relative to the TCP protocol must), but the RFC standard requires that the sender should calculate the test and.
UDP verifies and overwrites the UDP protocol header and data, which is different from the IP check and the IP protocol is verified and only overwrites the IP header, and does not overwrite all data. Both UDP and TCP contain a pseudo-header, which is for the purpose of calculating the test and filming. The pseudo-header even includes the IP address information in the IP protocol, the purpose is to allow UDP two times to check whether the data has been correctly reached the destination. If the sender does not open the test and the option, and the receiving end of the calculation of the test and error, then the UDP data will be silently discarded (not guaranteed delivery), without generating any error messages.
1.4 UDP length
UDP can be very long and can be as long as 65535 bytes. But the general network at the time of transmission, a general transmission is not so long protocol (related to the problem of MTU), it is necessary to the data sharding, of course, these are the UDP and other superior protocol transparent, UDP does not need to care about the IP protocol layer on the data fragmentation, the next section will be a little discussion of the strategy of the Shard.
1.5 IP Shards
IP after receiving data from the upper layer, according to the IP address to determine the data sent from that interface (by routing), and the MTU of the query, if the data size exceeds the MTU of data fragmentation. The shards of the data are transparent to the upper and lower layers, and the data is simply reassembled at the destination, but don't worry, the IP layer provides enough information to re-assemble the data.
In the IP header, the 16bit identification number uniquely records the ID of an IP packet, the IP slice with the same ID will be reassembled, and the 13-bit offset records the position of the IP slice relative to the entire packet, and the two intermediate 3bit flags indicate whether there is a new shard behind the Shard. These three indicators make up all the information of the IP Shard, which the recipient can use to reorganize the IP data (even if the subsequent shards are first up than the previous shards, this information is sufficient).
Because the Shard technology is used frequently on the network, the software and people that forge the IP Shard packet for rogue attack are endless.
You can use the Trancdroute program to perform a simple MTU detection. Please refer to the textbook.
1.6 The interaction between UDP and ARP
This is a detail that is not often noticed, for some systematic implementations. When the ARP cache is still empty. UDP before being sent must send an ARP request to obtain the MAC address of the destination host, if the UDP packet is large enough to the IP layer must be fragmented, imagine that the first shard of the UDP packet will issue an ARP query request, All shards Du Hui wait until this query is complete and then send. Is this actually the case?
As a result, some systems will send an ARP query for each shard, all of the shards are waiting, but when the first response is received, the host sends only the last piece of data and discards the others, which is really amazing. Thus, because the data of the shards cannot be assembled in time, the receiving host will discard the IP packets that will never be assembled for a period of time, and send an ICMP packet with the assembly timeout (in fact many systems do not produce this error) to ensure that the host's own receive-side cache is not filled with shards that will never be assembled. 1.5 ICMP source Station suppression error
When the target host's processing speed is not as fast as the data received, because the host's IP layer cache is full, the host sends an ICMP message "I can't stand it".
1.7 UDP Server Design
Some of the features of the UDP protocol will affect our server program design, broadly summarized as follows:
- About customer IP and address: The server must have the ability to determine whether a packet is legitimate based on the client's IP address and port number (which seems to require each server)
- About Destination Address: The server must have the ability to filter broadcast addresses.
- About data entry: usually each port number of the server system corresponds to an input buffer, the incoming input according to the principle of first served waiting for the server processing, it is inevitable that a buffer overflow problem, in this case, the UDP packet may be discarded, and the application server program itself is not aware of the problem.
- The server should restrict the local IP address, that is, it should be able to bind itself to a certain port on a network interface.
2 Introduction to TCP protocol
TCP is a connection-oriented protocol, so it is necessary to first establish a connection before sending data to both parties. This is completely different from the protocol mentioned earlier. All the protocols mentioned above are just sending data, most of them do not care about sending the data is sent to, UDP is especially obvious, from the programming point of view, the UDP programming is also much simpler----UDP do not consider data sharding. The establishment of a TCP connection can be simply called a three-time handshake , while a connection abort can be called a four-time handshake . TCP and UDP are the most different place, TCP provides a reliable data transmission service, TCP is connection-oriented, that is, the use of TCP communication between the two host first to go through a "call" process, wait until the end of the communication is ready to start transmitting data, and finally end the call. So TCP is more reliable than UDP, UDP is to send the data directly, regardless of whether the other party is not receiving the letter, even if the UDP can not be delivered, it will not produce ICMP error messages, which once reiterated many times.
2.1 Establishment of the connection
When the connection is established, the client first requests to the server to open a port (with a SYN segment equal to 1 TCP packets), and then the server sends back an ACK message to notify the client request message received, the client received a confirmation message after the confirmation message confirmed that just the server-side sent confirmation message (around the mouth) , at this point, the establishment of the connection is complete. This is called a three-time handshake. If you plan on getting both sides ready, you must send three messages, and only three messages are required. As you can imagine, if you add a TCP time-out retransmission mechanism, then TCP can ensure that a packet is sent to the destination.
2.2 End Connection
TCP has a special concept called Half-close, which means that TCP connections are full-duplex (can be sent and received simultaneously), so when you close the connection, you must close the transmission and send the two-direction connection. The client gives the server a fin of 1 TCP message, then the server returns a confirmation ACK message to the client, and sends a FIN message, when the client replies to the ACK message (four handshake), the connection is over.
2.3 Maximum message length
At the time of establishing the connection, the two sides of the communication must confirm each other's maximum message length (MSS) in order to communicate. This SYN length is generally the MTU minus the fixed IP header and the TCP Head ministerial degree. For an Ethernet, it can generally reach 1460 bytes. Of course, for non-native IP, this MSS may be only 536 bytes, and if the intermediate transmission network MSS better small, this value will become smaller.
2.4 Principle of reliability
- The application data is split into a block of data that TCP considers most suitable for sending. This is completely different from UDP, and the datagram length that the application produces will remain unchanged. The unit of information that is passed to the IP by TCP is called a segment or segment (segment) (see Figure 1-7). In section 1, 8.4 We will see how TCP determines the length of the message segment.
- When TCP sends out a segment, it initiates a timer, waiting for the destination to acknowledge receipt of the message segment. If a confirmation cannot be received in time, the message segment will be re-sent. In the 21st chapter, we will understand the adaptive timeout and retransmission strategies in the TCP protocol.
- When TCP receives data from the other end of the TCP connection, it sends an acknowledgment. This acknowledgement is not sent immediately and will typically be deferred for a fraction of a second, which will be discussed in section 1 9.3.
- TCP will keep its header and data checked and. This is an end-to-end test and is designed to detect any changes in the data during transmission. If the test and error of the segment are received, T p will discard this segment and not acknowledge receipt of this message segment (expecting the originator to timeout and re-send).
- Since TCP packets are transmitted as IP datagrams, and the arrival of IP datagrams can be out of order, the arrival of the TCP message segment may also be out of sequence. If necessary, TCP will reorder the received data and hand the received data to the application tier in the correct order.
- TCP can also provide traffic control. Each side of a TCP connection has a fixed-size buffer space. The receiving side of TCP only allows the other end to send the data that the receiving buffer can accept. This prevents faster hosts from causing buffer overruns for slower hosts.
As you can see from this passage, the way to maintain reliability in TCP is to time out the resend, which makes sense, although TCP can also use a variety of ICMP packets to deal with these, but this is not reliable, the most reliable way is to resend the datagram as long as it is not recognized, Until you get confirmation from the other party.
The header of TCP is the same as the UDP header, with the sending port number and the receive port number. But obviously, TCP has more header information than UDP, and you can see that the TCP protocol provides all the necessary information needed to send and confirm. This is described in detail in the p171-173. It can be imagined that the sending of a TCP data should be a process like the following.
- Both sides establish the connection
- The sender sends a TCP datagram to the receiver, and then waits for the other party to confirm the TCP datagram, if not, resend it, and if so, send the next datagram.
- The receiving party waits for the sender's datagram, sends an ACK (acknowledgment) datagram, and waits for the next TCP datagram to arrive if the datagram is received and verified to be correct. Until the fin is received (send complete datagram)
- Abort connection
As you can imagine, in order to establish a TCP connection, the system may establish a new process (the worst is a thread) to transmit the data
2.5 TCP Server Design
The previous talk about UDP server design, you can find that the UDP server completely does not need the so-called concurrency mechanism, it just set up a data input queue can be. But unlike TCP, TCP servers need to establish a separate process (or lightweight, thread) for each connection to ensure the independence of the conversation. Therefore, the TCP server is concurrent. And TCP also needs to be equipped with an incoming connection request queue (also not required by the UDP server) to establish a conversation process for each connection request, which is why the various TCP servers have a maximum number of connections. Depending on the IP and port number of the source host, the server can easily distinguish between different sessions for data distribution.
Android Network programming Series A transport layer of the TCP/IP protocol family