Author: t.c. http://blog.chinaunix.net/uid-20556054-id-3165405.html.
How does TCP ensure reliable transmission?
Validation mechanism: ensures that each packet can be received,
Error Check: ensure that the data packets are correct,
Traffic control: ensures that the receiver does not overflow.
Sequential Number: ensures the transmission order.
1. Five features of reliable TCP/IP Transmission Service: Data Stream-oriented, virtual circuit connection, buffered transmission, unstructured data stream, and full-duplex connection.
2. TCP adopts the validation technology with retransmission function as the basis for reliable data stream transmission service.
3. In order to improve the efficiency of the data stream transmission process, the sliding window protocol is introduced based on the above, which allows the sender to send multiple groups before waiting for a confirmation. The sliding window protocol requires that only unconfirmed groups need to be re-transmitted, and the maximum number of unconfirmed groups is the size of the window.
4. TCP function: TCP defines the formats of data and validation information exchanged for reliable transmission between two computers, and measures taken by the computer to ensure correct data arrival.
5. a TCP connection is a virtual circuit connection, which is identified by a pair of endpoints. The endpoints are defined as a pair of Integers (host, Port). The host is the IP address of the host, port is the TCP port number on the host.
6. TCP uses a dedicated Sliding Window Protocol mechanism to solve the transmission efficiency and traffic control issues. TCP uses the sliding window mechanism to solve the end-to-end traffic control, but does not solve the whole network congestion control.
7. TCP allows a small window to be changed at any time. The advertised value indicates how much data the receiver can receive. The advertised value increases, and the sender expands the sliding window. The advertised value decreases, the sender can narrow down the sending window.
8. TCP Message format
The message is divided into two parts: header and data. The header carries the necessary identification and control information.
The confirmation number field indicates that the local machine wants to receive the serial number of the next byte group;
The value of the sequence number field is the position of the upstream data stream of the message segment, that is, the sending sequence number;
The confirmation number refers to the data flow in the opposite direction as the message segment flows.
9. TCP uses 6-bit long codes to indicate the Application Purpose and content of the packet segment.
The URG emergency pointer field is available; The ack validation field is available; The PSH request is urgent and near; the RST connection is reset; the SYN synchronization sequence number; the fin sender ends the byte stream.
10. TCP three-way handshake
To establish a TCP connection, the two systems need to synchronize their initial TCP serial number ISN. The serial number is used to track the communication order and ensure that multiple packets are not lost during transmission. The initial serial number is the starting serial number when a TCP connection is established.
Synchronization is achieved by exchanging data packets with ISN and 1-bit SYN control bits.
A handshake can be initiated by one or both parties.
Difference Between TCP and UDP:
1. Connection-based and connectionless
2. Requirements on system resources (more TCP and less UDP)
3. stream mode and datagram Mode
4. TCP ensures data correctness, UDP may cause packet loss, and TCP ensures data order
Connection without connection:
Connection and connectionless connection are commonly used terms in network transmission. Their relationships can be illustrated in an image metaphor, that is, making calls and writing letters.
When making a call, a person must first dial (send a connection request), wait for the response from the other party, and then answer the call (establish a connection) before passing information to each other. After the call is complete, you need to stop the call (disconnect) to complete the call process. Writing a letter is different. You only need to fill in the recipient's address information, and then put the letter into the post office, even if the task is completed. At this time, the Post Office will deliver the mail to the specified destination based on the recipient's address information.
We can see that the two are very different. During a call, both parties must establish a connection to transmit information. Connections also ensure the reliability of information transmission. Therefore, connection-oriented protocols must be reliable. Without a connection, there is not much attention. No matter whether the other party has a response or feedback, it simply sends the information. Just as once a letter enters the mailbox, you cannot track its whereabouts until it reaches the destination. Even if the recipient receives the letter, it will not notify you when it will arrive. There is no guarantee in the entire communication process. Therefore, we often say that connection-free protocols are not reliable. Of course, the Post Office will try its best to send the right-click to the destination. In 99% of the cases, the mail will arrive safely, but in a few cases there are exceptions.
Connection-oriented protocols have significant advantages over connection-free protocols in terms of reliability. However, before establishing a connection, you must wait for the response from the receiver and confirm whether the information is transmitted, a response signal is required when the connection is disconnected, which increases the resource overhead for the connection protocol. For TCP and UDP protocols, apart from the source port and destination port, TCP also includes serial number, validation signal, data offset, control mark (generally referred to as URG, ack, Psh, RST, Syn, fin), window, checksum, emergency pointer, option, etc. UDP only contains length and checksum information. UDP datagram is much smaller than TCP, which means less load and more effective bandwidth. Many instant messaging software use the UDP protocol, which has a great relationship.
TCP-the transmission control protocol provides connection-oriented and reliable byte stream services. Before the customer and the server exchange data with each other, a TCP connection must be established between the two parties before data can be transmitted. TCP provides timeout resend, discard duplicate data, test data, traffic control, and other functions to ensure data can be transferred from one end to the other.
UDP-the User Datagram Protocol is a simple datagram-oriented transport layer protocol. UDP does not provide reliability. It only sends the data from the application to the IP layer, but it cannot guarantee that the data can reach the destination. Because UDP does not need to establish a connection between the client and the server before transmitting the datagram, and there is no timeout and re-transmission mechanism, the transmission speed is very fast.
If you use TCP or UDP, which of the following aspects does your program focus on? Reliable or fast?
Tcp udp is two protocols. Simply put, the TCP connection requires confirmation from the other party, But UDP does not need confirmation from the other party to receive the packet. Therefore, the TCP connection is safer, however, the UDP protocol is generally used for streaming media in the playing network.
Generally, the transport layer protocols include TCP and UDP. For reliable transmission, the Protocol itself ensures the reliability of data transmission. However, this requires a lot of extra network overhead. UDP is unreliable, so the transmission efficiency is relatively high. The local end is only responsible for sending data, and it is not guaranteed whether the peer end can receive the data. For UDP, reliability can be achieved at the application layer.
What is the difference between "stream mode" and "data packet mode" in programming?
1. TCP
For example, TCP. You have a reservoir in your house. You can pour water in it. There is a faucet in the reservoir. You can put the water in the pool through the faucet, then use a variety of containers (cups, mineral water bottles, pots and pans) to connect to the water.
In the above example, pouring water several times into the pool is not necessarily related to taking the water several times. That is to say, you can only pour the water once and then pick it up 10 times. In addition, the amount of water in the pool will be less; the amount of water poured into the pool will increase, but it cannot exceed the pool capacity, the excess water will overflow.
To combine TCP, the pool is like receiving a cache. Pouring Water is equivalent to sending data, and receiving water is equivalent to reading data. For example, if you send data to the other end through a TCP connection, you only call write once and send 100 bytes, but the other end can receive 10 bytes each time; you can also call 10 write times, with 10 bytes each time, but the other party can finish it at one time. (Assuming all data can be reached) However, the data volume you send cannot be larger than the recipient's receiving cache (Traffic Control). If you want to send excessive data, when the other party's cache is full, the extra data will be discarded.
2. UDP
Different from TCP, UDP calls write several times, and the receiving end must read the data with the same number of reads. UPD is based on packets. When receiving a message, only one packet can be read at most at a time. Messages and packets are not merged. If the buffer zone is smaller than the packet length, the extra part will be discarded. That is to say, if the msg_peek flag is not specified, each read operation consumes one packet.
3. Why?
In fact, this difference is determined by the characteristics of TCP and UDP. TCP is connection-oriented. That is to say, during the connection continuity process, the data received by the socket is sent by the same host (hijacking is not considered). Therefore, you only need to know how much data is read each time.
UDP is a connectionless protocol, that is, any host can send data to the acceptor as long as it knows the IP address and port of the acceptor and the network is reachable. At this time, if the data of more than one packet can be read at a time, it will be messy. For example, host a sends packets P1 and host B sends packets P2. If data of more than one packet can be read, data of P1 and P2 will be merged, such data is meaningless.
TCP handshake protocol
In TCP/IP, TCP provides reliable connection services and uses three handshakes to establish a connection.
First handshake: when a connection is established, the client sends the SYN Packet (SYN = J) to the server and enters the syn_send status. Wait for the server to confirm;
The second handshake: when the server receives the SYN packet, it must confirm the customer's Syn (ACK = J + 1) and send a SYN Packet (SYN = K), that is, the SYN + ACK packet, the server enters the syn_recv status;
The third handshake: the client receives the server's SYN + ACK package and sends the ACK (ACK = k + 1) Confirmation package to the server. After the package is sent, the client and server enter the established status, complete three handshakes.
After three handshakes, the client and the server start to transmit data. In the above process, there are some important concepts:
Unconnected queue: in the three-way handshake protocol, the server maintains an unconnected queue, which opens an entry for the SYN Packet (SYN = J) of each client, this entry indicates that the server has received the SYN Packet and sent a confirmation to the customer, waiting for the customer's confirmation package. The connection identified by these entries is in the syn_recv state on the server. When the server receives the customer's confirmation packet, it deletes the entry and the server enters the established state.
Backlog parameter: Maximum number of unconnected queues.
SYN-ACK retransmission times the server sends the SYN-ACK package, if the customer does not receive the confirmation package, the server for the first retransmission, wait for a period of time has not received the customer confirmation package, for the second retransmission, if the retransmission times exceed the maximum retransmission times specified by the system, the system deletes the connection information from the semi-connection queue. Note that the waiting time for each retransmission is not necessarily the same.
Semi-connection survival time: the maximum time for the semi-connection queue to survive, that is, the maximum time for the service from receiving the SYN packet to confirming that the message is invalid, the maximum waiting time of all retransmission request packets. The semi-join survival time is also called timeout time and syn_recv survival time.
========================================================== ======================================
Now, let's take a look at the complete process. On a TCP socket, how does the system call connect to establish a connection to the peer end. Let's take the test environment 172.16.48.2 as an example to initiate a connection request to port 5002 of 172.16.48.1.
Step 1: 172.16.48.2 initiate a connection request to 172.16.48.1, send a SYN segment, specify the destination port 5002, and advertise its initial serial number (ISN, a 32-digit random number generated by the protocol stack ), set the confirmation sequence number to 0 (because no peer data has been received), and notify yourself that the sliding window size is 5840 (the peer side is 5792, which seems to be a problem, to be further checked), the window expansion factor is 2 (in the first option), and the maximum length of the advertised message segment is 1460 (local area network ), the following figure shows the data content (the Ethernet header of the link layer and the IP address header of the network layer ):
Meaning of data content
Basic Header
80 0e source port (32782)
13 8A destination port (5002)
00 00 07 BC initial serial number isn
00 00 00 00 confirm serial number
A header length
0 02 flag, SYN = 1
16 D0 sliding window size (5840)
64 9e checksum
00 00 emergency pointer
TCP options
02 04 05 B4 Maximum packet segment length (1460)
04 02 sack allowed
08 0a 00 0a 79 14 00 00 00 timestamp (0x000a7914), Echo timestamp (0)
01 placeholder.
03 03 02 window expansion factor (2)
Step 2: 172.16.48.1 after receiving the request package, check the flag and find SYN = 1, it is considered as a request to initiate the connection, respond to this SYN, and also send its own SYN segment (ACK, syn ). Because SYN occupies a sequence number (and the fin also occupies a sequence number ). Therefore, confirm that the serial number is set to ISN plus 1 of 172.16.48.2 (that is, 172.16.48.1 expects to receive the first serial number of the next package from 172.16.48.2 as 0x07bd. At the same time, we also need to announce our initial sequence number, sliding window size, window expansion factor, maximum packet segment length, etc. The following is the data content:
Meaning of data content
Basic TCP Header
13. Source Port 8a (5002)
Port 80 0e (32782)
98 8e 40 91 initial serial number isn
00 00 07 BD validation serial number (peer isn + 1)
A header length
0 12 flag, ACK = 1, SYN = 1
16 A0 Sliding Window Size
65 D7 checksum
00 00 emergency pointer
TCP options
02 04 05 B4 Maximum packet segment length (1460)
04 02 sack allowed
08 0a 00 3C 25 8A 00 0a 79 14 timestamp (0x003c258a), Echo timestamp (000a7914)
01 placeholder
03 03 02 window expansion factor (2)
Step 3: 172.16.48.2 confirm the SYN segment from 172.16.48.1. At this point, the TCP three-way handshake protocol is complete, and the connection is established. When 172.16.48.2 receives the SYN segment, change the status of your socket from tcp_syn_sent to tcp_established to enter the connection establishment status. The following figure shows the data content:
Meaning of data content
80 0e source port (32782)
13 8A destination port (5002)
00 00 07 BD No. (It is no longer an ISN)
98 8e 40 92 confirmation serial number (peer isn + 1)
8 header length (8*4 = 32, with 12-byte options)
0 10 mark, ACK = 1
05 B4 sliding window size (1460, is there a problem? To be confirmed)
A5 8A checksum
00 00 emergency pointer
01 placeholder
01 placeholder
08 0a 00 0a 79 14 00 3C 25 8A timestamp (0x0a007914), Echo timestamp (0x003c258a)
========================================================== ==================================
7. Briefly describe the TCP three-way handshake process and explain why three-way handshake is required
TCP three-way handshake
TCP connections are initialized through three handshakes. The purpose of the three-way handshake is to synchronize the serial number and confirmation number of both parties and exchange the TCP window size information. The following procedure provides an overview of how client computers normally contact server computers:
1. the client sends a TCP packet with a SYN position to the server, it contains the initial serial number X of the connection and the size of a window (indicating the buffer size of the incoming segment sent from the server on the client ).
2. after receiving the SYN Packet sent from the client, the server sends a TCP packet with both SYN and ACK positions to the client, it contains the selected initial serial number y, the confirmation of the client serial number x + 1, and the size of a window (indicating the buffer size of the incoming segment sent from the client on the server).
3 .. after the client receives the SYN + ACK packet returned by the server, it returns an ACK packet with the Confirmation No. Y + 1 and No. x + 1 to the server. A standard TCP connection is complete.
TCP uses a similar handshake process to end the connection. This ensures that both hosts can complete transmission and that all data is received.
Tcp client flags TCP Server
1 send Syn (SEQ = x) ---- SYN ---> SYN received ed
2 SYN/ack converted ed <--- SYN/ack ---- send Syn (SEQ = Y), Ack (x + 1)
3 send ACK (Y + 1) ---- ack ---> ack received, connection established
W: isN (initial sequence number) of the Client
X: ISN of the server
========================================================== ============================
Handshake phase:
SEQ ack in sequence direction
1 A-> B 10000 0
2 B-> A 20000 10000 + 1 = 10001
3 A-> B 10001 20000 + 1 = 20001
Explanation:
1: A initiates a connection request to B and initializes the seq of A with a random number. This is assumed to be 10000. At this time, ACK = 0.
2: After B receives the connection request from a, it also initializes the seq of B with a random number, which is assumed to be 20000, meaning: I have received your request, my data flow starts from this number. The ack of B is the seq of a plus 1, that is, 10000 + 1 = 10001
3: After a receives a reply from B, its seq is the seq plus 1 of its previous request, that is, 10000 + 1 = 10001. That is, I have received your reply, my data flow starts from this number. A's Ack is B's seq plus 1, that is, 20000 + 1 = 20001
Data transmission phase:
SEQ ack size in sequence direction
23 A> B 40000 70000 1514
24 B-> A 70000 40000 + 1514-54 = 41460 54
25 A-> B 41460 70000 + 54-54 = 70000 1514
26 B-> A 70000 41460 + 1514-54 = 42920 54
Explanation:
23: B receives seq = 40000, ACK = 30000, size = 1514 packets from.
24: Then B sends a packet to a, telling B That I have received your last packet. B's seq is filled with the ACK of the packet it receives. Ack is the seq of the packet it receives plus the packet size (excluding the Ethernet protocol header, IP header, and TCP Header ), to confirm that all data sent by B has been received.
25: when receiving a 41460 seq packet sent by B, A sees 41460, which is exactly the size of the seq of its last packet plus the package, the last packet sent has arrived safely. So it sends another packet to B. The seq of the packet being sent is also filled with the ACK of the packet it received, and Ack is filled with the seq (70000) of the packet it received plus the size (54) of the packet, that is, ACK = 70000 + 54-54 (all headers are long and there are no data items ).
26: The same