Tcp/ip1. Basic concepts
Why would there be TCP/IP Protocol
All over the world, a wide variety of computers run their own different operating systems for everyone, and these computers are used in many ways when expressing the same message. It is as if God in the Bible has disturbed the accents of people everywhere and made them unable to cooperate. Computer users realize that the computer is only a man-in-the-war and does not play a big role. Only by bringing them together can the computer play its greatest potential. So people are trying to connect the computer to each other by electric wires. But the simple connection is not enough, as if the language of the different two people meet each other, completely unable to exchange information. So they need to define something common to communicate, and TCP/IP is born for that. TCP/IP is not a protocol, but a generic term for a protocol family. It includes IP protocols, IMCP protocols, TCP protocols, and more familiar HTTP, FTP, POP3 protocols, and more. Computers have these, as if learning a foreign language, you can and other computer terminals to do free communication.
TCP/IP Protocol tiering
When it comes to protocol layering, it's easy to think of Iso-osi's seven-layer protocol classic architecture, but the structure of the TCP/IP protocol family is slightly different.
The TCP/IP protocol family is packaged in layers from top to bottom. The top is the application layer, which has http,ftp, and so on we are familiar with the agreement. The second layer is the transport layer, and the famous TCP and UDP protocols are at this level. The third layer is the network layer, where the IP protocol is responsible for adding IP addresses and other data to the data (as described later) to determine the destination of the transmission. The fourth layer is called the data link layer, which adds an Ethernet protocol header to the data to be transmitted, and the CRC code is prepared for the final data transfer. Then down is the level of hardware, responsible for the transmission of the network, the definition of this level includes the cable format, network card definition and so on (these we do not care, we do not do network cards), so some books do not put this level in the TCP/IP protocol family, because it almost and tcp/ The creator of the IP protocol does not have any relationship. The host sending the protocol encapsulates the data from top to bottom, while the host receiving the data is untied from the received packet and finally gets the required data. This structure is very stack-flavored, so some articles also refer to the TCP/IP protocol family as the TCP/IP protocol stack.
Four levels
1) Link layer, sometimes referred to as the data link layer or the Network interface layer,
This typically includes device drivers in the operating system and the corresponding network interface cards in the computer. They work together with the physical interface details of the cable (or any other transmission medium).
Purpose: to send and receive IP datagrams for IP modules.
Sends an ARP request and receives an ARP reply for the ARP module.
Send Rarp request and receive RARP response for Rarp
2) The network layer, sometimes referred to as the Internet layer, handles the grouping of activities in the network, such as grouping routing. In
In the T C p/i P protocol family, the network layer protocol includes I p Protocol (Internet Protocol), I C m P protocol (i n T e R n e t internet Control Message Protocol), and I G m P protocol (i n T e R n e T Group Management Protocol).
3) The transport layer mainly provides end-to-end communication for applications on two host computers. In the T C p/i P Protocol family, there are two different transport protocols: T C P (Transmission Control Protocol) and U D P (User Datagram Protocol). T C P provides high-reliability data communication for two hosts. The work that it does involves putting the data that the application gives to it into the appropriate small pieces to the following network layer, confirming the received packet, setting the timeout clock to send the last acknowledgment packet, etc. Because the transport layer provides high-reliability end-to-end communication, the application layer can ignore all of these details. On the other hand, the U D P provides a very simple service for the application layer. It only sends packets called datagrams from one host to another, but does not guarantee that the datagram will reach the other end. Any required reliability must be provided by the application layer. These two transport layer protocols have different uses in different applications, which will be seen later.
4) The application layer is responsible for handling specific application details. Almost all of the different T C p/i P implementations provide the following common applications:
telnet Telnet.
• FTP file Transfer protocol.
SMTP Simple Mail Transfer protocol.
SNMP Simple Network Management protocol.
Some basic knowledge.
Internet address (IP address)
Each node on the network must have a separate Internet address (also known as an IP address). Now, the commonly used IP address is a 32bit number, which is what we often call the IPV4 standard, this 32bit number is divided into four groups, that is, the common 255.255.255.255 style. IPV4 Standard, the address is divided into five categories, we often use a class B address. Please refer to the other documentation for specific classifications. It is important to note that the IP address is a combination of network number + host number.
Domain Name System
The domain Name system is a distributed database that provides services that convert host names (URLs) to IP addresses.
Rfc
What is an RfC? The RFC is the standard document for the TCP/IP protocol, where we can see the long definition list of RFCs, which now has a definition of more than 4,000 protocols, and of course, the more than 10 protocols we are going to learn.
Port number (port)
Note that this number is used on the tcp,udp of a logical number, not a hardware port, we usually say that a certain port sealed off, but also only in the IP layer with this number of IP packets to filter out just.
Application Programming Interface
Now the common programming interfaces are sockets and Tli. And the front is sometimes called "berkeleysocket", can be seen Berkeley for the development of the network how much contribution.
A few important agreements
2. TCP protocol
The TCP protocol is a transport layer protocol that is connection-oriented and guarantees high reliability (no data loss, no data disorder, no data error, no data duplication).
1.TCP Head Analysis
Let's start by analyzing the format of the TCP header and the meaning of each of these fields:
(1) port number [16bit]
We know that the network implements inter-process communication between different hosts. In an operating system, there are many processes that are submitted to which process to process when the data arrives? This requires a port number. In the TCP header, there is the active port number (source port) and the destination port number (Destination port). The source port number identifies the process of the sending host, and the destination port number identifies the process of the receiving host.
(2) Serial number [32bit]
The serial number is divided into the sending ordinal (Sequence number) and the confirmation ordinal (acknowledgment number).
Send sequence Number: Used to identify a stream of data bytes sent from the TCP source to the TCP destination, which represents the sequence number of the first data byte in the message segment. If you consider a stream of bytes as a one-way flow between two applications, TCP counts each byte with a sequential number. The number is an unsigned number of 32bit, and the sequence number reaches 2 32-1 and starts at 0. When a new connection is established, the SYN flag becomes 1, and the Sequence number field contains the initial order number of the connection selected by this host (Initial Sequence numbers).
Confirm ordinal: Contains the next order number expected to be received at the end of the send acknowledgement. Therefore, the confirmation sequence number should be the last time the data byte order was successfully received plus 1. Only the ACK flag is 1 o'clock to confirm that the ordinal field is valid. TCP provides full duplex service to the application layer, which means data can be transferred independently in two directions. Therefore, each end of the connection must maintain the transmit data sequence number in each direction.
(3) offset [4bit]
The offset here actually refers to the length of the TCP header, which is used to indicate the number of bits in the TCP header, through which it is possible to know where the user data for a TCP packet began. This field occupies 4bit, as the value of 4bit is 0101, then the TCP header length is 5 * 4 = 20 bytes. So the first Minister of TCP has a maximum of 15 * 4 = 60 bytes. However, there are no optional fields, and the normal length is 20 bytes.
(4) Reserved [6bit]
Currently not used, it has a value of 0
(5) logo [6bit]
There are 6 flag bits in the TCP header. Multiple of them can be set to 1 at the same time.
URG Emergency pointer (urgent pointer) effective
ACK Confirmation ordinal valid
PSH indicates that the receiver should hand over this segment to the application layer without waiting for the buffer to fill
RST generally means disconnecting a connection
For example, a TCP client initiates a connection to the server side of a port that is not listening, and the Wirshark catches the following packets:
You can see that host:192.168.63.134 initiated a connection request to host:192.168.63.132, but the host:192.168.63.132 is not on the server side of the listener's corresponding port,
host:192.168.63.132 A TCP packet that sends an RST position is disconnected.
The SYN synchronous sequence number is used to initiate a connection
FIN send end complete send task (i.e. disconnect)
(6) Window size (Windows) [16bit]
The size of the window that represents the maximum number of bytes that the source method can accept.
(7) checksum [16bit]
The checksum covers the entire TCP packet segment: TCP header and TCP data. This is a mandatory field that must be computed and stored by the originator and validated by the end of the collection.
(8) Emergency pointer [16bit]
The emergency pointer is only valid if the URG flag is set to 1 o'clock. The emergency pointer is a positive offset, and the sum of the values in the Ordinal field represents the ordinal of the last byte of the emergency data. The emergency mode of TCP is a way of sending an emergency data to the other end.
(9) TCP Options
Is optional, we look at it when we grab the bag at the back.
2. Key explanations
(1) Three-time handshake to establish the connection
A. The requester side (often called the customer) sends a SYN segment indicating the port of the server to which the customer intends to connect, and the initial sequence number (ISN, in this case, 1415531521). This SYN field is message Segment 1.
B. The server sends back the SYN segment (message segment 2) that contains the initial sequence number of the server as a response. At the same time, the confirmation number is set to customer's ISN plus 1 to confirm the customer's SYN message segment. A SYN will occupy a sequence number
C. The customer must set the confirmation number to the server's ISN plus 1 to confirm the SYN message segment of the server (message segment 3)
These three segments complete the establishment of a connection. This process is also known as a three-time handshake (three-way handshake)
Grab the bag with Wirshark as follows:
You can see that the three-time handshake determines the serial number of the two-side package, the maximum accepted data size (window), and the MSS (Maximum Segment size).
MSS = Mtu-ip Head-TCP header, MTU represents the maximum transmission unit, we will say in the IP header analysis, it is generally 1500 bytes. The IP header and TCP headers are 20 bytes with optional options. In this case mss=1500-20-20 = 1460.
MSS limits the size of the TCP packet to carry the data, it means that when the application layer to the transport layer to submit data through the TCP protocol for transmission, if the application layer of data >mss must be segmented, divided into multiple segments, each sent past.
For example: The application layer submits 4,096 bytes of data to the transport layer at once, and this time through the Wirshark capture the package as follows:
The first three times is the process of three handshake, the last three times is the process of transmitting data, because the data size is 4,096 bytes, so three times to pass (1448 + 1448 + 1200).
The attentive person will ask why is the maximum data size for each transfer not 1460 bytes? TCP Header Length = 20 + 12 (optional option size) = 32 bytes Because TCP carries an option here. The maximum data that can be transmitted is: 1500-20-32 = 1448 bytes.
(2) Four wave disconnect
A. The current network communication is based on the socket, and when the client closes its socket, the kernel stack automatically sends a FIN-set packet to the server, requesting disconnection. We call the first party initiating the disconnect request called the active disconnect.
B. After the server side receives the client's fin disconnect request, the kernel stack immediately sends an ACK packet as a reply indicating that the request has been received
C. After the server has been running for some time, it shuts down its socket. At this point, the kernel stack sends a FIN-mounted packet to the client requesting a disconnect
D. After the client receives a fin disconnect request from the server, an ACK is sent to answer that the request has been received from the server
The following analysis is made with the Wirshar clutch:
(3) The guarantee of TCP reliability
TCP uses a technology called "positive acknowledgment with retransmission (positive acknowledge with retransmission)" as the basis for providing reliable data transfer services. This technique requires the receiving party to send back the acknowledgement ACK to the source station after receiving the data. The sender keeps a record of each group emitted and waits for a confirmation message before sending the next packet. The sender also initiates a timer at the same time as the packet is sent out, and when the timer expires and confirms that the message has not arrived, resend the packet just issued. Figure 3-5 shows a positive acknowledgement protocol with retransmission function to transmit data, and figure 3-6 indicates that packet loss causes timeouts and retransmissions. In order to avoid late confirmation and duplicate confirmation due to network delay, the protocol specifies a sequence number in the confirmation message to allow the receiver to associate the grouping with the acknowledgment correctly.
As can be seen from figure 3-5, although the network has the ability to communicate at the same time, the simple affirmation protocol wastes a lot of valuable network bandwidth because the next packet must be deferred before receiving the acknowledgement from the previous packet. To do this, TCP uses the mechanism of sliding windows to improve network throughput while addressing end-to-end traffic control.
(4) Sliding window technology
Sliding window technology is a more complex variant of the simple positive acknowledgment mechanism with retransmission, which allows the sender to send multiple groupings before waiting for a confirmation message. 3-7, the sender wants to send a sequence of packets, the sliding window protocol places a fixed-length window in the grouping sequence, and then sends out all the groupings within the window, and when the sender receives a confirmation of the first grouping within the window, it can slide backwards and send the next packet; The window also continues to slide backwards.
3. UDP protocol
Unlike the TCP protocol, UDP protocol does not provide time-out retransmission, error retransmission and other functions, that is, it is an unreliable protocol.
The UDP protocol is also a transport layer protocol, which is no connection and does not guarantee a reliable transport layer protocol. Its protocol header is relatively simple, as follows:
The port number here does not explain, and the TCP port number is the same meaning.
Length occupies 2 bytes, which identifies the size of the UDP header.
Checksum: Checksum, which contains the UDP header and data parts.
4. IP protocol
I P is the most core protocol of the T C p/i P protocol family. All t C p, U D p, i C m P and i G m P data are transmitted in the I P datagram format. It features the following:
Unreliable (u n r e L i a b l e) means that it does not guarantee that I-P datagrams will be able to reach the destination successfully. I P provides only the best transfer service. If some kind of error occurs, such as a router has temporarily run out of buffers, I p has a simple error handling algorithm: Discard the datagram, and then send an I C M p message to the source side. The reliability of any requirement must be provided by the upper layer (e.g. T C P).
No connection (c o n n e c t i o n l e s s) This term means that I p does not maintain any status information about subsequent datagrams. The processing of each datagram is independent of each other. This also indicates that I P datagrams can not be received in the order sent. If a source sends two consecutive datagrams (first A, then B) to the same beacon, each datagram is routed independently and may choose a different route, so B may arrive before a arrives.
1.IP Header Format
(1) version is 4 bits, which refers to the version of IP protocol. The IP protocol versions used by both sides of the communication must be consistent. The current widely used IP protocol version number is 4 (that is, IPv4). With regard to IPV6, it is still in the draft stage.
(2) The first length is 4 bits, the maximum decimal value that can be represented is 15. Note that the unit of the number represented in this field is 32 bits in length (a 32-bit word length is 4 bytes), so when the IP header length is 1111 (that is, the decimal 15), the header length reaches 60 bytes. When the header length of an IP packet is not an integer multiple of 4 bytes, it must be populated with the last fill field. Therefore, the data part will always start at 4-byte integer multiples, which makes it more convenient to implement the IP protocol. The disadvantage of a header length limit of 60 bytes is sometimes not enough. However, this is done in the hope that users will minimize overhead. The most commonly used header length is 20 bytes (that is, the header length is 0101), and no options are used.
(3) Differentiated services accounted for 8, for better service. This field is called the service type in the old standard, but it has not been used in practice. 1998 the IETF renamed this field to differentiate Service DS (differentiated Services). This field only works if you are using differentiated services.
(4) Total length length refers to the length of the header and the sum of the data, in bytes. The total Length field is 16 bits, so the maximum length of the datagram is 216-1 = 65535 bytes.
Each data link layer below the IP layer has its own frame format, which includes the maximum length of the data field in the frame format, which is called the Maximum Transfer Unit MTU (Maximum Transfer unit). When a datagram is encapsulated into a link-layer frame, the total length of the datagram (that is, the header plus the data portion) must not exceed the MTU value of the data link layer below.
(5) The mark (identification) occupies 16 bits. The IP software maintains a counter in memory, each generating a datagram, the counter adds 1, and assigns this value to the identity field. However, this "identity" is not an ordinal, because IP is a no-connect service, the datagram does not exist in order to receive problems. When a datagram must be fragmented because it is longer than the MTU of the network, the value of the identity field is copied to the identity field of all datagrams. The same value of the identity field causes the fragmented datagram to be correctly re-installed as the original datagram.
(6) flag (flag) accounted for 3, but at present only 2 bits are meaningful.
The lowest bit in the Flag field is recorded as MF (more Fragment). Mf=1 is a datagram that says "There are shards" later. Mf=0 says this is the last of several datagrams
The one in the middle of the flag field is recorded as DF (Don ' t Fragment), meaning "cannot shard." Shards are allowed only when df=0.
(7) Chip offset is 13 bits. The slice offset indicates the relative position of a piece in the original group after a long grouping in the Shard. That is, relative to the starting point of the User data field, the slice starts from where. The slice offset is offset in 8 bytes. This means that the length of each shard must be an integer multiple of 8 bytes (64 bits).
(8) Survival time is 8 bits, and the abbreviation for the time-to-live field is TTL, which indicates the lifetime of the datagram in the network. This field is set by the source point at which the datagram is emitted. The aim is to prevent the inability to deliver data in a way that is unrestrained in the Internet and thus consumes network resources in vain. The original design was in seconds as the TTL unit. Each time a router is passed, the TTL is subtracted from the datagram when it is consumed by the router. If the datagram consumes less than 1 seconds on the router, the TTL value is reduced by 1. When the TTL value is 0 o'clock, the datagram is discarded.
(9) The protocol occupies 8 bits, and the Protocol field indicates which protocol the data is carrying in order to enable the IP layer of the destination host to know which process to hand over the data portion.
(10) The first Test and 16. This field only examines the header of the datagram, but does not include the data section. This is because the router has to recalculate the first check and (some fields, such as lifetime, flag, slice offset, and so on) every time a router is passed. Do not test the data section to reduce the amount of computational effort.
(11) The source IP address occupies 32 bits.
(12) The destination IP address occupies 32 bits.
2. Shard Interpretation
Sharding refers to the need to transfer more data than the maximum Transmission Unit (MTU), you need to split into multiple packages, and then sent to each other. When we talk about TCP, there are a lot of people in MSS who can't tell them apart. From the diagram below, I think I can distinguish them completely.
Personal feel if the data is transmitted through the TCP protocol to the IP layer, there is no need for sharding. Shards are required only when big data is transmitted over the UDP protocol.
Example: Transmitting 10,240 bytes of data with the UDP protocol
Can be seen, but when the data is submitted to the network layer, because the data exceeds the maximum transmission unit, it is fragmented. Split into multiple packages to send each other through the IP protocol. The maximum bytes per packet is Mtu-ip header = 1500-20 = 1480.
5. Ethernet Header
Three components: source Mac Address | Destination MAC Address | The protocol that is used.
So in Ethernet, there are several formats for the packet:
The ARP protocol obtains the corresponding MAC address through the IP address, called the Address Resolution Protocol
RARP protocol is a MAC address to obtain the corresponding IP address, called Reverse Address Resolution Protocol
6. Working principle
How TCP/IP works.
(1) The application layer transmits a string of bytes to the transport layer on the source host;
(2) The transmission layer divides the byte stream into the TCP segment, plus the TCP header to the Internet (IP) layer;
(3) The IP layer generates a packet, placing the TCP segment in its data domain, plus the IPIP packet of the source and destination host to the data link layer;
(4) The Data link layer in its frame of the data portion of the IP packet, destined to the destination host or IP router;
(5) In the destination host, the data link layer will be removed from the data link layer frame header, the IP packet to the Internet layer;
(6) The IP layer checks the IP header, if the checksum in the header is inconsistent with the calculation, the packet is discarded;
(7) If the checksum is consistent, the IP layer removes the IP header, the TCP segment is passed to the TCP layer, and the TCP layer checks the sequence number to determine whether it is the correct TCP segment;
(8) The TCP layer computes TCP headers and data for the TCP header. If not, the TCP layer discards this packet, if yes, sends an acknowledgement to the source host;
(9) In the destination host, the TCP layer removes the TCP header and spreads the bytes to the application;
(10) The destination host receives a stream of bytes from the source host, as if it were sent directly from the source host.
In fact, each lower layer, then add a header, and this head is transparent to the upper layer, the upper layer does not feel the existence of the following header.
7. References
http://blog.csdn.net/zh634455283/article/details/7952454
Http://www.cnblogs.com/luckyxiaoxuan/p/3395527.html
Http://jingyan.baidu.com/article/36d6ed1f56b9fe1bce48837f.html
Http://www.cnblogs.com/fengzanfeng/articles/1339347.html
Http://www.cnblogs.com/liuxiaoming/archive/2013/04/27/3047803.html
http://blog.csdn.net/panfengyun12345/article/details/11694199
http://892848153.iteye.com/blog/2200650
TCP/IP Detailed volume 1
The fourth edition of Computer network
Tcp / ip