Original: http://www.cnblogs.com/xuanku/p/tcpip.html
The TCP/IP network protocol stack is divided into four tiers, from bottom to top in turn:
Link Layer
In fact, there is a physical layer under the link layer, refers to the transmission of electrical signals, such as the common twisted-pair cable, optical fiber, as well as the early coaxial cable, etc., the physical layer design determines the transmission of the signal bandwidth, speed, transmission distance, anti-jamming and so on.
In the link layer itself, is mainly responsible for the data with the physical layer interaction, common work includes the driver of the network card device, frame synchronization (detect what signal is a new frame), conflict detection (if there is a conflict automatically re-send), data error checking and so on.
Link Layer Common has 以太网
, 令牌环网
the standard.
Network layer
The IP protocol of the network layer is the foundation of the Internet. This level is responsible for sending the data to the corresponding destination address, the network has a large number of routers to do this, the router will often tear down the link layer and the network layer corresponding to the data head and re-encapsulation. The IP layer is not responsible for the reliability of the data transfer, the transmission of the process may be lost, the need for the upper layer protocol to ensure this thing.
Transport Layer
The network layer is responsible for point-to-Point protocol, that is, only to a certain host, the transport layer is responsible for the end-to-end protocol, that is, to reach a process.
The typical protocol has tcp/udp two kinds of protocols, in which the TCP protocol is a connection-oriented, stable and reliable protocol, it will be responsible for the detection of data, spin-off and re-assembly in sequence, automatic re-hair and so on. UDP is only responsible for sending the data to the corresponding process, almost no logic, that is, the application layer itself to ensure the reliability of data transmission.
Application Layer
That is, our common HTTP, FTP protocol and so on.
The packets for this four-layer protocol are encapsulated as:
The communication process corresponding to the four-layer protocol is as follows:
Link Layer Ethernet Data frame
The Ethernet Data frame format is as follows:
The description is as follows:
- Destination address and source address refers to the network card's hardware address (ie MAC address), the length is 48 bits, when the factory cured.
- Type field is the upper-level protocol type, there are currently three kinds of values: IP, ARP, RARP.
- Data corresponding to the upper layer protocol transmission of data, Ethernet specified data size is 46~1500 bytes, the maximum value of 1500 is the Maximum Transmission Unit (MTU) of Ethernet, different network types have different MTU, if you need to transfer across different types of links, you need to re-shard the data.
- CRC is the checksum of the data to ensure the correct data transmission
ARP protocol
In the process of network communication, the source host's application only knows the destination application's IP address, does not know the other host's hardware address, so before the data sent, need to find the target and its hardware address, this is the role of the ARP protocol.
Each time a connection is made, the destination IP address is broadcast on the local network, and all machines are subjected to the request, and the destination machine discovers that the IP address in the request is the same as itself, and returns the hardware address back, otherwise ignoring the request.
In general, each machine maintains an ARP cache table that stores the recent mapping of IP addresses and hardware addresses, and can be used arp -a
to view the contents of the cache table.
If the destination machine and the machine are not within the same network segment, the data will be sent to the gateway to process, the General gateway is the router, the gateway will be IP routing, the ARP request sent to the destination network address, and then return the answer to the originating request machine.
IP protocol
The IP protocol packet format is as follows:
Several fields are explained as follows:
- TOS, a total of 8 bits, of which 3 bits are used to indicate the priority of the packet, is no longer available; There are also 4 bits representing the optional service type (minimum delay, maximum throughput, maximum reliability, lowest cost), and one always 0;
- Flag bit: Used to identify the shard relationship of each IP packet for fragmentation and reassembly of packets;
- TTL (time to Live) refers to the maximum number of times a packet is forwarded over the network, and if the number is exceeded, the packet is discarded
- 8-bit protocol, upper Optional Protocol: TCP, UDP, ICMP, IGMP
The total number of IP addresses is divided into the following categories:
When the internet just came out, most organizations have applied for the Class B network address, resulting in class B address quickly ran out, but a class has a lot of idle address, and each router must master all the information of the network, with the increase of C-type network, routers in the number of routing table items are more and more.
In response to this situation, it was found that most of the internal network of the machine do not need a separate public IP, these machines through a public network IP with the external connection, in their own network for each machine to apply for a private IP, build a router inside, to do the location of the intranet IP address.
The emergence of private IP has greatly solved the problem of IP waste, so we can see a lot of such as 192.168.xx in the daily IP, these IP are only LAN internal IP, will not waste IP address.
As a result, RFC1918 specifies the private IP address specification for the local area network:
- 10.*, the front 8 is the network number, a total of 16,777,216 private IP
- 172.16.* to 172.31.*, a total of 1,048,576 private IPs
- 192.168.*, a total of 65,536 private IPs
Although these private IP addresses do not have a public IP address, they can still interact with the public via technologies such as NAT.
In addition to the private IP, there are several special IP addresses:
- 127.* IP address is used for native loopback testing, the interaction data of this kind of address will not cross the network card, directly in the kernel over the protocol to complete the interaction
- 255.255.255.255, which is a special IP that represents a locally routed broadcast
- The host number is a 0 address representing a network, not a host (for example, you cannot use 192.168.0.0 as the IP of a machine)
- The host number section is full of 1 addresses representing the broadcast within the network
TCP Protocol Packet format
The TCP protocol packets are as follows:
Some fields are explained as follows:
- Source port number and destination port number: Used to label data interaction between the two processes
- 32-bit sequence number and 32-bit ACK sequence number: TCP is a reliable interaction protocol that is used as a marker for data in transit, guaranteeing the order in which data is transmitted and re-sending
- Urg/ack/psh/rst/syn/fin: To mark what phase of the request package is in a TCP connection, these 6 fields are explained in detail below
Interactive process
The key information in this packet is marked by the number on each connection line, such as
SYN,1000(0),<mss 1460>
Delegate: The request package contains a SYN tag with a 32-bit sequence number of 1000, no data, and an MSS option with a value of 1460
SYN,8000(0),ACK,1001,<mss 1024>
Rep: The request package contains the SYN and ACK tokens, the 32-bit sequence number is 8000, does not contain data, the 32-bit ACK sequence number is 1001, and the MSS option is also included
So next we look at the interaction of the TCP protocol:
Establish a connection
- The client sends a packet 1, the SYN represents the request to establish a connection, the first packet ordinal is 1000, the size of the ordinal is maintained by the operating system kernel, each send will be self-increment, the increment value is the number of bytes sent, where the MSS option represents the maximum segment size, which is to avoid unnecessary packet unpacking of the underlying protocol;
- The server returns package 2, contains ACK 1001, represents a packet less than 1001 sequence number, I have received, the next request sent greater than or equal to 1001 packets; The package contains SYN 8000 (0), which is the same as the client interaction, but the server side of the first sequence number is 8000;
- The client returns package 3, which contains only ACK 8001 packets, and represents the package that received the server's build connection.
At this point, the connection is established, you can send data, the process contains the client and the server each request and reply, the server's request and answer put in a package, a total of 3 packets sent, so this process is called three times handshake.
Exchanging data
- Client sends packet 4, contains ACK 8001, and 20 bytes of data ordinal from 1001~1020
- The server returns package 5, containing ACK 1021 (because it contains 20 bytes), and the ordinal number of 10 bytes from the 8001~8010 data
- The client returns package 6 because the data has been interactively completed, so it contains only an ACK of 8011
This section is mainly to understand the TCP interaction ordinal management logic, because it is a full-duplex protocol, that is, the server and the client can simultaneously send data like each other, so the client and the server need to maintain a serial number. In the case of a half-duplex protocol, you only need one party to maintain a sequence number.
Close connection
- Client sends package 7, contains fin tag, 1021
- The server returns package 8, just answer ack 1022
- The server returns package 9 again, containing the fin tag, 8011 sequence
- The client returns package 10 with ACK 8012
When the connection is established, the server's request and response are merged into a package. However, in the process of closing the connection, you must separate two packets, because the client will not be able to send the data after closing the connection, but the server can also send data to the client until the server also sends the FIN tag.
Sliding window
The above is a return of the interaction, in general, there may be one side of the data sent particularly fast, the other side of the data is particularly slow, this time if you do not control, it is bound to slow this side of the data processing to cause the loss of packets.
TCP protocol used 滑动窗口协议
to solve the problem, similar to the above mss
, and then add a new option win
to tell the other side of their own sliding window size, the other side in the sending of data each time the data sent to know the other end of the window space is enough, if not enough, not to send This solves the problem of a fast and slow one.
Connection Status
Such as:
UDP protocol
The UDP protocol is much simpler, basically only contains the source address, destination address, length, checksum, data.
The interaction process is no longer like TCP, such as the process of establishing a connection and closing the connection, and sending the data directly each time, there are some problems as follows:
- Send the data, if the packet is lost in the vast route, the receiver does not know
- Multiple packets sent, in different routes, may reach the timing is not the same as when sent, so the receiver may get a different sequence of packets
- If the sender is fast and the receiver is slow, the receiving end will lose the packet.
So, as mentioned earlier, the UDP protocol does not guarantee the reliability of the data, he is generally used for some high-performance scenarios, and needs to do some simple encapsulation processing of the application layer.
Reference
- Linux C Programming One-stop learning. Http://docs.linuxtone.org/ebooks/C&CPP/c/index.html
TCP/IP network protocol stack (reprint)