Internet Protocol entry)

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Http://kb.cnblogs.com/page/144577/

We use the Internet every day. Have you ever wondered how it is implemented?

Billions of computers around the world are connected together and communicate with each other. One Network Card in Shanghai sends a signal, and the other network card in Los Angeles actually receives the signal. The two actually do not know the physical location of the other. Do you think this is amazing?

The core of the Internet is a series of protocols, called Internet protocol suite ). They have made detailed provisions on how computers are connected and networked. After understanding these protocols, we understand the principles of the Internet.

Below are my study notes. Because these protocols are too complex and too huge, I want to organize a simple framework to help myself grasp them in general. I have made a lot of simplification to ensure simplicity and accuracy, but I should be able to clarify the principles of the Internet.

I. Overview

1. layer-5 Model

The implementation of the Internet is divided into several layers. Each layer has its own functions. Like a building, each layer is supported by the next layer.

The user is only exposed to the top layer and does not feel the bottom layer. To understand the internet, you must start from the bottom up and understand the functions of each layer from the bottom up.

How to layer different models? Some models are divided into seven layers and some into four layers. I think it is easier to explain how to divide the Internet into five layers.

As shown in, the bottom layer is called the physical layer, the top layer is called the application layer, and the middle layer is called the bottom-up layer) they are "link layer", "Network Layer", and "Transport Layer ). The lower the layer, the closer it is to the hardware. The higher the layer, the closer it is to the user.

Their names are not important. You only need to know that the Internet can be divided into several layers.

1. Layer 2 and protocols

Each layer is used to complete a function. To implement these functions, everyone must abide by common rules.

The rules that everyone follows are called protocols ).

Each layer of the Internet defines many protocols. These protocols are called Internet protocols ). They are the core of the Internet. Next we will introduce the functions of each layer, mainly about the main protocols of each layer.

2. Physical Layer)

We start from the bottom layer.

What is the first thing to do when computers need networking? Of course, the computer can be connected first by means of optical cables, cables, twisted pair wires, and radio waves.

This is called the "entity layer", which is the physical means to connect the computer. It mainly defines some electrical characteristics of the network, and is responsible for transmitting 0 and 1 electrical signals.

3. Link Layer)

3. 1 Definition

0 and 1 alone have no meaning. You must specify the interpretation method: How many electric signals are counted in a group? What is the significance of each signal bit?

This is the function of "link layer". It is above the "entity layer" and determines the grouping method of 0 and 1.

3. 2 Ethernet protocol

In the early days, each company had its own electric signal grouping method. Gradually, a protocol called "Ethernet" occupies a dominant position.

According to Ethernet rules, a group of electrical signals constitute a packet called frame ). Each frame is divided into two parts: header and data ).

"Header" contains some descriptions of the data packet, such as the sender, receiver, and data type. "data" is the specific content of the data packet.

The length of the "Header", which is fixed to 18 bytes. The length of "data". The maximum length is 46 bytes and 1500 bytes. Therefore, the entire "frame" can be up to 64 bytes and up to 1518 bytes. If the data is long, it must be divided into multiple frames for sending.

3. 3. MAC address

As mentioned above, the "Header" of an Ethernet packet contains information about the sender and receiver. How are senders and recipients identified?

Ethernet requires that all devices connected to the network must have an "Nic" interface. Data packets must be transmitted from one network card to another. The NIC address is the packet Sending address and Receiving address, which is called the MAC address.

When each network adapter leaves the factory, it has a unique MAC address all over the world. The length is 48 binary BITs, which are usually represented by 12 hexadecimal numbers.

The first six hexadecimal numbers are the vendor ID, and the last six are the NIC serial numbers of the vendor. With the MAC address, you can locate the path of the nic and data packet.

3. 4 Broadcast

Defining an address is only the first step, and there are more steps later.

First, how does one network card know the MAC address of the other network card?

The answer is: there is an ARP protocol that can solve this problem. This will be introduced later. Here, you only need to know that the Ethernet packet must know the MAC address of the receiver before it can be sent.

Second, even if the MAC address is available, how can the system send the packets to the receiver accurately?

The answer is: Ethernet adopts a very "primitive" method. Instead of sending data packets to the receiver accurately, it sends data packets to all computers in the network so that each computer can determine for itself, whether it is the receiver.

Computer 1 sends a data packet to computer 2. computers on the same subnet, such as computer 3, computer 4, and computer 5, will receive the packet. They read the "Header" of the package, find the MAC address of the receiver, and then compare it with their own MAC address. If the two are the same, they will accept the package for further processing, otherwise, the package will be discarded. This sending method is called broadcasting ).

With the definition of data packets, the MAC address of the network card, and the broadcast transmission method, the "link layer" can transmit data between multiple computers.

4. Network Layer)

4. 1 Network Layer

The Ethernet protocol uses the MAC address to send data. Theoretically, the NIC in Shanghai can be used to find the NIC in Los Angeles, which is technically feasible.

However, this has a major drawback. Ethernet uses the broadcast method to send data packets. All members have a "package", which is not only inefficient, but also limited to the sub-Network of the sender. That is to say, if the two computers are not in the same sub-network, the broadcast cannot pass through. This design is reasonable. Otherwise, every computer on the Internet will receive all packages, which will lead to a disaster.

The Internet is a giant network composed of countless sub-networks. It is almost impossible to imagine that computers in Shanghai and Los Angeles will be in the same sub-network.

Therefore, you must find a way to distinguish which MAC addresses belong to the same subnet and which are not. If the network type is the same as that of the sub-network, broadcast transmission is adopted; otherwise, route transmission is used. ("Routing" refers to how to distribute data packets to different subnetworks. This is a big topic and is not covered in this article.) Unfortunately, the MAC address itself cannot do this. It is only related to the manufacturer and the network in which it is located.

This led to the birth of the "Network Layer. Its role is to introduce a new set of addresses so that we can identify whether different computers belong to the same subnetwork. This address is called "network address" or "website address ".

As a result, when the "Network Layer" appears, each computer has two types of addresses, one is the MAC address and the other is the network address. There is no connection between the two addresses, the MAC addresses are bound to the network card, and the network addresses are allocated by the Administrator. They are only randomly combined.

The network address helps us determine the sub-network where the computer is located, and the MAC address sends the packets to the target Nic in the sub-network. Therefore, logically, we can infer that we must first process the network address and then the MAC address.

4. 2 IP protocol

The protocol that specifies the network address is called the IP protocol. The address defined by it is called an IP address.

Currently, IPv4 is widely used in the fourth version of the IP protocol. This version requires that the network address is composed of 32 binary digits.

Traditionally, the IP address is represented in four decimal segments, from 0.0.0.0 to 255.255.255.255.255.

Each computer on the internet is assigned an IP address. This address is divided into two parts, the first part represents the network, and the last part represents the host. For example, if the IP address 172.16.254.1 is a 32-bit address and its network part is the first 24 bits (172.16.254), the host part is the last 8 bits (the last one ). For computers in the same subnet, the IP addresses of these computers must be the same, that is, 172.16.254.2 and 172.16.254.1 must be in the same subnet.

However, the problem is that we cannot determine the network part from the IP address alone. Take 172.16.254.1 as an example. Whether the network part is the first 24 bits, the first 16 bits, or even the first 28 BITs cannot be seen from the IP address.

So how can we determine whether two computers belong to the same subnet from the IP address? This requires another parameter "subnet mask" (subnet mask ).

The so-called "subnet mask" is a parameter that represents a sub-network feature. It is equivalent to an IP address and also a 32-bit binary number. Its Network part is all 1 and the host part is all 0. For example, if the IP address 172.16.254.1 is known to be the first 24 bits and the host part is the last 8 bits, then the sub-network mask is 11111111.111111.111111.00000000, which is written as decimal limit 255.0.

Knowing the "subnet mask", we can determine whether any two IP addresses are in the same subnet. The method is to perform the and operation on the two IP addresses and subnet masks respectively (the two digits are both 1, and the calculation result is 1, otherwise 0), and then compare whether the results are the same. If yes, it indicates that they are in the same sub-network, otherwise they are not.

For example, the subnet masks of IP addresses 172.16.254.1 and 172.16.254.233 are both 255.255.255.0. Are they in the same subnet? The two and the subnet mask perform the and operation respectively, and the result is 172.16.254.0. Therefore, they are in the same subnetwork.

To sum up, the IP protocol has two main functions: one is to assign an IP address to each computer, and the other is to determine which addresses are in the same subnet.

4. 3 IP data packets

Data sent based on the IP protocol is called IP data packets. It is not hard to imagine that it must include IP address information.

However, as mentioned earlier, an Ethernet packet only contains the MAC address and does not have the IP address field. So do you need to modify the data definition and add another column?

The answer is no. We can put IP data packets directly into the "data" section of the Ethernet data packet, so we do not need to modify the Ethernet specification at all. This is the benefit of an Internet layered structure: changes in the upper layer do not involve the lower layer structure at all.

Specifically, IP data packets are divided into two parts: "Header" and "data.

The "Header" mainly includes version, length, IP address, and other information. The "data" section is the specific content of the IP data packet. After it is put into the Ethernet data packet, the Ethernet data packet becomes the following.

The length of the "Header" part of an IP packet is 20 to 60 bytes, and the total length of the entire packet is up to 65,535 bytes. Therefore, theoretically, the "data" part of an IP data packet is up to 65,515 bytes. As mentioned above, the "data" part of an Ethernet packet can be up to 1500 bytes. Therefore, if an IP packet exceeds 1500 bytes, it needs to be split into several Ethernet packets and sent separately.

4. 4 ARP Protocol

For more information about "Network Layer.

Because IP packets are sent in Ethernet packets, we must know two addresses at the same time, one is the MAC address of the other and the other is the IP address of the other. Generally, the IP address of the other party is known (which will be explained later), but we do not know its MAC address.

Therefore, we need a mechanism to obtain the MAC address from the IP address.

There are two cases. In the first case, if the two hosts are not in the same subnet, there is actually no way to get the MAC address of the other host, you can only send data packets to the "Gateway" at the two sub-network connections for the gateway to process.

In the second case, if the two hosts are in the same subnet, we can use ARP to obtain the MAC address of the other host. ARP also sends a data packet (including the Ethernet data packet), which contains the IP address of the host to be queried. In the MAC address column of the other host, this parameter is FF: FF: FF, indicating a "broadcast" address. Each host in its sub-network receives this packet, extracts the IP address from it, and compares it with its own IP address. If the two are the same, send a response and report the MAC address to the other party. Otherwise, the packet will be discarded.

In short, with the ARP protocol, we can get the MAC address of the host in the same subnetwork and send the packets to any host.

5. Transport Layer)

5. 1 origin of the transport layer

With the MAC address and IP address, we can establish communication between any two hosts on the Internet.

The following problem is that there are manyProgramNetwork is required. For example, you can chat with friends online while browsing the Web page. When a packet is sent from the Internet, how do you know whether it indicates the content of the webpage or the content of online chat?

That is to say, we also need a parameter to indicate which program (process) the data packet is used. This parameter is called "Port", which is actually the number of each program using the NIC. Each packet is sent to a specific port of the host, so different programs can obtain the data they need.

"Port" is an integer between 0 and 65535, exactly 16 binary digits. Ports 0 to 1023 are occupied by the system. You can only use ports greater than 1023. The application selects a random port for browsing the Web page or chatting online, and then contacts the corresponding port of the server.

The function of "Transport Layer" is to establish "port-to-port" communication. In contrast, the "Network Layer" function is to establish "Host-to-host" communication. As long as the host and port are determined, we can implement communication between programs.Therefore, the UNIX system calls host + port "socket ). With it, you can develop network applications.

5. 2 UDP protocol

Now, we must add port information to the data packet, which requires a new protocol. The simplest implementation is the UDP Protocol. Its format is almost in front of the data and the port number is added.

A UDP packet is composed of two parts: "Header" and "data.

The "Header" section mainly defines the sending and receiving ports, and the "data" section is the specific content. Then, put the entire UDP packet into the "data" part of the IP packet, and as mentioned earlier, the IP packet is placed in the Ethernet packet, so the entire Ethernet packet is now changed to the following:

UDP data packets are very simple. The "Header" contains only 8 bytes in total, and the total length cannot exceed 65,535 bytes. It is placed into an IP data packet.

5. 3 TCP protocol

The advantage of UDP protocol is that it is relatively simple and easy to implement, but the disadvantage is that the reliability is poor. Once a data packet is sent, the other party cannot know whether to receive it.

To solve this problem and improve network reliability, the TCP protocol was born. This Protocol is very complex, but it can be considered to be a UDP protocol with a validation mechanism. Each packet sent requires confirmation. If a packet is lost, the sender will know that it is necessary to resend the packet.

Therefore, the TCP protocol ensures that data is not lost. Its disadvantage is that the process is complex, difficult to implement, and consumes more resources.

TCP and UDP data packets are embedded in the "data" section of IP data packets. TCP data packets have no length limit and can be infinitely long theoretically. However, to ensure network efficiency, the length of TCP data packets does not exceed the length of IP data packets, so that a single TCP data packet does not have to be separated.

6. Application Layer)

The application receives data from the transport layer. Because the Internet is an open architecture and data sources are varied, the format must be specified in advance, otherwise it cannot be interpreted.

The role of "Application Layer" is to define the data format of the application.

For example, TCP can transmit data for various programs, such as email, WWW, and FTP. Therefore, there must be different protocols specifying the formats of email, webpage, and FTP data. These application protocols constitute the "Application Layer ".

This is the highest level, directly facing users. Its data is placed in the "data" section of the TCP packet. Therefore, the current Ethernet data packet is as follows.

So far, the five-layer structure of the entire Internet has been fully explained from the bottom up. This explains how the Internet is structured from a system perspective. Next time, from the user's perspective, I will look at how this structure works to complete a network data exchange.

VII. A summary

First, make a summary of the previous content.

We already know that network communication is the exchange of data packets. Computer A sends a packet to computer B. The latter receives the packet and replies to the packet to implement communication between the two computers. The structure of the data packet is basically as follows:

To send this package, you need to know two addresses:

MAC address of the recipient

IP address of the other party

With these two addresses, data packets can be delivered to the receiver accurately. However, as mentioned earlier, the MAC address has limitations. If the two computers are not in the same subnet, they will not be able to know the MAC address of the other computer. They must be forwarded through the gateway.

Computer 1 sends a packet to computer 4. It first checks whether the computer No. 4 is in the same sub-network and finds that it is not (this article introduces the judgment method), so it sends this packet to Gateway. Through the routing protocol, Gateway A finds that computer 4 is located in sub-Network B and sends data packets to Gateway B. Then, Gateway B forwards data packets to computer 4.

The MAC address of gateway A must be known when computer 1 sends data packets to Gateway. Therefore, the destination address of the data packet is actually divided into two situations:

Scenario	Packet address
Same Subnet	The MAC address of the other party and the IP address of the other party.
Non-same subnet	MAC address of the gateway, IP address of the other party

Before sending data packets, the computer must determine whether the peer is in the same sub-network and then select the corresponding MAC address. Next, let's look at how this process is completed in actual use.

8. User's Internet access settings

8. 1 static IP Address

You bought a new computer, plugged in the network cable, and started the computer. Can the computer access the Internet at this time?

Usually you must make some settings. Sometimes, the Administrator (or ISP) will tell you the following four parameters. You can enter them into the operating system and the computer will be connected to the Internet:

Local IP Address

Subnet Mask

IP address of the Gateway

Dns ip Address

Is the Windows Settings window.

These four parameters are indispensable. Later, I will explain why you need to know them before accessing the Internet. Because they are given, the computer will be assigned the same IP address each time it starts up, so this situation is called "static IP address surfing the Internet ".

However, this setting is very professional and common users are daunting. If the IP address of a computer remains unchanged, other computers cannot use this address, which is not flexible enough. For these two reasons, most users use "Dynamic IP address for Internet access ".

8. 2. Dynamic IP Address

The so-called "Dynamic IP Address" means that an IP address is automatically assigned to the computer after it is started. The DHCP protocol is used.

This Protocol stipulates that a computer in each sub-network is responsible for managing all IP addresses of the network. It is called a "DHCP server ". When a new computer joins the network, it must send a "DHCP request" packet to the "DHCP server" to apply for IP addresses and related network parameters.

As mentioned above, if the two computers are in the same subnet, they must know the MAC address and IP address of the other computer to send packets. However, the newly added computer does not know these two addresses. How can we send data packets?

The DHCP protocol has made some clever provisions.

8. 3 DHCP protocol

First, it is an application layer protocol built on UDP, so the entire packet is like this:

(1) set the MAC address of the sender (Local Machine) and the MAC address of the receiver (DHCP server. The former is the MAC address of the local Nic. If you do not know the latter, enter a broadcast address: FF-ff.

(2) set the sender's IP address and the receiver's IP address. At this time, the local machine does not know the two. Therefore, the sender's IP address is set to 0.0.0.0, and the receiver's IP address is set to 255.255.255.255.

(3) set the sender port and receiver port in the last "UDP Header. This part is stipulated by the DHCP protocol. The sender is port 68 and the receiver is port 67.

This packet can be sent after being constructed. Ethernet is broadcast transmission, and each computer in the same subnetwork receives the packet. Because the recipient's MAC address is a FF-FF-FF-FF-FF-FF, can not see who is sent, so each computer received this package, you must also analyze the IP address of this package, in order to determine whether to send your own mail. When the IP address of the sender is 0.0.0.0 and the receiver is 255.255.255.255, the DHCP server knows that "this packet has been sent to me", and other computers can discard this packet.

Next, the DHCP server reads the data of the packet, allocates the IP address, and sends back a "DHCP response" packet. The structure of this response packet is similar. The MAC address of the Ethernet header is the NIC address of both parties, and the IP address of the IP header is the IP address of the DHCP server (issuer) and ipv0000255 (receiver), UDP header ports are 67 (sender) and 68 (receiver), and the IP address assigned to the request end and specific parameters of the network are included in the data section.

After receiving the response packet, the newly added computer knows its IP address, subnet mask, gateway address, DNS server, and other parameters.

8. 4 Internet access settings: Summary

In this section, you need to remember that, whether it is "static IP Address" or "Dynamic IP Address", the first step for accessing the computer is to determine four parameters. These four values are important and worth repeating:

Local IP Address

Subnet Mask

IP address of the Gateway

Dns ip Address

With these numeric values, the computer can surf the Internet. Next, let's look at an example of how the Internet protocol works when a user accesses a webpage.

9. One instance: webpage access

9. 1 Local Parameters

We assume that, after the steps in the previous section, you have set your network parameters:

Local IP Address: 192.168.1.100

Subnet Mask: 255.255.255.0

IP address of the Gateway: 192.168.1.1

Dns ip Address: 8.8.8.8

Then he opened his browser and entered the URL www.google.com in the address bar to access Google.

This means that the browser will send a webpage request packet to Google.

9. 2 DNS protocol

We know that to send data packets, we must know the IP address of the other party. But now, we only know the website www.google.com, and do not know its IP address.

The DNS protocol can help us convert this URL to an IP address. The DNS server is known to be 8.8.8.8, so we send a DNS packet (port 53) to this address ).

Then, the DNS server responds and tells us that Google's IP address is 172.194.72.105. So we know the IP address of the other party.

9. 3 Subnet Mask

Next, we need to determine whether the IP address is in the same subnetwork. This requires a subnet mask.

It is known that the subnet mask is 255.255.255.0. In this tutorial, it performs a binary and operation on its IP address 192.168.1.100 (the two digits are the same and the result is 1; otherwise, it is 0 ), the calculation result is 192.168.1.0, and then an and operation is performed on Google's IP address 172.194.72.105. The calculation result is 172.194.72.0. The two results are not equal, so the conclusion is that Google and the local machine are not in the same subnet.

Therefore, to send a data packet to Google, it must be forwarded through the gateway 192.168.1.1. That is to say, the recipient's MAC address will be the gateway's MAC address.

Application Layer Protocol

The Web browser uses the HTTP protocol, and its entire packet structure is as follows:

The HTTP content is similar to the following:

Get, HTTP, 1.1

HOST: www.google.com

Connection: keep-alive

User-Agent: Mozilla/5.0 (Windows NT 6.1 )......

Accept: text/html, application/XHTML + XML, application/XML; q = 0.9, */*; q = 0.8

Accept-encoding: gzip, deflate, SDCh

Accept-language: ZH-CN, ZH; q = 0.8

Accept-charset: GBK, UTF-8; q = 0.7, *; q = 0.3

COOKIE :......

We assume that this part is 4960 bytes in length and will be embedded in TCP packets.

9. 5 TCP protocol

TCP data packets require a port. The http port of the receiver (Google) is 80 by default, and the sender (local) port is a random integer between 1024 and 65535, which is assumed to be 51775.

The header length of a TCP packet is 20 bytes, and the total length of the packet embedded with HTTP is 4980 bytes.

9.6 IP protocol

Then, the TCP packet is embedded into the IP packet. IP packets must be set to the IP addresses of both parties. This is known. The sender is 192.168.1.100 (Local Machine), and the receiver is 172.194.72.105 (Google ).

The Header Length of the IP data packet is 20 bytes, and the total length of the embedded TCP data packet is 5000 bytes.

9. 7 Ethernet protocol

Finally, IP data packets are embedded into Ethernet data packets. For an Ethernet packet, you must set the MAC address of both parties. the sender is the MAC address of the local Nic, and the receiver is the MAC address of the gateway 192.168.1.1 (obtained through ARP ).

The maximum length of an Ethernet data packet is 1500 bytes, and the current IP data packet length is 5000 bytes. Therefore, IP data packets must be divided into four packages. Because each packet has its own IP header (20 bytes), the IP packet lengths of the four packets are 1500, 1500, 1500, and 560, respectively.

Server Response

After forwarding by multiple gateways, Google server 172.194.72.105 receives the four Ethernet packets.

According to the serial number of the IP header, Google puts together four packets, extracts the complete TCP packet, reads the "http request" in it, and then generates "HTTP Response ", send it back using the TCP protocol.

After the local machine receives the HTTP response, it can display the webpage and complete a network communication.

This example has been simplified, but it roughly reflects the entire communication process of Internet protocols.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Internet Protocol entry)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Internet Protocol entry)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support