Getting Started with Internet protocols

Last Update:2015-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Get more information Welcome to my website or csdn or [blog Park]

This paper mainly introduces the basic OS model of the Internet, and the Internet Entry protocol for reference and modification in the Express network.

Overview five-tier model

The realization of the Internet is divided into several layers. Each layer has its own function, just like a building, each layer is supported by the next layer. The user touches, just the top layer, does not feel at all below the layer. To understand the Internet, you have to start from the bottom and understand the capabilities of each layer from below. Generally divided into five layers: the bottom layer is called the "solid layer" (physical layer), the topmost layer is called the "Application Layer" (Application layer), the middle of the three layers (bottom-up) is the "link layer", "Network layer" (Network layer) and "Transport Layer" (Transport layer). The lower layer, the closer to the hardware, the higher the upper layer, the closer the user.

Layers and protocols

Each layer is designed to perform a function. In order to achieve these functions, we need to abide by common rules. The rules that we all obey are called "agreements" (protocol). Every layer of the Internet defines a lot of protocols. Collectively, these protocols are called "Internet Protocols" (Internet Protocol Suite). They are the core of the Internet, the following describes the functions of each layer, mainly to introduce each layer of the main protocol.

Physical Layer

Solid layer is the computer combo up can use optical cable, wireless and so on; it is the physical means of connecting the computer; responsible for transmitting electrical signals: 0 and 1

Link Layer Definition

The simple 0 and 1 are meaningless, and must specify the decoding method; The link layer is exactly the function; Determine the grouping of 0 and 1

Ethernet Protocol

Ethernet provides that a set of electrical signals constitutes a packet called "frame". Each frame is divided into two parts: header (head) and data.

The header "contains some description of the packet, such as the sender's MAC address, the recipient's MAC address, the data type, and so on;" Data "is the specific content of the packet.
The length of the "header", fixed to 18 bytes. The length of the "data" is as short as 46 bytes and up to 1500 bytes. Therefore, the entire "frame" is a minimum of 64 bytes and a maximum of 1518 bytes. If the data is long, it must be split into multiple frames for sending.

MAC address

Ethernet specifies that all devices connected to the network must have a "Nic" interface. The packet must be routed from one network card to another. The address of the network card is the sending and receiving address of the packet, which is called the MAC address. Each NIC comes out of the factory with a unique MAC address in the world, with a length of 48 bits, usually in 12 hexadecimal digits. The first 6 hexadecimal digits are the vendor number, and the last 6 are the vendor's NIC serial number. With the MAC address, you can locate the network card and the path to the packet.

#ubuntu下查看本机mac地址：‘/eth/{print $1,$5}‘#或者‘{print $4}‘#或者sudo lshw -C network#或者sudo lshw -c network | grep serial#如我的mac地址：‘/eth/{print $1,$5}‘00:90:f5:f1:68:65#windows下：在命令行下输入下面命令就可以观看：ipconfig/all

Broadcasting

It's not enough to have a MAC address. A network card how the other piece of the address of the network card? Even if you know the address, how is the system accurately sent to the receiver?
This is because there is an ARP protocol that allows us to learn the address of another NIC. This is explained later; accurate delivery is because Ethernet uses a very "primitive" way, it is not to send the packet accurately to the receiver, but to all the computers in the network to send, so that each computer to determine whether it is the receiver.

, computer 1th sends a packet to computer number 2nd, and Computers 3rd, 4th, and 5th of the same subnet receive the packet. They read the "header" of the packet, find the receiver's MAC address, and then compare it to their MAC address, and if the two are the same, accept the package, do further processing, or discard the package. This type of transmission is called "broadcast" (broadcasting). With the definition of the packet, the MAC address of the NIC, the way the broadcast is sent, the link layer can transfer data between multiple computers.
Note At this point: the transmission is only a sub-network;

The origin of Network layer

Ethernet protocol that relies on MAC addresses to send data. Theoretically, relying solely on the MAC address, Shanghai's network card can find the network card in Los Angeles, technically can be achieved. However, there is a major drawback to doing so. Ethernet uses broadcast to send packets, all members of a "package", not only inefficient, but also confined to the sub-network of the sender. In other words, if two computers are not on the same subnet, the broadcast is not passed. This design is reasonable, otherwise every computer on the Internet will receive all the packages, which will cause disaster. The internet is a giant network of countless sub-networks, much like the idea that computers in Shanghai and Los Angeles will be on the same subnet, which is almost impossible.
Therefore, you must find a way to differentiate which MAC addresses belong to the same subnet and which are not. If it is the same subnet, it is sent by broadcast, otherwise it is sent by "route" mode. ("Routing" means the distribution of packets to different sub-networks, which is a large topic that is not covered in this article.) Unfortunately, the MAC address itself cannot do this. It is only relevant to the vendor, regardless of the network in which it is located.
This led to the birth of the "network layer". Its role is to introduce a new set of addresses that allow us to distinguish whether different computers belong to the same subnet. This set of addresses is called "Network Address", referred to as "url".
Therefore, after the "Network layer" appears, each computer has two kinds of addresses, one is the MAC address, the other is the network address. There is no connection between the two addresses, the MAC address is bound on the network card, the network address is assigned by the administrator, they are only randomly grouped together.
The network address helps us determine the subnet where the computer resides, and the MAC address sends the packet to the destination network card in that subnet. Therefore, it is logically inferred that the network address must be processed before the MAC address is processed.

IP protocol

The

Protocol that specifies the network address is called the IP protocol. The address that it defines is called an IP address. At present, the widely used is the fourth edition of IP protocol, referred to as IPV4. This version stipulates that the network address consists of 32 bits. In practice, we use a decimal number divided into four segments to represent the IP address, from 0.0.0.0 to 255.255.255.255.
Every computer on the Internet will be assigned an IP address. This address is divided into two parts, the previous part represents the network, and the latter part represents the host. For example, the IP address 172.16.254.1, which is a 32-bit address, assuming that its network portion is the first 24 bits (172.16.254), then the host part is the last 8 bits (the final 1). Computers in the same sub-network, their IP address must be the same network part, that is, 172.16.254.2 should be in the same subnet as 172.16.254.1.
However, the problem is that we cannot judge the network part simply from the IP address. Or take 172.16.254.1 as an example, its network part, in the end is the first 24 bits, or the first 16, or even the top 28, from the IP address is not visible.
So, how can I tell whether two computers belong to the same subnet from an IP address? This will use another parameter, "Subnet mask" (subnet mask). The
subnet mask is a parameter that represents a sub-network feature. It is formally equivalent to an IP address, is also a 32-bit binary number, its network portion is all 1, the host part is all 0. For example, IP address 172.16.254.1, if the network portion is known as the first 24 bits, the host part is the last 8 bits, then the subnet mask is 11111111.11111111.11111111.00000000, written in decimal is 255.255.255.0.
We know the subnet mask, and we can tell if any two IP addresses are in the same sub-network. The method is to use the two IP address and the subnet mask for each and operation (two digits are 1, the result of the operation is 1, otherwise 0), and then compare the results are the same, if so, it indicates that they are in the same sub-network, otherwise it is not.
For example, the subnet masks for known IP addresses 172.16.254.1 and 172.16.254.233 are 255.255.255.0, are they on the same subnet? Both and operations are performed separately with the subnet mask, and the results are 172.16.254.0, so they are on the same subnet.
to summarize, the IP protocol has two main functions, one is to assign an IP address to each computer, and the other is to determine which addresses are in the same subnet (netmask).

IP packets

The data that is sent according to the IP protocol is called an IP packet. It is not difficult to imagine that it must include IP address information. But as mentioned earlier, the Ethernet packet contains only the MAC address, and there is no field for the IP address. Do you need to modify the data definition and add a field?
The answer is no, we can put the IP packet directly into the "data" part of the Ethernet packet, so there is no need to modify the Ethernet specifications at all. This is the benefit of the hierarchical structure of the Internet: changes in the upper layers do not involve the underlying structure at all.
Specifically, IP packets are also classified as "header" and "data" two parts.

The "header" section mainly includes the version, length, IP address and other information, the "Data" section is the specific content of IP packets. When it is placed in an Ethernet packet, the Ethernet packet becomes the following.

The "header" portion of an IP packet is 20 to 60 bytes long, and the total length of the packet is up to 65,535 bytes. Therefore, in theory, the "data" portion of an IP packet is up to 65,515 bytes in length. As mentioned earlier, the "data" portion of an Ethernet packet is only 1500 bytes long. Therefore, if the IP packet exceeds 1500 bytes, it needs to be split into several Ethernet packets, which are sent by sub-development.

ARP protocol

There is one last point you need to explain about the network layer.

Because the IP packet is sent in the Ethernet packet, we must also know two addresses, one is the other's MAC address, the other is the other's IP address. Normally, the IP address of the other party is known (explained later), but we do not know its MAC address.

So, we need a mechanism to get the MAC address from the IP address.

This can be divided into two different situations. In the first case, if the two hosts are not in the same sub-network, then in fact there is no way to get the other's MAC address, only the packet to the two sub-network connection "gateway", let the gateway to handle.

In the second case, if the two hosts are on the same subnet, then we can use the ARP protocol to get the MAC address of each other. The ARP protocol also emits a packet (contained in an Ethernet packet) that contains the IP address of the host to which it is queried, in the other's MAC address column, filled with FF:FF:FF:FF:FF:FF, indicating that this is a "broadcast" address. Each host of its subnet receives the packet, which takes the IP address and compares it to its own IP address. If the two are the same, make a reply, report their MAC address to each other, or discard the package.

In short, with the ARP protocol, we can get the host MAC address of the same sub-network, can send packets to any host.

Origin of Transport layer Transport layer

With the MAC address and IP address, we can already establish communication on any two hosts on the Internet.

The next problem is that there are many programs on the same host that need to use the network, for example, while you're browsing the web and chatting with your friends online. When a packet is sent from the Internet, how do you know whether it represents the content of a Web page or the content of an online chat?

In other words, we also need a parameter that indicates which program (process) The packet is intended to use. This parameter is called "Port", which is actually the number of each program that uses the NIC. Each packet is sent to a specific port on the host, so different programs can take the data they need.

The "Port" is an integer between 0 and 65535, exactly 16 bits. 0 to 1023 of the ports are system-occupied, users can only choose a port greater than 1023. Whether you are browsing the Web or chatting online, the application randomly selects a port and then contacts the appropriate port on the server.

The function of the "Transport layer" is to establish "port-to-port" communication. In contrast, the function of the "network layer" is to establish "host-to-host" communication. As long as the host and Port are determined, we can communicate between the programs. Therefore, the UNIX system puts the host + port, called the socket. With it, you can develop Web applications.

UDP protocol

Now we have to include the port information in the packet, which requires a new protocol. The simplest implementation is called the UDP protocol, and its format is almost in front of the data, plus the port number.

UDP packets are also made up of "header" and "data".

The "header" section mainly defines the issuing port and the receive port, and the "Data" section is the specific content. Then, the entire UDP packet into the "data" part of the IP packet, and the previous said that the IP packet is placed in the Ethernet packet, so the entire Ethernet packet now becomes the following:

UDP packets are very simple, the "header" section is only 8 bytes, the total length of not more than 65,535 bytes, just put in an IP packet.

TCP protocol

The advantages of the UDP protocol are relatively simple and easy to implement, but the disadvantage is that the reliability is poor, once the packet is sent, it is impossible to know whether the other party received.

In order to solve this problem and improve the network reliability, the TCP protocol was born. This protocol is very complex, but it can be approximated that it is a UDP protocol with a confirmation mechanism, each sending a packet requires confirmation. If a packet is lost, the acknowledgement is not received and the sender knows it is necessary to re-send the packet.

Therefore, the TCP protocol ensures that data is not lost. Its disadvantage is the complexity of the process, the implementation of difficult, more expensive resources.

TCP packets, like UDP packets, are embedded in the "Data" section of the IP packet. TCP packets have no length limit and can theoretically be infinitely long, but in order to ensure the efficiency of the network, the TCP packet length does not exceed the length of the IP packet, to ensure that a single TCP packet does not have to be split again.

Application Layer

The application receives data from the "Transport Layer", which is then interpreted. Since the Internet is an open architecture, data sources are varied and must be well-defined in advance, otherwise they cannot be interpreted at all.
The role of the "Application layer" is to specify the data format of the application.
For example, the TCP protocol can pass data to a variety of programs, such as email, WWW, FTP, and so on. Then there must be different protocols for the format of e-mail, Web pages, FTP data, and these application protocols constitute the "Application layer".
This is the highest level, directly facing the user. Its data is placed in the "Data" section of the TCP packet. As a result, the current Ethernet packet becomes the following.

Summary

We already know that network communication is the exchange of data packets. Computer A sends a packet to Computer B, which receives, responds to a packet, and realizes communication between the two computers. The structure of the packet is basically the same.
To send this package, you need to know two addresses:

对方的mac地址对方的ip地址

With these two addresses, the packet can be sent to the receiver accurately. However, as mentioned earlier, the MAC address has limitations, if the two computers are not on the same subnet, you will not know the other's MAC address, must be forwarded through the gateway.

, computer 1th will send a packet to computer number 4th. It first to determine whether the 4th computer is in the same subnet, the results found not (after the introduction of the method of judgment), so the packet sent to gateway A. Gateway A through the routing protocol, found that 4th computer is located in sub-network B, and the packet sent to Gateway B, Gateway B and then forwarded to computer 4th.
Computer number 1th sends the packet to gateway A, you must know the MAC address of Gateway A. Therefore, the destination address of the packet is actually divided into two situations:

Packet

Scene	Address
Same sub-network	Each other's MAC address, the other's IP address
Non-identical sub-network	The MAC address of the gateway, the IP address of the other

Before sending a packet, the computer must determine whether the other person is on the same subnet, and then select the appropriate MAC address. Next, let's see how this process is done in practice.

User's Internet setting static IP address

You bought a new computer, plugged in a network cable, power on, then the computer can surf the Internet?
Usually you have to do some setup. Sometimes, the administrator (or ISP) will tell you the following four parameters, you fill them in the operating system, the computer can connect the Internet:

* 本机的IP地址* 子网掩码* 网关的IP地址* DNS的IP地址

For example, the Windows System's Network Settings window.

These four parameters are integral and will explain why you need to know them to get online. Because they are given, each time the computer is turned on, it will be assigned the same IP address, so this situation is called "Static IP address Internet".

However, such a setting is professional, the average user is daunting, and if the IP address of a computer remains unchanged, other computers will not be able to use this address, not flexible. For these two reasons, most users use "Dynamic IP address Internet".

Dynamic IP

The so-called "Dynamic IP Address", refers to the computer boot, will automatically assign to an IP address, without human settings. The protocol it uses is called the DHCP protocol.

This protocol stipulates that in each sub-network, one computer is responsible for managing all IP addresses of the network, which is called a "DHCP server". When a new computer joins the network, a "DHCP request" packet must be sent to the "DHCP server" requesting the IP address and the associated network parameters.

As mentioned earlier, if two computers are on the same subnet, you must know the other's MAC address and IP address to send the packet. However, the newly added computer does not know these two addresses, how to send a packet?

The DHCP protocol makes some clever rules.

DHCP protocol

First, it is an application-layer protocol that is built on top of the UDP protocol, so the entire packet is this:

1) The first "Ethernet header", set the MAC address of the issuing party (native) and the MAC address of the receiver (DHCP server). The former is the MAC address of the local network card, the latter do not know, fill in a broadcast address: FF-FF-FF-FF-FF-FF.

(2) Next "IP Header", set the IP address of the sender and the IP address of the receiver. At this time, for both, this machine is not known. The IP address of the issuing party is then set to 0.0.0.0, the IP address of the receiver is set to 255.255.255.255.

(3) The last "UDP header", set the port of the issuing party and the port of the receiver. This section is provided by the DHCP protocol, which is port 68 and the receiver is port 67.

Once this packet is constructed, it can be sent out. Ethernet is broadcast sent, and each computer on the same sub-network receives this packet. Because the receiver's MAC address is ff-ff-ff-ff-ff-ff, do not see who is sent to, so each received this package of the computer, you must also analyze the IP address of the package to determine whether it is sent to their own. When the sender IP address is 0.0.0.0 and the receiver is 255.255.255.255, the DHCP server knows "This package is sent to me" and the other computer can discard the package.

Next, the DHCP server reads out the contents of the packet, assigns the IP address, and sends back a "DHCP response" packet. The structure of this response packet is similar, the MAC address of the Ethernet header is the network card address of both sides, The IP address of the IP header is the IP address of the DHCP server (the issuing party) and the 255.255.255.255 (receiver), the UDP header port is 67 (sender) and 68 (receiver), the IP address assigned to the requester side and the specific parameters of the network are included in the data section.

The newly added computer receives the response packet, so it knows its own IP address, subnet mask, gateway address, DNS server, and so on.

Internet Settings: summary

In this section, one thing to keep in mind: whether it's a "static IP address" or a "dynamic IP address", the first step in computer surfing is to determine four parameters. These four values are important and worth repeating:

* 本机的IP地址* 子网掩码* 网关的IP地址* DNS的IP地址

One instance: accessing Web page Native parameters

We assume that, following the steps in the previous section, the user has set their own network parameters:

　* 本机的IP地址：192.168.1.100* 子网掩码：255.255.255.0* 网关的IP地址：192.168.1.1* DNS的IP地址：8.8.8.8

Then he opens the browser, wants to visit Google, and in the address bar entered the URL: www.google.com. This means that the browser is sending a Web request packet to Google.

DNS protocol

We know that sending a packet must be known to the other's IP address. However, now, we only know the URL www.google.com, do not know its IP address.
The DNS protocol can help us to convert this URL into an IP address. The DNS server is known to be 8.8.8.8, so we send a DNS packet (53 port) to this address

Subnet mask

Next, we want to determine whether this IP address is in the same subnet, which will use the subnet mask.
The known subnet mask is 255.255.255.0, the machine uses it to its own IP address 192.168.1.100, do a binary and operation (two digits are 1, the result is 1, otherwise 0), the result is 192.168.1.0; Then Google's IP address 172.194.72 .105 also makes an and operation, which evaluates to 172.194.72.0. These two results are not equal, so the conclusion is that Google is not on the same subnet as the native computer.
Therefore, we want to send a packet to Google, must be forwarded through the gateway 192.168.1.1, that is, the receiver's MAC address will be the gateway's MAC address.

Application Layer Protocol

The Web page is configured with the HTTP protocol, and the entire packet is constructed like this:

The contents of the HTTP section, similar to the following

GET / HTTP/1.1Hostwww.google.comConnectionkeep-aliveUser-AgentMozilla/5.0 (Windows NT 6.1) ……Accepttext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Encodinggzip,deflate,sdchAccept-Languagezh-CN,zh;q=0.8Accept-CharsetGBK,utf-8;q=0.7,*;q=0.3Cookie… …

TCP protocol

Then, the TCP packet is then embedded in the IP packet. IP packets need to be set up on both sides of the IP address, which is known, the sender is 192.168.1.100 (native) and the receiver is 172.194.72.105 (Google). The header length of the IP packet is 20 bytes, plus the embedded TCP packet, the total length becomes 5000 bytes.

Ethernet Protocol

Finally, the IP packet is embedded in the Ethernet packet. Ethernet packet needs to set the MAC address of both sides, the sender is the local network card MAC address, the receiver is the gateway 192.168.1.1 MAC address (through the ARP protocol).

The data portion of the Ethernet packet, the maximum length is 1500 bytes, and now the IP packet length is 5000 bytes. Therefore, IP packets must be split into four packets. Because each package has its own IP header (20 bytes), the length of the IP packets for the four packets is 1500, 1500, 1500, 560, respectively.

Server-side response

After the forwarding of multiple gateways, Google's server 172.194.72.105, received the four Ethernet packets.

According to the IP header number, Google put four packages together, take out the full TCP packet, and then read the inside of the "HTTP request", and then make "HTTP response", and then sent back with the TCP protocol.

After the native HTTP response is received, the Web page can be displayed to complete a network communication.

Getting Started with Internet protocols

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Getting Started with Internet protocols

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Getting Started with Internet protocols

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support