From theory to practice, all-round understanding of DNS (theory)

Source: Internet
Author: User
Tags rfc domain name server domain server

For DNS (domain Name System) Everyone is certainly not unfamiliar, is not used to convert a site's domain name to the corresponding IP. When we find that we can go to QQ but can't browse the webpage, we will think that the domain name server hangs up, when we use the Hosts file provided by others to browse to a "nonexistent" webpage, we will understand the fragility of the domain name resolution system.

However, there are a lot of stories about DNS that we should listen to and think about.

DNS Source from

To access a computer on the network, we have to know its IP address, but these addresses (such as 243.185.187.39) are just a bunch of numbers, there are no rules, so it's hard to remember. And if a computer changes the IP, it must notify all the people.

Obviously, using an IP address directly is a stupid scenario. So people came up with an alternative approach, that is, to name each computer, and then establish a mapping relationship between the computer name and the address. We access the computer's name, and the remaining name-to-address conversion process is done automatically by the computer.

Hosts mapping

In the early days, the process of name-to-address conversion was straightforward. Each computer holds a Hosts file that lists all computer names and corresponding IP addresses, and then periodically updates the records from a site that maintains this file. When we access a computer name, we first find the corresponding IP in the Hosts file, and then we can establish the connection.

This was the case with the early arpanet, but as the scale of the network expanded, this approach became more and more unbearable. There are three main reasons for this:

    1. The Hosts file becomes very large;
    2. Host name will conflict;
    3. Centralized maintenance sites will be overwhelmed (need to provide hosts file for millions of machines, think about it scary).
Domain Name System

In order to solve the above problem, Paul Mockapetris introduced the domain Name System (DNS, Domain name Systems) in 1983, which is a hierarchical, field-based naming scheme and implemented with a distributed database system. When we need to access a domain name (which is actually the name of the computer mentioned earlier), the application initiates a DNS request to the DNS server, and the DNS server returns the IP address of the domain name. The above problem is solved by the following three ways:

    1. The user's computer does not store all the name-to-IP mappings, which avoids the hosts file being too large (the Hosts file is now empty by default in each operating system).
    2. Specifies the naming rules of the domain name, ensuring that the host name is not duplicated.
    3. The DNS server is no longer a single machine, but a hierarchical, reasonably organized server cluster.

The process of accessing a domain name can be simplified to:

DNS protocol

So how to implement this so-called domain Name System, you know, to manage a very large and changing domain name-to-IP mapping collection is not a simple thing, but also to deal with thousands of DNS query requests. People finally came up with a nice set of protocols on how to implement the system, so let's take a look.

Domain Name space

First we need to set up a set of naming rules to prevent duplication of domain names. DNS rules about domain names are similar to the express systems in our lives, using hierarchical address structures. Express delivery system to send someone mail items, the address may be: China, Guangdong Province, Guangzhou City, Fanyu District, Zhongshan West Road, No. 12th XXX. And a domain name looks like this groups.google.com (why not com.google.groups?) I guess it might have something to do with the foreigner's habit of writing an address.

For the Internet, the top level of the domain name hierarchy (equivalent to the country portion of an international express address) is administered by ICANN (Internet name and digital address allocation agency). Currently, there are more than 250 top-level domain names, each of which can be further classified as sub-domains (two-level domain names), which can be re-partitioned (three-level domain names), and so on. All of these domains can be organized into a tree as shown (image from computer networks:7-1):

Domain Name resource record

At the beginning of the DNS design is used to establish the domain name to the IP address mapping, theoretically for each domain name we only need to save a record on the domain name server. The record here is generally called the domain Name resource record, which is a five-tuple, which can be represented in the following format:

domain_name time_to_live Class Type Value

which

    1. domain_name: Indicate which domain name this record applies to;
    2. Time_to_live: Used to indicate the lifetime of a record, that is, the maximum amount of time that the record can be cached (the cache mechanism is described later);
    3. Class: generally always in;
    4. Type: Types of records;
    5. Value: The values of the record, and if it is a record, value is an IPv4 address.

We see that the domain name resource record has a Type field that indicates the kind of record. What is this for? Because for a domain name, it is usually not just the IP address of the record, it may also require some other kinds of records, some common record types are as follows:

Record Type meaning
A IPV4 Address of the host
Aaaa IPV6 Address of the host
Ns Authoritative domain name server for the domain where the domain name resides
Mx Server domain name that accepts email for a specific domain name
CNAME An alias for the current domain name

Examples of these domain name resource records we will see in the next article (in practice).

Domain Name server

We know that it is not possible to respond to all DNS queries with only one domain name server, because no machine can provide query services to users around the world, and computing power, storage, bandwidth are not allowed. Only reasonable organization of a domain name server cluster, so that they work together to provide domain name resolution services. The first issue to be faced with is how to reasonably store all the domain name resource records on a different domain name server.

The name space of the domain name can be organized as a tree, and here we can further divide it into non-overlapping zones (DNS zone) for the domain name space, a possible domain name division such as:

Each zone is then associated with multiple domain name servers, one of which is master, and the other slave servers are used to provide data backup, speed up resolution, and ensure service availability, which is called the authoritative domain name server for the zone (authoritative name Servers), which holds two types of domain name resource records:

    1. The domain name resource records for all domain names in the region.
    2. The Domain Name resource records (primarily NS records) for the parent and child zone domain names servers.

In this way, all the domain name resource records are saved in a number of domain name servers, and all the domain name server is also composed of a hierarchical index structure, so that we can follow the domain name resolution. The following is a simplified domain name space as an example of how the domain name resource records are stored in the domain name server, such as a:

The domain name space in the figure is divided into a, B, C, D, E, F, G Seven DNS zones, each DNS zone has a number of authoritative domain name servers, these domain name servers are stored in many domain name resolution records. For the NDS region E, it is authoritative for the domain name server that is stored in the record in the table below.

Looking closely you may find that areas A and B do not have a parent area, and they do not have a path linked together. This will lead to a very troublesome problem, that is, zone A's authoritative domain name server may not know the existence of zone B at all. With this in mind, you might come up with a natural solution that is to record the address of the B domain server in a and record a in B, so that they are connected. But given that we have more than 250 top-level domains, this is not a very appropriate thing to do.

The domain Name System we use is a smarter way to introduce a root name server that holds the authoritative name server records for all the top-level zones. Now through the root domain server, we can find all the top-level authoritative domain name servers, and then you can go down one level down. A map of the global root name servers, which can be found here.

So far, our authoritative domain name server and root name server is actually composed of a tree, root for the roots of the server, each of the following nodes is an authoritative domain name server, for each DNS zone in figure a authoritative name server, they make up the following tree (in practice, An authoritative domain name server may hold records for multiple DNS zones, so the connection between authoritative nameservers does not constitute a tree. The details of this section can be consulted in RFC 1034:4. NAME SERVERS. For ease of understanding, simplify it to a tree):

Domain Name resolution

We already have a domain name server cluster, which reasonably preserves the correspondence between the domain name space and the domain name resource record. Now all we have to do is send a DNS request to the domain name server and wait for it to return the correct domain name resource record, which is called Domain name resolution.

Strictly speaking, the process of domain name parsing should be traced back to the establishment of network connection. Because each time the network is connected, the computer will automatically get a default DNS server, of course, you can also use your own trusted DNS server, such as 8.8.8.8 (DNS server also has the trust of untrusted points, yes, the practice will say), we have this domain name server also known as the local name server. Next, when we need to know the resource record of a domain name, the request is made to the local name server, and if the domain name happens to be within the Domain name area (DNS zone) of the local domain name server, the record can be returned directly.

If the local domain name server does not find a resource record for the domain name, it needs to search for the domain name throughout the domain name space. While the entire domain name space resource records are stored in a hierarchical, tree-like connection on a series of domain name servers, so the local domain name server first to start from the root name server down search. Here's a question of how the local domain name server can find the root name server. in fact, when the name server starts, it will load a configuration file, which holds the root name server NS records (to know that the root name server address is generally very stable, will not be easily changed, and the number is very small, so this configuration file will be very little). Once you've found the root name server, you can look down one level at a glance.

Still taking our figure A as an example, now assume that a user in zone E wants to access math.sysu.edu.cn, then the request process is as follows:

The words are briefly described as follows:

    1. User: Hello, local domain name server, tell me the address of math.sysu.edu.cn;
    2. Local domain name server: Oh, I don't know, not in my jurisdiction, let me ask Big Brother. Root boss, can you tell me the address of math.sysu.edu.cn?
    3. Root name server: Busy, you ask B (. cn);
    4. Local domain name server: Hello, B, tell me the address of math.sysu.edu.cn;
    5. B: You ask D (. edu.cn);
    6. Local domain name server: Hello, D, tell me the address of math.sysu.edu.cn;
    7. D: You go and ask F (sysu.edu.cn);
    8. Local domain name server: Hello, F, tell me the address of math.sysu.edu.cn;
    9. F: Swaiiow Look, alas, found, is x.x.x.x;
    10. Local domain Name server: Pale finally found, feed the user, come out ah, I found, is x.x.x.x

Think carefully, this and we Mail Express is the same ah, assuming you from the United States mail to Guangzhou Fanyu District, first express delivery to China (but there is no such as a root name server), and then down to Guangdong Province, followed by Guangzhou, and then down is Panyu.

The above is the local domain name server iterative parsing process, in fact, can also be recursive query, here do not say, the truth is similar.

Caching mechanism

Now the entire domain Name system has been able to provide us with the domain name resolution service, when we enter the domain name, the computer sends a DNS request, and then the DNS server returned to us to resolve the results, everything looks perfect. But can it be more perfect?

Looking back at your usual website, we will find two more interesting conclusions:

    1. 80% of the time we are looking at those 20% of the site, this is the famous 80/20 Rule;
    2. We will jump between the different pages of a website, that is, to constantly access the same domain name, similar to the local principle of program access.

These two conclusions are easy to associate with caching mechanisms. If we have already visited the domain name of the resolution of the results of the cache on their own computer, then the next time the visit can be directly read the results, do not have to repeat the DNS query process, to itself and the domain name server has saved the trouble.

Of course, the premise of this is that the parsing results of the cache will not change frequently, that is, I resolved the result of a domain name in 10 minutes and the result is the same as now. For most domain names, this is an indisputable fact. But inevitably there are some "fickle" domain names, they may frequently change their own analytic results. To adapt the caching mechanism to these two types of situations, we add a time_to_live field to the domain name resource record to indicate how long this record can be cached. For those "steady" domain names, give a relatively large value, and those "loose" domain name, you can be given a small value.

Since we can use the cache in this machine, can we also use the cache mechanism on the domain name server, the answer is of course yes. Because for the name server, the above two interesting conclusions are still valid. Therefore, the domain name server can be those access to the domain name resource record cache, the user initiates the request, you can directly return the cached results, not to iterate or recursive parsing.

For more information about the DNS theory section, you can also refer to these two texts:

    • RFC 1034:domain names-concepts and Facilities
And it's not over.

There's a whole bunch of theories that look a little unclear, so yes, it's okay, and then we'll combine practice to get a clearer idea of the most basic system of DNS.

In fact, more than DNS, as well as HTTPS, TCP, UDP these very basic protocols, are worth our quiet heart to know them well. Because, before I wrote DNS, I thought I had completely figured it out, but the process of writing found that there were so many places that I didn't know it at all, before I was completely stuck on a very pompous level. Therefore, it is time to find the time to take these agreements over again, in their own language, from the point of view of solving the problem, record the story of these classic agreements.

From theory to practice, all-round understanding of DNS (theory)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.