"Issue 1161th" from Chrome Source view DNS parsing process

Source: Internet
Author: User
Tags current time socket domain server nameserver timedelta

Objective

The Product Manager Circle has started talking about AI, AI, and what the AI's product managers should be able to learn. Today morning reading articles are licensed for sharing by @ Li Bancheng.

The text starts from here ~

The role of DNS resolution is to resolve the domain name to the corresponding IP address, because the router needs to know the IP address on the WAN to know who sent the message. DNS is the domain Name System's abbreviation, and it is a protocol that specifically describes this protocol in RFC 1035. The process is shown in the following figure:



This process may seem simple, but there are several problems:

The browser is how to know the DNS resolution server, as shown above in the 8.8.8.8 this station.

A domain name can be resolved to multiple IP addresses, if there is only one IP address, in the case of large concurrency, the server may explode.

After the domain name is tied to the host, is not the domain name to resolve the direct use of the local host specified IP address.

The validity time of the domain name resolution is how long, that is, after how long after the same domain name needs to be parsed again.

What is a record of domain name resolution, AAAA record, CNAME record.

In fact, the domain name analysis and chrome is not directly related, even the simplest Curl command also requires domain name resolution, but we can use the chrome source code to see how this process is, and answer the above questions.

First of all, the browser is how to know the DNS resolution server, in the local network settings can see the current DNS server IP, such as My computer's:

These two DNS servers are provided by one of my family's broadband connections:

General broadband service providers will provide a DNS server, Google also provides two free DNS services for the public, respectively, 8.8.8.8 and 8.8.4.4, take these two IP addresses is for easy to remember, when your DNS service is not good, you can try to change to these two.

How do the devices accessing these IP addresses get into the network? is through Dynamic Host Configuration Protocol (DHCP), when a device is connected to the router, the router assigns it an IP address through DHCP and tells it the DNS server, which is the DHCP setting for the router as follows:

This process can be observed through the Wireshark capture package:

When my computer is connected to WiFi, a DHCP request broadcast will be sent, and when the router receives this broadcast it will assign an IP address to my computer and inform the DNS server.

This time the system has a DNS server, chrome is to tune Res_ninit this system function (Linux) to get the system's DNS server, this function is to read/etc/resolver.conf This file to obtain DNS:

#
# Mac OS X Notice
#
# This file isn't used by the host name and address resolution
# or the DNS query routing mechanisms used by more processes on
# this MAC OS X system.
#
# This file is automatically generated.
#
Search DHCP HOST nameserver 59.108.61.61
NameServer 219.232.48.61

The purpose of the search option is that when a domain name is not resolvable, it tries to add the appropriate suffix later, such as Ping Hello, which will ping hello separately. Dhcp/hello. HOST, the results are not resolved at the end.

Chrome starts with a different operating system to get the DNS server configuration, and then puts it in DnsConfig's nameservers:

List of name server addresses.
Std::vector<ipendpoint> nameservers;

Chrome also listens for changes in network synchronization configuration.

Then use this nameservers list to initialize a socket pool, socket, which is used to send requests. In the need to do a domain name resolution from the socket pool to take out a socket, and pass the desired server_index, initialization is 0, that is, take the first DNS service IP address, once the resolution request failed twice, then Server_index + 1 Use the next DNS service.

unsigned server_index =
(first_server_index_ + attempt_number)% config.nameservers.size ();
Skip over known failed servers.
The maximum number of attempts is 2, which is set in the construction dnsconfig
Server_index = Session_->nextgoodserverindex (Server_index);

If all nameserver fail, then it will take the nameserver of the first failure.

In addition to reading the DNS server, Chrome will fetch and parse the hosts file and put it into the hosts property of DnsConfig, which is a hash map:

Parsed results of a Hosts file.
//
Although Hosts files map IP address to a list of domain names, for name
Resolution the desired mapping direction Is:domain name to IP address.
When parsing Hosts, we apply the ' first hit ' rule as Windows and glibc do.
With a Hosts file of:
300.300.300.300 localhost # bad IP
127.0.0.1 localhost
10.0.0.1 localhost
The expected resolution of localhost is 127.0.0.1.
Using dnshosts = Std::unordered_map<dnshostskey, IPAddress, dnshostskeyhash>;

The Hosts file on the Linux system is on/etc/hosts:

Const Base::filepath::chartype kfilepathhosts[] =
File_path_literal ("/etc/hosts");

There is no trick to reading this file, it needs to be done in a row, and there are some illegal judgments, such as comments from the code above.

So dnsconfig inside there are two configurations, one is the hosts, the other is Nameservers,dnsconfig is a combination to dnssession, their combined relationship as shown in the following figure:


Resolver is responsible for parsing the driver class, which combines a client,client to create a session,session layer that has a big role to play in managing Server_index and socket pool such as assigning sockets, etc. Session initialization Config,config is used to read locally bound hosts and nameservers two configurations. Each of these layers has its own responsibilities.

Resolver has an important function, which combines a job to create a task queue. Resolver also combines a hostcache, which is the cache of the parsing results, if the cache cache hit, it does not have to parse, the process is this, the external tune Rosolver provides the Hostresolverimpl::resolve interface, This interface will first determine whether the local can handle:

int net_error = err_unexpected;
if (Servefromcache (*key, info, &net_error, addresses, Allow_stale,
Stale_info)) {
Source_net_log. Addevent (Netlogeventtype::host_resolver_impl_cache_hit,
Addresses->createnetlogcallback ());
// | Servefromcache () | would set |*stale_info| As needed.
return net_error;
}
TODO (Szym): Do no do the IF nsswitch.conf instructs not to.
http://crbug.com/117655
if (servefromhosts (*key, info, addresses)) {
Source_net_log. Addevent (Netlogeventtype::host_resolver_impl_hosts_hit,
Addresses->createnetlogcallback ());
Makenotstale (Stale_info);
return OK;
}
return Err_dns_cache_miss;

The above code first tune Servefromcache go to the cache inside to see if there is, if the cache hit the return, otherwise see if the hosts are hit, if not hit will return Cache_miss flag bit. If the return value is not equal to Cache_miss, it is returned directly:

if (rv! = Err_dns_cache_miss) {
Logfinishrequest (Source_net_log, info, RV);
Recordtotaltime (Info.is_speculative (), True, Base::timedelta ());
return RV;
}

Otherwise, create a job and see if it can be executed immediately, and if the job queue is too many, add it to the job queue and pass a successful callback handler.

So here and our cognition is basically the same, first look at the cache there is no, and then see the hosts have no, if not, then query. When the cache is queried, if the cache is obsolete or staled, then NULL is returned, and the criteria for judging whether stale are as follows:

BOOL Is_stale () const {
return network_changes > 0 | | Expired_by >= Base::timedelta ();
}

That is, the network has changed, or expired_by is greater than 0, it is considered obsolete cache. The time difference is to subtract the current cache's expiration date with the current time:

stale.expired_by = Now-expires_;

The expiration time is the value of the Now + TTL at the time of initialization, which is the TTL returned using the last request resolution:

uint32_t ttl_sec = Std::numeric_limits<uint32_t>::max ();
Ttl_sec = Std::min (ttl_sec, Record.ttl);
*ttl = Base::timedelta::fromseconds (ttl_sec);

The above code does an anti-overflow treatment. The Wireshark DNS response can visually see this TTL:


The current domain name has a TTL value of 600s, which is 10 minutes. This can be set up in the provider that buys the domain name:


You can also see that this record type is a, what is a, as shown in the following figure:

When adding parsing, you can see that a is the domain name resolved to a IPV4 address, and AAAA is resolved to the IPV6 address, the CNAME is resolved to another domain name. The advantage of using a CNAME is that when a lot of other domains point to a CNAME, when the IP address needs to be changed, as long as the address of the CNAME is changed, then the other will take effect, but it has to do two parse.

If the domain name cannot be resolved locally, chrome will send the request. The operating system provides a system function called Getaddrinfo for domain name resolution, but Chrome does not use it, but instead implements a DNS client, including encapsulating the DNS request message and parsing the DNS response message. This may be because the flexibility will be greater, for example, Chrome can decide how to use nameservers, the order, and the number of failed attempts.

Start parsing in Resolver's startjob. Take the next Queryid, then build a query, build a dnsudpattempt, and then execute its start, because the DNS client query uses UDP packets (the secondary name server queries the primary domain server for TCP):

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.