Reprinted from: Http://www.tuicool.com/articles/V7JN32Z request page process based on URL
To tell you the truth, this kind of article on the net grab a lot of, and I still want to write this blog, on the one hand is want to carefully wisp this process, on the other hand is hoping to use clear language and structure to explain, also is a small challenge to yourself.
Process overview
Browser to find the corresponding IP address of the domain name;
The browser establishes a socket connection to the server based on the IP address;
Browser communication with server: Browser request, server processing request;
The browser is disconnected from the server.
Oh, my God, it's over. It's too easy ... You crossing, no hurry, all said is an overview, and look down.
Find IP address concept interpretation based on domain name
IP Address: The IP protocol is a logical address assigned to each network and each host on the Internet. The IP address is like a house number, and an IP address determines the location of a host. Server nature is also a host, want to access a server, you must first know its IP address;
Domain name (DN): The IP address is composed of four digits, the middle with the dot number connection, in the use of the process difficult to remember and easy to input errors, so we are familiar with the combination of letters and numbers to replace the pure digital IP address, such as we will only remember Www.baidu.com (Baidu domain name) and not 220.181.112.244 (one of Baidu's IP addresses);
DNS: Each domain name corresponds to one or more IP addresses that provide the same service server, and only know the server IP address to establish a connection, so you need to resolve the domain name to an IP address through DNS.
Knowing the concept above, you probably know that you want to get the server's house number, you need to first convert the domain name to an IP address. The conversion process is as follows (in the case of querying the IP address of www.baidu.com, where 2, 3, and 4 steps are not successfully queried in the previous step):
Find process
The browser searches its own DNS cache (maintains a table of names and IP addresses);
Search the DNS cache in the operating system (maintain a table of names and IP addresses);
Search the Hosts file of the operating system (under Windows environment, maintain a table of the domain name and IP address);
The operating system sends the domain name to the LDNS (local domain name server, if you connect to the Internet in the school, then the LDNS server is in the school, if the Internet through the telecommunications, then the LDNS server is in your native telecommunications there. LDNS queries its own DNS cache (general search success rate is around 80%), find success returns results, failure initiates an iterative DNS resolution request;
LDNS to ROOT name server, which, although not specific information for each domain name, but stores the address of the top-level nameservers responsible for each domain, such as COM, net, org, and so on, where Root name server returns the COM domain's The address of the top-level domain server;
LDNS initiates a request to the top-level domain name server of the COM domain and returns the Baidu.com domain name server address;
LDNS to baidu.com domain name server request, get www.baidu.com IP address;
LDNS returns the IP address to the operating system and caches the IP address itself;
The operating system returns the IP address to the browser and caches the IP address itself;
At this point, the browser has been given the domain name corresponding IP address.
Additional Information
The domain name and URL are two concepts: the domain name is the names of one or a group of servers used to determine the location of the server on the Internet; a URL is a Uniform Resource locator used to determine the specific location of a file, for example, segmentfault.com is the domain name of SF. According to this domain name can find the SF server, segmentfault.com/a/1190000003829539 is the URL, can be based on this URL to locate the first blog I wrote;
The IP address and the domain name is not a one by one corresponding relationship: You can set up multiple servers with the same service IP to the same domain name, but at the same time a domain name can only resolve an IP address, at the same time, an IP address can be bound to multiple domain names, the number of unlimited;
Establish a connection--three handshake
Knowing the IP address of the server, the following begins to establish a connection to the server.
In layman's words, the establishment of a communication connection requires the following three processes:
The host sends a request to the server to establish a connection ( Hello, I want to know you );
The server sends a signal that agrees to connect after receiving the request ( OK, nice to meet you );
After the host has received a signal to accept the connection, it sends a confirmation signal to the server (and I am pleased to know you ), since then the host and the server have established a connection.
Additional Information
TCP protocol: Three handshake process using TCP protocol, which can guarantee the reliability of information transmission, three handshake process, if one party does not receive the confirmation signal, the protocol will require re-send signal.
Web page request and display
After the server has established a connection with the host, the following host communicates with the server. A Web request is the process of a one-way request, which is the process by which a host requests data from the server and the server returns the corresponding data.
The browser generates an HTTP request based on the content of the URL, including the location of the requested file, the way the file was requested, etc.
After the server receives the request, it will decide how to obtain the corresponding HTML file according to the contents of the HTTP request;
The server sends the resulting HTML file to the browser;
When the browser has not fully received the HTML file, it starts rendering, displaying the Web page;
In the execution of HTML code, as needed, the browser will continue to request pictures, CSS, javsscript and other files, the process with the request HTML;
Disconnect--Four waves
The host sends a disconnect request to the server ( it's not too early, I should go );
The server sends a signal confirming receipt of the request after receiving the request ( know );
The server sends a disconnect notification to the host ( I should go too );
The host disconnects after disconnecting and feedback a confirmation signal ( well, OK ), the server disconnects after receiving the acknowledgement signal;
Additional Information
Why does the server not immediately agree to disconnect when receiving a disconnect request: When the server receives a disconnected request, there may still be data that is not sent, and all servers send a confirmation signal before agreeing to disconnect after all the data has been sent.
After the fourth handshake, the host sends a confirmation signal and does not immediately disconnect, but waits for 2 message delivery cycles, because if the confirmation message of the fourth handshake is lost, the server will resend the third handshake's disconnected signal, and the server is aware that the packet dropped and the resend of the disconnected connection arrived at the host time exactly 2 Message transmission cycles.
*******************
For more information:
What happens when you enter a URL? What happens in the process from the input URL to the page load complete? (involving the bottom and many details)
Process from input URL to page load completion