Weekly share of the two HTTP protocol (3)

Last Update:2017-10-04 Source: Internet

Author: User

Tags domain server nginx reverse proxy

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This sharing HTTP protocol, is divided into three parts, this is the third part, mainly on the steps of a complete HTTP request, when we enter a URL in the Address bar, to return to the page has gone through what

1. Enter the URL

When we enter the URL in the browser, the browser can only match the possible URL, he will be from the history, bookmarks and other places to find the URL of the string that has been entered, and then give a smart hint, so that you can complete the URL address, Google Chrome browser, He will even display the page directly from the cache, which means that you haven't pressed the Enter page to show it.

2. The browser finds the IP address of the domain name

1. Once the request is initiated, the browser number of the most influential thing is to parse the domain name, in general, the browser will first look at the browser's DNS cache is a cache record, if there is a direct display, otherwise, view the local Hosts file, see if there is any rules corresponding to this domain name, if any, directly use The IP address in the Hosts file.

2. If the local Hosts file does not find the corresponding IP address, the browser will issue a DNS request to the local DNS server, the local DNS server is generally your network access to the server provider, such as China Telecom, Chinese mobile.

3. After querying the DNS request of the URL you entered to the local DNS server, the local DNS server will first query his cache record, if there is this record in the cache, directly return the results, this process is recursive query, if not, the local DNS server also to the DNS root server query.

4. The root DNS server does not record the specific domain name and IP address of the corresponding relationship, but tells The local DNS server you can go to the domain server to continue the query, and give the address of the domain server, this process is an iterative process.

5. The local DNS server continues to make a request to the domain server, in which case the requested object is a. com domain server. When the COM domain server receives the request, it does not return the corresponding relationship of the domain name and IP address directly, but tells The local DNS server your domain name and the address of the resolution server.

6. Finally, the local DNS server to the domain name of the resolution server to make a request, you can receive the domain name and IP address of the corresponding relationship, the local DNS server not only to return the IP address to the user's computer, and the corresponding relationship is stored in the cache, so that the next user query, you can directly return the results, Speed up network access.

3. The browser sends an HTTP request to the Web server

Once the IP address of the domain name is obtained, the browser initiates a TCP connection request with a random port (< port < 65535) to the server's Web program (common Httpd,nginx, etc.) 80 port. This connection request is sent to the server side (this intermediate through various routing devices, except within the LAN), into the network card, and then into the kernel TCP/IP protocol stack (to identify the connection request, parse the package, a layer of stripping), and possibly through the NetFilter firewall (belonging to the kernel module) filtering , finally arrives at the Web program, finally establishes the TCP/IP connection.

After a TCP connection is established, an HTTP request is initiated, and a typical HTTP requests header typically includes the requested method, such as GET or POST, which is not often put,delete,head,option and the TRACE method, A normal browser can only initiate a GET or POST request.

When the client initiates an HTTP request to the server, there are some request information that contains three parts:

Request method URI Protocol/version?
Requests header (Request header)
Request Body

The following is an example of a complete HTTP request

get/sample.jsp http/1.1

accept:image/gif.image/jpeg,*/*

Accept-language:zh-cn

Connection:keep-alive

Host:localhost

user-agent:mozila/4.0 (compatible; MSIE5.01; Window NT5.0)

Accept-encoding:gzip,deflate

username=jinqiao&password=1234

Note: The last request header is followed by a blank line that sends a carriage return and a newline character, notifying the server that the following no longer has a request header.

4. Permanent redirect response of the server

The server returns a 301 permanent redirect response to the browser so that the browser accesses "http://www.google.com/" instead of "http://google.com/".

Why does the server have to redirect rather than send directly to the user what they want to see? One of the reasons is related to the site rankings, if a page has two URLs, like: http://www.yy.com and http:/yy.com, the search engine will think they are two sites, resulting in a decrease in each search connection to reduce the rankings. and search engine know 301 permanent redirect is what meaning, so will visit with www and without WWW site address to the same ranking. There is a difference in the cache-friendliness of the different addresses, when a page has several names, he may appear in the cache several times.

5. Browser tracing REDIRECT Address

Now the browser knows that "http://www.google.com" is the correct address to access, so he will send an HTTP request.

6. Server Processing Requests

After a lot of previous steps, we finally sent our HTTP request to the server here, in fact, the previous redirection is already reaching the server, then, how the server struggled with our request?

The backend begins with receiving TCP packets at a fixed port, which processes the TCP connection, parses the HTTP protocol, and further encapsulates the HTTP Request object in the message format for use by the upper layer.

Some of the larger sites will be your request to the reverse proxy server, because when the site is very large, the site is getting slow, a server is not enough, so the same application deployed to multiple servers, a large number of user requests allocated to multiple machines processing, at this time, the client is not directly through the HTTP Protocol access to a Web site application server, but first request to Nginx,nginx in the request application server, and then return the results to the client, here Nginx role is reverse proxy server. Also brings a benefit, one server in case of hanging, as long as there are other servers operating properly, it will not affect the normal use of users.

Through the Nginx reverse proxy, we reached the Web server, the server-side script to process our requests, access to our database, to obtain the content and so on.

7. The server returns an HTTP response

After the previous 6 steps, the server received our request, also processing our request, to this step, it will return its processing results, that is, return an HTTP response.

HTTP responses are similar to HTTP requests, and the HTTP response is made up of three parts, namely:

Status line
Response Header (Response header)
Response body

http/1.1- OK

　　Date:sat, Dec 2005 : :

content-type:text/html;charset=iso-8859-1

Content-length: 122

http

!--body goes here-->

Status line:

The status line consists of the Protocol version, the status code in the number form, and the status description of the response, separated by a space between the elements

Response header:

Response header: Consists of keyword/value pairs, one pair per line, keywords and values separated by the English symbol ":".

Response Body:

Contains some specific information we need, such as cookie,html,image, back end of the request data, and so on, it is important to note that the response body and the response header by a row of blank lines, indicating the response header information to a blank line.

8. Browser Display HTML

When the browser does not receive the full HTML document, he has already started to display the page, how the browser renders the page on the screen? Different browser parsing process may not be the same, here I only introduce the WebKit rendering process, this process includes:

Parse HTML to build the DOM tree--Build the render tree--layout render tree--Draw the render tree

When the browser parses the HTML file, it loads "top-down" and parses the rendering during the load process, and during parsing, if an external request resource is encountered, the slice, the outer-chain CSS, the Iconfont, etc., the request process is asynchronous and does not affect the loading of the HTML document.

During parsing, the browser first parses the HTML file to build the DOM tree, then parses the CSS file to build the render tree, and when the render tree is built, the browser begins to lay out the render tree and draw it to the screen, which is a complex process involving two concepts: Reflow (reflow) and Repain (Redraw).

Each element in the DOM node is a box of models, which requires the browser to calculate its location and size, a process called reflow (reflux), when the size of the box model, position and other properties, such as color, font, etc. are determined, the browser begins to draw the content, this process is called Repain (Redraw).

Pages are bound to experience reflow and Repain during the first load, and the process of reflow and Repain is very performance-intensive, especially on mobile devices, which can disrupt the user experience and sometimes cause lag, so we should reduce reflow and repain as little as possible.

When the document loading process encountered a JS file, the HTML document will suspend rendering (load resolution rendering synchronization) of the thread, not only to wait for the document JS file loading complete, but also wait for the resolution to complete, before you can restore the HTML document rendering thread. Because JS has the possibility of modifying the DOM, the most classic document.write, which means that after the completion of JS execution, all subsequent downloads of resources may not be necessary, which is the root cause of JS blocking the subsequent download of resources. So in my normal code, JS is placed at the end of the HTML document.

JS parsing is done by the JS parsing engine in the browser, such as Google's V8. JS is a single-threaded run, that is, only one thing can be done in the same time, all the tasks need to queue, the previous task is over, the latter task can begin. However, there are some tasks that are time consuming, such as IO Read and write, so a mechanism is required to perform the tasks that follow, that is: synchronous tasks (synchronous) and asynchronous tasks (asynchronous).

The execution mechanism of JS can be seen as a main thread plus a task queue. A synchronization task is a task that is placed on the main thread, and an asynchronous task is a task that is placed in the task queue. All synchronization tasks are performed on the main thread to form an execution stack, and an asynchronous task has a running result to place an event in the task queue, and the script runs the execution stack sequentially, then extracts the events from the task queue and runs the tasks in the task queue, which is repeated, so called the event loop ( Event loop).

9. The browser sends the request to get embedded in the HTML resources (slices, audio, video, CSS,JS, etc.) 　　

In fact, this step can be side-by-side in step 8, when the browser displays HTML, it will notice the need to get additional address content tags. At this point, the browser sends a FETCH request to retrieve the files. For example, I want to get external images, CSS,JS files, etc., similar to the following link:

Image: Http://static.ak.fbcdn.net/rsrc.php/z12E0/hash/8q2anwu7.gif

CSS style sheet: http://static.ak.fbcdn.net/rsrc.php/z448Z/hash/2plh8s4n.css

JavaScript Files: http://static.ak.fbcdn.net/rsrc.php/zEMOA/hash/c8yzb6ub.js

These addresses are going through a process similar to HTML reading. So the browser will find these domain names in DNS, send requests, redirect, etc...

Unlike dynamic pages, static files allow the browser to cache them. Some files may not need to be communicated to the server, read directly from the cache, or can be placed in a CDN

Weekly share of the two HTTP protocol (3)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More