[Add to favorites] HTTP protocol analysis (with the Chinese version of HTTP protocol. pdf)

Last Update:2018-12-07 Source: Internet

Author: User

Tags microsoft iis

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is HTTP?

Simply put, it is a communication specification based on the application layer: both parties need to communicate, and everyone must abide by this specification, which is the HTTP protocol.

What can the HTTP protocol do?

Most people will first think of: browsing the Web page. Yes, Web browsing is the main application of HTTP, but it does not mean that HTTP can only be used for Web browsing. HTTP is a Protocol. As long as both parties comply with this Protocol, HTTP can be used. For example, the commonly used QQ and thunder software will use the HTTP protocol (and other protocols ).

How does the HTTP protocol work?

Everyone knows the general communication process: first, the client sends a request to the server. After receiving the request, the server will generate a response and return it to the client.

In this communication process, the HTTP protocol is defined in the following four aspects:

1.RequestAnd responseFormat

Request format:

HTTP request line
(Request) Header
Empty row
Optional Message Body

Note: The request line and title must end with <CR> <LF> (that is, press enter and then line feed ). The blank line must contain only <CR> <LF> and no other spaces. In HTTP/1.1, all request headers, except host, are optional.

Instance:

Get, HTTP, 1.1

HOST: gpcuster.cnblogs.com

User-Agent: Mozilla/5.0 (windows; U; Windows NT 6.0; en-US; RV: 1.9.0.10) Gecko/2009042316 Firefox/3.0.10

Accept: text/html, application/XHTML + XML, application/XML; q = 0.9, */*; q = 0.8

Accept-language: En-US, en; q = 0.5

Accept-encoding: gzip, deflate

Accept-charset: ISO-8859-1, UTF-8; q = 0.7, *; q = 0.7

Keep-alive: 300

Connection: keep-alive

If-modified-since: Mon, 25 May 2009 03:19:18 GMT

Response format:

HTTP status line
Response Header
Empty row
Optional Message Body

Instance:

HTTP/1.1 200 OK

Cache-control: private, Max-age = 30

Content-Type: text/html; charset = UTF-8

Content-encoding: Gzip

Expires: Mon, 25 May 2009 03:20:33 GMT

Last-modified: Mon, 25 May 2009 03:20:03 GMT

Vary: Accept-Encoding

Server: Microsoft-Microsoft IIS/7.0

X-ASPnet-version: 2.0.50727

X-powered-by: ASP. NET

Date: Mon, 25 May 2009 03:20:02 GMT

Content-Length: 12173

Message Body content (omitted)

For more information, see RFC 2616.

For a brief introduction to HTTP headers, see quick reference to HTTP headers.

2.Connection Establishment Method

HTTP supports establishing connections in 2: non-persistent connections and persistent connections (http1.1 uses persistent connections by default ).

1) non-persistent connection

Let's take a look at the steps for transferring a web page from the server to the customer in the case of non-persistent connections. Assume that the bay surface consists of a basic HTML file and 10 JPEG images, and all these objects are stored on the same server host. Assume that the URL of the basic HTML file is gpcuster.cnblogs.com/index.html.

Follow these steps:

1. The HTTP client initializes a TCP connection to the HTTP server in gpcuster.cnblogs.com. The HTTP server uses the default port 80 to listen for connection establishment requests from HTTP clients.

2. the HTTP client sends an HTTP request message through the local socket associated with the TCP connection. This message contains the path name/somepath/index.html.

3. the HTTP server receives the request message through the local socket associated with the TCP connection, and then retrieves the object/somepath/index.html from the memory or hard disk of the server host, sends a response message containing the object through the same socket.

4. the HTTP server informs TCP to close the TCP connection (however, TCP will terminate the connection only after the customer receives the Response Message ).

5. The HTTP client receives the response message through the same socket. The TCP connection is subsequently terminated. The message indicates that the encapsulated object is an HTML file. After the customer extracts the file and analyzes it, it finds that 10 JPEG objects are referenced.

6. Repeat steps 1-4 for each referenced JPEG object.

The above steps are called non-persistent connections because each time the server sends an object, the corresponding TCP connection is closed. That is to say, each connection does not last until it can be used to transmit other objects. Each TCP connection is used to transmit only one request message and one response message. In the preceding example, each time a user requests a Web page, 11 TCP connections are generated.

2) persistent connection

Non-persistent connections have some disadvantages. First, the customer must establish and maintain a new connection for each object to be requested. For each such connection, TCP must allocate a TCP buffer on the client and server, and maintain the TCP variable. This severely increases the burden on Web servers that may provide services for requests from hundreds of different customers at the same time. Second, as mentioned above, each object has two RTT response extensions-one RTT is used to establish a TCP connection, and the other RTT is used to request and receive objects. Finally, each object is slowed down by TCP, because each TCP connection starts from the slow start stage. However, the use of parallel TCP connections can partially reduce the RTT latency and slow startup latency.

In the case of persistent connections, after the server sends a response, the TCP connection continues to open. Subsequent requests and responses to the same client/server can be sent through this connection. The entire web page (in the preceding example, a page containing a basic htmll file and 10 images) can be sent through a single persistent TCP connection: even multiple web pages stored on the same server can be sent through a single persistent TCP connection. Generally, the HTTP server shuts down a connection after a specific period of time, which can be configured during this period. Persistent connections are divided into two versions: Without pipelining and with pipelining. If it is a version without a pipeline, the customer sends a new request only after receiving the response from the previous request. In this case, each object referenced by the web page (10 images in the previous example) experiences a RTT delay, which is used to request and receive the object. Compared with the latency of two RTTs for non-persistent connections, persistent connections without pipelines have been improved, but persistent connections with streamline can further reduce the response latency. Another disadvantage without the assembly line version is that the server sends an object and waits for the next request, but the new request cannot arrive immediately. During this time, server resources are idle.

The default HTTP/1.1 mode uses persistent connections with pipelines. In this case, each time an HTTP client encounters a reference, it immediately sends a request. Therefore, an HTTP client can send a request next to each referenced object. After receiving these requests, the server can send each object one by one. If all requests and responses are sent in close proximity, all referenced pairs will experience only one RTT delay (instead of the same as those in a version without pipelines, each referenced object has a RTT delay ). In addition, requests such as server null in persistent connections with pipelines are less time-consuming. Compared with non-persistent connections, persistent connections (whether with or without a pipeline) reduce the response latency of one RTT, and slow startup latency is also relatively small. The reason is that since each object uses the same TCP connection, after the server sends the first object, it does not have to send subsequent objects at the initial slow rate. On the contrary, the server can start sending the next object at the rate at which the first object is sent.

3.Cache Mechanism

The purpose of caching in HTTP/1.1 is to reduce the number of sending requests in many cases, and in many cases it is not necessary to send a complete response. The former reduces the number of network loops, and HTTP uses an expiration mechanism for this purpose. The latter reduces the bandwidth of network applications. Http uses the "validation" mechanism for this purpose.

HTTP defines three caching mechanisms:

LFreshnessAllows a response to be used without re-checking it on the origin server, and can be controlled by both the server and the client. for example, the expires Response Header gives a date when the document becomes stale, and the cache-control: Max-age directive tells the cache how many seconds the response is fresh.

LValidationCan be used to check whether a cached response is still good after it becomes stale. For example, if the response has a last-modified header, a cache can makeConditional request Using the IF-modified-since header to see if it has changed.

LInvalidationIs usually a side effect of another request that passes through the cache. for example, if URL associated with a cached response subsequently gets a post, put or delete request, the cached response will be invalidated.

For more information about Web Cache, see caching tutorial for web authors and webmasters (English version)

4.Response authorization Incentive Mechanism

These mechanisms can be used by the server to stimulate client requests and authorize the client.

For more information, see RFC 2617: HTTP Authentication: basic and digest access.

The exception is also detailed.ArticleHttp://www.blogjava.net/redhatlinux/archive/2009/02/17/255109.html

HTTP Chinese version (391)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More