An analysis of HTTP protocol

Source: Internet
Author: User
Tags http authentication rfc asymmetric encryption

Source: http://www.cnblogs.com/gpcuster/archive/2009/05/25/1488749.html

What is the HTTP protocol?

Simply put, is a communication specification based on the application layer: the two sides to communicate, everyone must abide by a specification, this specification is the HTTP protocol.

What can the HTTP protocol do?

A lot of people must first think: Browse the Web. Yes, browsing the web is the main application of HTTP, but this does not mean that HTTP can only be applied to web browsing. HTTP is a protocol in which HTTP can be useful as long as both parties to the communication adhere to this protocol. For example, we commonly used QQ, thunder these software, will use the HTTP protocol (also includes other protocols).

How does the HTTP protocol work?

We all know the general communication flow: First the client sends a request to the server, and the server generates a response (response) back to the client after receiving the request.

In the process of this communication, the HTTP protocol is defined in the following 4 areas:

1. format of Request and Response

Request Format:

HTTP request Line
(Request) Header
Blank Line
Optional message body

Note: The request lines and headers must end with <CR><LF> (that is, enter and then wrap). There must be only <CR><LF> in the empty line and no other spaces. In the http/1.1 protocol, all request headers, except host, are optional.

Instance:

get/http/1.1

Host:gpcuster.cnblogs.com

user-agent:mozilla/5.0 (Windows; U Windows NT 6.0; En-us; rv:1.9.0.10) gecko/2009042316 firefox/3.0.10

accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

accept-language:en-us,en;q=0.5

Accept-encoding:gzip,deflate

accept-charset:iso-8859-1,utf-8;q=0.7,*;q=0.7

keep-alive:300

Connection:keep-alive

If-modified-since:mon, 03:19:18 GMT

Response Format:

HTTP status Line
(answer) head
Blank Line
Optional message body

Instance:

http/1.1 OK

Cache-control:private, max-age=30

content-type:text/html; Charset=utf-8

Content-encoding:gzip

Expires:mon, 03:20:33 GMT

Last-modified:mon, 03:20:03 GMT

Vary:accept-encoding

server:microsoft-iis/7.0

x-aspnet-version:2.0.50727

X-powered-by:asp.net

Date:mon, 03:20:02 GMT

content-length:12173

Content of the message body (slightly)

For detailed information, please refer to: RFC 2616.

For a brief introduction to HTTP headers, please view: Quick reference to HTTP headers

2. How to establish a connection

HTTP supports the way connections are established in 2: non-persistent connections and persistent connections (HTTP1.1 default connection mode is persistent).

1) Non-persistent connection

Let's take a look at the steps for transferring a Web page from the server to the customer in a non-persistent connection scenario. Assume that the shell consists of 1 basic HTML files and 10 JPEG images, and all of these objects reside on the same server host. Then assume that the URL of the base HTML file is: gpcuster.cnblogs.com/index.html.

Here is the concrete step mule:

1.HTTP The client initializes a TCP connection to the HTTP server in the server host gpcuster.cnblogs.com. The HTTP server uses the default port number 80 to listen for connection establishment requests from HTTP clients.

A 2.HTTP client sends an HTTP request message through a local socket associated with a TCP connection. This message contains the path name/somepath/index.html.

The 3.HTTP server receives this request message through the local socket associated with the TCP connection, and then extracts the object/somepath/index.html from the server host's memory or hard disk, sending a response message containing the object through the same socket.

The 4.HTTP server tells TCP to close the TCP connection (although TCP will not actually terminate the connection until the client receives the response message).

5.HTTP customers receive this response message via the same socket. The TCP connection is then terminated. The message indicates that the encapsulated object is an HTML file. The client takes out the file and analyzes it to see that there are 10 references to the JPEG object.

6. Repeat step mule 1-4 for each JPEG object that you refer to.

The above step is called using a non-persistent connection because the corresponding TCP connection is closed each time the server issues an object, meaning that each connection does not persist until it can be used to transfer other objects. Each TCP connection is used to transmit only one request message and one response message. For the above example, 11 TCP connections are generated for each time a user requests that web page.

2) Persistent connection

There are some drawbacks to non-persistent connections. First, the customer has to establish and maintain a new connection for each object to be requested. For each such connection, TCP allocates a TCP buffer on the client and server side and maintains the TCP variable. This can significantly increase the burden on Web servers that are likely to serve requests from hundreds of different customers at the same time. Second, as mentioned earlier, each object has a 2 RTT response extension-one RTT is used to establish a TCP connection, and the other-a RTT is used to request and receive objects. Finally, each object suffers a TCP slow start because each TCP connection starts at the slow start phase. However, the use of parallel TCP connections can partially mitigate the effect of the RTT delay and slow start delay.

In the case of a persistent connection, the server continues to open the TCP connection after the response is made. Subsequent requests and responses between the same client/server can be sent through this connection. The entire Web page (in the previous example, a page that contains a basic HTMLL file and 10 images) can never be sent over a single persistent TCP connection: Even multiple Web pages hosted on the same server can be sent over a single persistent TCP connection. Typically, an HTTP server shuts down a connection after it has been idle for a certain period of time, which is usually configurable. Persistent connections are divided into two versions with no pipelining (without pipelining) and a belt pipeline (with pipelining). In the case of a version without pipelining, the customer only makes a new request after receiving the response from the previous request. In this case, each object referenced by the Web page (10 images in the previous example) undergoes a delay of 1 RTT, which is used to request and receive the object. Persistent connections without pipelining have improved compared to a delay of 2 RTT with non-durable connections, but a durable connection with pipelining can further reduce response latency. Another disadvantage of not having a pipelined version is that the server sends an object and waits for the next request, but the new request does not arrive immediately. Server resources are idle during this time.

The default mode of http/1.1 uses a durable connection with pipelining. In this case, the HTTP client makes a request immediately after encountering a reference, so that the HTTP client can make a request next to each reference object. After the server receives these requests, it can also emit individual objects next to each other. If all the requests and responses are sent next to each other, then all the referenced objects experience only a 1 RTT delay (instead of a 1 RTT delay for each referenced object, as opposed to a version without pipelining). In addition, there is less time for server empty requests in a persistent connection with pipelining. In contrast to non-durable connections, persistent connections (whether or not with pipelining) reduce the latency of slow start-up in addition to reducing the response delay of 1 RTT. The reason for this is that since each object uses the same TCP connection, the server emits the first object without having to send subsequent objects at a slow rate at the beginning. Instead, the server can start sending the next object at the rate at which the first object is sent.

3. Caching Mechanisms

The purpose of the cache in http/1.1 is to reduce the sending request in many cases, while in many cases it is not necessary to send a full response. The former reduces the number of network loops; HTTP uses an "out-of-date (expiration)" mechanism for this purpose. The latter reduces bandwidth for network applications, and HTTP uses the "Authentication (validation)" mechanism for this purpose.

HTTP defines 3 kinds of caching mechanisms:

L Freshness allows a response to being used without re-checking it on the origin server, and can is controlled by Bo Th the server and the client. For example, the Expires response header gives a date when the document becomes stale, and the cache-control:max-age dire Ctive tells the cache how many seconds the response was fresh for.

L Validation can used to check whether a cached response are still good after it becomes stale. For example, if the response have a last-modified header, a cache can make a conditional requestusing the IF-MODIF Ied-since header to the if it has changed.

L invalidation is usually a side effect of another request that passes through the cache. For example, if URL associated with a cached response subsequently gets a POST, PUT or DELETE request, the cached response would be invalidated.

For information on Web caching, refer to: Caching Tutorial for Web Authors and Webmasters (English) (Chinese version)

4. Response authorization excitation mechanism

These mechanisms can be used by the server to fire client requests and enable the client to authorize them.

For more information, please refer to: RFC 2617:http authentication:basic and Digest Access

5. HTTP-based applications

1 HTTP Proxy

Principle

Classification

  1. Transparent proxy
  2. Non-transparent proxy
  3. Reverse Proxy

2 Multi-threaded downloads

      1. The download tool opens multiple threads that make HTTP requests
      2. Only one part of the resource file is requested for each HTTP request: Content-range:bytes 20000-40000/47000
      3. Merging files downloaded by each thread

3 HTTPS Transport Protocol principle

Two kinds of basic encryption and decryption algorithms

Symmetric encryption: The key is only one, encryption and decryption of the same password, and decryption speed, the typical symmetric encryption algorithm has DES, AES and so on

Asymmetric encryption: The key in pairs appear (and according to the public key can not infer the private key, according to the private key can not infer the public key), encryption and decryption using different keys (public key encryption requires private key decryption, private key encryption requires public key decryption), relatively symmetric encryption speed is slow, the typical asymmetric encryption algorithm has RSA, DSA, etc.

HTTPS Communication process

Advantages

      1. Client-generated keys are only available to client and server
      2. Encrypted data can only be plaintext by the client and server side
      3. Client-to-server communication is secure

4 Common request Methods when developing a Web application

HEAD

The (Head method) requires that the response be the same as the response to the corresponding GET request, but there is no response body (response body). This is used to get the metadata information (meta-infomation) in the response header (response header) to have (very) help (because) it does not need to transfer all of the content.

TRACE

(The trace method tells the server side) to return the received request. The client can (through this method) see what is added or changed by the intermediary server during the request.

OPTIONS

Returns the HTTP methods supported by the server (at the specified URL). This method can be used to check the functionality of a network server by requesting "*" instead of the specified resource.

CONNECT

Converting the requested connection to a transparent TCP/IP channel is typically used to simplify ssl-encrypted communication (HTTPS) over a non-encrypted HTTP proxy.

5 user-to-server interaction

      1. Identity verification
      2. Cookies
      3. Conditional get

6 program based on socket programming to follow HTTP

Postscript:

This article is just an introduction to the HTTP protocol, a lot of details are missing, please interested friends to read RFC 2616.

A good book for learning the HTTP protocol:

1.O ' Reilly-http Pocket Reference: This is a short introductory book on the HTTP protocol that can be used as a primer

2.O ' Reilly-http The Definitive guide: This is a book of books, because it contains a lot of content, can be used as a comprehensive learning of the HTTP protocol preferred reading

3.sams-http Developers Handbook: This is a little simpler than the HTTP the definitive guide slightly more than the HTTP the definitive guide. But from what I feel, this book is better than http the definitive guide, because it is less space, the introduction of the essence of HTTP, I think this book should be the first choice of web programmers reading

An analysis of HTTP protocol

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.