The historical evolution and design ideas of HTTP protocol

Source: Internet
Author: User

HTTP protocol is the basic protocol of the internet, but also the necessary knowledge of Web development, the latest version of HTTP/2 is to let it become a technical hotspot.

This paper introduces the historical evolution and design ideas of HTTP Protocol.

first, http/0.9

HTTP is an Application-layer protocol based on the TCP/IP Protocol. It does not involve packet (packet) transmission, mainly specifies the communication format between the client and the server, using 80 ports by Default.

The earliest version was released in 1991 as version 0.9. This version is extremely simple and has only one command GET .

?
1 GET /index.html

The above command indicates that after a TCP connection (connection) is established, the client requests a Web page from the server index.html .

The protocol stipulates that the server can only respond to html-formatted strings and cannot respond to other Formats.

?
123 <html>  <body>Hello World</body></html>

When the server is finished sending, close the TCP Connection.

Ii. introduction of http/1.02.1

May 1996, The http/1.0 release, the content greatly increased.

first, content in any format can be Sent. This allows the Internet not only to transfer text, but also to transfer images, video, binary Files. This laid the foundation for the great development of the Internet.

second, In addition to the GET command, but also introduced POST commands and HEAD commands, enriched the browser and the server interactive Means.

again, the format of the HTTP request and response has Changed. In addition to the data section, each communication must include the header information (HTTP header), which is used to describe some Meta-data.

Other new features include status code, multi-character set support, multipart send (multi-part type), permissions (authorization), caching (cache), content encoding (contents encoding), and More.

2.2 Request Format

The following is an example of a version 1.0 HTTP Request.

?
123 GET / HTTP/1.0User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)Accept: */*

As you can see, this format has changed a lot with version 0.9.

The first line is the request command, and the protocol version () must be added at the tail HTTP/1.0 . The following is a multi-outfit message describing the Client's situation.

2.3 Response Format

The server responds as Follows.

?
12345678910 http/1.0 200 ok   content-type: text/plain content-length: 137582 expires: thu, 05 dec 1997  16:00:00 gmt last-modified: wed, 5 august 1996  15:55:28 gmt server: apache 0.84   < html >    < body >hello world</ body > </ html >

The format of the response is "header information + a blank line ( \r\n ) + data". Where the first line is "protocol version + status code" + status Description.

2.4 Content-type Field

With regard to the encoding of characters, version 1.0 stipulates that the header information must be ASCII and the subsequent data can be in any format. therefore, when the server responds, it must tell the client what format the data is, and that is Content-Type the function of the FIELD.

The following are the values of some common Content-Type fields.

  • Text/plain

  • Text/html

  • Text/css

  • Image/jpeg

  • Image/png

  • Image/svg+xml

  • Audio/mp4

  • Video/mp4

  • Application/javascript

  • Application/pdf

  • Application/zip

  • Application/atom+xml

These data types are always called MIME type , each value includes a first-level type and a two-level type, separated by a Slash.

In addition to predefined types, the vendor can also customize the Type.

?
1 application/vnd.debian.binary-package

The above type indicates that a binary packet of the Debian system is being sent.

MIME typeYou can also add parameters by using semicolons at the Tail.

?
1 Content-Type: text/html; charset=utf-8

The above type indicates that a Web page is being sent, and that the encoding is UTF-8.

When requesting a client, you can use a Accept field to declare which data formats you can Accept.

?
1 Accept: */*

In the above code, the client declares that it can accept data in any Format.

MIME typeNot only in the HTTP protocol, but also in other places, such as HTML Pages.

?
123 <metahttp-equiv="Content-Type" content="text/html; charset=UTF-8" /><!-- 等同于 --><meta charset="utf-8"/>
2.5 content-encoding Field

Since the data sent can be in any format, the data can be compressed and then Sent. Content-EncodingThe field describes the compression method of the Data.

?
123 Content-Encoding: gzipContent-Encoding: compressContent-Encoding: deflate

When the client requests, it uses Accept-Encoding fields to describe which compression methods they can accept.

?
1 Accept-Encoding: gzip, deflate
2.6 Disadvantages

The main disadvantage of version http/1.0 is that only one request can be sent per TCP Connection. After sending the data, the connection is closed and if additional resources are requested, a new connection must be Created.

The new cost of the TCP connection is high because the client and server three handshakes are required and the start time is slower (slow start). therefore, the performance of the HTTP 1.0 version is relatively poor. As the Web page loads more and more external resources, the problem is becoming more prominent.

To solve this problem, some browsers use a non-standard field when requested Connection .

?
1 Connection: keep-alive

This field requires the server not to close the TCP connection for other requests to be Reused. The server also responds to this FIELD.

?
1 Connection: keep-alive

A reusable TCP connection is established until the client or server actively shuts down the Connection. however, This is not a standard field, and the behavior of different implementations may be inconsistent and therefore not a fundamental solution.

third, http/1.1

In January 1997, the http/1.1 version was released only half a year later than the 1.0 Version. It further improved the HTTP protocol, which has been used 20 years later today, and is still the most popular Version.

3.1 Persistent connections

The biggest change in version 1.1 is the introduction of persistent connections (persistent connection), which means that TCP connections are not closed by default and can be reused by multiple requests without declaring them Connection: keep-alive .

The client and server can actively close the connection if they find that they have not been active for some Time. however, the canonical practice is that the client sends the last request, Connection: close explicitly requiring the server to close the TCP Connection.

?
1 Connection: close

currently, for the same domain name, most browsers allow the simultaneous creation of 6 persistent connections.

3.2 Piping mechanism

Version 1.1 also introduces a pipeline mechanism (pipelining), in which the client can send multiple requests simultaneously in the same TCP Connection. This further improves the efficiency of the HTTP Protocol.

For example, a client needs to request two Resources. previously, in the same TCP connection, send a request first, and then wait for the server to respond, receive and then issue a B request. The pipeline mechanism is to allow the browser to issue a and B requests at the same time, but the server is in order, the first response to a request, complete and then respond to the B request.

3.3 content-length Field

A TCP connection can now transmit multiple responses, which is bound to have a mechanism to differentiate which response a packet belongs to. This is the function of the Content-length field, declaring the data length of this Response.

?
1 Content-Length: 3495

The above code tells the browser that the length of the response is 3,495 bytes, and the subsequent bytes are the next Response.

In version 1.0, the Content-Length field is not required because the browser discovers that the server has closed the TCP connection, indicating that the received packet is Complete.

3.4 chunked transfer encoding

The precondition for using a Content-Length field is that the data length of the response must be known before the server sends a RESPONSE.

For some time-consuming dynamic operations, this means that the server waits for all operations to complete before it can send data, which is obviously inefficient. A better approach is to produce a piece of data, send a piece, and replace the "cache mode" (buffer) with a stream mode.

therefore, version 1.1 stipulates that you can use Content-Length the "chunked transfer encoding" (chunked transfer Encoding) without using a field. As long as the header information of the request or response has a Transfer-Encoding field, it indicates that the response will consist of an undetermined number of data blocks.

?
1 Transfer-Encoding: chunked

Before each non-empty block of data, There is a 16 binary value representing the length of the Block. finally, a block of size 0 indicates that the data for this response has been sent. Here is an example.

?
1234567891011121314151617 HTTP/1.1 200 OKContent-Type: text/plainTransfer-Encoding: chunked25This is the data in the first chunk 1Cand this is the second one 3con 8sequence0
3.5 Other Features

Version 1.1 also added a number of verb methods:,,, PUT PATCH HEAD OPTIONS , DELETE .

In addition, the header information requested by the client adds a Host field to specify the domain name of the Server.

?
1 Host: www.example.com

With Host fields, You can send requests to different sites on the same server, laying the groundwork for the rise of virtual hosts.

3.6 Disadvantages

Although version 1.1 allows the multiplexing of TCP connections, all data communication is performed sequentially in the same TCP Connection. The server will only take the next response if it finishes processing a response. If the previous response is particularly slow, there will be many requests waiting in the BACK. This is called "team head blockage" (head-of-line blocking).

To avoid this problem, there are only two methods: one is to reduce the number of requests, and the other is to open persistent connections at the same time. This has led to a number of Web optimization techniques, such as merging scripts and stylesheets, embedding images into CSS code, domain name sharding (sharding), and so On. This extra work can be avoided if the HTTP protocol is designed to be a little better.

Iv. SPDY Agreement

In 2009, Google disclosed its own research and development of the SPDY agreement, mainly to solve the problem of http/1.1 efficiency is not high.

This agreement, when proven to be feasible on chrome, is used as the basis for http/2, and the main features are inherited from the http/2.

wu, HTTP/2

2015, HTTP/2 Released. It is not called http/2.0, because the Standard Committee does not intend to release the child version again, the next new version will be http/3.

5.1 Binary Protocol

http/1.1 version of the header information is definitely text (ASCII encoding), The data body can be text, or it can be binary. HTTP/2 is a complete binary protocol in which the header information and data bodies are binary and collectively referred to as "frames": header information frames and data frames.

One benefit of the binary protocol is that additional frames can be defined. HTTP/2 defines nearly 10 types of frames, providing a foundation for future advanced Applications. If you use text to do this, parsing the data becomes cumbersome, and binary parsing is much more convenient.

5.2 Multi-work

HTTP/2 multiplexing TCP connections, in which both the client and the browser can send multiple requests or responses at the same time, and do not correspond in order one by one, thus avoiding "team head clogging".

For example, in a TCP connection, the server received both a request and a B request, so the first response to a request, the results found that the processing process is very time-consuming, so send a request has been processed parts, and then respond to the B request, after completion, then send a request the Remainder.

This two-way, real-time communication, is called the Multi-work (multiplexing).

5.3 Data Flow

Because HTTP/2 packets are sent out of sequence, successive packets within the same connection may belong to different responses. therefore, the packet must be marked to indicate which response it belongs to.

HTTP/2 will each request or response of all packets, called a Stream. Each data stream has a unique number. When a packet is sent, the data stream ID must be marked to distinguish which data stream it belongs to. In addition, the client sends out the data stream, the ID is all odd, the server issued, the ID is Even.

When the data stream is sent in half, both the client and the server can send a signal ( RST_STREAM Frame) to cancel the Traffic. Version 1.1 The only way to cancel the data flow is to close the TCP Connection. This means that HTTP/2 can cancel a request, while ensuring that the TCP connection is still open and can be used by other Requests.

The client can also specify the priority of the data flow. The higher the priority, the sooner the server responds.

5.4 Header information Compression

The HTTP protocol does not have a status, and each request must have all the information attached. therefore, Many of the requested fields are duplicates, such as Cookie and User Agent , exactly the same content, each request must be accompanied, which will waste a lot of bandwidth, also affect the Speed.

HTTP/2 has optimized this point by introducing the header information compression mechanism (header compression). On the one hand, the header information is used gzip or compress compressed before sending; on the other hand, the client and the server maintain a header information table, all the fields are stored in the table, generate an index number, and then do not send the same field, only send the index number, which increases the Speed.

5.5 Server Push

HTTP/2 allows the server to proactively send resources to clients without request, which is called Server Push.

A common scenario is a client requesting a Web page that contains a lot of static resources. normally, the client must receive the Web page, parse the HTML source, find a static resource, and then issue a static resource Request. In fact, the server can expect the client to request the Web page, it is likely to request static resources, so they proactively send these static resources along with the Web page to the Client.

Vi. Reference Links
    • Journey to http/2, by Kamran Ahmed

    • HTTP, by Wikipedia

    • http/1.0 specification

    • HTTP/2 specification

Source: Nanyi's Blog

Http://www.oschina.net/news/76365/http-introduce

The historical evolution and design ideas of HTTP protocol

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.