HTTP protocol past and present life

Source: Internet
Author: User

Personal Summary After reading through the informationThe HTTP protocol is an application-layer protocol based on the TCP/IP protocol. He does not involve the transmission of packets (packet), mainly specifies the communication format between the client and the server, the default is 80 port in other words, the HTTP protocol is used to package the data, TCP/UDP is used to transfer the Protocol TCP /IP protocol refers not only to the TCP protocol, but to the entire TCP protocol family     The earliest HTTP version was the HTTP0.9 version released in 1991. There was only one get command for that version.  Get/index.htmlThe above command indicates that after a TCP connection (connection) is established, the client requests a Web page from the server index.html .
The protocol stipulates that the server can only respond to HTML-formatted strings and cannot respond to other formats. When the server is finished sending, close the TCP connection.
The second version is the HTTP1.0 version of 1996, with a significant increase in content first of all, data in any format can be sent, no longer just HTML, text, images, video, audio, etc. second, besides the GET command, the post command and Head command are introduced to enrich the interaction between the browser and the server . other features such as status code, multi-character set transfer, multipart send (multi-part type), permissions (authorization)caching (cache), content encoding (contents encoding), etc. 1.0 Examplesget/http/1.0
user-agent:mozilla/5.0 (Macintosh;intel Mas OS X 10_10_5) accept:*/* The first line is the request command and the protocol version (http/1.0) must be added at the end. The back is the multi-outfit information that describes the client situation Response Format
http/1.0 OK content-type:text/plain content-length:137582 expires:thu,05 Dec 1997 16:55:55 GMT Last-Mod ified:web,5 August 1996 15:55:55 GMT Server:apache 0.84 content-type FieldWith regard to character set encoding, 1.0 specifies that the header information must use ASCII code, and the subsequent data can be in any format. Therefore, when the server responds, it must tell the client what format the following database is, which is the common field value of the Content-type field (see the bottom of the article)
    • Text/plain
    • Text/html
    • Text/css
    • Image/jpeg
    • Image/png
    • Image/svg+xml
    • Audio/mp4
    • Video/mp4
    • Application/javascript
    • Application/pdf
    • Application/zip
    • Application/atom+xml
Each value includes a first-level type and a two-level type, separated by a slash. , this type of data is always called MIME type in addition to the predefined format, you can also customize the MIME type by using semicolons at the end, adding parameter content-type:text/html; Charset=utf-8He said it was a Web page, and the code was Utf-8 .Application/vnd.debian.binary-package is a binary package client request that sends a Debian system, you can use the Accept field to accept which formats your life accepts: */* represents data that can be accepted in any format \ MI ME Type not only exists with the HTTP protocol, but also exists with HTML <meta http-equiv= "Content-type" content= "Text/html;charset=utf-8"/> <!--equivalent to- <meta charset= "Utf-8"/> content-encoding FieldBecause the data sent can be any format, because the data can be compressed and then sent, content-type field Description Data compression method Content-encoding:gzip content-encoding:compress content- Encoding:defate clients use the Accept-enconding field to describe which compression methods they can accept when requested accept-encoding field describes which compression methods they can accept Accept-encoding:gzip,d Eflate

The main disadvantage of version http/1.0 is that only one request can be sent per TCP connection. After sending the data, the connection is closed and if additional resources are requested, a new connection must be created.

The new cost of the TCP connection is high because the client and server three handshakes are required and the start time is slower (slow start). Therefore, the performance of the HTTP 1.0 version is relatively poor. As the Web page loads more and more external resources, the problem is becoming more prominent.

To solve this problem, some browsers use a non-standard field when requested Connection .

connection:keep-aliveThis field requires the server not to close the TCP connection so that other requests can be reused, and the server responds with this field as above connection:keep-aliveAt this point, a reusable TCP connection is established until the client or server actively shuts down the connection, but it is not possible for the different implementations of the standard field to behave differently, It's not a fundamental solution.1997, the 1.0 version of the six months later, 1.1 released, is still the most popular version, this 2017 HTTP 1.1 Persistent Connections  The biggest change in version 1.1 is the introduction of persistent connections (persistent connection), which means that TCP connections are not closed by default, multiple requests are reusable, and there is no need to declare connection:keep-aliveThe client and server can actively close the connection for a period of time without activity, but when the client sends the last request, the connection:keep-alive,Explicitly require the server to close the TCP connection currently, for a domain name, most browsers allow 6 persistent connections to be established simultaneously Piping MechanismThe 1.1 release also introduces a pipeline mechanism (pipelining), in which the client can send multiple requests simultaneously in the same TCP connection, which further improves the efficiency of the HTTP protocol for example, the client needs to request two resources, and the previous practice is to send a request first on the same TCP connection , and then wait for the server to respond, receive a response to send a B request, and the pipeline mechanism is to allow the browser to send both A and B requests, but the server in order to respond to a request, and then respond to the B request content-length FieldNow we know that a TCP protocol can transmit multiple requests/responses, so we need to differentiate which response the packet belongs to, which is what the Content-length field does, declaring the data length of this response content-length : 1024 The code above tells the browser that the length of the response is 1024, and the next byte is the second response.Note: Version 1.0. Content-length is not required because the browser discovers that the server has closed the TCP connection, indicating that the packet has been collected chunked Transfer EncodingThe precondition for using the Content-length field is that the server must know the length of the response before sending the response for some time-consuming dynamic operations, which means that the server waits until all the operations are complete before it can send the data, which is obviously inefficient, so the smart programmer begins to find a way. Then we produce a piece of data, send a piece, use stream mode (stream), instead of cache mode (buffer)Therefore, the 1.1 version can also use the "chunked transfer Encoding" (chunked transfer encoding) without using the Content-length fieldAs long as the request header information has a transfer-encoding field, it means that the response will consist of data blocks that are not determined by the data transfer-encoding:chunked each non-empty block of data, there will be a 16 binary value, Represents the length of this block. The last one is a block of size 0, which indicates that the data of this response is sent out example: http/1.1 content-type:text/plain transfer-encoding:chunked 25Th Is was data for first chunk 1c and this is second one 3d Hello 0 Other featuresVersion 1.1 also added a lot of verb methods, put,patch,head,options,delete In addition, the client request header also added the Host field, used to specify the server domain name, Host:www.baidu.com has the Ho St field, you can send requests to different sites on the same server, laying the groundwork for virtual hosts DisadvantagesAlthough version 1.1 allows the multiplexing of TCP connections, all data communication is performed sequentially in the same TCP connection. The server will only take the next response if it finishes processing a response.    If the previous response is particularly slow, there will be many requests waiting in the back. This is called " Team Head Jam"(Head-of-line blocking).    To avoid this problem, there are only two methods: one is to reduce the number of requests, and the other is to open persistent connections at the same time. This has led to a number of web optimization techniques, such as merging scripts and stylesheets, embedding images into CSS code, domain name sharding (sharding), and so on. This extra work can be avoided if the HTTP protocol is designed to be a little better. Spdy ProtocolIn 2009, Google disclosed the independent research and development of the Spdy agreement, mainly to solve the problem of http/1.1 efficiency is not high, the agreement on the Chrome browser can be regarded as the basis of http/2.0, the main features are inherited in the http/2.0 HTTP/22015,HTTP/2 is released because there is no child version, so no. 0, because the official next version of the direct push HTTP/3 Binary Protocolhttp/1.1 version of the header information must be text, (ASCII encoding), the data body can be text, or binary, HTTP/2 is a thorough binary protocol header information and the data body are binary, and collectively referred to as " Frame": Header information frame and data frame binary has a benefit, you can define additional frames, HTTP/2 defines nearly 10 kinds of frames, for the future of the application of a large base, if the use of text to achieve this function, parsing data will become very cumbersome, binary parsing is very simple, especially the character set, encoding conversion reduced a lot of steps   Multi-WorkHTTP/2 multiplexing TCP connection, a connection, the client and the server can send multiple requests/responses at the same time, and not in order one by one corresponding, so as to avoid "team head jam" for example, in a TCP connection, the servers received a request and B request, so first respond to a request,    The result is that the process is time-consuming, sending a request that has already been processed, and then responding to the B request, and then sending the remainder of the request. This bidirectional, real-time communication is called Multi-work (multiplexing) Data FlowBecause HTTP/2 packets are sent out of sequence, successive packets within the same connection may belong to different responses.        Therefore, the packet must be marked to indicate which response it belongs to. HTTP/2 will each request or response of all packets, called a Data Flow(stream). Each data stream has a unique number. When a packet is sent, the data stream ID must be marked to distinguish which data stream it belongs to.        In addition, the client sends out the data stream, the ID is all odd, the server issued, the ID is even. When the data stream is sent in half, both the client and the server can send a signal ( Rst_stream Frame) to cancel this data stream. Version 1.1 The only way to cancel the data flow is to close the TCP connection.        This means that HTTP/2 can cancel a request, while ensuring that the TCP connection is still open and can be used by other requests. The client can also specify the priority of the data flow. The higher the priority, the sooner the server responds. Header information CompressionThe HTTP protocol does not have a status, and each request must have all the information attached.        Therefore, many of the requested fields are duplicated, such as the cookie and user Agent, identical content, each request must be accompanied, which will waste a lot of bandwidth, also affect the speed. HTTP/2 has optimized this and introduced header information compression mechanism (header compression)。 On the one hand, the header information is compressed and then sent using gzip or compress; On the other hand, the client and the server maintain a header information table, all the fields are stored in the table, generate an index number, and then do not send the same field, only send the index number, which increases the speed. Server PushHTTP/2 allows the server to proactively send resources to the client without request, which is called Server Push(server push). A common scenario is a client requesting a Web page that contains a lot of static resources. Normally, the client must receive the Web page, parse the HTML source, find a static resource, and then issue a static resource request. In fact, the server can expect the client to request the Web page, it is likely to request static resources, so they proactively send these static resources along with the Web page to the client.

HTTP protocol past and present life

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.