I have written an article [HTTP protocol details]. This time I will continue to introduce HTTP compression.
This document uses Fiddler to view HTTP request and Response. If you are not familiar with this tool, refer to [Fiddler tutorial].
HTTP compression refers to the "text content" compressed and transmitted between the Web server and the browser. HTTP adopts common compression algorithms, such as gzip, to compress HTML, Javascript, and CSS files. This greatly reduces the amount of data transmitted over the network and increases the speed at which users can display webpages. Of course, it will also increase the server overhead. This article describes the concept of HTTP compression from the HTTP protocol perspective.
Reading directory
- Differences between HTTP Content Encoding and HTTP Compression
- HTTP compression process
- Example: Use Fiddler to observe HTTP Compression
- Content Encoding type
- Benefits of Compression
- Disadvantages of gzip
- How gzip is compressed
- HTTP Response can be compressed, and HTTP Request can also be compressed
Differences between HTTP Content Encoding and HTTP Compression
HTTP compression is actually a type of content encoding in the HTTP protocol.
In the http protocol, you can encode the content (that is, the body part) and use the gzip encoding. To achieve the goal of compression. You can also use other codes to disrupt or encrypt the content to prevent unauthorized third parties from seeing the content of the document.
Therefore, HTTP compression is actually a type of HTTP content encoding. Therefore, do not confuse HTTP compression with HTTP content encoding.
HTTP compression process
1. the browser sends an Http request to the Web server. The request contains Accept-Encoding: gzip and deflate. (Tell the server that the browser supports gzip compression)
2. After receiving the request, the Web server generates the original Response, including the original Content-Type and Content-Length.
3. the Web server uses Gzip to encode Response. The encoded header contains Content-Type and Content-Length (the size after compression), and added Content-Encoding: gzip. then, send Response to the browser.
4. After receiving the Response, the browser decodes the Response according to Content-Encoding: gzip. After obtaining the original response, the page is displayed.
For example:
Instance: Fiddler observes HTTP Compression
From the perspective of reality, let's look at a real example. I found that gzip compression is used in the blog Park.
You can see it clearly using Fiddler.
In Fiddler, it is too troublesome to manually decode each time. Click the "Decode" button on the toolbar to automatically decode the code.
Content Encoding type
HTTP defines some standard content encoding types and allows more encoding in extended form.
The Content-Encoding header uses these standardized codes to describe the algorithm used for Encoding.
Content-Encoding Value
Gzip indicates that the entity uses GNU zip encoding.
Compress indicates that the entity uses Unix File compression programs.
Deflate indicates that the object is compressed in zlib format.
Identity indicates that the entity is not encoded. This is the default case when no Content-Encoding header exists.
Gzip, compress, and deflate encoding are lossless compression algorithms used to reduce the size of transmitted packets without causing information loss. Gzip is generally the most efficient and widely used.
Benefits of Compression
Http compression can compress plain text to 40% of the original content, saving 60% of data transmission.
Example: Before the blog homepage is compressed, the value is 46124 bytes. After compression, the value is 16.0bytes. Only the original value is 35%. Saves 65% of data transmission, greatly improving performance
See the figure below.
Disadvantages of Gzip
It is not good to use gzip to compress JPEG files.
How Gzip is compressed
In short, Gzip compression is used to find similar strings in a text file and temporarily replace them to make the entire file smaller. This form of compression is very suitable for the Web, because HTML and CSS files usually contain a large number of repeated strings, such as spaces and labels.
HTTP Response can be compressed, and HTTP Request can also be compressed
The browser does not compress the Request. However, some HTTP programs encode the Request when sending the Request. For example.