14 Rules for building high-performance websites: Using gzip compression Components

Source: Internet
Author: User

The quality of front-end development engineers is directly related to the speed of page access (JOHN: You know, using an extremely ugly table to set the pages made by the table is much more expensive than the simple Div and table pages, both in K bytes and in load speed ), however, other factors that we cannot control, such as the user's bandwidth, will also affect the speed at which users access your web applications. Rule 1 and rule 3 demonstrate how to increase page loading time by reducing unnecessary HTTP requests. Rule 2 introduces how to use CDN to bring us closer to users, but we cannot always optimize all HTTP requests. Therefore, our rule 4 is developed based on the trend.

Rule 4: speed up response by reducing the amount of HTTP response data. You need to know how fast an HTTP response is to be transmitted if there is enough, because the packets sent from the server to the browser are too small and there is less. This is especially effective for users with poor networks. This section will show you how to compress HTTP responses through gzip. This is the easiest way to reduce the amount of data, but if you get it, you will pay for it and gzip will have some bad effects, this will be introduced later.

How compression works

A long time ago, we applied File compression to email and FTP sites. The Web client does not support compression until the HTTP/1.1 specification is established through the accept-encoding header in the HTTP request.
If the Web server sees this header in the request information, it can return the available compression method of the server through the content-encoding header of the response.
Gzip is currently the most popular and efficient compression method. It is a free-to-use compression format that is not subject to patent rights. It is developed and defined in rfc1952 specification by the GNU project. Another compression method is deflate, which is not as common and efficient as gzip. In fact, only one website is using deflate: msn.com. Browsers that support deflate data compression also support gzip, so Gzip is a good choice.

What to compress

What the server needs to compress depends on the file format. Many websites only compress HTML files. In fact, scripts and style sheet files are also worthy of compression (in fact, any text data can be compressed, including XML and JSON data ). Images and PDF files do not need to be compressed with gzip because they are compressed formats. Excessive compression will only waste CPU resources and will not be much compressed.
 
Gzip compression consumes the CPU of the server, and the client also needs to decompress the data in GZIP format. As for whether it is worth compression, I think any file over 1 K or 2 k is worth compression. The mod_gzip_minimum_file_size parameter can be used to configure the size range of the file to be compressed. The default value is 500 bytes.

I have investigated the usage of gzip on top 10 websites in the United States. Nine of them use gzip to compress their HTML files, and seven use gzip to compress the script and style sheet files. Five of them completely compress all scripts and style sheets. If a website compresses all HTML files, style sheets, and script files, it can even reduce the data volume by 70%. This is described in the following section.
Save the savings

After gzip compression, the returned data volume is generally reduced by 70%, as shown in. scrtip shows the changes in the size of the style sheet file before compression and after compression, it also shows the comparison of the data volume compressed in deflate mode.
Obviously, we can see why we chose gzip. Gzip reduces the data volume by about 66%, while deflate reduces the data volume by about 60%.

Configure Configuration

The configuration of gzip depends on your Apache version: Apache 1.3 uses mod_gzip, while Apache 2.x uses mod_deflate. This section explains how to configure each module.

Apache 1.3: mod_gzip

The gzip compression function of apache1.3 is implemented through the mod_gzip module. Mod_gzip has many configuration parameters, which are described on the mod_gzip website. Here I will introduce some of the most common parameters.

Mod_gzip_on
Enable mod_gzip.
Mod_gzip_item_include
Mod_gzip_item_exclude
Defines the types of files to be compressed or not compressed by gzip, MIME type, user agent, and so on.

Most Web servers open mod_gzip and set text/html to the default file type. Most importantly, you should also enable the gzip function for the script and style sheet files. You can refer to the following configuration for Apache 1.3:

Mod_gzip_item_include file/. js $
Mod_gzip_item_include mime ^ application/X-JavaScript $
Mod_gzip_item_include file/. CSS $
Mod_gzip_item_include mime ^ text/CSS $

The command line program of gzip can control the compression level and CPU usage, but it cannot be configured in mod_gzip. Sometimes the compression of data streams will cause excessive CPU load. We can choose to cache the compressed response data to the disk or memory. It is too troublesome to manually implement this function, fortunately, we can use mod_gzip to automatically save the compressed data to the disk and respond to updates in a timely manner. To implement this function, configure the mod_gzip_can_negotiate and mod_gzip_update_static parameters.

Apache 2.x: mod_deflate

Apache 2.x is compressed through the mod_deflate module. Although its name is deflate, it is actually implemented in gzip compression mode (JOHN: Weird !). If you want to compress scripts and style sheets, you only need the following line:

Addoutputfilterbytype deflate text/HTML text/CSS application/X-Javascript

Unlike mod_gzip, mod_deflate can control the compression level through parameter configuration. For more information, refer to the Apache 2.0 mod_deflate document.

Proxy Cache proxy caching

The simple configuration described above is no problem for direct access to the web server. The web server uses the accept-encoding header in the request to determine whether the client can support compressed data, then, the compressed or uncompressed data is returned to the client, which is implemented through the HTTP header information.

However, if the user accesses the Internet through proxy, the user's requests are also sent through the proxy server, which is troublesome: if the user sends a request to the proxy server, it also indicates that it does not support gzip, but the proxy server returns cached compressed data to the user when it transfers data again. In this case, the user will see a bunch of garbled characters, and vice versa.
To solve this problem, add the vary header information to the server. By default, mod_gzip adds the vary: accept encoding header information for all responses, so that the Proxy Server caches compressed and uncompressed data.

Edge Cases

Currently, 90% of browsers support gzip. However, we still have to consider some special cases, such as some bugs in earlier ie versions, especially ie5.5 and ie6.0 SP1, A safer method is to use "browser whitelist" to allow the server to send gzip data only for some browsers. The following settings are only applicable to IE6 ~ Version 9 and ILA 5 ~ Version 9 sends gzip data.
 
Apache 1.3 uses User-Agent:

Mod_gzip_item_include reqheader "User-Agent: MSIE [6-9]"
Mod_gzip_item_include reqheader "User-Agent: Mozilla/[5-9]"

Apache 2.x uses browsermatch:

Browsermatch ^ MSIE [6-9] Gzip
Browsermatch ^ Mozilla/[5-9] Gzip

For proxy caching, you can add the User-Agent ID to the vary header to tell the proxy server that the whitelist is used. In the mod_gzip module, if the whitelist is configured, the User-Agent ID is automatically added to the vary. But do not expect the proxy server to cache each URL copy of all the browser white lists. Therefore, we have to use another method: add the vary: * or cache-control: Private header information to the returned response to indicate that the proxy server cache is completely prohibited, the proxy is not allowed to cache any component, so that users can request the Real Server to obtain page data through the proxy. In fact, Google and Yahoo! Both use this policy, although it will bring more traffic consumption.

In another special case, it should be noted that etags (which we will mention separately in Chapter chapter 13) will not be compressed by default, so the best way is to simply disable etags, let's talk about this in chapter 13 of chapter 13.

Example
The following three links are not compressed by gzip. They only compress HTML and compress the Three Link pages of all components:
Nothing gzipped
Http://stevesouders.com/hpws/nogzip.html
HTML gzipped
Http://stevesouders.com/hpws/gzip-html.html
Everything gzipped
Http://stevesouders.com/hpws/gzip-all.html
Is the comparison of the specific data volume and time of the three:

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/FrankTaylor/archive/2008/12/30/3657367.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.