Principle of image compression

Last Update:2018-07-26 Source: Internet

Author: User

Tags requires first row

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original URL: http://blog.csdn.net/newchenxf/article/details/51693753
Reprint please indicate the source. 1 Why images can be compressed

An original image (1920x1080), if each pixel is 32bit (RGBA), then the memory size required by the image
1920x1080x4 = 8294400 Byte, approximately 8M. This is unacceptable to us. If so, 1G hard disk to save more than 100 pictures, hurt Ah. Video is also the same if the video is 1920x1080,30fps, 1 hours. That doesn't compress the memory that's probably needed:
8mx30x60*60 = 864000M, it is more than 800 g. You're crazy.

So, we need image compression.

Why the image can be compressed. Because it has a lot of redundant information.
The following types of redundancy exist in common image, video, and audio data:
1. Space redundancy
2. Time Redundancy
3. Visual redundancy
This is detailed below. 1.1 Space Redundancy

There is often spatial coherence between the colors of each sample point on an image surface, such as the following image, the color of two mice, the wall behind, the gray floor, and the colors. These same-color blocks can be compressed.
For example, the first row of pixels is basically the same, assuming that the luminance value y is so saved
[105 105 105.......105], if a total of 100 pixels, that requires 1byte*100.
The simplest compression: [105, 100], which means that the next 100 pixels have a brightness of 105, then as long as 2 bytes, the entire row of data can be represented. Wouldn't it be compressed.

Space redundancy occurs mainly in a single picture, such as our photos. 1.2 Time Redundancy

This redundancy is primarily for video.
Motion Image (video) is generally located in a time axis interval of a set of continuous picture, where the adjacent frames tend to contain the same background and moving objects, but moving objects in a slightly different spatial location, so the next frame of data and the previous frame of data have many common place, This commonality is called time redundancy because adjacent frames record the same scene picture of adjacent moments.
As shown in the figure below, in fact 1 seconds 30 frames, each frame is 33ms, so short, before and after the change of the frame is very small, perhaps only the mouth moved, the background is not moving.
1.3 Visual Redundancy

Human visual system due to the limitations of physiological characteristics, the attention to the image field is non-uniform, people to the subtle color difference feeling is not obvious.
For example, the general resolution of human vision is 26 gray level, while the general image quantization uses 28 gray level, that is, the existence of visual redundancy.
Human hearing is less sensitive to certain signals, making it possible to reduce the range of changes that can be made after compression, and people do not feel it. 2 Classification of data compression methods 2.1 If the compression method produces distortion classification 2.1.1 Distortion-free compression

No distortion compression requires that the data after decompression is exactly the same as the original data. The data obtained after decompression is the copy of the original data, which is a kind of reversible compression.
Distortion-free compression removes or reduces redundancy in data, and is then re-inserted into the data when restored, thus reversing the process
According to the current technical level, lossless compression algorithm can generally compress the data of ordinary files to the original 1/2-1/4. Some common lossless compression algorithms have Huffman (Huffman) algorithm and LZW (Lenpel-ziv & Welch) compression algorithm 2.1.2 Distortion compression

After extracting the data and the original data is not exactly the same, is not reversible compression mode. There is distortion compression after restore, does not affect the expression of information
For example, the compression of image, video, and audio data can use lossy compression, because it contains more data than our vision system and the auditory system can receive information, discard some data without misunderstanding the sound or the meaning of the image, but can greatly improve the compression ratio. The compression ratio of image, video and audio data can be as high as 100:1, but the subjective feeling of the person still does not misunderstand the original information. 2.2 Classification According to the principle of compression method 2.2.1 Predictive Coding

The basic idea is to use the data values of the points that have been coded to predict the data value of a neighboring pixel point 2.2.2 Transform encoding

The basic idea is to transform the optical intensity matrix of the image into the coefficient space, and then encode the coefficients to compress the 2.2.3 Statistical code .

Compression coding based on the distribution characteristics of the probability of the occurrence of information. Like Hoffman code. 3 Elements of image compression

Compression ratio
The ratio of file size before and after compression, the higher the better, but affected by the speed, consumption of resources and so on.
Image quality
Compared with the original image, the evaluation method has objective evaluation and subjective evaluation.
compression and decompression speed
With the compression method and the compression coding algorithm, the general compression ratio is larger than the decompression calculation, so compression is slower than decompression. 4. Image Compression Coding example 4.1 Stroke Code (RLE)

This is one of the best-understood encodings.
There are many such images in reality that have many of the same color tiles in an image. In these tiles, many rows have the same color, or there are many consecutive pixels on a line that have the same color value. In this case, you do not need to store the color value of each pixel, but only the color value of one pixel, and the number of pixels with the same color, or the color value of the stored pixel, and the number of rows with the same color value.
This compression encoding is called the Stroke encoding (run length Encoding,rle), and the number of pixels that have the same color and are contiguous is called the stroke length.
For example, the string aaabcddddddddbbbbb
Using RLE principle can be compressed into 3abc8d5b
RLE encoding is simple and intuitive, fast encoding/decoding speed,
So many graphics and video files, such as. BMP. This method is used for the compression of TIFF and AVI format files.
Since there are many tiles of the same color in an image, the color value of one pixel and the number of pixels (length) of the same color are stored in an integer pair. For example:
(G, L)//g is the color value, L is the length value
The encoding uses a left-to-right, top-to-bottom arrangement, replacing the original data string with this data and repetition times whenever a string of identical data is encountered.

For example, 18*7 pixels below (assuming only grayscale values, 1 bytes)
000000003333333333
222222222226666666
111111111111111111
111111555555555555
888888888888888888
555555555555553333
222222222222222222
Only 11 data representations are required.
(0,8) (3,10) (2,11) (6,7)
(1,18) (1,6) (5,12) (8,18)
(5,14) (3,4) (2,18)

Run length encoding features:
intuitive, economical;
is a lossless compression;
Compression ratio depends on the characteristics of the image itself, the larger the same color image block, the smaller the number of image blocks, the higher the compression ratio.
Applies to computer-generated images, such as. BMP, TIF, etc., not suitable for color-rich natural images.

This is not to say that RLE encoding method does not apply to the natural image compression, in contrast, in the natural image compression without RLE, but can not simply use Rle a coding method, need and other compression coding technology joint application. 4.2 Huffman Code (Huffman)

Because the image of the color of the data show the probability of different, for the occurrence of high frequency of the Assignment (series) to a shorter length of the code, the occurrence of a small frequency of the length of the code, thereby reducing the total amount of code, but does not reduce the total amount of information.

Encoding steps:
(1) Initialize, sort symbols in large to small order according to the size of the symbol probabilities
(2) The smallest probability of two symbols to form a node, as shown in Figure 4-02, D and e are composed of node P1.
(3) Repeat step 2, get the node P2, P3 and P4, forming a "tree", wherein the P4 is called the root node.
(4) From the root node P4 start to the corresponding to each symbol of the "leaf", from top to bottom "0" (upper Branch) or "1" (the next branch), which is "1" which is "0" is irrelevant, the final result is only the distribution of code, and the average length of the code is the same.
(5) From the root node P4 begin to write each symbol code along the branch to each leaf separately. 4.3 DCT encoding 4.3.1 Basic Concepts

The images described in the airspace are described in some transformation domain through some transformations (usually, cosine transform, Fourier transform, Walsh transform, etc.).
In the transformation domain, the correlation of image is reduced firstly, and then the encoded bit rate of image can be further compressed by some image processing (such as two-dimensional filtering in frequency domain) and entropy coding.
This transformation is commonly used in JPEG image compression. 4.3.2 Transform compression principle block Diagram

G: input source image
G ': Decoded image
U: Two-dimensional orthogonal transformation
U ': two-dimensional orthogonal inverse transformation

The DCT transformation, due to the longer content, will be discussed on the other side of the blog.
Take a look at my other blog post: [JPEG compression principle and DCT discrete cosine transform]
http://blog.csdn.net/newchenxf/article/details/51719597

In addition to these common compression algorithms, there are a lot of other algorithms, here is not introduced, anyway, you have a perceptual understanding on the line.

Pro, if you think I write not so bad, give a praise Bai ^ ^

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More