The basic principle of video compression, the concept of some common compression algorithms _ Multimedia

Source: Internet
Author: User
Tags arithmetic

For algorithmic research, it is in itself to know where to work and where to go. The principle is to indicate the direction.

A. The feasibility of video compression

1. Space redundancy

A static image, such as a human face. Background, the brightness of the human face, hair and so on, the color, are gentle changes. Adjacent pixels and chroma signal values are relatively close. With strong correlation, if the sample number is used to represent brightness and chroma information, there is more spatial redundancy in the data. If you remove the redundant data before coding, the average number of bits per pixel will drop, this is the usual image of the frame encoding, that is, to reduce space redundancy for data compression.

2. Time Redundancy

The video is the frame image sequence of the time axis direction, and the correlation of the neighboring frame image is very strong. It is often used to reduce the time redundancy by reducing the frame-between frames. The technique of motion estimation and motion compensation is used to satisfy the quality requirement of decoding reconstruction image.

3. Symbolic redundancy

Using the same code to denote the different probability of the symbol, can cause the waste of the bit number. For example 10,11,13 three number, if we all use 1bytes to represent, is 3bytes (namely 3x8 = 24bits), but if our table 00b expresses 10,01b to represent 11,02b to represent 13, thus, three numbers together only then uses the 6bits, Can save 18bits compared to the previous.

The principle of variable length coding is so, the introduction of large use of shorter code words, the probability of small with a long code word.

4. Structural redundancy

For the interior of the image, there is a relationship between the parts. We can use this relationship to reduce the code word expression of information. For example: Fractal image coding

5. Visual redundancy

1), the luminance resolution of the color signal of the human eye is higher than the color resolution, such as RGB-->YUV is the principle

2), the resolution of the space of the human eye to the still image is greater than that of the motion image.

3), the human eye on the small changes in brightness is not sensitive

4), center sensitive, four weeks insensitive.

In fact, although we know these, we know that there is redundancy, but how to find out these redundancy is a very complex process. It is also the process that our algorithm pursues unceasingly.

The above section is the cornerstone of all video compression standards. mpeg2,mpeg4,h264,h265 these standards, not so much that they are standard, as they provide some combination of algorithms, or simple or complex, of course the simple algorithm compresses the redundancy of small, complex compressed redundancy large. Through the algorithm to find out where the redundancy information, and then compressed, to achieve the reduction of data volume. This is our catalogue.

More recently, how do we find out the relevance of the data.

Second, the common algorithm of noun interpretation

There are two kinds of large classifications, one transformation and one encoding.

Change first.

We have to find out the relevance of the signal, the time is difficult to find how to do, to change to another space up. That's the conclusion we get in signal and system, digital signal processing, advanced mathematics

Transform

Fourier transform

Walsh-hadamard (Volshadama Transform)

Sine transform

Cosine transform----most widely used

Skew transformation

Hal Transform

K-L Transform

Wavelet transform

For these transformations, a lot of things are only mathematically meaningful, and for engineering, there is no fast algorithm, or a low correlation between transformations, or other reasons. Only cosine transform is the most extensive, in order to reduce our learning pressure (of course, if you are to compare the differences between the difference), we can only grasp the cosine transform.

Coding

There is no distortion code and limit distortion code, from the name we can see the difference. Oh, not much explanation

Types of non-distortion encodings:

Huffman coding, Arithmetic coding, run-length coding

Limit distortion encoding

Predictive coding, Transform coding, vector quantization, model-based coding.

For coding this piece, the above algorithm, the basic need to master all the only line.

JPEG/MPEG2 first used the run-length code to reduce the number of 0 of the bit occupied, and then use Huffman compression.

H264 used the arithmetic code to do the last compression process.

Motion compensation and motion estimation are used to predict coding.

MPEG4 used model-based coding.

After the transformation is completed, vector quantization is carried out.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.