Discrete Cosine Transform (DCT)-reprint

Source: Internet
Author: User
Discrete cosine transformation (DCT)
Discrete Cosine Transform DCT (discrete cosine transform) is a common conversion encoding method for digital rate compression. The Fourier transformation of any continuous real-symmetric function only contains the cosine. Therefore, the cosine transformation has the same physical meaning as the Fourier transformation. DCT first divides the entire image into N * n pixel blocks, and then performs DCT transformation on N * n pixel blocks one by one. Because the high-frequency components of most images are small, the coefficients corresponding to the high-frequency components of the images are often zero. In addition, the human eyes are not sensitive to the distortion of the high-frequency components, so they can be further quantified. Therefore, the digital rate of the transfer coefficient is much smaller than the digital rate used to transmit image pixels. After reaching the receiving end, the system returns the sample value through the inverse Discrete Cosine transformation. Although there is a certain degree of distortion, it is acceptable to the human eye. Formula for two-dimensional positive and negative discrete cosine transformation:

Where N is the number of horizontal and vertical shards of the image block, generally n = 8. N is more effective than 8, but the complexity is greatly increased. The two-dimensional data blocks of 8*8 are converted into 8*8 transformation coefficients after DCT. These coefficients have clear physical meanings. For example, when u = 0, V = 0, F () is the average of the original 64 sample values, which is equivalent to the DC component. As the u and v values increase, the corresponding coefficients represent the gradually increasing horizontal and vertical spatial frequency components. When we only consider the data row (8 pixels) in the horizontal direction, 1 shows:


It can be seen that the image signal is decomposed into DC components, and various cosine components from low frequency to high frequency. The DCT coefficient only indicates the shares of the original image signal occupied by this component. Obviously, the restored image information can be expressed as a matrix in the form of F (n) = C (n) * E (n)

In formula, E (n) is a base, C (n) is a DCT coefficient, and F (n) is an image signal.

If we consider the changes in the vertical direction, we need a two-dimensional base, that is, the base must not only reflect the changes in the horizontal direction frequency, but also reflect the changes in the vertical space frequency; it corresponds to 8x8 pixel blocks. Its space base 2 shows that it is an image consisting of 64 pixel values, which is usually called a basic image. They are called Basic images because any image block can be expressed as a combination of different sizes of 64 coefficients in the inverse transform of discrete cosine transformation. Since the basic image is equivalent to a single coefficient in the transform field, any pixel can also be seen as a combination of 64 basic images with different ranges. This has the same physical significance as the combination of any signal that can be decomposed into base wave and harmonic waves of different amplitude.

Figure 2 shows an example of DCT transformation for 8*8 image blocks:

In the example shown in 3, we can see that after a DCT transformation calculation, 64 coefficients are still obtained for 64 sample values, and the bit rate is not compressed. However, after DCT transformation, the number of bits increases. The original value is 8 bits, and the data ranges from 0 ~ 255; the obtained F10 indicates that the maximum value of the DC component is 256 of the original 64/8, that is, 0 ~ 2047, the range of the AC component is-1024 ~ 1023; but after 2nd steps, I .e. after quantification (△: 4 in the figure), the coefficient of most high frequency components becomes 0. Generally, human eyes are sensitive to low frequency components, it is not sensitive to high-frequency components. Therefore, the quantization result removes the less important high-frequency components and reduces the bit rate. Then, you can read the data in the zig-zag mode to reduce the bit rate. After DCT transformation, the coefficients are mostly concentrated in the upper left corner, that is, the low frequency component area. Therefore, the font-shaped reading is actually based on the two-dimensional high and low frequency reading coefficient. This makes it easy to use Run Length Encoding. The so-called Run Length Encoding means that a code can simultaneously represent the value of the code and there are several zeros in the front. This gives full play to the advantages of the "Zhi" font reading, because there are many opportunities for zero connections in the "Zhi" font reading, especially in the end, if they are all zero, after reading the last number, you only need to give the block end code (eob) to end the output, thus saving a lot of bit rate.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.