About DCT --- reprint

Source: Internet
Author: User
About DCT .........
MPEG adopts the Discrete Cosine Transform (DCT-Discrete Cosine Transform) compression algorithm proposed by Ahmed (a giant mathematician) in 1970s to reduce the spatial redundancy of video signals.
DCT converts motion compensation errors or original image information blocks into coefficients that represent different frequency components. There are two advantages: first, signals usually focus most of their energy on a small range of frequency domains. In this way, describe unimportant components with only a few bits. Second, frequency Domain Decomposition maps the processing process of human visual systems and allows subsequent quantization processes to meet their sensitivity requirements.
I have a detailed description of this point in my tutorial. Let me directly reference it:

The spectrum line of the video signal is in the range of 0 to 6 MHz, and most of the video images contain low frequency spectrum lines, only video signals at the edge of an image with a very low proportion in the image area contain high-frequency spectral lines. Therefore, when video signal digital processing, bit numbers can be allocated based on Spectrum factors: a large number of BITs can be allocated to low-spectrum areas that contain a large amount of information, A small number of BITs are allocated to the high-frequency spectral areas with low information, while the image quality is not perceptible to achieve bit rate compression. However, it is only when the low entropy (entropy) value is used for Effective encoding. Whether a string of data can be effectively encoded depends on the probability of each data occurrence. The probability difference between each data is large, which indicates that the entropy value is low and the data in this string can be efficiently encoded. If the probability difference is small and the entropy value is high, efficient coding cannot be performed. The digitization of video signals is based on the/D converter's video level conversion at a specified sampling frequency. The video signal amplitude of each pixel changes periodically with the time of each layer. The total average Information volume of each pixel is the total average Information volume, that is, the entropy value. Because each video level has almost the same probability, the video signal entropy is very high. The entropy value is a parameter that defines the bit rate compression ratio. The compression ratio of a video image depends on the entropy value of the video signal. In most cases, the video signal is a high entropy value and must be encoded efficiently, it is necessary to change the high entropy value to the low entropy value. How does it become a low entropy value? This requires analyzing the characteristics of the video spectrum. In most cases, the video spectrum decreases as the frequency increases. Among them, the low frequency spectrum gets the level 0 to the highest under almost equal probability. In contrast, high-frequency spectrum usually produces low-level and rare high-level. Obviously, the low frequency spectrum has a higher entropy value and the high frequency spectrum has a lower entropy value. Based on this, the low-frequency and high-frequency video components can be processed separately to obtain the high-frequency compression value.

As can be seen from the reference above, bit rate compression is based on transform encoding and entropy Encoding algorithms. The former is used to reduce the entropy value, and the latter converts the data into an effective encoding method that can reduce the number of bits. In the MPEG standard, the conversion encoding adopts DCT. Although the conversion process does not compress the bit rate itself, the converted frequency coefficient is very helpful for bit rate compression. In fact, the whole process of compressing digital video signals is divided into four main processes: block sampling, DCT, quantization, and encoding. First, the original image is divided into N (horizontal) × N (vertical) in the time domain) sampling block. You can select 4x4, 4x8, 8x8, 8x16, and 16x16 blocks as needed, these sampled pixel blocks represent the gray values of each pixel in the original image frame, which ranges from 139-163 and are sent to the DCT encoder in sequence, in this way, the sampling block is converted from the time domain to the DCT coefficient block in the frequency domain. The conversion of the DCT system is carried out in each sampling block. Each sampling block is a digital value, indicating the video signal amplitude value corresponding to the pixel in a field.
The specific inverse algorithm for DCT and Its decompression is as follows.

When u, v = 0, if the coefficient after the Discrete Cosine positive transformation (DCT) is F (0, 0) = 1, then the Discrete Cosine inverse transformation (IDCT) f (x, y) = 1/8 is a constant value. Therefore, F () is called the DC (DC) coefficient. When u, v = 0, if the coefficient after positive transformation is F (u, v) = 0, f (x, y) is not a constant, the F (u, v) coefficient after the positive transformation is the AC coefficient.

For a specific application of DCT conversion, see the figure below: (we just need to make a slide for employee training)
Http://pic.zingking.com/rzhy/kean/DCTpro.jpg

The conversion principle can detect two points: first, the 64 DCT frequency coefficients after DCT correspond to the 64 pixel blocks before DCT, and both sides of DCT are 64 points, it is only a lossless transformation process without compression. Second, the spectrum of all the DCT coefficients of a single image is almost all concentrated in the coefficient block in the upper left corner. Only one compressed image can be formed from the spectrum of this block; the maximum DC (DC) coefficient in the upper left corner of the frequency coefficient matrix output by DCT is 315 in the figure. because it represents the DC components on the X and Y axes, therefore, it represents the average value of all input matrices. Other DCT coefficients downward and right based on the DC coefficient are farther away from the DC component. The higher the frequency, the smaller the amplitude value, the bottom right corner of the image is-0.11, that is, most of the image information is concentrated on the DC coefficient and its nearby low frequency spectrum. The high frequency spectrum, which is farther and farther away from the DC coefficient, almost does not contain image information, even only contain clutter. Obviously, although DCT does not compress itself, it lays an essential foundation for "fetching" and "giving" in future compression.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.