[Graphic] MPEG-2 compression coding technology principle application (6)

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Guo Bin, Professor of the Department of television engineering, Beijing Broadcasting Institute

2) Two-Dimensional DCT

MPEG adopts the discrete cosine transform (DCT-discrete cosine) proposed by Ahmed N. et al. in 1974.
Transform compression algorithm to reduce the space redundancy of video signals (spatial)
Redundancy ). Because static images and prediction error signals have very high spatial redundancy, the most widely used frequency domain decomposition technology to reduce spatial redundancy is DCT. DCT will run
The dynamic compensation error or original image information block is converted into a coefficient set that represents different frequency components. This has two advantages: first, the signal usually concentrates most of its energy in a small range of the frequency domain, so that the description is not heavy
Only a few bits are required. Second, the frequency domain decomposition maps the processing process of the human visual system and allows subsequent quantization to meet the sensitivity requirements. The video signal spectrum is 0-6 MHz.
Within the range, and most of a video image contains low frequency spectrum lines, only the video signal at the edge of the image that accounts for a very low proportion of the image area contains high frequency lines. Therefore, when video signal digital processing
Allocation of bits based on Spectrum factors: allocates more bits to Low-Frequency Spectral areas with large information, and allocates less bits to High-Frequency Spectral areas with low information, however, the image quality is not perceptible
To the purpose of Bit Rate compression. However, it is only when the low entropy (entropy) value is used for Effective encoding. Whether a string of data can be effectively encoded depends on the probability of each data occurrence.
The probability difference between each data is large, which indicates that the entropy value is low and the data in this string can be efficiently encoded. If the probability difference is small and the entropy value is high, efficient coding cannot be performed. Video signal digitalization is stipulated
The A/D converter converts the video level at the sampling frequency. The input video signal amplitude is indicated by layer 256 or layer 1024, the video signal amplitude of each pixel changes cyclically with the time of each layer.
. The total average Information volume of each pixel is the total average Information volume, that is, the entropy value. Because each video level has an almost equal probability, the video signal entropy is very high, as shown in 21.
The entropy value is a parameter that defines the bit rate compression ratio. The compression ratio of a video image depends on the entropy value of the video signal. In most cases, the video signal is a high entropy value and must be encoded efficiently, it is necessary to change the high entropy value to the low entropy value. What
What about low entropy? This requires analyzing the characteristics of the video spectrum. The video Spectrum Analysis in Figure 22 shows that in most cases, the video spectrum decreases as the frequency increases. Among them, the low frequency spectrum is almost equal.
To obtain the highest level. In contrast, high-frequency spectrum usually produces low-level and rare high-level. Obviously, the low frequency spectrum has a higher entropy value and the high frequency spectrum has a lower entropy value. Accordingly, you can view
The low frequency component and high frequency component are processed respectively to obtain the high frequency compression value.

As shown above, bit rate compression is based on two algorithms: Transform encoding and entropy encoding, as shown in Figure 23. The former is used to reduce the entropy value, and the latter converts the data into an effective encoding method that can reduce the number of bits. In the MPEG standard,
The conversion encoding uses DCT. Although the conversion process does not compress the bit rate, the converted frequency coefficient is very helpful for bit rate compression. In fact, the whole process of compressing digital video signals is divided into blocks.
Sample, DCT, quantization, and encoding, as shown in 24. First, the original image is divided into N (horizontal) × n (vertical) Sampling blocks in the time domain. You can select 4 × 4, 4 × 8,
8x8, 8x16, 16x16, and so on. Considering the appropriate compromise between data relevance and computing complexity, 8x8 pixel blocks are selected. These 8x8 pixel blocks represent the original image pixels.
The gray value in the range between 139-163 and sent to the DCT encoder in order to convert the sampling block from the time domain to the DCT coefficient block in the frequency domain. The conversion of the DCT system is in each sampling block.
Each of these blocks is a digital value, indicating the video signal amplitude value corresponding to the pixel in a field. Formula (2) and (3) are two-dimensional DCT positive transformation and inverse transformation formula:

For example, when u, v = 0
If the coefficient after the Discrete Cosine positive transformation (DCT) is F (0, 0) = 1, then the reproduction function f (x, y) after the Discrete Cosine inverse transformation (IDCT) = 1/8, is a constant value, so
F () is called the DC (DC) coefficient. When
When u, v =0, the coefficient after positive transformation is f (u, v) = 0, then the reconstruction function f (x, y) after inverse transformation is not a constant, the f (u, v) coefficient after the positive transformation is the AC coefficient.

The positive DCT transformation formula (2) and inverse transformation formula (3) show that the calculation is complex. However, in fact, this function is implemented using code, that is, the two cosine items are counted only once at the beginning of the program.
Compute, store the computing results, and then use the lookup table. Other items can be solved through the lookup table. The program uses a double nested loop. Figure 25 is the kernel function Gu, V composed of two cosine
(X, y) Where n = 8, u = 2, V = 3; X = 4, y = 5, G2, 3 () =
G2, 3 (4) G2, 3 (5) = (-0.924) × (+ 0.979) =-
0.905, and so on, you can get the values of each point and store them for future reference. You can use the code to identify the values of each item and implement the DCT coefficient output by the DCT encoder in Figure 24. Based on formula (2) and
(3) After checking the table, the C language program is used to calculate the code of n × n matrix elements using double nested loops as follows:
For (u = 0, u <n, u ++)
For (V = 0, v <n, V ++ ){
Temp = 0, 0;
For (x = 0, x <n, x ++)
For (y = 0, Y <n, y ++ ){
Temp + = cosines [x] [u] * cosines [y] [v] * pixel [x] [Y];
}
Temp * = SQRT (2 * n) * coefficients [u] [v];
DCT [u] [v] = int_round (temp ):
}

In the code, use pixel [x] [Y] To represent f (x, y) in the formula, and use DCT [u] [v] To represent f (u, v) in the formula ).
Currently, in addition to defining DCT using double-layer nested loops, the cosine transform matrix is also used to define the matrix calculation method of DCT. The two mechanisms are the same.

Figure 24 and the above conversion principles can detect two points: first, the 64 DCT frequency coefficients after DCT correspond to the 64 pixel blocks before DCT, and both sides of DCT are 64 points, it's only one.
The lossless transformation process of compression. Second, the spectrum of all the DCT coefficients of a single image is almost all concentrated in the coefficient block in the upper left corner. Only one compressed image can be formed from the spectrum of this block. DCT
The maximum DC (DC) coefficient in the upper left corner of the output frequency coefficient matrix is 315 in Figure 24. because it represents the DC components on the X and Y axes, therefore, it indicates the average value of all input matrices;
Take the DC coefficient as the starting point and other DCT coefficients to the right. The farther away from the DC component, the higher the frequency and the smaller the amplitude value. In Figure 24, the lower-right corner is-0.11, that is, most of the image information is concentrated in the DC series.
The high frequency spectrum, which is farther and farther away from the DC coefficient, does not contain image information or even only contain clutter. Obviously, although DCT does not compress itself, it is the "take ",
"Homes" have laid an essential foundation. (To be continued)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Graphic] MPEG-2 compression coding technology principle application (6)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Graphic] MPEG-2 compression coding technology principle application (6)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support