HEVC algorithm and Architecture: intra-frame prediction of predictive coding

Last Update:2015-11-24 Source: Internet

Author: User

Tags coding standards

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Intra-frame prediction of predictive coding (intra-picture prediction)

Predictive coding (prediction Coding) is one of the core techniques of video coding, which uses one or several coded sample values to predict the current sample values according to a model or method, and encodes the difference between the real and predicted values of the samples. The video encoder transforms, quantifies, and encodes the predicted residual error rather than the original pixel value, thus greatly improving the coding efficiency.

For video signals, there is a strong spatial correlation between neighboring pixels in a frame image, that is, airspace redundancy, and there is strong correlation between adjacent images, and immediate domain redundancy. The techniques of eliminating spatial redundancy and time domain redundancy are intra-frame prediction technology and inter-frame prediction technology respectively.

This blog post first introduces the principle of predictive coding, and then focuses on the analysis of intra-frame prediction technology related knowledge points.

First, the principle of predictive coding

We can simply consider the video as a kind of memory source, predictive coding through the prediction model to eliminate the correlation between pixels, the resulting difference signal can be considered to be irrelevant, so it can be encoded as a non-memory source.

In the predictive coding, the image sample value itself is not transmitted directly, but the difference between the actual sample value and its predicted value is encoded, transmitted, if this difference (prediction error) is quantified and then encoded, this predictive coding method is called differential pulse code modulation (DPCM). Statistically, the prediction error that needs to transmit is mainly concentrated in a small area near 0, because of the "masking effect" of human eye, it is not easy to detect the larger error in the complex area of texture or motion, so the quantization layer needed for predicting error quantization is much less than that of the direct transmission image sample. DPCM is to realize the code-rate compression by removing the correlation between neighboring pixels and reducing the quantization layer of difference signal.

As shown in the schematic framework of the predictive differential coding, in the predictive coding system, the Predictor and Quantizer are the two most important parts.

The basic process of predictive coding: for the input pixel value x (n), first the reconstruction of the encoded pixel is worth the predicted value P (n) of the current pixel, and then the difference e (n) is quantified, entropy encoding, and the quantization residual e ' (n) and the predicted value P (n) to obtain the current pixel reconstruction value X ' (n), The pixels to encode after prediction. The corresponding decoding process is: by entropy decoding can get the current pixel prediction error of the reconstruction Value E ' (n), and the predicted value P (n) added to the current pixel reconstruction value X ' (n).

In order to ensure that the encoder and decoder predict the exact same reference datum, when using space-time domain correlation prediction, it is necessary to use the decoded pixel X ' (n) as the reference pixel, so as to avoid error accumulation in the encoder and decoder due to different predictive references, that is to say, a decoder is required to be embedded inside the encoder.

Second, in-frame prediction technology

The time-domain correlation of video sequences is often greater than that of spatial correlation, so the contribution of inter-frame prediction technology is often greater than the intra-frame prediction technology, but this does not mean that all video frames can be used for predictive coding using inter-frame prediction technology, which mainly includes:

(1), almost all video coding standards Support I-frame, this frame can not rely on the adjacent reference frame for independent decoding, this feature enables the video application can support fast forward or rewind playback, but also avoids the accumulation of coding distortion caused by the gradual deterioration of the image and subsequent image motion prediction effect gradually deteriorated.

(2), the model based on rigid body translation is not suitable for all scenarios, because the actual motion of the video sequence is very complex, although the variable size block pixel motion prediction to some extent to improve this shortcoming, but still have some macro block or block can not get good motion prediction effect, The spatial correlation of these regions may be stronger than time domain correlation, and the prediction effect using intra-frame prediction is better than the inter-frame prediction effect. The research shows that there is a small proportion (1%~3%) of macro blocks in P-frame and B-frames, in which the intra-frame prediction mode is used.

1. The difference between the size and the prediction mode of the prediction code in the frame of HEVC and H.

HEVC Intra-frame prediction is similar to H. S, which uses the reconstructed values of neighboring blocks to predict, so the selection and coding of coding patterns are the key problems to be solved in intra-frame coding. The biggest difference between HEVC and H + in-frame prediction is that HEVC uses larger and more sizes to accommodate the content characteristics of HD video, and supports more kinds of in-frame prediction modes to accommodate richer textures.

A total of 3 sizes of luminance intra-frame prediction blocks were specified: 4*4, 8*8, and 16*16, and intra-frame prediction blocks for chroma components were based on 8*8 size blocks. The brightness blocks of the 4*4 and 8*8 sizes contain 9 predictive modes (vertical, horizontal, DC, lower left diagonal mode, right down diagonal mode, vertical right mode, horizontal down mode, vertical left mode, and horizontal up mode), while the 16*16-size brightness blocks and 8* The 8-size chroma block has only 4 predictive modes (DC, horizontal, vertical, and plane). It is important to note that the "4*4 and 8*8-size brightness blocks support 9 predictive modes" with "16*16-size luminance blocks and 8*8-sized chroma blocks that only support 4 predictive modes" The pattern numbering sequence is different.

The HEVC luminance component in-frame prediction supports 5 sizes of PU (prediction Unit): 4*4, 8*8, 16*16, 32*32, 64*64, each of which has 35 predictive modes, including planar mode, DC mode, and 33 angle modes, As shown in. For chroma components, the size of the supported Pu is 4*4/8*8/16*16/32*32, a total of 5 modes, namely planar mode, vertical mode, horizontal mode, DC mode and corresponding brightness component of the prediction mode, if the corresponding brightness prediction mode is one of the first 4, Then replace it with mode 34 in the angle prediction.

In summary, the PB size is the same as the CB size for all block sizes when the predictive mode is selected as a frame. For the smallest CB size, there is a sign to indicate whether the CB is divided into 4 PB, each PB has its own intra-frame prediction mode, the reason for this segmentation is to be able to select intra-frame prediction mode for 4*4 size block, when the brightness of the intra-frame prediction mode to 4*4 size block processing, The in-frame prediction of Chroma also uses 4*4 blocks.

All prediction modes Use the same template, as shown in. As we can see, HEVC increases the use of the boundary pixels of the lower left block as a reference to the current block compared to H. This is due to the fixed size of the macro block encoding for the unit, when the current block in-frame prediction, its left bottom square is likely not yet encoded, not for reference, and HEVC's four-tree encoding structure makes this area available pixels.

2. Intra-frame prediction process

In HEVC, 35 predictive models are defined on the basis of PU, while the implementation of a specific intra-frame prediction process is in TU units. HEVC stipulates that Pu can divide tu in the form of a four-fork tree, and all tu in a PU share the same predictive pattern.

The in-frame prediction process of HEVC can be divided into the following three steps:

(1), the acquisition of neighboring reference pixels

As shown, the current TU size is n*n, its reference pixels by the area can be divided into 5 parts, left, left, top left, Upper and upper right, a total of 4*n+1 points, if the current tu is at the image boundary or slice, tile boundaries, the adjacent reference pixels may not exist or unavailable, and in some cases, The block at the bottom left or right may not be encoded, and these reference pixels are not available at this time. When the pixel does not exist or is unavailable, HEVC specifies that the nearest pixel can be used to fill, such as the lower-left reference pixel does not exist, the lower-left area of all the reference pixels can be used to fill the lower-left area of the pixel at the bottom, if the upper right area of the reference pixel does not exist, You can use the rightmost pixel in the upper area to fill it (as in the right-hand example). It should be explained that if all the reference pixels are not available, the reference pixels are populated with fixed values, and for 8-bit pixels, the predicted value is 128, and for 10-bit pixels, the predicted value is 512.

(2), the filter of the reference pixel

The reference pixels in some modes are filtered in the intra-frame prediction to make better use of the correlation between neighboring pixels and improve the accuracy of prediction. HEVC follows this approach and expands: one is to select different numbers of patterns for different sizes of TU to filter, and the other is to increase the use of a strong filtering method.

(3), the calculation of the predicted pixel

The prediction pixel is calculated by using different calculation methods to get the predicted pixel value for different prediction modes.

HEVC algorithm and Architecture: intra-frame prediction of predictive coding

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More