Summary
Introduction to the first chapter
Body
1, Generally speaking, the video signal information is large, the transmission network needs a relatively wide bandwidth. For example, a video videophone or conference TV signal, due to its active content is less, the required bandwidth is narrower, but to achieve good quality, not compressed about a certain number of Mbps, compression needs 384Kbps, and, for example, a high-definition HDTV signal (HDTV), because of its large amount of information, not compressed 1Gbps required, With MPEG-2 compression, 20Mbps is still required.
2. Video Compression coding target
Video signal because of the large amount of information, transmission network bandwidth requirements, like a huge lorry only in the broad
The road can only be driven. Then there is a problem: whether the video signal can be transmitted before the compression encoding, that is, video source compression encoding, and then transfer on the network, in order to save the transmission bandwidth and storage space. Here are two requirements:
1) must be compressed within a certain bandwidth, that is, the video encoder should have sufficient compression ratio;
2) video signal compression, you should maintain a certain video quality. This video has a quality of two standards: one for subjective quality,
A person is visually evaluated; an objective mass, usually expressed as a signal-to-noise ratio (S/n). If you do not ask the quality, blindly compression, although the compression ratio is very high, but serious distortion after compression, obviously not up to the requirements; Conversely, if only the quality, compression ratio is too small, also does not meet the requirements.
3. Intra-frame predictive coding
As we all know, an image consists of a number of so-called pixel dots, 1.2 "O" represents a pixel, a large number of statistics Indicates that there is a strong correlation between pixels in the same image, the shorter the distance between two pixels, the stronger the correlation, the more In a vulgar way, the closer the two-pixel value is. In other words, the probability of a mutation in the value of two adjacent pixels is minimal, and the probability of "equal, similar, or slow" is great.
Thus, people can use the correlation between the pixels to compress the encoding. For example, the current pixel X (pixels set for immediate transfer) can be predicted using the linear weighting of the previous pixel A or B, C, or three. These A, B, and C are referred to as reference pixels. In the actual transfer, the actual pixel X (current value) and the reference pixel (the predicted value) subtract, simply transfer x-a, to the receiver and then put (x-a) +a=x, because A is transmitted (on the receiving side is stored), so the current value. Because X is similar to A, the (x-a) value is small and the video signal is compressed, which is called intra-frame predictive coding.
Furthermore, the inter-frame correlation shown in Figure 1.3 can be used to compress the encoding. Because the correlation between neighboring frames is generally stronger than the intra-frame pixels, the compression ratio is greater.
Thus, using the correlation between pixels (intra-frame) and the correlation between frames, that is to find the corresponding reference pixel or reference frame as a predictive value, can achieve video compression coding.
4. Transform Code
5, the basic structure of video Coding system
As can be seen from Figure 1.5, the video coding method is related to the available source model. If a source model of "one image is made up of many pixels" is used, the parameter of the source model is the amplitude value of the luminance and chroma of each pixel. The compression coding technique for these parameters is called Waveform-based encoding . If a source model consisting of several objects is used, the parameters of the source model are the shape, texture and movement of each object. The technique of compressing these parameters is called Content- based encoding .
Thus, according to the use of the source model, video coding can be divided into two categories, based on the waveform encoding and content-based coding. They use different compression coding methods to obtain the corresponding pre-quantization parameters, and then quantify these parameters, using binary codes to represent their quantization values, and finally, lossless entropy coding is used to further improve the code rate. Decoding is the inverse process of encoding.
6, block-based hybrid coding method
Waveform-based coding employs a block-based hybrid coding method combining predictive and transformation coding.
In order to reduce the complexity of the coding, so that the video encoding operation is easy to execute, when using a hybrid encoding method, the first image is divided into fixed-size blocks, such as block 8x8 (i.e., 8 rows per block, 8 pixels per row), block 16x16 (16 rows per block, 16 pixels per line), and then compress the block encoding processing.
The H.264/H.261/H.263/MPEG-1/2/4 is based on the hybrid coding method, which belongs to the waveform-based encoding.
"New generation video compression code standard-H.264_AVC" Reading notes 1