Guo Bin, Professor of the Department of television engineering, Beijing Broadcasting Institute
2. hierarchical coding of MPEG-2
Because the MPEG-2 uses a hierarchical encoding (Scalable Coding) that has exceeded the main
The range supported by the profile encoding algorithm. Therefore, in the SNR profile and spatial
Profile) two subsets are added with hierarchical encoding. The so-called hierarchical encoding is to divide the entire video data stream into several layers that can be embedded step by step. decoder of different complexity can be based on their own capabilities from the same data stream
Extract different layers for decoding to obtain video signals of different quality, time resolution, and spatial resolution. Figure 34 shows the schematic diagram of Video Classification and encoding. As shown in the figure, the video encoding adopts multi-level Encoding
Case. The figure shows two layers of basic and enhancement. Each layer supports different video levels. The process is: To achieve multi-definition display, first downgrade the input video signal to a lower-definition video.
Reduce the sampling rate in space or time. Then, the degraded video is encoded as the basic layer data stream that reduces the bit rate, and then the basic layer data stream that reduces the bit rate is improved by upgrading the sampling rate in space or time.
Upgrade: used to predict the original input video signal and encode the prediction error into an enhanced layer data stream. If the receiver needs to display all the quality of the video signal, the basic layer data stream and the enhancement layer data stream are decoded together.
This can be achieved. If the receiver is not capable or does not need to display all the quality of the video signal, it only decodes the data stream at the basic layer. To meet the bandwidth requirements of transmission channels and storage media
For different network video transmission services, each layer should be allocated with a suitable bit rate of video and its classification and encoding.
The purpose of hierarchical coding is to provide interoperability between different businesses and flexibly support various TV receivers with different display functions.
For receivers that do not have the ability or need to reproduce the full definition of a video, only the subset of the layered data stream can be decoded to display a low-quality video image with low space or time definition. This is achieved through Signal-to-Noise Ratio
Type (SNR
Profile) subsets are implemented by hierarchical coding, that is, as the receiving conditions become worse, the image quality is moderately degraded to prevent the inherent "Cliff effect" of digital broadcasting ". The second is to perform the HDTV source
Hierarchical encoding allows you to flexibly support multiple definitions to achieve compatibility between HDTV and sdtv products. This prevents two separate data streams from being specially and separately transmitted to HDTV and sdtv for receiving.
Machine. That is, avoid using the simucast method, because this method is to encode each video program with different spatial resolutions, frame rates, bit rates, and other parameters, transmitted to the corresponding user, unnecessary
Economic Burden. This is done through the spatial profile)
The subset is implemented by hierarchical encoding. In addition, hierarchical encoding is also applied in Media Asset Management Database browsing and multi-definition video replay in multimedia environments.
Hierarchical encoding has advantages and disadvantages. Two advantages: Enable the same data stream to adapt to different decoder features, improving flexibility and effectiveness; for the transition of video broadcast and communication systems to higher time resolution and space resolution, provides technical guarantee.
There are also two disadvantages: This technology complicate the encoder and decoder, and increase the cost. Due to the multi-layer encoding in the data stream, the encoding efficiency is reduced.
Despite the advantages and disadvantages of hierarchical coding, in the standardization process of MPEG-2, people still want to develop a general hierarchical encoding scheme to meet the various possible applications imagined. Some applications require the lowest
Device complexity, while others require the highest possible coding efficiency. The conflict between universality and particularity makes the general hierarchical encoding solution a bubble. However, it is such a bubble that reminds people to consider the reality of special problems.
To meet the needs of various special applications. Results The spatial classification (spatial) was provided for the MPEG-2.
Scalability, temporal scalability, and SNR
MPEG-2 has standardized the first three tools: scalability and data partitioning:
1) Spatial Classification
The starting point of spatial classification is to make the services between images of different sizes compatible. The main method used is spatial compensation.
Space compensation refers to the process of dividing an image into two layers: high and low. The upper layer only transmits the data with the difference between the upper and lower layers, the decoded and re-Sampled Image Data of the Low-layer data stream is used as the reference image for spatial compensation. The difference data of the High-level decoding is added to the corresponding low-layer image data block to obtain the high-level image data.
This encoding data stream can provide at least two types of spatial resolution video signals, one is the standard resolution video letter sdtv, and the other is the high resolution video letter HDTV. Layer-4 nested hierarchical data streams
The base layer conforms to the MPEG standard, and the others are the enhancement layer ). The MPEG-2 defines two variables in the sequence layer's data header:
Layer-ID and scalable-mode. Used to specify the layer number of the layer and the classification method used. Currently, the spatial classification method is used to provide sdtv with the basic layer and the enhancement layer.
Provides HDTV. Table 6 shows the application of spatial classification. To obtain sdtv, each frame of the original video sequence must undergo low-pass filtering and subsampling to form a low-resolution basic layer image sequence.
The MPEG-2 is independently encoded to obtain the basic layer data stream, and the basic layer provides the standard resolution sdtv. To obtain HDTV, You need to predict the time and space of the original video sequence image (the reference frame can be
Encode the full-resolution image, or the prediction image formed after the basic layer image is interpolated, or the weighted average value of the prediction image of the Full-resolution image), and encode the prediction error to form a full-Resolution enhancement layer data stream, enhancement Layer implementation
High-resolution HDTV signal.
2) time classification
The starting point of time grading is to achieve compatibility between video and image services at different frame rates. This classification method provides video signals with different frame rates and the same spatial resolution. The implementation of time classification is divided into two steps:
Step 1 is to skip some frame fields in the original video regularly, and then combine the remaining frame fields into the basic layer image sequence, which is encoded by MPEG-2 to form the basic layer data stream, because the time definition of the basic layer is not very high, it must be transmitted on a channel with good performance.
Step 1 is to encode the frame fields that will be skipped using the encoded basic layer image and the motion compensation plus DCT method to form the full frame rate enhancement layer data stream. With the help of time classification, provides line-based scanning at the basic layer.
HDTV provides line-by-line scanning of HDTV at the enhancement layer. Because the time definition of the enhancement layer is higher, it can be transmitted over channels with lower performance. Here, the basic layer image can be directly used as part of the enhancement layer image, increasing
The strong layer can have no I frame and can be predicted by the recently solved enhancement layer image or basic layer image. B frame in the basic layer image can also be used as a reference frame. Table 7 shows the application of time classification.
3) SNR Classification
The starting point of signal-to-noise ratio grading is to achieve compatibility between different quality video and image services. This classification method is used to generate two video data streams of different encoding quality with the same spatial resolution from one image signal source. Two steps are taken to achieve SNR classification:
Step 2 quantifies the DCT coefficients by means of coarse (Grob), which is called the 1st quantization to form the basic layer data stream.
Step 1 is to subtract the original DCT coefficient before the rough quantization from the 2nd quantization results, and quantify the difference by 1st times, that is, precisely (Feiner) quantization, to form an enhanced layer data stream.
From the above, we can see that the enhancement layer carries out precise error DCT quantization, which is closely related to the rough Quantization of the DCT coefficient carried out by the basic layer, therefore, the reinforcement layer and the basic layer must be performed simultaneously during decoding. Table 8 shows the application of SNR classification.
4) Data Division
The purpose of Data Division is to receive images with slightly less quality when the signal transmission channel conditions and transmission power are limited, so that no image can be received. Therefore, the MPEG-2 uses the data Division
Technology, information that plays an important role in decoding, such as packet header, motion vector, DCT coefficient (especially low frequency DCT coefficient of video), is transmitted in a channel with good error code performance. Decoding is not very important,
For example, the DCT coefficient of the audio is transmitted in a channel with poor error code performance. Of course, this solution can be implemented only when there are two channels for transmitting and storing video signals. In fact, using the overview of priority
You can also divide data. The encoded data stream is divided into two parts with different priorities. For example, the header information, motion vector, quantization parameter, and low frequency DCT coefficient in the encoded data stream are classified as high priority.
(High Priority partition), which divides the high frequency DCT coefficient and the audio DCT coefficient in the encoded data stream into low priority
Partition. This method of dividing data by priority can minimize the image damage caused by channel noise and cell loss.
It can be seen from the above, in order to solve the contradiction between universality and particularity, The MPEG-2 has taken two measures: 1 is to adopt the hierarchical type, level concept, used to describe different encoding parameter sets. The other one is
Four video coding tools, namely, time, space, signal-to-noise ratio, and Data Division, are used to encode Part 1 of the data stream and decode all the data streams to obtain a lower image resolution. So that the MPEG-2
Become a true "general standard ".
In short, MPEG-2 can effectively compress and encode the image signals of different resolutions and output bit rates in a large range, which has become a real international standard. It will certainly be widely used in the field of broadcast and television.
Table 9 shows the digital video bandwidth of various applications. Table 10 shows some digital systems and parameters. For more information, see. (Full text)