Introduction to the standard H.

Source: Internet
Author: User
Tags arithmetic coding standards

JVT (Joint video team, videos Joint Working group) was established in Pattaya, Thailand, in December 2001. It consists of ITU-T and ISO Two international organization for Standardization of video coding experts. The goal of JVT is to develop a new video coding standard to achieve high compression ratio, high image quality and good network adaptability . At present, the work of JVT has been accepted by ITU-T, the new video compression coding standard is called the H. S standard, which is also accepted by ISO, called the AVC (Advanced videocoding) standard, is the 10th part of MPEG-4.

The H. C Standard can be divided into three gears:

  Basic Grade (its simple version, wide application surface);

  main grade (using a number of technical measures to improve image quality and increase compression ratio, can be used for SDTV, HDTV and DVD, etc.);

  scale (can be used for all kinds of network video stream transmission).

not only did H.263 and MPEG-4 save percent of the bitrate, but also for the network transmission has a better support function. It introduces an IP packet-oriented encoding mechanism, which facilitates packet transmission in the network and supports the streaming of video in the network. With the strong anti-error characteristics , the video transmission in the wireless channel with high packet loss rate and serious interference can be adapted. It supports hierarchical encoding transmission under different network resources, thus obtaining smooth image quality. It can adapt to the video transmission in different networks, and the affinity of network is good.

The main goal of the H. S standard is to provide better image quality at the same bandwidth compared to other existing video coding standards. Compared with previous international standards such as H.263 and MPEG-4, the most significant advantages of the technology are shown in the following four areas:

1. Each video frame is separated into a block of pixels, so the encoding process of the video frame can reach the block level.

2. The space redundancy method is used to predict, transform, optimize and encode some primitive blocks of video frames (variable length encoding).

3. A temporary storage method is used for different blocks of successive frames, so that only the parts that have changed in a continuous frame are encoded. Motion prediction and motion compensation are used to accomplish this algorithm. For certain blocks, a search is performed on one or more frames that have already been encoded to determine the motion vectors of the block and thus predict the main block in the subsequent encoding and decoding.

4. Residual space redundancy technology is used to encode the remaining blocks in the video frame. For example, conversion, optimization, and entropy encoding are used again for different source blocks and corresponding prediction blocks.

The advantages and the essence of the previous compression technology and the many advantages that other compression technologies cannot match, H. h has thecharacteristics and advanced Advantages:

1. Low bit rate: compared with MPEG2 and MPEG4 ASP compression technology, in the same image quality, the amount of data compressed with H. 1/3 is only MPEG2 of 1/8,MPEG4. Obviously, the adoption of the compression technology will greatly save the user's download time and data traffic charges.

2. High-quality images: The high-quality image (DVD quality) can be delivered continuously and smoothly.

3. Strong fault tolerance: the necessary tools for solving errors such as packet loss that can occur in an unstable network environment are provided.

4. Network adaptability: H. E provides a network adaptation layer (adaptation layer, is it a network abstraction layers? ), so that the files can be easily transmitted on different networks (e.g. Internet, cdma,gprs,wcdma,cdma2000, etc.).

one, H. A video compression system

The standard compression system consists of two parts: video coding layer (Coding layer, VCL) and network abstraction layer,nal . VCL includes VCL encoder and VCL decoder, the main function is video data compression coding and decoding, it includes motion compensation, transformation coding, entropy coding and other compression units. nal is used to provide a unified network-independent interface for VCL, which is responsible for packaging the video data to be transported in the network , it adopts a unified data format, including single byte header information, multiple bytes of video data and group frames, logical channel signaling, timing information, Sequence end signal, and so on. The header contains storage flags and type flags. The storage flag is used to indicate that the current data does not belong to the referenced frame. Type flags are used to indicate the type of image data.

VCL can transmit encoding parameters adjusted to the current network conditions.



ii. characteristics of H.

Like H.261 and H.263, the DCT-encoded and DPCM-coded differential coding is a hybrid coding structure. At the same time, the new coding method is introduced in the framework of mixed coding, which improves the coding efficiency and is closer to practical application. There is no cumbersome option, but a simple "return to the basic", which has better compression performance than h.263++, but also has the ability to adapt to a variety of channels.

With a wide range of applications, it can meet a variety of different rates, different occasions of video applications, with good anti-error and anti-packet loss processing capacity.

the basic system of H. C does not need to use copyright, has the open nature, can adapt well to IP and the use of wireless network , which is the current Internet transmission of multimedia information, mobile network transmission of broadband information, etc. are of great significance.

At the system level, a new concept is proposed, which is a conceptual segmentation between the video coding layer (Coding layer, VCL) and the Network abstraction layer (Networks abstraction layer, NAL). The former is the core of the video content of compressed content, the latter is a specific type of network delivery of the expression, such a structure to facilitate information encapsulation and better priority control of information. The system code block diagram of H. A is shown below:



Although the basic structure of the code is similar to that of H.261 and H.263, it has been improved in many aspects and is listed below.

1. A variety of better motion estimates
High Accuracy Estimation  
Half-pixel estimation is used in H.263, and the motion estimation of 1/4 pixels or even 1/8 pixels is further adopted in H. That is, the displacement of the true motion vector may be 1/4 or even 1/8 pixels as the base unit. Obviously, the higher the precision of motion vector displacement, the smaller the residual error between frames, the lower the transmission code rate, that is, the higher the compression ratio.

The interpolation of the 6-order FIR filter is used to obtain a 1/2-pixel position value in H. When the 1/2 pixel value is obtained, the 1/4 pixel value can be obtained by linear interpolation,

For the 4:1:1 video format, the 1/4 pixel accuracy of the luminance signal corresponds to the 1/8-pixel motion vector of the chroma portion, which requires a 1/8-pixel interpolation of the chroma signal.

Theoretically, if the accuracy of motion compensation is increased by one-fold (for example, from the whole pixel accuracy to 1/2 pixel accuracy), there can be 0.5bit/sample coding gain, but the actual verification found that the motion vector accuracy of more than 1/8 pixels, the system basically has no significant gain, so in H. The motion vector mode with 1/4 pixel accuracy is used instead of the 1/8 pixel precision.

multi-macro block partitioning model estimation
In the prediction mode, a macro block (MB) can be divided into 7 different modes of size, this multi-mode flexible, small macro block division, more suitable for the actual moving object shape in the image, so in each macro block can contain 1, 2, 4, 8 or 16 motion vectors.

Multi-parameter frame estimation
The motion estimation of multiple parameter frames can be used in H. E, which means that there are several parameter frames just coded in the encoder's cache, from which the encoderSelect a parameter frame that gives a better coding effect and indicate which frame is being used for prediction, so that you get a better coding effect than using just the last encoded frame as the prediction frame.

2. Integer transformation of small size 4*4
In video compression codingthe usual units used in the past are 8*8 block. The small size of the 4*4 is used in H. Block, as the size of the transform block becomes smaller, the division of the moving object is more accurate. In this case, the computational amount in the process of image transformation is small, and the connection error at the edge of moving object is reduced greatly.

When the image has a large area of the smooth area, in order not to produce due to small size transformation caused by the gray difference between the block, H. 16 4*4 block of macro block brightness data of the DCT coefficients of the second 4*4 block transformation, the chroma data of 4 4*4 block DC coefficients (each small block one, a total 4 DC coefficients) to transform the 2*2 block.

H.263 not only makes the size of the image transform block smaller, but the transformation is an integer operation, not a real operation, that is, the encoder and decoder of the transformation and inverse transformation of the same precision, there is no "reverse transformation error."

3. More accurate intra-frame prediction

In H. 17, each pixel in each 4*4 block can be predicted in-frame using a different weighting of the nearest previously encoded pixel.

Intra-frame encoding is used to reduce spatial redundancy of images. in order to improve the efficiency of intra-frame coding, the spatial correlation of adjacent macro blocks is fully utilized in a given frame, and adjacent macro blocks usually contain similar attributes. Therefore, in the case of a given macro block encoding, the first can be based on the surrounding macro-block prediction (typically based on the upper left corner of the macro block, because this macro block has been encoded), and then the difference between the predicted value and the actual value of the code, so that compared to directly to the frame encoding, can greatly reduce the bitrate.

H. 6 provides a 4x4 pixel macro block prediction, including 1 DC predictions and 5 direction predictions, as shown in the following figure. In the diagram, the adjacent block A to I a total of 9 pixels have been encoded, can be used to predict, if we choose Mode 4, then, A, B, C, D4 pixels are predicted to be equal to the value of E, E, F, G, H4 pixels are predicted to be equal to the value of F, for the image contains little space information in the flat area, It also supports 16x16 in-frame encoding.

4. Inter-frame Predictive coding

Inter-frame Predictive coding uses time redundancy in continuous frames to estimate and compensate for motion . The motion compensation of H. B supports most of the key features of the previous video coding standard, and it also flexibly adds more functions, in addition to supporting P-frame and frame, a new inter-stream transfer frame--SP frame is supported. When the stream contains SP frames, it can quickly switch between streams with similar content but with different bitrate, while supporting both random access and fast replay modes.

The motion estimates of H. 4 have the following characteristics.

(1) macro block segmentation of different sizes and shapes

The motion compensation for each 16x16 pixel macro block can take different sizes and shapes, and H. 7 is supported in Figure 4. The motion compensation of the small block mode improves the performance of the motion details, reduces the block effect and improves the quality of the image.

(2) High precision sub-pixel motion compensation

The motion estimation of half-pixel accuracy is used in h.263, while the motion estimate of 1/4 or 1/8 pixel accuracy can be used in H. In the case of requiring the same precision, the residual error after the motion estimation using 1/4 or 1/8 pixel accuracy is smaller than that of the h.263 using half-pixel precision motion estimation. In this way, with the same accuracy, the bit rate required in the inter-frame coding is smaller.

(3) Multi-frame prediction

With optional multi-frame prediction, the 5 different reference frames are selectable between frames, providing better error correction performance, which improves video image quality. This feature is mainly used in the following situations: Periodic motion, translational motion, and changing the camera's lens back and forth between two different scenes.

(4) Go to block filter

The adaptive removal of block effect filters is defined in H. V, which can handle the horizontal and vertical block edges in the prediction loop, greatly reducing the block effect.
5. Quantification

There are 32 different quantization steps in H. 31, which is similar to the quantitative step size in H.263, but in H. 12.5%, the step size is progressive rather than a fixed constant.
There are two ways to read the transformation coefficients in H. V (Zigzag) scanning and double scanning. In most cases, a simple zigzag scan is used, and a dual scan is used only in blocks with smaller quantization levels to help improve coding efficiency.
6. The Unified VLC

The last step of video encoding processing is entropy coding , there are two methods of entropy coding in H.

1, Unified VLC (ie uvlc:universal VLC). the UVLC is encoded using an identical code table, and the decoder is easy to recognize the prefix of the code word, UVLC can quickly get resynchronization in the event of a bit error.

In the h.263 and other standards, according to the data type to be encoded such as transformation coefficients, motion vectors, etc., using different VLC code table. The UVLC Code table in H + + provides a simple way to use a uniform variable-length encoding table, regardless of the type of data the symbol expresses. The advantage is simple, the disadvantage is that a single code table is derived from the probabilistic statistical distribution model, without regard to the correlation between the coding symbols, the effect is not very good in the middle and high bit rate.

2. Content Adaptive binary Arithmetic coding (cabac:context Adaptive binary arithmetic Coding). The coding performance is slightly better than UVLC, but the complexity is higher.

Arithmetic coding allows the probability models of all syntactic elements (transformation coefficients, motion vectors) to be used on both sides of the encoding and decoding. In order to improve the efficiency of arithmetic coding, through the process of content modeling, the basic probabilistic model can adapt to the statistical characteristics of video frames. Content modeling provides conditional probability estimation of coded symbols, and the correlation between symbols can be removed by selecting the corresponding probability model of the encoded symbol adjacent to the symbol, and different syntactic elements usually maintain different models by using the appropriate content model.

third, performance advantages

The following 6 test rates are used for encoding performance comparisons of H. MPEG-4 and h.263++: 32kbit/s, 10f/s and qcif;64kbit/s, 15f/s and qcif;128kbit/s, 15f/s and cif;256kbit/s, 15F/S and qcif;512kbit/s, 30f/s and cif;1024kbit/s, 30f/s and CIF. The test results indicate that the PSNR performance is superior to that of MPEG and h.263++ . H. PSNR is 2dB higher than MPEG-4, averaging 3dB higher than h.263++.

Iv. new Fast motion estimation algorithm
  A new fast motion estimation algorithm umhexagons (Chinese patent) is a new algorithm that can save more than 90 of the algorithm, which is called "asymmetric cross multi-level hexagon lattice search algorithm" (Unsymmetrical-crossmuti-hexagon Search) ", which is an integer pixel motion estimation algorithm. Because it is in high-bit-rate large motion image sequence coding, in the condition of maintaining good rate distortion performance, the computation is very low, has been formally adopted by the standard H.

The ITU/ISO co-development of H. MPEG-4 Part 10 is likely to be accepted as a unified standard for broadcast, communications and storage media (CD DVD), most likely the standard for broadband interactive new media. China's source coding standards have not been developed, pay close attention to the development of H. S, the establishment of China's source coding standards work is stepping up.

The H264 standard has made moving image compression technology rise to a higher stage, and providing high quality image transmission on lower bandwidth is a bright spot in the application of H. The popularization of the application of the video terminal, gatekeeper, Gateway, MCU and other systems of high requirements, will be a strong impetus to video conferencing software and hardware equipment in all aspects of continuous improvement.


http://kb.cnblogs.com/page/168157/


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.