"H.264/AVC Video Codec Technology detailed" video tutorial has been in the "CSDN College" on-line, the video details of the background, standard protocol and implementation, and through a practical project in the form of the standard of the resolution and implementation of H. A, Welcome to watch! "The paper came to the end of shallow, I know this matter to preach", only by themselves in accordance with the standard document in the form of code to operate, in order to video compression coding standard ideas and methods have a profound understanding and experience! Link Address: H.264/AVC Video codec technology detailed GitHub code address: Click here context Adaptive variable length encoding (context-based Adaptive Variable Length Coding, CAVLC) 1. Introduction
In the previous chapters of the blog/video, we have learned that entropy coding is a lossless coding method that uses statistical redundancy of information to compress data , and has discussed the basic principle of entropy coding and the analytic algorithm of grammatical elements used in H. Exponential Columbus code "algorithm and Practice:
- "H.264/AVC Video Codec technology detailed" seven, Entropy Coding algorithm (1): basic knowledge
- "H.264/AVC Video Codec technology detailed" eight, Entropy coding algorithm (2): Entropy coding basic method, exponent Columbus code in H.
In the analysis of the structure (such as nal Unit, Slice header, etc.) we have implemented, most of them are implemented using fixed length coding or exponential Columbus coding. For example, it is necessary to use algorithms with higher compression ratios, such as CAVLC and Cabac, to predict residuals that occupy large volumes of code streams. The former is what we will discuss in this article, which will be detailed in the following sections.
2. Fundamentals of CAVLC
We know that the full name of CAVLC is called "context-adaptive variable-length coding context-based Adaptive Variable Length Coding". The so-called "context-adaptive", shows that the CAVLC algorithm is not the same as the exponential Columbus Code with fixed code stream-code-word mapping coding, but a dynamic encoding algorithm, so the compression ratio is far more than the fixed variable length encoding UVLC algorithm.
The CAVLC is mainly used to predict the coding of residual error in the H. In the second post of this series, we give the coded flow graph of H. I, which shows that the input of entropy code is the coefficient matrix after the transformation-quantization of the prediction residuals between frames/frames. In the case of a 4×4-sized coefficient matrix, after transformation-quantization, the matrix typically presents the following characteristics:
- The matrix with the transformed quantization usually has sparse characteristics, that is, most of the data in the matrix is 0. CAVLC can efficiently compress continuous 0 coefficient strings through run-length coding;
- The highest frequency non-0 coefficient of the coefficient matrix after zig-zag scanning is usually a data string with a value of ±1. CAVLC can encode high-frequency components efficiently by transmitting a continuous length of +1 or-1;
- The amplitude of a non-0 coefficient is usually larger in the vicinity of the DC (i.e. DC component), while the high frequency part is smaller;
- The number of non-0 coefficients in matrices is related to neighboring blocks;
In view of the above characteristics 3 and 4, for the coefficients to be encoded in the coefficient matrix in different locations, as well as the adjacent blocks of information, in the encoding with different Code table encoding. This feature of CAVLC embodies the "context-adaptive" approach in naming.
3. Coding process for CAVLC
In CAVLC, entropy coding is not coded for a certain code element like Huffman coding, but for a coefficient matrix. Let's say we want to CAVLC encode one of the following transformation coefficient blocks:
{ 3, 2, -1, 0, 1, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, 0,}
For a 4×4-sized transformation coefficient matrix to be CAVLC encoded, it needs to be scanned first to convert the two-dimensional matrix into a one-dimensional array. As mentioned in the previous section, the scans are carried out in Zig-zag order, that is, in the following order:
Therefore, the transformation coefficients are rearranged after the scan, resulting in:
[3, 2, 1, -1, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
The following important syntax elements need to be noted in the coding process:
- Number of non-0 coefficients (totalcoeffs): The range of values is [0, 16], which is the number of elements in the current coefficient matrix that are not 0 values;
- The number of trailing coefficients (trailingones): The range of values is [0, 3], which indicates the number of values of the highest frequency as ±1 coefficients. The trailing factor is no more than 3, and if exceeded, only the last 3 are considered as trailing coefficients, others are used as ordinary non-0 coefficients;
- Trailing factor symbol: 1 bit, 0 means +,1-;
- Current block value (numbercurrent): Used to select the Code table, calculated from the non-0-series number of adjacent blocks above and to the left. The current block value is NC, the upper adjacent block is not 0 series number is NA, the left adjacent block non 0 series number is NB, the formula is NC = Round ((NA + NB)/2), for the chromaticity of the DC coefficient, NC = 1;
- The amplitude of the ordinary non-0coefficient (level): The encoding of the amplitude is divided into prefix and suffix two parts. The encoding process is encoded in reverse order, starting from the highest frequency non-0 factor.
- The number of 0 before the last non-0 coefficient (totalzeros);
- Number of 0 before each non-0 coefficient (runbefore): According to the reverse order code, that is, from the highest-frequency non-0 coefficients, for the last non-0 coefficients (that is, the low-frequency non-0 coefficients) before the number of 0, and no remaining 0 coefficients need to encode, no longer need to continue coding.
In each of these types of data, the level that encodes a non-0 factor is the most complex. The main processes are:
- Determine the value of the Suffixlength:
- Suffixlength initialization: Normally initialized to 0, when Totalcoeffs is greater than 10 and trailingones is less than 3 o'clock, it is initialized to 1;
- If the number of non-0 coefficients is greater than the threshold value, then suffixlength plus 1, the threshold is defined as 3 << (Suffixlength 1); After encoding the first level, the suffixlength should be added 1;
- Converts the signed level value to an unsigned levelcode:
- If level > 0,levelcode = (level << 1)-2;
- If level < 0,levelcode =-(Level << 1)-1;
- The encoding Level_prefix:level_prefix is calculated as: Level_prefix = levelcode/(1 << suffixlength), and the corresponding relation of Level_prefix to Bitstream is represented by 9-6;
- Determine the length of the suffix: the length of the suffix levelsuffixsize is usually equal to suffixlength, with the exception of:
- Level_prefix = 14 o'clock, Suffixlength = 0, levelsuffixsize = 4;
- Level_prefix = 15 o'clock, levelsuffixsize = 12;
- Calculate the value of the Level_suffix: Level_suffix = levelcode% (1 << suffixlength);
- Encode Level_suffix according to levelsuffixsize length;
In the above coefficient matrix, the number of non-0 series is totalcoeffs=6, the number of trailing coefficients is trailingones=2, the last non-0 coefficient is 0 before the number totalzeros=2; suppose nc=0.
- In table 9-5 of the standard protocol document, the value of Coeff_token is 0x00000100;
- The symbol of coding trailing factor, from high frequency to low frequency, trailing factor symbol is + 、-、-, so the code stream of the symbol is 011;
- Encode the amplitude of non-0 coefficients, three ordinary non-0 coefficients are 1, 2, 3 respectively;
- The encoding 1:suffixlength is initialized to 0;levelcode=0;level_prefix=0, the corresponding stream of the check table is 1; suffixlength=0, so the suffix is not encoded;
- The Code 2:suffixlength self-increment 1 equals to 1;levelcode=2;level_prefix=1, the check table indicates that the corresponding stream is the suffixlength=1,level_suffix=0, so the suffix stream is 0;
- The coding 3:suffixlength does not satisfy the self-increment condition, still is the 1;levelcode=4;level_prefix=2, the check table know corresponding code stream is 001; suffixlength=1,level_suffix= 0, so the suffix stream is 0;
- To sum up, the 10100010of the amplitude portion of the non-0 coefficients is the code flow;
- When the number of the last non-0 coefficients of the code is 0 totalzeros:totalcoeffs=6,totalzeros=2, the code stream is 111in table 9-7;
- Encode the number of the first 0 of each non-0 coefficient: from high frequency to low frequency, the total number of each non-0 coefficient before 0 (zerosleft) is 2, 1, 0, 0, 0, 0, each non-0 coefficient before the number of consecutive 0 (run_before) 1, 1, 0, 0, 0, 0. According to standard document table 9-10 available:
- run_before=1,zerosleft=2, the corresponding code stream is the same;
- Run_before=1,zerosleft=1, the corresponding code stream is 0;
- All 0 coefficients have been encoded and no further coding is required;
In summary, the entire 4x4 coefficient matrix after CAVLC encoding, the output stream is: 0000010001110100010111010.
"H.264/AVC Video Codec technology detailed" 13, Entropy coding Algorithm (3): CAVLC principle