JM model I frame prediction process
I-frame only has intra-frame encoding, and there is no estimation of Inter-frame motion, so I-frame has a synchronous effect ,. the cost is a little less efficient, but it is also necessary.
Intra-frame encoding can be divided into brightness encoding and color encoding. Prediction and calculation of RD cost are required to determine the Macro Block block mode.
The I-frame brightness is divided into 16x16, 8x8, and 4x4. The color block mode only has one 8x8 mode, and each block mode has a different
Prediction method. In the JM model, the RD cost is calculated for these models, and the minimum value is selected as the optimal mode.
The following describes the data that involves the block mode:
Const int mb_mode_table [9] = {0, 1, 2, 3, p8x8, i16mb, i4mb, i8mb, ipcm}; // do not change order !!!
0: 16x16 direct mode, valid in B frame
1: inter16x16, valid between frames
2: inter16x8, valid between frames
3: inter8x16, valid between frames
P8x8: valid between frames
I16mb: valid within 16 intra16xframes
I4mb: intra valid
I8mb: intra valid
Ipcm: intra valid. Do not predict. encode raw data directly.
The p8x8 mode is related to the following data:
Const int b8_mode_table [6] = {0, 4, 5, 6, 7}; // do not change order !!!
The above five modes are classified into the p8x8 mode, which is called the sub-macro block level. In the code stream trace file, there is a syntax element called b8mode.
0: 8x8 direct mode, valid in B frame.
4: inter8x8, valid between frames
5: inter8x4, valid between frames
6: inter4x8, valid between frames
7: inter4x4, valid between frames
The following is an example in the trace file:
@ 45800 mb_type (B _slice) (7, 4) = 8 0000 (22)
@ 45804 8x8 mode/pdir (0) = 4/0 000 (1)
@ 45807 8x8 mode/pdir (1) = 6/1 0000 (7)
@ 45811 8x8 mode/pdir (2) = 4/0 00 (1)
@ 45813 8x8 mode/pdir (3) = 5/1 0000000 (6)
That is, the position () Macro Block is p8x8 blocks,
The first 8x8 blocks are 8x8, list0 prediction, and the second 8x8 blocks are two 4x8 blocks. The list1 direction prediction, the 3rd is 8x8, and the list0 direction prediction, the fourth is two 8x4, list1 predictions,
This seems to be a bit out of the question, because b8mode is not an I frame block mode, it is a P, B frame split block mode.
The above basically includes all the block modes used between frames within 264 frames.
The following describes the process of intra-frame prediction:
Color prediction (calculate the predicted values of all possible prediction modes) ----> brightness prediction (all block modes require three blocks) ----> calculate the RD value of each model (including various prediction methods) ---> obtain the optimal value.
The following is a step-by-step description:
1: Color Prediction
Color prediction is based on 8x8 blocks. 8x8 prediction has four prediction modes, including horizontal, vertical, DC prediction, and flat prediction.
Several tables must be mentioned for intra-frame color prediction:
A table is:
Static int block_pos [3] [4] [4] = // [YUV] [B8] [B4]
{
{0, 1, 2, 3}, {0, 0, 0}, {0, 0, 0}, {0, 0, 0, 0 }},
{0, 1, 2, 3}, {2, 3, 2, 3}, {0, 0, 0}, {0, 0, 0, 0 }},
{0, 1, 2, 3}, {1, 1, 3, 3}, {2, 3, 2, 3}, {3, 3, 3 }}
};
Among them, YUV is the YUV sampling ratio, YUV = YUV format-1, so yuv420, YUV value is equal to 0, B8 is the Macro Block color 8x8 serial number, yuv420, 8x8 only one, b8 is 0,
B4 has 0, 1, 2, and 3, so it is easy to understand the above table.
The second table is
Const unsigned char subblk_offset_x [3] [8] [4] = // [YUV] [B8] [B4]
{
{0, 4, 0, 4 },
{0, 4, 0, 4 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },},
{0, 4, 0, 4 },
{0, 4, 0, 4 },
{0, 4, 0, 4 },
{0, 4, 0, 4 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },},
{0, 4, 0, 4 },
{8, 12, 8, 12 },
{0, 4, 0, 4 },
{8, 12, 8, 12 },
{0, 4, 0, 4 },
{8, 12, 8, 12 },
{0, 4, 0, 4 },
{8, 12, 8, 12 }}
};
Const unsigned char subblk_offset_y [3] [8] [4] = // [YUV] [B8] [B4]
{0, 0, 4, 4 },
{0, 0, 4, 4 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },},
{0, 0, 4, 4 },
{8, 8, 12, 12 },
{0, 0, 4, 4 },
{8, 8, 12, 12 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 },
{0, 0, 0, 0 }},
{0, 0, 4, 4 },
{0, 0, 4, 4 },
{8, 8, 12, 12 },
{8, 8, 12, 12 },
{0, 0, 4, 4 },
{0, 0, 4, 4 },
{8, 8, 12, 12 },
{8, 8, 12 }}
};
The two tables have similar meanings to the above block_pos, but both the u and v components are included. This meaning is the coordinate offset of the frame, not the serial number.
In-frame color prediction uses macro blocks A, B, and C to make judgments and color block prediction.
Since this article was written by comparing the code and writing it, the idea is somewhat incorrect. It is a bit of a flashback method in writing articles.
The imageparameters structure contains a member mprr_c [2] [4] [16] [16]; // [UV] [dc_pred_8/hor_pred_8/vert_pred_8/plane_8] [16] [16,
Place the predicted values in various modes and ask why 16x16 is required, instead of 8x8. When yuv444 is used, 16x16 is required.
1: DC prediction. DC prediction uses 4x4 blocks as the basic unit to calculate the average value for the pixels of blocks A, B, and C. If there are no blocks A, B, and C, 128 is used.
2: vertical prediction. If the B Macro Block exists, copy the bottom pixel of the B Macro Block directly.
3: horizontal prediction. If a macro block exists, copy the rightmost pixel of a macro block directly.
4: The flat mode requires both A, B, and C. The flat mode method for predicting the pixel value is complicated.
After the color value is predicted, it is temporarily saved in the mprr_c array and will be used later,
2: The prediction of brightness and the calculation of value generation are both in the compute_mode_rd_cost function. This function uses rdcost_for_macroblocks to complete the main calculation.
This function is described as follows:
Int rdcost_for_macroblocks (double lambda, // <-- Laplace multiplier, Laplace factor
Int mode, // <-- modus (0-copy/direct, 1-16x16, 2-16x8, 3-8x16, 4-8x8 (+), 5-intra4x4, 6-intra16x16)
// Mode: 9 block modes.
Double * min_rdcost, // <-> minimum rate-distortion cost
Double * min_rate, // --> bitrate of mode which has minimum rate-distortion cost.
Int i16mode) // 16x16 prediction mode, which is only valid for 16x16 frames.
This function selects different modes for different mode values.
I4mb ------------- mode_decision_for_intra4x4macroblock
I16mb ------------ intra16x16_mode_de.pdf
I8mb ------------- mode_decision_for_new_intra8x8macroblock
I4mb Analysis
I4mb also calls the mode_decision_for_8x8intrablocks function. The parameter is the same name. This function selects 8x8 blocks in the mode,
This function also calls mode_decision_for_4x4intrablocks, which completes 4x4 intra-frame prediction, residual computation, differential DCT, Zig sorting, quantization, rl encoding, IDCT, anti-quantization,
Store and reconstruct pixels for future use.
II. How troublesome it is to compile an I frame in 264-I frame encoding Analysis in JM 9.7
Coding a frame. First of all, you have to know where to start coding. Since we have something like fault tolerance, FMO (flexible macroblock ordering) becomes the choice of encoding block.
Step 1. of course, in the most common encoding method, one frame is a slice. One slice starts from the first macro block of the frame, so the macro block at the top left corner of the frame is obtained.
Address, starting with I frame encoding.
During initialization of encoding parameters, most of them are set encoding parameters in the configure file. For example, whether the prediction in a certain direction of 4x4 in intra is disabled
8x8 is optional, or don't wait. These are all initialized to the strange little item "enc_mb-> valid []", so if you want to create a single block
Mode, you must be very careful with the data in it.
For an I-frame, encoding always performs intra prediction on the u and v parts of a frame before considering the luma optimal solution after the chroma optimization.
Calculate the mode of RD cost, which is the smallest in the frame of encoding, to obtain the encoding method of the current Luma component.
Core Function: rdopt. C: int rdcost_for_macroblocks (.....)
In this function, we analyze different block modes and the current encoding status. The intra coding involves i4mb, i8mb, i16mb, and ipcm (ipcm is
Something weird. Let's talk about it later)
For i16mb:
First, perform the intra prediction of 16x16 blocks. Of course, this is the prediction of the entire block, so the problem of ABT is not involved. directly use the DC, horizontal, vertical, plane four modes.
Line encoding to get the prediction results of the four modes.
After the prediction result is obtained, we use the orignal data to subtract the predicted data and get the residual. The values of residual are different for different modes.
The result is that the energy they contain is different. The result is that the code word consumed by encoding is different, and the result is different. For years, people who have coded for this purpose
A few bits are hard-working. To save bits, we have to compare the results and make some achievements. The comparison content is their respective sad (sum of absolute difference)
The mode with the smallest sad is called the best encoding mode in the i16mb block mode.
After the prediction, we use this encoding mode to perform DCT and quantization on residual, and the result is directly sent to the entropy encoder for processing. This i16mb Encoding
Finished.
For i4mb:
Divide the 16x16 macro block into four 8x8 sub-blocks, divide each sub-block into four 4x4 sub-blocks, and perform intra prediction for each 4x4 block, of course, 4x4 blocks are implemented.
The prediction has nine modes, which are predicted based on their possible texture trend. The residual obtained by the mode that best matches the texture direction is smaller, so that the prediction will be selected as the best encoding.
Mode. Use the best mode to obtain the value of the best intra prediction to obtain the corresponding residual. Calculate the RD of each method, calculate the DCT, and send it to quantization,
The result is sent to the entropy encoder, And the encoding is completed.
In this case, wait until 16 4x4 block codes are completed.
For i8mb:
Divide a 16x16 macro block into four 8x8 sub-blocks, calculate the result of 8x8 prediction for each sub-block, calculate the residual, and use each residual
The Hadamard transform is used to obtain satd, compare the size, and the smaller one is the best prediction mode. retain the best residual, perform 8x8 DCT transformation, retain the transform coefficient, and end the encoding.
In this way, the entire frame is encoded. At this time, the I frame encoding ends. The MB layer data is written, the macroblock encoding data is written, and the I frame encoding ends.
We can see that I-frame encoding is easy to think of more complex algorithms, such as using multiple seed Block Methods in a macro block. However, such an approach may make the bits
Excessive numbers make the bitrate-to-SNR ratio cost-effective. Therefore, the idea of encoding is very important, that is, the complexity is not necessarily optimal, and the mathematical theory and experimental science are still
There are some differences. This also provides an inspiration for further research. think more about simple and new ideas and don't give everything to mathematics and warn me.