This article briefly translated the x264 option (encoding option, which is later called the English name) analysis report x264 codec strong and weak points by MSU lab. After reading it, I felt that the analysis was very thorough, and the method used was also very useful for reference. Therefore, I should record the key points in the analysis for memo. As we all know, there are a lot of options for x264. It is really a headache to combine these many parameters to make the encoded video smaller in size, with good quality and fast encoding speed. Experiments in this report are designed to solve the above problems.
1. Introduction This article analyzes the options of x264 Encoder by using objective quality evaluation algorithm. The following table lists the words used in this document.
Simple translation:
Name |
Definition |
Example |
Option) |
Encoding options |
B frames, motion estimation algorithm |
Option value) |
Encoding option value |
-- Me (motion estimation algorithm) includes the following values: "Dia", "hex", "umh", "ESA", and "TeSA" |
Preset) |
A set of options with fixed values. |
|
Proxy-optimal point (preset) |
There is no other preset to get better video quality and faster encoding speed than the preset. |
|
Envelope line points (preset) (preset on the envelope line) |
Default located on convex hull (convex hull. Represents the best preset. |
|
Parameter λ (λ parameter) |
The ratio of the expected encoding time to the bit rate. It can be represented in the following formula: M = λ T + q T represents the relative encoding time, and Q represents the relative video quality. |
|
Ii. Option and option value analyzed in this report are listed below.
Simple translation:
Option |
Value |
Remarks |
Partitions) -- Partitions x |
"None" "P8x8, b8x8, i8x8, i4x4" "All" |
The block method used by the macro block. Default Value: "P8x8, b8x8, i8x8, i4x4" |
B-frames (number of frames per B) -- Bframes n |
0 2 4 |
The number of consecutive B-frames between frames I and P. Default Value: 0 |
Reference frames (reference frames) -- Ref n |
1 4 8 |
The number of reference frames. Default Value: 1 |
Motion Estimation Method) -- Me x |
"Dia" "Hex" "Umh" "TeSA" |
Motion Estimation Method. For more information, see annotations. Default Value: "Hex" |
Subpixel motion estimation (subpixel motion estimation) -- Subme n |
1 4 5 6 |
Subpixel estimates the complexity. Default Value: 5 |
Mixed references -- Mixed-refs |
Off On |
Default Value: Off |
Weighted Prediction -- Weightb |
Off On |
The influence of each frame on the B frame is related to the distance between the B frame and the B frame. Default Value: Off |
Note: 1. Introduction to motion estimation methods DIA (diamond Diamond Search) is the simplest Search Method. Starting from the optimal prediction value, the motion vector is detected at the top, left, bottom, and right pixels, and the best value is selected, repeat this step until a better motion vector is not found. The hex (hexagon positive hexagonal search) strategy is similar, but it searches for the six neighboring vertices in range-2, so it is called a positive hexagonal search. This method is much more efficient than Dia, and the speed is quite high. Therefore, this method is usually used for encoding. Umh (uneven multi-hex Asymmetric Multi-hexagonal search) is much slower than hex, but it can search complex multi-hexagonal shapes to avoid missing difficult motion vectors. Similar to hex and Dia, the m. E. Radius Range parameter directly controls the umh search radius. You can increase or decrease the size of the search space. ESA (exhaustive full search) Searches motion vectors in a highly optimized smart manner within the entire space within the range of M. E. radius near the optimal predicted value. It is equivalent to the mathematical exhaustive method. It searches every motion vector in the search area, but it is faster. However, this method is far slower than umh, and has few benefits. It is not very useful for common encoding. The TeSA (transformed exhaustive transform comprehensive search) algorithm attempts to compare the approximate comparison of various motion vectors with the hadama transform method. It is similar to exhaustive, but the effect is slightly better and the speed is slightly slower. 2. subpixel motion estimation Subpixel estimates the complexity. The larger the complexity, the better. From 1 to 5, the sub-pixel refined intensity is controlled. The value 6 enables the mode decision rdo, and the value 8 enables the motion vector and the internal prediction mode rdo. The rdo mode is much slower than the low-level mode. If the value is less than 2, a fast but low quality lookahead mode will be used, and the decision-making of -- scenecut will be affected. Therefore, it is not recommended. Optional values: 0. fullpel only 1. qpel sad 1 Iteration 2. qpel satd 2 iterations 3. hpel on MB then qpel 4. Always qpel 5. multi qpel + bi-directional Motion Estimation 6. RD on I/P frames 7. RD on all frames 8. RD refinement on I/P frames 9. RD refinement on all frames QP-RD (requires -- trellis = 2, -- AQ-mode> 0) 11. Full Rd |
3. The best preset shows all the results of this experiment. The experiment results are obtained by enumerating all the options of the parameters listed in the preceding table and then compressing and encoding them. Each Green Point represents a type of Preset compression result. It can be seen that the experiment has a large amount of data.
The following briefly explains the meaning of this image. The ordinate represents the video bit rate. The lower the value, the smaller the bit rate, which means the same video quality. The abscissa represents the video encoding time. The lower the value, the lower the encoding time. The coordinate value is a relative value. Each Green Point represents a preset. Therefore, we can see that the more the preset is located in the lower left corner, the faster the encoding speed and the lower the bit rate. The optimal preset should be the point on the convex hull (that is, the point on the red line ). Note that the horizontal and vertical coordinates are relative values rather than absolute bit rates and time. The X-axis and Y-axis values are compared to the default preset of x264. The default preset of x264 is that x264 uses the default preset, which is located at the () point of this figure.
If the encoding time is fixed, the optimal preset is on the envelope line ., The preset at the position of the pink pentagram has the lowest Bit Rate at the same encoding time.
The default value of x264 is shown in the following table.
Option |
Default Value |
Partitions |
"P8x8, b8x8, i8x8, i4x4" |
B-Frames |
0 |
Reference frames |
1 |
Motion Estimation Method |
"Hex" |
Subpixel motion estimation |
5 |
Mixed references |
Off |
Weighted Prediction |
Off |
Shows the preset (which has the highest advantage of the probe-optimal point) (that is, no other preset can get better video quality and faster encoding speed than this preset) and the convex hull).
The preset data statistics on the convex hull (convex hull) are as follows.
The preset analysis result on the convex hull (convex hull) is as follows. The table lists the presets that use more options and fewer options on a convex hull (convex hull. On the other hand, it lists the presets that consume a long time but have a high quality, and the presets that are faster but have poor quality.
4. preset Analysis of "colored cloud map" in this Chapter, each graph corresponds to an option of interest, and the preset containing different option values is marked as different colors. The experiment results are as follows.
The analysis results of each option that can be obtained from a table are shown in the following table.
Simple translation:
Option |
Value |
Conclusion |
Partitions) -- Partitions x |
"None" "P8x8, b8x8, i8x8, i4x4" "All" |
"None" is required for fast speed, and "all" is required for high video quality ". Use "p8x8, b8x8, i8x8, i4x4" for speed and quality balancing considerations" |
B-frames (number of frames per B) -- Bframes n |
0 2 4 |
"0" is required when the speed is extremely high ". In other cases, use "2" and "4". The difference between them is not big. |
Reference frames (reference frames) -- Ref n |
1 4 8 |
"1" is required when the speed is high, and "8" is required when the video quality is extremely high ". Use "4" for speed and quality balancing considerations" |
Motion Estimation Method) -- Me x |
"Dia" "Hex" "Umh" "TeSA" |
"Dia" and "hex" are required for fast videos, and "TeSA" is required for videos with high quality ". Use "umh" for speed and quality balancing considerations ". |
Subpixel motion estimation (subpixel motion estimation) -- Subme n |
1 4 5 6 |
"1" is required for fast speed, and "6" is required for high video quality ". Use "4" for speed and quality balancing considerations" |
Mixed references -- Mixed-refs |
Off On |
|
Weighted Prediction -- Weightb |
Off On |
|
Note 1: "weighted prediction" has little effect
NOTE 2: 2 and 4 B-frames in "bframes" have little difference.
The following table lists the analysis conclusions of the colored cloud chart. This table lists the values that option values should use in different use environments ). There are three situations: speed is the most important, speed and video quality are equally considered, and video quality is the most important.
Original article address:
Http://compression.ru/video/codec_comparison/pdf/x264_options_analysis_08.pdf
X264 encoder option analysis (x264 codec strong and weak points) 1