1. Prediction code
CompressionAlgorithmThe essence is to remove redundancy between signals. What is signal redundancy? The correlation between signals is redundancy. human hearing or visual systems that do not feel or mask can also be used as redundant components. Today we will talk about the concept of predictive coding, which is a very intuitive and simple method. It is intuitive. Taking an image as an example, There Are similarity and correlation between the two frames or adjacent pixels of the same image. We can use the current frame and a set of prediction coefficients, the next frame of image can also be inferred from the current pixel. Some redundancy is removed by the difference between the actual value and the predicted value, which reduces the dynamic range of the signal, indicating that the number of bits of these signals is reduced to achieve compression.
Prediction encoding for video signals is divided into two types: one is inter-Frame Prediction encoding and the other is intra-Frame Prediction encoding. Intra-Frame Prediction removes the redundancy between macro blocks in the same image frame from the space. In h264, there are 4x4 brightness prediction modes, 16x16 brightness prediction modes, 8x8 color block prediction modes, and an I _pcm encoding mode. It is not easy to choose the optimal encoding model.
The Inter-Frame Prediction encoding efficiency is higher than intra-frame encoding. In terms of time, it removes the redundancy between image frames and frames, which can be classified into one-way prediction and two-way prediction. Generally, bidirectional prediction increases the encoding latency, so there is not much to use in real-time communication. In inter-frame prediction, we have to mention the concept of motion estimation. The scene in the adjacent frame of the motion image will have a spatial displacement. The process of obtaining this motion offset is motion estimation, it involves various search algorithms, and the complexity of this part is also the focus of h264.
2. Transform Encoding
Transform encoding refers to converting the image in the spatial domain to the frequency domain, which produces some transform coefficients with little relevance and compress the image. DCT transformation is usually used because its performance is close to K-L transformation, and it has a fast algorithm, which is very suitable for image transformation encoding. Transform encoding is more complex than prediction encoding, but all kinds of errors (quantization and channel error) are not extended to the back, which has little impact on vision.
3. Entropy Encoding
Entropy encoding is used to compress bit rates based on the statistical characteristics of the source. The feature is lossless encoding, but the compression rate is relatively low. It is generally used for further compression after conversion encoding. Common variables include variable-length coding (Huffman encoding) and arithmetic coding.
1) variable-length encoding
Assign a short-character-length binary code to a symbol with a high probability, and assign a long-character-length binary code to a symbol with a low probability to obtain the minimum average symbol length. It is also called the best encoding.
2) Arithmetic Coding
Unlike the Huffman encoding, instead of using a code word to represent the input symbol, a floating point number is used to represent a string of input symbols. After arithmetic encoding, a floating point number smaller than 1 and greater than or equal to 0 is output, the decoding end is performing unique decoding to restore the original symbol sequence.
These two encoding methods have the hardware precision problem in practical application, that is, how to point the decimal point.