2.4Quantization)
The quantization process is a process of discretization the amplitude of a signal. After Quantization, the discrete signal is converted into a digital signal.
HVS is more sensitive to low-frequency signals, so the low-frequency part of the signal uses a relatively short quantitative step, and the high-frequency part uses a relatively long quantitative step. In this way, relatively clear images and higher compression ratio can be obtained to a certain extent.
2.5Z-shaped code (zigzag scan)
Read the quantified data in Z-shaped form, for example:
2.6Use the travel Length Encoding (RLE) to encode the AC Coefficient
The so-called Length Encoding means that a code can simultaneously represent the value of the code and there are several zeros in the front. This gives full play to the advantages of Z-based reading, Because Z-based reading has many opportunities for zero connections, especially in the end, if all are zero, after reading the last number, as long as the block end code (eob) is given, the output can be ended, saving a lot of bit rate.
For example, in the figure, the code value is obtained by using Z-shaped extraction and travel code.
(, 0) (, 0) (, 0) (, 0) (, 1) eob
In this way, a 4*4 matrix can be expressed with a small number!
2.7Entropy Encoding
The commonly used entropy encoding is a variable-length coding, that is, Harman encoding.
Khman's coding method: assign a short-character-length binary code to a symbol with a high probability, assign a long-character-length binary code to a symbol with a low probability, and obtain the Code with the shortest average length of the symbol.
Step 1: (1) sort the source symbols in the order of probability and try to distribute the length of the code word in reverse order. (2 ). when allocating the length of a codeword, first combine the probabilities of the two symbols with the minimum probability to form a probability (3 ). regard this synthetic probability as a new probability of combining symbols. Repeat the above practice until there are only two signed probabilities at the end. (4 ). after the above probability order is arranged, the code will be carried forward in turn. Each time there are two branches each assigned a binary code, which can be assigned zero to the probability, if the probability is small, 1 is assigned.
About the AC/DC coefficient Encoding
1. Huffman encoding of AC Coefficient
The non-zero AC coefficient after Z scan and program encoding is expressed as symbol A and symbol B. The symbol A is composed of (runlength, size) and B (amplitude ).
The runlength is the AC coefficient that is consecutive 0 before the non-zero AC coefficient;
Size indicates the number of bits required for amplitude encoding;
Amplitude is the amplitude of the AC coefficient.
In actual operation, JPEG uses an 8-bit value Rs to represent the symbol A, RS = rrrrssss. For a non-zero AC coefficient, the four-digit height indicates runlength, the lower four bits are used to indicate the size. (00000000) indicates eob.
Encode symbol B with a variable-length INTEGER (vli). Put the vli code of symbol B in a to form the final result of encoding a and B.
2. Huffman encoding of DC coefficient
For the DC coefficient, similar to the non-zero AC coefficient, it describes the difference (diff) between the two adjacent DC coefficients as follows: Symbol A is (size ), symbol B is (amplitude ).
Size indicates the number of digits required for amplitude encoding;
Amplitude indicates the amplitude of the DC coefficient.
In the JPEG standard, symbol A is encoded according to the corresponding Huffman table, and symbol B is converted into an integer, then, the vli code of symbol B is placed in the Huffman code of symbol A, and the diff encoding is completed.
The default Huffman table is not defined in the JPEG standard. You can choose a general Huffman table or a specific image based on your actual needs, calculate the Huffman table by collecting statistical features before compression.
Iii. Main Process of JPEG decoding.
3.1Read File Information
According to the data storage method of JPEG files, read the information about the files to be decoded one by one, and prepare for the subsequent decoding work. The reference method is to design a series of struct corresponding to each tag and store the information represented in the tag. Among them, the image length and width, multiple quantization tables, the user table, the horizontal/Vertical Sampling factor, and other information is more important. The following are some reading problems.
1. Read the general structure of the file
The general sequence of jfif format JPEG files (*. jpg) is:
SOI (0xffd8), app0 (0xffe0), [Appn (0 xffen)] (optional,
Dqt (0 xffdb), sof0 (0xffc0), DHT (0xffc4), SOS (0 xffda ),
Compress data, EOI (0xffd9 ).
2. read data from the table;
3. Create a user tree.
After preparing all the image information, you can decode the image data.
Decoding of AC and DC coefficients
1. decode the AC Coefficient
Solve the RS by querying the Huffman data, and the value from the middle to the runlength and size. Because symbol B is encoded in the vli table, amplitude can be obtained by querying the size value. In this way, the values of the symbols A and B can be solved.
2. decode the DC coefficient
Similarly, we first query the Huffman table to calculate the size, calculate the diff through the size, and add it to the DC coefficient value of the previous 8*8 block, and finally obtain the DC coefficient of the block.
Decoding of color components (Y, U, V) in 3.2 MCU
Image Data streams are composed of MCU, while MCU is composed of data units and color components. Image Data Streams store information in bits. In addition, the internal data is obtained after the forward discrete cosine transform (FDCT) is used to transform the time-space domain to the frequency domain during encoding, therefore, each color component unit should be composed of two parts: one DC component and 63 AC components.
The color component units use RLE travel encoding and Harman encoding to compress data. The data stream of each pixel consists of two parts: the encoding and the value, and the two are basically separated by each other (unless the weight of the encoding is zero ). The process of decoding is actually the search process of the Harman tree.
Differential Code of 3.3 DC coefficient
All color component units are classified by color components (Y, Cr, and CB. Within each color component, the DC variables of the adjacent two color component units are encoded by difference. That is to say, the previously decoded DC variable value is only the actual DC variable of the current color component unit minus the actual DC variable of the previous color component unit. That is to say, the current DC Variable must be corrected by the actual (non-decoded) DC component of the previous color component unit:
DCN = DCn-1 + diff
Diff is the difference correction variable, that is, the DC coefficient directly decoded. However, if the current color component unit is the first unit, the decoded Dc value is the real DC variable.
The DC variables of the three color components are separated for differential encoding. That is to say, three independent DC correction variables should be set for decoding an image.
3.4 anti-Quantization
The anti-quantization process is relatively simple. You only need to multiply the 64 values of the 8*8 color component units by the values with the same positions in the corresponding quantization table. All the color units in the image must be reversed.
3.5 anti-Zig-Zag Encoding
3.6 inverse Discrete Cosine Transformation
As mentioned above, the data in the file is obtained by performing a forward discrete cosine transformation (FDCT) to transform the spatiotemporal domain to the frequency domain during encoding, therefore, the inverse Discrete Cosine Transformation (IDCT) of decoding is to convert the value of the frequency field in the color component matrix to the time-space field. In addition, if the size of the matrix in the original frequency field is 8x8, the matrix in the time-space field is still 8x8 after the inverse Discrete Cosine transformation.
3.7 convert ycrcb to RGB
To display an image on the screen, the color of the image must be expressed in RGB mode. Therefore, the ycrcb mode must be converted to the RGB mode during decoding.
In addition, due to the Discrete Cosine variation, the symmetry of the defined domain is required. Therefore, during encoding, the RGB value range is reduced from [0,255] to [-128]. Therefore, 128 must be added to each component during decoding. The specific formula is as follows:
R = Y + 1.402 * CB + 128;
G = Y-0.34414 * Cr -0.71414 * CB + 128;
B = Y + 1.772 * CB + 128;
Another problem is that the R, G, and B values obtained by transformation may exceed the defined domain. Therefore, we need to check whether the R, G, and B values are used. If the value is greater than 255, the value is truncated to 255. If the value is smaller than 0, the value is truncated to 0.
So far, the decoding of each MCU has been completed.
Source: http://blog.sina.com.cn/s/blog_61d40cc30100f5e6.html
Author: quennel