Introduction to image compression coding and JPEG compression coding standards

Last Update:2018-12-07 Source: Internet

Author: User

Tags 0xc0 coding standards

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Before introducing image compression encoding, consider the following question: why compression? You don't need to answer this question. Because the amount of data in the image information is amazing. For example, we can see that the data volume of an A4 (210mm * 297mm) image scanned with a medium-resolution (300 dpi) Scanner Based on the actual color? Let's calculate: there are a total of (300*210/25 .4) * (300*297/25 .4) pixels, each of which occupies 3 bytes, and the data volume is 26 MB, the data size is obvious. Today, on the Internet, traditional character-based interface applications are gradually replaced by the WWW (World Wide Web) method that can browse image information. Although WWW is beautiful, it also brings about a problem: the size of image information is too large, and the network bandwidth that is already very tight has become even more overwhelmed, this makes World Wide Web a World Wide Wait.
In short, the large amount of image information will increase the storage capacity, the bandwidth of the communication trunk channel, and the processing speed of the computer. It is unrealistic to solve this problem simply by increasing the memory capacity, increasing the channel bandwidth, and the computer's processing speed. In this case, compression should be considered. The theoretical basis of compression is information theory. From the information theory perspective, compression removes redundancy in information, that is, retaining uncertain information and removing definite information (which can be inferred ), that is to say, a description closer to the essence of information is used to replace the original redundant description. The essence of this is the amount of information (that is, uncertainty ).
Compression can be divided into two categories. The first type of compression process is reversible. That is to say, the compressed image can completely restore the original image without any loss of information, which is called lossless compression; the second type of compression process is irreversible, and the source image cannot be completely recovered. The information is lost and becomes lossy compression. Which type of Compression should be considered as a compromise. Although we want lossless compression, the compression ratio of lossy compression is usually (that is, the ratio of the number of bytes occupied by the source image to the number of bytes occupied by the compressed image, the larger the compression ratio, the higher the compression efficiency) is higher than that of lossless compression.
Image compression is generally achieved by changing the representation of the image. Therefore, compression and encoding are inseparable.. The main application of image compression is the transmission and storage of image information, which can be widely used in broadcast and television, video conferencing, computer communication, fax, multimedia systems, medical images, satellite images and other fields.
There are many compression encoding methods, mainly divided into the following four categories:1. pixel Encoding;2. Prediction code;3. Transform Encoding;4. Other Methods.
Pixel encoding means that each pixel is processed separately during encoding without considering the correlation between pixels. The following methods are commonly used in pixel encoding: 1. pulse Code Modulation (PCM); 2. entropy Coding; 3. run Length Coding; 4. bit Plane Coding (Bit Plane Coding ). Here, we will introduce the Huffman encoding and travel encoding in entropy (taking reading the. PCX file as an example ).
The so-called predictive encoding is to remove the correlation and redundancy between adjacent pixels and encode only new information. For example, because the gray scale of pixels is continuous, the gray scale value between adjacent pixels may be very small in a region. If we only record the gray scale of the first pixel, the gray scale of the pixel will be represented by the gray scale difference between the pixel and the previous pixel, and it will be compressed. For example, 248,250,251,251,252,255, the gray scale of these 6 pixels is. It indicates that 250 requires 8 bits, and 2 requires only two bits, thus implementing compression. The commonly used prediction codes include Delta Modulation (DM) and Differential prediction coding (DPCM). We will not detail the specific details.
The so-called transform encoding refers to converting a given image to another data domain (such as the frequency domain), so that a large amount of information can be expressed with less data, so as to achieve compression. There are many conversion codes, such as 1. discrete Fourier Transform (DFT); 2. discrete Cosine Transform (Discrete Cosine Transform, DCT); 3. discrete hadama Transform (DHT ).
There are also many other Coding methods, such as hybrid Coding (Hybird Coding), Vector quantization (Vector Quantize, VQ), and LZW algorithms. Here, we will only introduce the general idea of the LZW algorithm. It is worth noting that many new compression coding methods have emerged in recent years, such as the use of Artificial Neural networks (Artificial Neural Network, ANN) compression coding algorithms; Fractl ); wavelet, Object-Based compression algorithm, and Model-Based) (it should be used in MPEG4 and future video compression coding standards ). These are beyond the scope of this lecture.
At the end of this article, we will take the JPEG compression encoding standard as an example to see how the above encoding methods are applied in actual compression encoding.

1. Huffman Encoding
Huffman encoding is a common compression encoding method. It was established by Huffman in 1952 to compress text files. Its basic principle is that frequently-used data is replaced by shorter code, while less-used data is replaced by longer Code. The Code for each data is different. These codes are binary codes and the code length is variable. For example, if an object contains eight symbols S0, S1, S2, S3, S4, S5, S6, and S7, each symbol must be encoded in at least three bits, suppose the code is encoded as 000,001,010,011,100,101,110,111 (called a code word ). The serial number S0S1S7S0S1S6S2S2S3S4S5S0S0S1 is encoded as 000001111000001110010010011100101000000001, and 42 BITs are shared. We found that the S0, S1, and S2 symbols appear frequently, and other symbols appear less frequently. if we adopt a coding scheme to make S0, S1, s2 is short and other characters are long. This reduces the number of occupied bits.
For example, we adopt this encoding scheme: the S0 to S7 codewords are respectively 0011,100, 011110001110011101101000000010010010111, then the above symbol sequence becomes, sharing 39 bits, although some code words such as S3, S4, S5, and S6 become longer (from 3 to 4), several frequently used code words such as S0 and S1 become shorter, so compression is realized.
How is the above Code obtained? Random writing is not acceptable. The encoding must ensure that the first few digits of a code word cannot be the same as that of another code word. For example, if the S0 code word is 01 and the S2 code word is 011, when the sequence appears 011, you don't know if the S0 code is followed by a 1, or a complete S2 code word. The encoding we provide can ensure this.
The following describes the specific Huffman encoding algorithm.
1. Calculate the occurrence frequency of each symbol. In the above example, S0 to S7 appear at 4/14, 3/14, 2/14, 1/14, 1/14, 1/14, 1/14, 1/14, And, respectively.
2. Sort the above frequencies from left to right in ascending order.
3. The minimum two values are selected each time, and the two leaf nodes of the binary tree are used as their root nodes. The two leaf nodes are no longer involved in comparison, and the new root nodes are involved in comparison.
4. Repeat 3 until the root node with the sum of 1 is obtained.
5. Mark the left node of the binary tree as 0 and the right node as 1. Concatenates the 0, 1 series from the top root node to the bottom leaf node and obtains the encoding of each symbol.
The preceding example uses the Huffman encoding process as shown in. The numbers in the circle are generated by new nodes. We can see that the above encoding is obtained.

Figure 1. Huffman-encoded
To generate the Hoffmann encoding, You need to scan the original data twice. The first scan must accurately calculate the frequency of each value in the original data, and the second scan is to create and encode the Hoffmann tree, because Binary Trees need to be built and traversed to generate codes, data compression and restoration are slow, but simple and effective, and thus widely used.

2. Run Length Coding)
The principle of stroke encoding is also very simple: replace adjacent pixels with the same color value in a row with a Count value and the color value. For example, aaabccccccddeee can be expressed as 3a1b6c2d3e. If an image is composed of many areas with the same color, the compression efficiency using stroke encoding is amazing. However, this algorithm also leads to a fatal weakness. If the colors of every two adjacent points in an image are different, this algorithm cannot be used to compress, but the data size doubles. Therefore, there are not many compression algorithms that simply use travel encoding, and PCX files are one of them. PCX files are one of the earliest file formats used by the PC Paintbrush software. Due to the low compression ratio, not many files are used now. It is also composed of three parts: header information, color palette, and actual image data. The header information is structured as follows:
Typedef struct {
Char manufacturer;
Char version;
Char encoding;
Char bits_per_pixel;
WORD xmin, ymin;
WORD xmax, ymax;
WORD hres;
WORD vres;
Char palette [48];
Char reserved;
Char colour_planes;
WORD bytes_per_line;
WORD palette_type;
Char filler [58];
} PCXHEAD;
Note the following data: manufacturer is the ID of the PCX file, which must be 0x0a; xmin is the smallest x coordinate, and xmax is the largest x coordinate, therefore, the image width is xmax-xmin + 1, and the image height is ymax-yin + 1. bytes_per_line indicates the number of bytes occupied by each encoding row.
The PCX palette is at the end of the file. Take the 256-color PCX file as an example. The last 769th bytes indicates the number of colors. When the value is 256, this Byte must be 12, and the remaining 768 (256*3) indicates the RGB value of the color palette.
For convenience, we will introduce the decoding process of the 256-color PCX file. Encoding is the inverse process of decoding. If you are interested, try to complete it yourself.
Decoding is based on the unit of behavior. The number of bytes occupied by this row is determined by bytes_per_line. For this reason, we open a buffer for decoding bytes_per_line. At the beginning, all the content in the buffer zone is cleared. Read a byte C from the file. If C> 0xc0, it indicates the Run Length information, that is, the low 6 bits of C indicate the number of consecutive bytes (so up to 63 pixels with the same continuous color can be processed at the next stroke if there are pixels with the same color ), the next byte of the file is the actual image data (that is, the index value of the color in the color palette). If C <0xc0, it indicates that C is the actual image data. This is repeated until the bytes_per_line bytes are processed, and the decoding of this row is complete. PCX is composed of several such decoding lines.

Source code for decoding 256-color PCX files

3. General idea of LZW algorithm
LZW is a complex compression algorithm with high compression efficiency. Here we will only introduce the basic principle of LZW: LZW uses a numerical value to encode each string that appears for the first time, and then returns the value to the original string in the restoration program. For example, if the value 0 x is used to replace the string "abccddeee", 0x100 is used to compress the string whenever it appears. As for the ing between 0x100 and string, it is dynamically generated during the compression process, and this ing is hidden in the compressed data. With the decompression, this encoding table will be gradually restored from the compressed data, and the subsequent compressed data will generate more mappings Based on the mappings generated by the preceding data until the compression file ends. LZW is lossless. GIF files use this compression algorithm.
Note that the LZW algorithm is approved by Unisys for a patent in the United States.

4. JPEG compression encoding standard
JPEG is the abbreviation of the Joint image Expert Group (Joint Picture Expert Group). It is the compression and encoding standard for static images jointly developed by the International Organization for Standardization (ISO) and CCITT. Compared with other common file formats (such as GIF, TIFF, and PCX) with the same image quality, JPEG has the highest compression ratio in static images. Let's compare the specific data. Example: clouds.bmp in the windows95directory. The source image size is 640*480,256 colors. Use the tool SEA (version1.3) to convert it into 24-bit BMP, 24-bit JPEG, GIF (only 256 colors) compression format, and 24-bit TIFF compression format, the size (in bytes) of the 24-bit color TGA compression format is 921,654, 17,707, 177,152, 923,044, and 768,136, respectively. Visible JPEG is much higher than other types of compression ratios, while the image quality is similar (the JPEG processing color is only true color and grayscale ). Due to its high compression ratio, JPEG is widely used in multimedia and network programs. For example, one of the image formats selected in HTML syntax is JPEG (another is GIF ), this is obvious, because the bandwidth of the network is very valuable, it is necessary to choose a file format with a high compression ratio.

JPEG has several modes, the most common of which is the sequential mode based on DCT transformation, also known as Baseline. The following are all discussions about this format.

JPEG compression principle
In fact, the above-mentioned principles of JPEG compression are widely used. This is precisely why JPEG has a high compression ratio. The process of its encoder is

Figure 3. encoder Process
The decoder is basically the inverse process of the above process:

Figure 4. decoder Process

After an 8*8 image is transformed by DCT, its low-frequency components are concentrated in the upper left corner, and its high-frequency components are distributed in the lower right corner (DCT is actually a low-pass filter in the spatial domain ). Because the low-frequency component contains the main information (such as brightness) of the image, the high-frequency component is less important than the image. Therefore, we can ignore the high-frequency component to compress the image. How to remove high-frequency components requires quantification, which is the root cause of information loss. Here, the quantization operation divides a value by the corresponding value in the quantization table. Because the upper left corner of the quantization table has a small value and the value in the upper right corner is large, the low frequency component is maintained and the high frequency component is restrained.
The JPEG color is in YUV format. We mentioned that the Y component represents the brightness information, and the UV component represents the color difference information. In comparison, the Y component is more important. We can use fine quantization for Y and rough quantization for UV to further increase the compression ratio. Therefore, there are usually two quantified tables, one for Y and the other for UV.
As mentioned above, after DCT transformation, the low frequency components are concentrated in the upper left corner, where F () (that is, the first column of the First line) represents the DC (DC) coefficient, that is, the average value of the 8*8 sub-blocks, which must be individually encoded. Because the DC coefficients of two adjacent 8*8 sub-blocks are very small, the difference between them can be improved by using the differential encoding DPCM, that is, the difference between the DC coefficients of adjacent sub-blocks is encoded. The other 63 elements of 8*8 are AC coefficients, which are subject to travel encoding. Here is a problem: in what order should these 63 coefficients be arranged? To ensure that the low frequency component appears first, the high frequency component appears later, to increase the number of consecutive "0" in the stroke, these 63 elements use the "Zig-Zag" font) as shown in:

Figure 5. Zig-Zag
The 63 AC coefficient travel encoding codes are expressed in two bytes, such

Figure 6. Travel code
Above, we get the DC code and AC travel code. To further increase the compression ratio, we need to perform entropy encoding. Here we use Huffman encoding, which is divided into two steps:
(1) representation of the intermediate entropy encoding format:
There are two symbols for the AC coefficient. Symbol 1: stroke and Size, that is, the above (RunLength, Size ). (0, 0) and (15, 0) are two special cases. (0, 0) indicates the block end sign (EOB), (15, 0) indicates ZRL. When the travel length exceeds 15, increase the number of ZRL, therefore, there can be a maximum of three ZRL (3*16 + 15 = 63 ). Symbol 2 is the ampl value ).
The DC coefficient also has two symbols. Symbol 1: Size; symbol 2 is the Amplitude value ).
(2) entropy encoding:
For the AC coefficient, the symbols 1 and 2 are encoded respectively. When the zero-stroke length exceeds 15, there is a character (), and there is only one symbol () at the end of the block ). Encode symbol 1 with Hufffman (the brightness and color difference Huffman code table are different ). Encode the symbol 2 with a variable-length integer VLI. For example, when the Size is 6, the Amplitude range is-63 ~ -32, and 32 ~ 63. codes with the same absolute values and the opposite symbols are anticode relationships. Therefore, if the AC coefficient is 32, the code word is 100001, the code word of 33 is 011111, the code word of-32 is 011110, and the code word of-33 is. The code word of symbol 2 is followed by the code word of symbol 1.
For the DC coefficient, the Huffman code table of Y and UV is also different. You may have lost your schoolbag for so long. For example, it is easy to understand the above process.
The following is the quantization coefficient of the 8x8 brightness (Y) image sub-block.
15 0-1 0 0 0 0
-2-1 0 0 0 0 0
-1-1 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
It can be seen that only a few vertices (low frequency components) in the upper left corner are not zero after quantification, which makes it very effective to adopt the travel encoding.
Step 1: The intermediate format of entropy encoding: Check the DC coefficient first. Assume that the first 8x8 sub-blocks of the DC coefficient are quantified as 12, the difference between the DC coefficient and the block is 3, according to the following table
Size Amplitude
0 0
1-1, 1
2-3,-2, 2, 3
3-7 ~ -4,4 ~ 7
4-15 ~ -8 to 8 ~ 15
5-31 ~ ~ 31
6-63 ~ -32-32 ~ 63
7-127 ~ -64-64 ~ 127
8-255 ~ -128,128 ~ 255
9-511 ~ -256,256 ~ 511
10-1023 ~ 512,512 ~ 1023
11-2047 ~ ~ 2047
Size = 2, amplicate = 3, so the intermediate DC format is (2) (3 ). the AC coefficient is encoded. After Zig-Zag scan, the first non-zero coefficient is-2, and the number of zero values is 1 (RunLength ), according to the following AC coefficient table
Size Amplitude
1-1, 1
2-3,-2, 2, 3
3-7 ~ -4,4 ~ 7
4-15 ~ -8 to 8 ~ 15
5-31 ~ ~ 31
6-63 ~ -32-32 ~ 63
7-127 ~ -64-64 ~ 127
8-255 ~ -128,128 ~ 255
9-511 ~ -256,256 ~ 511
10-1023 ~ 512,512 ~ 1023
The table Size is 2. So RunLength = 1, Size = 2, amplicate = 3, so the intermediate AC format is (1, 2) (-2 ).
The other points are similar. We can find that the intermediate format of the 8*8 sub-block entropy encoding is
(DC) (2) (3), (1, 2) (-2), (0, 1) (-1), (0, 1) (-1), (0, 1) (-1), (2, 1) (-1), (EOB) (0, 0)
Step 2: entropy Encoding
For the (2) (3): 2 check the DC brightness Huffman table to get 11,3 after the VLI code is 011
For the (1, 2) (-2) :( 1, 2) Check the AC brightness Huffman table to get 11011.-2 is the inverse code of 2, which is 01.
For the (0, 1) (-1) :( 0, 1) Check the AC brightness Huffman table to get 00,-1 is the inverse Code of 1, 0
......
The last 8x8 sub-block brightness information is compressed and the data streams are 000,000,000,111. A total of 31 bits, the compression ratio is 64*8/31 = 16.5, about each pixel uses half a bit.
The compression ratio is inversely proportional to the image quality. The following figure shows the approximate relationship between compression efficiency and image quality. You can select an appropriate compression ratio based on your needs.
Compression Efficiency (unit: bits/pixel) Image Quality
0.25 ~ 0.50 medium ~ Yes, it can satisfy certain applications
0.50 ~ 0.75 good ~ Good, meeting the needs of most applications
0.75 ~ 1.5 excellent, meeting the needs of most applications
1.5 ~ 2.0 is almost the same as the original image

The above describes the JPEG compression principle, where the DC coefficient uses the prediction encoding DPCM, the AC coefficient uses the transformation encoding DCT, both use the entropy encoding Huffman, it can be seen that almost all traditional compression methods are used here. The combination of these methods is precisely the reason for the high JPEG compression ratio. By the way, this standard is obtained by the JPEG team from comparison tests in many different schemes. It is not a plug-in.
The above describes the basic principles of JPEG compression. The following describes the JPEG file format.

JPEG file format
JPEG files can be divided into two parts: Tag and compressed data. First, we will introduce the markup code section.
The Mark Code Section provides all the information of JPEG images (somewhat similar to the header information in BMP, but more complex), such as width, height, Huffman table, and quantization table. There are many tag codes, but the vast majority of JPEG files only contain several. The tag code structure is:
SOI
DQT
DRI
SOF0
DHT
SOS
...
EOI
A markup code consists of two bytes, with a high byte of 0XFF. Each markup code can be filled with an unlimited number of bytes 0XFF.
The following describes the structure and meaning of some common tag codes.
SOI (Start of Image)
Mark the number of structured bytes
0XFF 1
0XD8 1
It can be used as a JPEG format criterion (JFIF also requires APP0)
APP0 (Application)
Mark the number of structured bytes
0XFF 1
0XE0 1
Lp 2 APP0 flag code length, excluding the first two bytes 0XFF, 0XE0
Identifier 5 JFIF Identifier 0X4A, 0X49,0X00
Version 2 JFIF Version can be 0X0101 or 0X0102
Unit 1 of Units. If it is equal to zero, it indicates that it is not specified. If it is 1, it indicates inch, and if it is 2, it indicates centimeter.
Xdensity 2 horizontal resolution
Ydensity 2 vertical resolution
Xthumbnail 1 level point
Ythumbnail 1 vertical point
RGB0 3 RGB Value
RGB1 3 RGB Value
...
RGBn 3 RGB value, n = Xthumbnail * Ythumbnail
APP0 is the token that JPEG retains for the Application, and JFIF defines the file information in this tag.

DQT (Define Quantization Table)
Mark the number of structured bytes
0XFF 1
0XDB 1
The length of the DQT code, excluding the first two bytes 0XFF and 0XDB.
(Pq, Tq) 1. The four-digit Pq indicates the data accuracy of the quantified table. When Pq = 0, Q0 ~ The Qn value is 8 bits. When Pq = 1, the Qt value is 16 bits, and Tq indicates the number of the quantified table, which is 0 ~ 3. In the basic system, Pq = 0, Tq = 0 ~ 1. That is to say, there can be at most two quantization tables.
Q0 1 or 2 quantifies the table value. When Pq = 0, it is a byte. When Pq = 1, it is two bytes.
Q1 1 or 2 quantifies the table value. When Pq = 0, it is a byte. When Pq = 1, it is two bytes.
...
Qn 1 or 2 quantifies the table value. When Pq = 0, it is a byte. When Pq = 1, the value of two bytes of n is 0 ~ 63, indicating 64 values in the quantization table)

DRI (Define Restart Interval)
This tag requires the Minimum encoding Unit (MCU, Minimum Coding Unit. As mentioned above, the Y component data is important, and the UV component data is relatively unimportant. Therefore, you can only take a part of UV to increase the compression ratio. Currently, software that supports JPEG format generally provides two sampling methods: YUV411 and YUV422. The meaning is the data sampling ratio of the three components of YUV. For example, if Y takes four data units, that is, the value of horizontal sampling factor Hy multiplied by the value of Vertical Sampling factor Vy is 4, and U and V take one data unit respectively, hu * Vu = 1, Hv * Vv = 1. This kind of sampling is called YUV411. As shown in:

Figure 7. YUV411
Yizhi YUV411 has a compression ratio of 50% (originally there were 12 data units and now there are 6 data units) and YUV422
There is a compression ratio of 33% (originally there were 12 data units, and now there are 8 data units ). So you may wonder, isn't the YUV911 and YUV1611 higher compression ratios? However, we need to consider the image quality factors. Therefore, the JPEG standard specifies the minimum encoding unit MCU, which requires Hy * Vy + Hu * Vu + Hv * Vv ≤ 10.
The arrangement of blocks in MCU is closely related to values of H and V. See the following figure:

Figure 8. YUV111 sorting order

Figure 9. sorting order of YUV211

Figure 10. YUV411 sorting order
Mark the number of structured bytes
0XFF 1
0XDD 1
Lr 2 DRI flag code length, excluding the first two bytes 0XFF, 0XDD
Number of MCUs at the Ri 2 re-entry interval. The Ri must be an integer of the number of MCUs in a mcu line. The last Zero-header is not necessarily the number of RIS. Each reconnection interval is independently encoded.

In the basic system, SOF (Start of Frame) only processes SOF0.
Mark the number of structured bytes
0XFF 1
0XC0 1
Lf 2 SOF flag code length, excluding the first two bytes 0XFF, 0XC0
In the P 1 Basic System, it is 0X08
Y 2 Image Height
X 2 Image Width
The number of components in the Nf 1 Frame. Generally, 1 or 3 represents a grayscale image, and 3 represents a true color image.
C1 1 component no. 1
(H1, V1) 1 first horizontal and vertical sampling factor
Tq1 1 Number of the quantified table
C2 1 component no. 2
(H2, V2) 1 second horizontal and vertical sampling factor
Tq2 1 Number of the quantified table
...
Cn 1 component No. n
(Hn, Vn) 1 n horizontal and vertical sampling Factors
Tqn 1 Number of the quantified table

DHT (Define Huffman Table)
Mark the number of structured bytes
0XFF 1
0XC4 1
Lh 2 DHT flag length, excluding the first two bytes 0XFF, 0XC4
(Tc, Th) 1
L1 1
L2 1
...
L16 1
V1 1
V2 1
...
Vt 1
Tc is 4-bit high, and Th is 4-bit low. In the basic system, when Tc is 0 or 1, it refers to the Huffman table used by DC. If Tc is 1, it refers to the Huffman table used by AC. Th indicates the number of the Huffman table. In the basic system, the value is 0 or 1.
Therefore, in the basic system, there are a maximum of four Huffman tables, as shown below:
Tc Th Huffman table number (2 * Tc + Th)
0 0 0
0 1 1
1 0 2
1 1 3
Ln indicates the number of Huffman code words for each n-bit, n = 1 ~ 16
Vt indicates the value corresponding to each Huffman code word, that is, the symbol 1 we mentioned earlier. for DC, this value is (Size) and for AC, this value is (RunLength, size ). T = L1 + L2 +... L16

SOS (Start of Scan)
Mark the number of structured bytes
0XFF 1
0XDA 1
Ls 2 DHT flag length, excluding the first two bytes 0XFF, 0XDA
Ns 1
Cs1 1
(Td1, Ta1) 1
Cs2 1
(Td2, Ta2) 1
...
CsNs 1
(TdNs, TaNs) 1
Ss 1
Se 1
(Ah, Al) 1
Ns is the number of minutes in Scan. in the basic system, Ns = Nf (number of minutes in Frame ). CSNs indicates the number of minutes in Scan. TdNs is a high 4-bit table, and TaNs is a low 4-bit table, indicating the numbers of the DC and AC encoding tables. In the basic system, Ss = 0, Se = 63, Ah = 0, Al = 0.

EOI (End of Image) End mark
Mark the number of structured bytes
0XFF 1
0XD9 1

The program flowchart of the JPEG basic system decoder.

Figure 11. program flowchart of JPEG basic system Decoder
Because no optimization algorithm is used, the decoder speed is not high. When I evaluated the program using the VC Performance Evaluation Tool Profile, I found that the most time-consuming part is the inverse Discrete Cosine Transformation (IDCT) in fact, this is obviously because the number of floating point commands is much larger than that of integers. Therefore, using a fast IDCT algorithm can greatly improve the performance, I am using a fast IDCT algorithm which is considered to be better at present. The main idea is to break down two-dimensional IDCT into two one-dimensional IDCT S: Line and column.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More