JPEG image decoding and jpeg Decoding

Source: Internet
Author: User

JPEG image decoding and jpeg Decoding

Introduction

JPEG is a widely used standard method for compressing images. JPEG is the abbreviation of JointPhotographicExpertsGroup. A file in this compression format is generally called JPEG. The common extensions of such files include: .20.2.16.j2.16.jpgor .jpe. among them, the most powerful one is .jpg.

JPEG/JFIF is the most common image storage and transfer format on the Internet. However, this format is not suitable for drawing lines, text, or icons because its compression method is seriously damaged. PNG and GIF files are more suitable for the above images. However, each pixel of GIF only supports 8 bits color depth, which is not suitable for color-rich photos. However, PNG format can provide the same or more JPEG image details.

When stored in JPEG format in Photoshop software, 11 compression levels are provided, expressed as 0-10. Among them, the 0-level compression ratio is the highest, and the image quality is the worst. The compression ratio is up to even when the details are saved with almost lossless 10-level quality. A mb image file is generated when it is saved in BMP format. When it is saved in JPG format, the file size is only 178KB and the compression ratio reaches. After many comparisons, 8th-level compression is used as the best combination of storage space and image quality.

JPEG itself only describes how to convert a video to a byte stream (streaming), but it does not explain how these bytes are stored on any specific storage media. The compression method of JPEG is usually destructive data compression (lossy compression), meaning that the image quality will be visible during the compression process, one JPEG-based standard Lossless JPEG adopts Lossless compression, but Lossless JPEG is not widely supported.

JPEG compression and encoding process:

  • Color Model (color space conversion)

    JPEG images use the YCrCb color model (YUV color encoding, Y "indicates the brightness (Luminance, Luma ), "U" and "V" are Chrominance, concentration (Chrominance, Chroma), rather than the most commonly used RGB on the computer. An 8-bit brightness value is saved at each point, and a Cr Cb value is saved at every 2-2 points. However, the image does not feel much changed to the naked eye. Therefore, the original RGB model requires 4x3 = 12 bytes for four points. At present, only 4 + 2 = 6 bytes are required, and each point occupies 12bit on average. Of course, the C value of each vertex can be recorded in JPEG format. However, in MPEG, each node is stored by 12bit, Which is abbreviated as YUV12.

    The YUV component can be obtained by calculating the normalized (gamma-corrected) R', G', and B 'IN THE palth system:

    Y=0.299R'+0.587G'+0.114B'U=-0.147R'-0.289G'+0.436B'V=0.615R'-0.515G'-0.100B'

    ITU-R version of the formula:

    Y = 0.299 * R + 0.587 * G + 0.114 * B (brightness) cb =-0.1687 * R-0.3313 * G + 0.5 * B + 128Cr = 0.5 * R-0.4187 * G-0.0813 * B + 128R = Y + 1.402 * (Cr-128) G = Y-0.34414 * (Cb-128)-0.71414 * (Cr-128) B = Y + 1.772 * (Cb-128)
  • Downsampling)

    The above conversion makes the next step suddenly possible, that is, reducing the components of U and V (referred to as "downsampling" or "chroma subsampling ). On JPEG, The downsampling ratio can be (no downsampling is required), (take a multiple of 2 in the horizontal direction ), and the most common (take one of the multiples of 2 in the horizontal and vertical directions ). For the remaining parts of the compression process, Y, U, and V are processed individually in a very similar way.

  • Discrete cosine transformation (Discrete cosine transform)
    Each component (Y, U, V) in the video is generated into three areas, each of which is then divided into 8x8 sub-areas arranged as tiles, each subarea uses two-dimensional discrete cosine transform (DCT) to convert to the frequency space. After this transformation, the law between points and points in the image is presented, making it easier to compress. Each 8x8 vertex in JPEG is processed in one unit. Therefore, if the length and width of the original image are not a multiple of 8, you must first add a multiple of 8 to complete the processing of blocks. In addition, do you remember that CrCb is recorded once every 2x2? So in most cases. Is the Integer Block to be supplemented into 16x16. In the order from left to right, from top to bottom (the order of writing is the same as ours ). In JPEG, YCrCb is transformed by DCT.

    JPEG encoding uses the Inverse DCT (IDCT) used for Forward DCT (FDCT) decoding. This step takes a lot of time. In addition, there is an AA & N Optimization algorithm, on the Intel homepage, you can find the AA & n idct mmx optimization code.

    The large value in the upper left corner is called the DC coefficient (DC coefficient), and the other 63 values are called the AC coefficient (AC coefficient ). The following uses differential encoding for the DC coefficients in all 8x8 tables and process encoding for the AC coefficients.

  • Re-arrange DCT results

    DCT converts an 8x8 array into an 8x8 array. However, all the data in the memory is linear. If we store these 64 numbers in one row, the end of each row has nothing to do with the start point of the downstream, therefore, JPEG requires 64 numbers in the following order.

    UINT16 requant[DCTSIZE2] ={    0, 1, 5, 6,14,15,27,28,    2, 4, 7,13,16,26,29,42,    3, 8,12,17,25,30,41,43,    9,11,18,24,31,40,44,53,    10,19,23,32,39,45,52,54,    20,22,33,38,46,51,55,60,    21,34,37,47,50,56,59,61,    35,36,48,49,57,58,62,63}

    In this way, the adjacent points in the series are also adjacent to the image.

  • Quantization)

    For the preceding 64 spatial frequency amplitude values, we perform amplitude stratified quantization on them. The method is to divide the values in the quantization table and rounding them.

    UINT16 requant[DCTSIZE2] ={    16,11,10,16,24,40,51,61,    12,12,14,19,26,58,60,55,    14,13,16,24,40,57,69,56,    14,17,22,29,51,87,80,62,    18,22,37,56,68,109,103,77,    24,35,55,64,81,104,113,92,    49,64,78,87,103,121,120,101,    72,92,95,98,112,100,103,99,}for (i = 0 ; i<=63; i++ )    vector[i] = (int) (vector[i] / quantval[i] + 0.5)

    This table is made based on the psychological visual valve, and the processing of 8-bit brightness and color images is good. Of course, we can use any quantified table. The quantization table is defined after the DQT mark of jpeg. Generally, one is defined for the value Y and one is defined for the value C. Quantization tables are the key to controlling the JPEG compression ratio. This step removes some high-frequency values, causing a high loss of details. But in fact, human eyes are far less sensitive to high spatial frequencies. Therefore, the visual loss after processing is very small. Another important reason is that there is a color transition between the points of all images. A large amount of image information is contained in Low-space frequencies. After quantification, a large number of consecutive Zeros will appear in the high-space frequency segment.

  • Encoding (Coding)

    Before the color conversion is completed to encoding, the image is not further compressed. DCT transformation and quantification can be said to be in preparation for the encoding phase.

    Two encoding mechanisms are used: one is the travel length code with 0 values, and the other is the Entropy code (EntropyCoding)

    • ① Zig-zag ordering)
    • ② Use RLE to encode the AC Coefficient
    • ③ Use DPCM to encode the DC coefficient (DC)
    • ④ Entropy Encoding

    Encoding is actually a statistical feature-based encoding method. In JPEG, HUFFMAN encoding or arithmetic encoding is allowed. Baseline sequential adopts the former method.

    JPEG compression modes include the following:

    • Sequential Encoding: (based on DCT) process the image from left to right and from top to bottom at a time;
    • Incremental Encoding (Progressive Encoding): (based on DCT) when the image is transmitted for a long time, the image score can be processed, transmits images in a fuzzy or clear way (the effect is similar to the GIF transmission over the network );
    • Lossless Encoding: (based on DPCM) ensures that the original image sample value is fully restored after decoding;
    • Hierarchical Encoding: images are compressed at several resolutions to enable high-resolution images to be displayed on devices with lower resolutions.

File structure

JPEG files can be divided into two parts: Tag and compressed data. In JPEG file format, the storage of a single word (16 bits) uses the Motorola format rather than the Intel format. That is to say, the high byte (8-bit high) of a word is in front of the data stream, and the low byte (8-bit low) is behind the data stream, which is different from the commonly used Intel format.

A tag code consists of two bytes. The first byte is a fixed value of 0xFF, And the last byte has different values according to different meanings. Before each tag code, you can add an unlimited number of meaningless 0xFF fills. That is to say, multiple consecutive 0xFF can be understood as a 0xFF and indicate the beginning of a tag code. After a complete two-byte mark code, the compressed data stream corresponding to the mark code records various types of information about the file. Common tags include SOI, APP0, DQT, SOF0, DHT, DRI, SOS, and EOI.

The completed JPEG mark table contains many mark codes. If necessary, you can check the relevant information. Here we will briefly introduce several commonly used tags.

The general sequence of JFIF format JPEG files (*. jpg) is:

Mark Tag code Function
SOI 0xFFD8 Image start
APP0 0xFFE0 JFIF application data block
APPn 0 xFFEn Other application data blocks (n, 1 ~ 15)
DQT 0 xFFDB Quantization table
SOF0 0xFFC0 Frame image start
DHT 0xFFC4 Hoffman (Huffman) Table
SOS 0 xFFDA Scanning line starts
Compressing image data
EOI 0xFFD9 Image end

So we can see the jpg format judgment function in-x:

bool Image::isJpg(const unsigned char * data, ssize_t dataLen){    if (dataLen <= 4)    {        return false;    }    static const unsigned char JPG_SOI[] = {0xFF, 0xD8};    return memcmp(data, JPG_SOI, 2) == 0;}

For details about the jpeg mark code, we will not describe it here. We will analyze it in detail when we manually compress the jpeg image.

// Cocos2dx libjpg decoding bool Image: initWithJpgData (const unsigned char * data, ssize_t dataLen) {# if CC_USE_JPEG/* these are standard libjpeg structures for reading (decompression) */struct into _decompress_struct cinfo;/* We use our private extension JPEG error handler. * Note that this struct must live as long as the main JPEG parameter * struct, to avoid dangling-pointer problems. */struct MyError Mgr jerr;/* libjpeg data structure for storing one row, that is, scanline of an image */JSAMPROW row_pointer [1] = {0}; unsigned long location = 0; bool ret = false; do {/* We set up the normal JPEG error routines, then override error_exit. * /// sets cinfo for exception handling. err = maid (& jerr. pub); jerr. pub. error_exit = myErrorExit;/* Establish the setjmp return context for MyErrorExit to use. */if (s Etjmp (jerr. setjmp_buffer) {/* If we get here, the JPEG code has signaled an error. * We need to clean up the JPEG object, close the input file, and return. */pai_destroy_decompress (& cinfo); break;}/* setup decompression process and source, then read JPEG header * // initialize the jpeg_decompress_struct type struct, libjpeg internal use pai_create_decompress (& cinfo); # ifndef CC_TARGET_QT5 pai_mem_src (& cinfo, const_ca St <unsigned char *> (data), dataLen); # endif/* CC_TARGET_QT5 * // * reading the image header which contains image information */# if (pai_lib_version> = 90) // libjpeg 0.9 adds stricter types. pai_read_header (& cinfo, TRUE); # else pai_read_header (& cinfo, TRUE); # endif // we only support RGB or grayscale // sets the output color format, cocos2dx 3.2 currently supports RGB and I8 formats if (cinfo.jpeg _ color_space = JCS_GRAYSCALE) {_ renderForm At = Texture2D: PixelFormat: I8;} else {cinfo. out_color_space = JCS_RGB; _ renderFormat = Texture2D: PixelFormat: RGB888;}/* Start decompression jpeg here * // Start decoding __start_decompress (& cinfo ); /* init image info * // initialize the member Variable _ width = cinfo. output_width; _ height = cinfo. output_height; _ hasPremultipliedAlpha = false; // apply for memory _ dataLen = cinfo. output_width * cinfo. output_height * cinfo. output_compon Ents; _ data = static_cast <unsigned char *> (malloc (_ dataLen * sizeof (unsigned char); CC_BREAK_IF (! _ Data);/* now actually read the jpeg into the raw buffer * // * read one scan line at a time * // scan Pixel Information row by row, stored in the specified memory while (cinfo. output_scanline <cinfo. output_height) {row_pointer [0] = _ data + location; location + = cinfo. output_width * cinfo. output_components; pai_read_scanlines (& cinfo, row_pointer, 1);}/* When read image file with broken data, pai_finish_decompress () may cause error. * Besides, inclu_destroy_decompress () shall deallocate and release all memory associated * with the decompression object. * So it doesn' t need to call inclu_finish_decompress (). * // pai_finish_decompress (& cinfo); // release the memory pai_destroy_decompress (& cinfo);/* wrap up decompression, destroy objects, free pointers and close open files */ret = true;} while (0); return ret; # else return false; # endif // CC_USE_JPEG}

Libjpeg-turbo

Libjpeg-turbo is a JPEG decoder that uses SIMD commands (MMX, SSE2, NEON) to accelerate baseline compression and decompression of images on x86, x86-64, and other ARM systems, is an extension of libjpeg. In the above system, when other conditions are the same, libjpeg-turbo is 2-4 times faster than libjpeg. In other types of systems, libjpeg-turbo can also be faster than libjpeg through a highly-optimized Harman encoding program. In many cases, libjpeg-turbo is a very useful jpeg decoder.

For details, go to the libjpeg-turbo homepage.

You can compile the source code or download the compiled Link Library. Import to cocos2dx is very simple, add libjpeg. a and libjpeg-turbo.a two link library, contains the corresponding include folder.

BPG

BPG (Better Portable Graphics) is designed and proposed by the famous French programmer Fabrice Bellard (author of FFmpeg and QEMU. The advantage is that BPG has a higher compression ratio. Under the same image quality, the size of BPG files is only half the size of JPEG files. In addition, the native support for 8-bit and 16-bit channels.

For details, go to the Bellard homepage.

I downloaded the source code for compilation, in the file size and image quality performance is very good, but the current version 0.9.5 does not support the x86-64 architecture, looking forward to the author of the future updates.

Extended Link

1. YUV color Encoding

The invention of y' UV was due to the transitional period between the color TV and the black and white TV [1]. Black and white video only supports Y (Luma, Luminance) video, that is, grayscale value. When the color TV specification is developed, the color TV image is processed in YUV/YIQ format, and UV is regarded as the C (Chrominance or Chroma) that represents the color degree. If the C signal is ignored, the remaining Y (Luma) signals are the same as those of the previous black and white TVs, so that the compatibility between color TVs and black and white TVs is solved. The biggest advantage of y' UV is that it only requires a small amount of bandwidth.

The format of color image records, including RGB, YUV, and CMYK. The earliest conception of color TV is to use RGB three primary colors for simultaneous transmission. This design method is three times that of the original black and white bandwidth, which was not a good design at that time. RGB appeals to human eyes for Color Sensing, while YUV focuses on visual sensitivity to brightness. Y represents brightness, and UV represents color, similar to RGB), expressed by Cr and Cb respectively. Therefore, records of YUV are usually presented in Y: UV format. As shown in:

Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.