How the JGP works

Source: Internet
Author: User

Original link: HTTPS://MEDIUM.FREECODECAMP.COM/HOW-JPG-WORKS-A4DBD2316F35#.IOG01X8KP

How JPG works

The JPG file format is one of the most impressive technological advancements for picture compression and was mounted on the stage in 1992. Since then, it has become the technical benchmark for image quality on the Internet. That's a good reason. However, many of the techniques behind JPG are extremely complex and require a complete understanding of how the human eye adjusts to the perception of color and boundaries.

Because I have something (you too, if you read this article), I want to explain how JPG coding works so that we can better understand how to survive smaller JPG files.

Points

The JPG compression mechanism is divided into several stages. These stages are expressed from the top and I will explain them one after the other.

Color space Conversion

One of the key principles of lossy data compression is that human sensors are not as accurate as computational systems. Scientifically speaking, the human eye only distinguishes about 10 million different colors of ability. However, there are many factors that affect the perceived color of the human eye, the perfect color illusion, or the dress that messes with the Internet. The point is that the human eye can be manipulated very well relative to the color to be recognized.

Quality is a form of lossy compression, whereas JPG uses another method: the color model . A color space represents a set of colors, and its color model represents how the mathematical formula indicates the color (RGB, four separations CMYK).

The great thing about this process is that you can convert from one color model to another , meaning that you can get a completely different set of values with a mathematical expression of a given color.

For example, the colors specified below represent the RGB and CMYK color models, and the human eye sees the same color, but can be marked by a different set of values.

JPG is converted from RGB to Y,CB,CR color model, and the model consists of luminance (Y), chroma Blue (Cb), chroma Red (Cr). The reason is that the visual test of the mind (how the brain handles the information seen by the eye) indicates that the human eye is more sensitive to brightness than chroma, which means that we can ignore the larger changes in chroma without affecting our image recognition. Therefore, before the human eye receives the information, we can actively change the CBCR channel information.

Down sampling

An interesting result of the YCbCr color space is that the resulting CB/CR channels have less granular detail; they contain less information than the Y channel.

As a result, the JPG algorithm adjusts the CB and CR channel information, compressing it to the original size? (Note that there are some details on how to do this, I don't introduce ...), which is called under sampling.

It is important to note that the next sample is lossy compression (which cannot restore the exact original color but is very close), but the overall effect on the visual component of the human visual cortex is minimal. Brightness (Y) is one of the interesting parts, because we only sample CBCR channels, the visual system has less impact.

Image is divided into 8x8 pixel blocks

From now on, all of the JPG operations are based on the 8x8 pixel block. This is done because we usually expect that there will be no changes on the 8x8 block, even in very complex images, and there are some self-similarity in the local area. We will take advantage of this self-similarity after compression processing.

It is worth noting that we are going to introduce one of the most common "artifacts" of JPG encoding. "Color permeation" is the color along the sharp edge that can "penetrate" to the other side. This is because the Chroma channel, which represents the color of the pixel, averages 4 pixels to 1 blocks in a single color, and some blocks span the sharp edges.

Discrete cosine transform

By now, things have been quite simple. Color space, down sampling, and chunking are all simple parts in the field of image compression. But now ... Now the real math is starting.

The key part of the DCT change is that it assumes that any digital signal can be reconstructed using a combination of cosine functions.

For example, we have:

You can see that it is actually cos (x) +cos (2x) +cos (4x) and

It may be better to show that the real picture is decoded, given some column cosine functions in 2D space. To prove this, I've shown one of the most amazing gifs of the Food Network: in 2D space, use the cosine function to encode n8x8 pixel blocks:

What you see here is a reconstruction of a picture (the leftmost panel). Each frame, we use a new Datum value (right panel) and multiply a weight value (right panel text) to create the contribution of the picture (the middle panel).

As you can see, we reconstructed the original image (quite perfect ...) by adding different cosine values with weights.

This is the underlying background for how discrete cosine changes work. The core is that any 8x8 block can be changed by a set of weighted cosine and represented at different frequencies. The trick of the whole thing is to figure out what cosine to use and how to weigh them.

The question of "using those cosine" is quite simple; after a large number of calculations, a set of cosine values is chosen to generate the closest value, which is the underlying function and is displayed in.

As for the question of "How to weigh up", simple (ha!) Apply this formula.

I'm not going to describe what these values mean, you can view them on a wiki.

This matrix, G, represents the base weight used to reconstruct the image (the small decimal number above the right side of the animation). Basically, each of the underlying cosine values, we multiply the weights in this matrix and add the entire value, and then we get the final image.

Here we are not dealing with the color space, but directly manipulating the G-matrix (datum weights), and all the compression is directly based on the matrix.

The problem here is that we are now converting byte-aligned integers to real numbers. This actually expands our information (from 1 bytes to a floating point number (4 bytes)). To solve this problem and start generating more significant compression, we enter the quantization phase.

Quantization

Therefore, we do not want to compress floating-point data. This will inflate our data stream and be inefficient. To do this, we need to find a way to convert the weight matrix into a value that ranges from [0,255]. To put it bluntly, we can do this, find the maximum/minimum value in the matrix (415.38, and 77.13, respectively), divide each value by the span to get the value of the [0,1] interval, and then multiply by 255 to get the final result.

Example: [34.12- -415.38]/[77.13?—?-415.38] *255= 232

This method works, but the cost is significant reduction in precision. This scaling will result in a non-uniform distribution of values, resulting in a significant visual loss of the image.

Instead, JPG takes a different route. Instead of using the range of values in the matrix as its scaling value, a preprocessing matrix of quantization factors is used. These QF do not need to be part of a stream, but rather as part of the encoder itself.

This example shows the common usage of the quantization factor matrix,

Now we use the Q and G matrices to calculate the quantization DCT Synergy Matrix:

For example, use the g[0,0]=?415.37 and q[0,0]=16 values:

Final result Matrix:

It's easier to look at the matrix-it now contains a large number of 0 or small integers, making it easier to compress.

Simply, we apply each and every process to the Y,CBCR channel, so we need two different matrices: one for Y and the other for the C channel:

Image quantization compression has two important ways: 1, limiting the effective range of weights, reducing the number of digits required to represent them. 2, many weights become the same number or 0, thus increasing the third part of the compression, entropy code.

This quantification is a major source of JPEG products. Because the lower-right corner of the picture tends to be the largest quantization factor, JPEG products tend to be reconstructed in the image. By modifying the "quality level" of the JPEG, or by raising or lowering it, you can directly control the matrix of the quantization factor (which we will explain in a minute).

Compression

Now, we go back to the world of integers, and we can further apply lossy compression to blocks of data. When you see our conversion data, you'll find something interesting:

When you move from top to bottom, the frequency of 0 values increases. This looks like the first suspicion is a run-length code. But here the main sequence and column main sequence is not ideal, because these 0 interleaving runs, rather than packing them together.

Instead, we'll traverse the matrix from the upper-left corner to the diagonal zigzag, moving back and forth until the bottom right corner.

The result of the luminance matrix, in the following order:

? 26,?3,0,?3,?2,?6,2,?4,1,?3,1,1,5,1,2,?1,1,?1,2,0,0,0,0,0,-1,- 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

Once the data becomes this format, the next steps are simple: perform RLE sequentially and then perform statistical coding on the results (Hoffman/Arithmetic/ans).

Boom! The data block is now JPG encoded.

Understanding Quality Parameters

Now that you understand how to actually create a JPG file, it's worth revisiting the concept of quality parameters that you'll encounter when exporting JPG images from Photoshop.

This parameter, called Q, is an integer from 1 to 100. It can be considered that Q is a measure of image quality: the higher the Q value, the higher the image quality and the larger the file size.

The mass value is used in the quantification phase to scale the appropriate quantization factor. So for each datum weight, the quantization phase is now similar to RouN< Span class= "Mi" id= "mathjax-span-35" style= "font-family:mathjax_math-italic;" >d \ ( GI,k/aLPha?QI,k\)

The Alpha symbol is the result of the mass parameter.

Alpha or Q[x,y] increases (remembering that the larger the alpha value corresponds to the smaller the mass parameter q), the more information is lost and the file size is reduced .

Therefore, if you want to get smaller files at a higher visual illusion cost, you can set a lower quality value in the export phase.

Notice how we look at the blocking phase and the clear traces of the quantification phase in the lowest quality picture.

Perhaps more importantly, the change in mass parameters depends on the image . Because each picture is unique and presents different types of visual products, the Q value is unique.

Conclusion

When you understand how the JPG algorithm works, a few things become apparent:

    1. Getting the correct quality values for each image is important for the tradeoff between visual quality and file size.
    2. Because the process is block-based, artifacts occur in block effects or "ringing effects"
    3. Because the processed blocks do not mix with each other, JPG usually ignores the chance of compressing the large acacia wormwood. Solving this problem is WEBP format is good at.

If you want to play it all by yourself, all this can be wildly integrated into a file of about 1000 lines.

Hey!

Want to know how your JPG file is smaller?
Want to know how PNG files work, or make them smaller?
Want to have more data compression? Buy my book!

How the JGP works

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.