The origins of RGB/YUV and their mutual conversion

Last Update:2018-12-06 Source: Internet

Author: User

Tags color representation

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In video and other related applications, YUV is a frequently-seen format. This article describes the reason, relationship, and conversion method of YUV and RGB formats in the form of graphic data. It also introduces the YUV converted to RGB program in C language.
The color perception of human eyes has special characteristics. As early as the beginning of the last century, young (1809) and Helmholtz (1824) proposed the three primary colors of vision, namely: there are three kinds of cone cells in the retina, which are sensitive to red, green, and blue light. When a certain wavelength of light is applied to the retina, the three cone cells are excited to different degrees by a certain proportion. Such information is transmitted to the central node, which produces a certain color.
Since 1970s, due to advances in experimental technology, the hypothesis about three cone cells in the retina that are particularly sensitive to light at different wavelengths has been proven by many outstanding experiments. For example, ① someone uses a small monochrome beam that does not exceed the diameter of a single cone to inspect and draw it in the body one by one (the first experiment was performed in animals such as goldfish and togon, and later in humans) the spectral absorption curves of cone cells show that there are no more than three types of curves drawn, which represent three types of cone cells with different spectral absorption characteristics. The first type of absorption peak is at nm, one class is at 534nm, And the other class is at 564nm, which is almost equivalent to the wavelength of blue, green, and red three colors. It is consistent with the assumptions of the above three primary colors of vision. ② The method of recording the sensor potential of a single cone cell with micro-electrode also obtained similar results, that is to say, the size of the polarization type receptor potentials of different cone cells caused by different monochrome light is also different. The peak value is in line with the three primary colors.
As a result, when the color display has not been invented, humans have learned to use the three primary colors of light to allocate all colors of light. It doesn't mean that after the three primary colors are mixed, a new frequency of light is generated, but it gives the human eyes the feeling.
After the invention of the monitor, from the black and white display to the color display, people began to use the phosphor (CRT, plasma display) that emit different colors of light, or different colors of the color filter (LCD ), or a semiconductor light-emitting Device of different colors (large full-color display card of OLED and led) to form a color. Without exception, red, green, the three colors of blue are used as the basic luminous units. By controlling their luminous intensity, they combine most of the natural colors that human eyes can feel.
When a computer displays a color image, the color of the pixel must be determined by controlling the values of red, green, and blue in a pixel. The computer cannot simulate continuous storage of the smallest to brightest values, but can only be represented in numbers. Therefore, combined with the sensitivity of human eyes, the red, green, and blue light intensity values in a pixel are expressed in three bytes (3*8 bits, this is a common RGB format. In the custom color tool box, enter R, G, and B values to obtain different colors.
However, for video capture, codec, and other applications, this representation means a large amount of data. You need to find a way to change the representation of the original data without affecting your feelings, so as to reduce the data volume.
No matter what the intermediate processing process is, it is ultimately to display the changes to people. This change is also based on the characteristics of human eyes, and is the same as the starting point of the RGB three-color representation method.
So we use the Y, CB, Cr model to represent the color. The Iain book says: the human
Visual System (HVS) is less sensitive to color than to luminance
(Brightness). Human Visual systems (actually human eyes) are more sensitive to brightness than colors.
In the RGB color space, the importance of the three colors is the same, so the same resolution needs to be used for storage. A maximum of rgb565 can be used to reduce the quantization accuracy, however, the three colors need to be stored at the same resolution, and the data volume is large. Therefore, the human eye is more sensitive to brightness than color. The brightness information and color information of the image are separated and stored with different resolutions, in this way, image data can be stored more effectively with little influence on subjective perception.
YCbCr Color Space and Its deformation (sometimes referred to as YUV) are the most common effective methods to represent color images. Y is the luminance/Luma component of the image. The following formula is used to calculate the weighted average values of the R, G, and B components:
Y = KR r + kgg + kbb
K is the weight factor.
The formula above calculates the brightness and color information.
Difference/chrominance or chroma). The difference between each chromatic aberration component is R, G, B, and brightness Y:
CB = B-y
Cr = r-y
CG = G-y
Among them, Cb + Cr + CG is a constant (in fact, it is an expression about y). Therefore, you only need to combine the two values with the Y value to calculate the original RGB value. Therefore, we only save the difference between brightness and the colors of blue and red, which is (Y, CB, Cr ).
Compared with RGB color space, YCbCr color space has a significant advantage. Y storage can use the same resolution as the original image, but CB and Cr storage can use lower resolution. This reduces the amount of data and does not significantly reduce the image quality. Therefore, it is a simple and effective method to save the color information at a resolution lower than the measurement information.
In colour spaces. 17 ITU-R recommendation bt.601
When calculating y, we recommend that you set the weight to Kr = 0.299, KG = 0.587, kb = 0.114. The commonly used conversion formula is as follows:
Y = 0.299r + 0.587G + 0.114b
CB = 0.564 (B-y)
Cr = 0.713 (r-y)

R = Y + 1.402cr
G = Y-0.344cb-0.714cr
B = Y + 1.772cb
With this formula, we can convert an RGB image into a YUV image, which in turn can be used. The following describes how the image data is stored.
In the rgb24 format, for a screen with W width and H height, three bytes of W * H * are required to store the RGB information of each pixel, the pixel data of the image is arranged consecutively. According to R (), g (), B (); R (), g (), B );...; R (W-1, 0), g (W-1, 0), B (W-1, 0 );...; R (W-1, h-1), g (W-1, h-1), B (W-1, h-1) such a sequence of storage up.
In YUV format, yuv420 format is used as an example. If the width is W and the height is H, the brightness y data must be expressed in w x H bytes (one brightness for each pixel ). The CB and Cr data share a CB and Cr value in four pixels. In this way, CB uses w * H/4 bytes, AND Cr uses w * H/4 bytes.
Multiple frames are stored continuously in the YUV file. YUV
YUV ..... This continuous form, and each YUV is a picture.
In this single YUV, the first w x H bytes are y data, and the next w x H/4 bytes are CB data, the next 4 bytes (w * H/h) are Cr data.
When the RGB data is restored from the data that reduces the resolution, the corresponding values of Y, CB, and CR should be found based on the position of the pixel. The Y value is best found, if the pixel position is X and Y, the y * width + x value in Y data is its Y value. CB and Cr are one image block every 2x2 pixels, so CB and Cr data are equivalent to W/2 Resolution.
*
H/2, then the original position in the screen is X, Y pixels, in such a low-resolution screen position is X/2, Y/2, belongs to its CB, the CR value is in this place: (y/2) * (width/2) + (X/2 ).
For the sake of intuition, in the following figure, the y screen (CB, Cr = 0) and CB and Cr screen (y = 128) are displayed respectively, the CR image resolution is 1/4 of the Y image. However, after a picture is synthesized, our eyes will not feel that 4 pixels share a CB, Cr.

Y screen CB, Cr Screen
This is a program to verify the formula. It can be compiled under vs2005, in Windows Mobile 2003 and mobile
5.0.
This is because the common video codec, MPEG series and H.26x series are all processed using YUV as raw data. Therefore, it is very important to understand this basic knowledge.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More