YUV is the pixel format represented separately by the brightness parameter and the color parameter. The advantage of this separation is that it can not only avoid mutual interference, but also reduce the color sampling rate without significantly affecting the image quality. YUV is a general saying that its specific arrangement methods can be divided into many specific formats.
YUV format resolution 1 (player -- project2) enables video players in yuv420 format to open *. MP4; *. 264 files based on the card API design to play back the files. The video format is yuv420. The YUV format generally has two categories: packed format and planar format. The former stores the YUV component in the same array, usually several adjacent pixel groups.
A macro pixel (macro-pixel), which uses three arrays to separate and store three YUV components, just like a three-dimensional plane. Both yuy2 and y211 in table 2.3 are packaged.
Format, while if09 to yvu9 are flat format. (Note: When introducing various formats, each YUV component carries a subscript, for example, y0, U0, and V0 indicate the YUV score of the first pixel.
Amount. Y1, u1, and V1 indicate the YUV component of the second pixel, and so on .) Mediasubtype_yuy2
Yuy2
Format: mediasubtype_yuyv
Yuyv
Format (the actual format is the same as yuy2) mediasubtype_yvyu
Yvyu
Format: Package mediasubtype_uyvy
Uyvy
Format: Package mediasubtype_ayuv with alpha channel.
YUV
Mediasubtype_y41p
Y41p
Format: Package mediasubtype_y411
Y411
Format (the actual format is the same as y41p) mediasubtype_y211
Y211
Mediasubtype_if09 if09 mediasubtype_iyuv iyuv mediasubtype_yv12 yv12 format mediasubtype_yvu9 yvu9 format table 2.3yuv sampling
One of the advantages of YUV is that the sampling rate of the color channel is comparable to that of Y
Low channels without significantly reducing visual quality
. There is a notation used to describe the sampling frequency ratio of U and V to Y. This notation is called A: B: C Notation:
? |
Indicates that the color channel does not have subsampling. |
? |
: 2 horizontal subsample with no vertical subsample. For every two U samples or V samples, each scan row contains four y samples. |
? |
: 0 indicates a horizontal subsample of and a vertical subsample. |
? |
Indicates a horizontal subsample with no vertical subsample. For each U sample or V sample, each scan row contains four y samples. Compared with other formats Sampling is not commonly used. This article will not discuss it in detail. |
Figure 1 shows the sample grid used in the figure. The lighting sample is represented by a cross, and the color sample is represented by a circle.
The main form of sampling is defined in the ITU-R recommendation bt.601. Figure 2 shows the sample grid defined in this standard.
Sampling has two common variations. One form for MPEG-2 video, the other for MPEG-1 and ITU-T recommendations
H.261 and H.263. Figure 3 shows the sample Mesh Used in the MPEG-1 scheme, and figure 4 shows the sample Mesh Used in the MPEG-2 scheme.
Compared to the MPEG-1 scheme, it is easier to convert between the MPEG-2 scheme and the sample mesh defined for the and formats. Therefore, in Windows
The preferred MPEG-2 scheme in should be considered as the default conversion scheme in the format.
Surface Definition
This section describes the 8-bit YUV format recommended for video rendering. These formats can be divided into several categories:
? |
Format, 32 bits per pixel |
? |
, 16 bits per pixel |
? |
, 16 bits per pixel |
? |
, 12 bits per pixel |
First, you should understand the following concepts to understand the following content:
? |
Surface Origin . For the YUV format described in this article, the origin (0, 0) is always in the upper left corner of the surface. |
? |
Span . The cross-distance of the surface, also known as spacing, refers to the width of the surface, expressed in bytes. For a surface with the origin in the upper left corner, the cross distance is always positive. |
? |
Alignment . The surface alignment depends on the driver of the graphic display. The surface should always be aligned with DWORD, that is to say, all rows on the surface must be from The 32-bit (DWORD) boundary. Alignment can be greater than 32 bits, but it depends on the hardware requirements. |
? |
Packaging format and plane format . The YUV format can be divided into the packaging format and the plane format. In the packaging format, Y, U, and V The component is stored in an array. Pixels are organized into several giant pixel groups. The layout of the giant pixel group depends on the format. In the flat format, Y, U, and V The component is stored as three separate planes. |
Format, 32 bits per pixel
We recommend a 4: 4 format with the fourcc code ayuv. This is a packaging format. Each pixel is encoded into four consecutive bytes, and its organizational order is as follows.
The Byte marked with a contains the Alpha value.
, 16 bits per pixel
Two formats are supported. The fourcc code is as follows:
Both are packaged, and each giant pixel is encoded into two pixels of four consecutive bytes. This will multiply the color horizontal sampling by the coefficient 2.
Yuy2
In yuy2 format, data can be considered as a data without positive or negative signs.Char
Value array, where the first byte contains the first y
Example: The second byte contains the first U (CB) sample, the third byte contains the second y sample, and the fourth byte contains the first V (CR) sample, 6.
If the image is viewed as composed of two little-EndianWord
Value array, the first
Word
Include y0 in the lowest valid bits (LSB) and U in the highest valid bits (MSB. Second
Word
Include Y1 in LSB and V in MSB.
Is yuy2 used for Microsoft DirectX? Video acceleration (DirectX VA) preferred
Pixel format. It is expected to become an interim requirement for DirectX va accelerators that support videos.
Uyvy
The format is the same as yuy2, but the byte order is the opposite-that is, the color bytes and the light byte are flipped (figure 7 ). If the image is viewed as composed of two little-Endian
Word
Value array, the firstWord
Include U in LSB and include U in MSB
Y0, secondWord
Include V in LSB and Y1 in MSB.
, 16 bits per pixel
We recommend two 16-bit formats at each pixel. The fourcc code is as follows:
Both fourcc codes are in flat format. In both the horizontal and vertical directions, the color channel uses the coefficient 2 for re-sampling.
Imc1
All y samples are usedChar
The array composed of values is first displayed in the memory. Followed by all V (CR) examples, and then all
U (CB) example. The V and U planes have the same distance as the y Planes, thus generating unused areas of memory as shown in figure 8.
Imc3
This format is the same as imc1, but the u and v planes are exchanged:
, 12 bits per pixel
We recommend four bits per pixel 12 bits. The fourcc code is as follows:
? |
Imc2 |
? |
Imc4 |
? |
Yv12 |
? |
Nv12 |
In all these formats, the color channels must be re-sampled with a coefficient of 2 in both the horizontal and vertical directions.
Imc2
The format is the same as that of imc1, but the V (CR) and U (CB) lines are staggered at the semi-cross border. In other words, each complete cross-distance row in the color area uses a line V
The following is an example of a row starting at the next semi-span boundary (fig. 10 ). This layout works with imc1
In comparison, it can make more efficient use of the address space. Its color address space is reduced by half, so the overall address space is reduced by 25%. In each format, imc2 is the second preferred format.
After nv12.
Imc4
This format is the same as imc2, but u (CB) and V (CR) rows are exchanged:
Yv12
All y samples are usedChar
The array composed of values is first displayed in the memory. This array is followed by all V (CR) samples. V
The span of the plane is half of the span of the Y plane, and the behavior of the V plane is half of the row of the Y plane. The V plane is followed by all the U (CB) examples, and its cross distance and number of rows are the same as the V plane (figure
12 ).
Nv12
All y examples are composedChar
The array composed of values is first displayed in the memory, and the number of rows is an even number. Y
The plane is followed by a forward or backward sign.Char
Value array, which contains the packaged U (CB) and V (CR) samples, 13
. When the combined U-V array is consideredWord
When the value is an array, the LSB contains the U value, MSB
Contains the V value. Nv12 is the preferred 4: 2: 0 pixel format for DirectX va. It is expected to become DirectX va that supports videos.
Medium-term requirements for accelerators.
YUV format Resolution 2 confirm the h264 video format-h264
Supported
4
:
2
:
0
The encoding and decoding of consecutive or interlaced videos.
YUV (also known as ycrcb) is a color encoding method used by European television systems (PAL ). YUV is mainly used to optimize color video signal transmission, making it backward compatible with old-fashioned
Black/white TV. Compared with RGB video signal transmission, RGB requires three independent video signals to be transmitted at the same time ). "Y" indicates the brightness.
(Luminance or Luma), that is, the gray scale value; while "u" and "V" represent the color (chrominance or chroma), which is used to describe the image color and saturation.
Degree, used to specify the color of the pixel. The "brightness" is created through the RGB input signal by overlapping specific parts of the RGB signal. "Color" defines two aspects of color-tone and saturation
Degrees, expressed by Cr and CB respectively. Cr reflects the difference between the red part of the GB input signal and the brightness value of the RGB signal. CB indicates that the blue color of the RGB input signal is highlighted by the RGB signal.
The same degree value. Add the concept of field-the concept of field does not start with DV, and the TV system already exists (of course, we all know the relationship between DV and TV). In the end, it is still a scanning problem, specific to the PAL system:
25 frames per second, two scans per frame, one Scan Line (including the television's electron beam) first from top to bottom, and then the second scan from top to bottom
I understand the concept of the field mainly to make the picture more smooth and smooth at a limited bandwidth and cost.
The scanning lines of the two fields are separated from each other. For example, for a frame, the top line number is 0, followed by 1, followed by 2, 3, 4, and 5, 6 .... So the first field may be, or 6; maybe, -- this is the line scan.
In the row-by-row scan mode, scanning lines are scanned sequentially in the order of 0, 1, 2, 3, and 4. Obviously, there is no concept of field at this time. YUV and YCbCr
The YUV color model comes from the RGB model,
This model separates brightness and color, which is suitable for image processing.
Application: simulation field
Y' = 0.299 * R' + 0.587 * G' +
0.114 * B'
U' =-0.147 * R'-0.289 * G' + 0.436 * B '= 0.492 * (B '-
Y ')
V '= 0.615 * R'-0.515 * G'-0.100 * B' = 0.877 * (R '-
Y ')
R' = y' + 1.140 * V'
G' = y'-0.394 * U '-
0.581 * V'
B '= y' + 2.032 * U'
The YCbCr model comes from the YUV model. YCbCr is YUV
The offset version of the color space.
Application: digital video, ITU-R bt.601 recommendations
Y' = 0.257 * R' + 0.504 * G' + 0.098 * B '+
16
CB '=-0.148 * R'-0.291 * G' + 0.439 * B' +
128
Cr '= 0.439 * R'-0.368 * G'-0.071 * B' +
128
R' = 1.164 * (y'-16) +
1.596 * (CR '-128)
G' = 1.164*(y'-16)-0.813 * (CR '-128 )-
0.392 * (CB '-128)
B '= 1.164 * (y'-16) +
2.017 * (CB '-128)
PS:
All the above Symbols carry an apostrophes, indicating that the symbol performs Gamma Correction Based on the original value. Gamma Correction helps to compensate for the detailed loss caused by linear allocation of gamma values in the anti-aliasing process, enrich image details. Without Gamma Correction, the details of the dark area are not easily displayed. After this image enhancement technique is used, the layers of the images are clearer.
Therefore, YUV in h264 should belong to YCbCr.
Next, let's talk about the YUV format,
The YUV format generally has two categories: packed format and planar format. The former stores the YUV component in the same array, usually several adjacent pixels constitute a macro pixel (macro-pixel), while the latter uses three arrays to separate the three components of YUV, it is like a three-dimensional plane.
We often say that yuv420 belongs to the planar format YUV, The Color Ratio
As follows:
Y0u0v0 Y1
Y2u2v2 Y3
Y4 Y5
Y6 y7
Y8u8v8 y9
Y10u10v10 y11
Y12 Y13
Y14 Y15
For other formats of YUV, you can click here to view the details. In the YUV file, how is yuv420 stored?
In the YUV sequence of common h264 tests, for example, the YUV sequence of the CIF image size (352*288), there is no file header at the beginning of the file, directly YUV data, first save the y information of the first frame. The length is 352*288 bytes,
Then, the length of the first frame U information is 352*288/4 bytes, And the last frame V information. The length is 352*288/4 bytes,
Therefore, the total length of the first frame is 352*288*1.5, that is, 152064 bytes. If the sequence is 300 frames,
The total sequence length is 152064*300 = 44550kb, which is why the common 300-frame CIF sequence is always 44m.
Sampling means that the three elements y, CB, and CR have the same resolution. In this way, the three elements are sampled on each pixel. Number 4 refers to a variety
The sampling rate of each element. For example, there are four CB Cr sampling values for each of the four brightness sampling points. All the information values are fully retained at. during sampling (sometimes recorded as yuy2), the color
The level element has the same resolution as the brightness value in the vertical direction, while the horizontal side is half of the brightness resolution (represents two CB and Cr samples for each four brightness values .) video used to construct high-end products
Quality video color signal.
In the popular sampling formatYv12
)
CB
And CR have half of the Y Resolution in the horizontal and vertical directions. is somewhat different, because it does not mean to use in actual sampling, but to define this encoding method in the coding history.
The sampling method at and is widely used in consumer applications, such as video conferencing, digital TV, and DVD storage. Because each color difference element contains four points
The amount of Y sampling elements in one, so the 4: 2: 0ycbcr video needs exactly 4:
4: 4 or half of the RGB video sampling volume.
Sampling is sometimes described as a "12-bit per pixel
. The reason can be seen from the sampling of Four pixels.
Sampling at is performed for 12 times. For each y, CB, and Cr, 12*8 = 96 bits are required, with an average of 96/4 = 24 bits. 6*8 if is used
= 48 bits, 48/4 = 12 bits per pixel on average.
In a video sequence scanned at, the samples corresponding to the Y, CB, AND Cr of a complete video frame are allocated to two fields.
. We can see that the total number of samples for the interlace scan is the same as the number of samples used in the progressive scan.
Comparison:
The y41p (and y411) (packed format) format retains the Y component for each pixel, while the UV component samples every 4 pixels in the horizontal direction. A macro pixel is 12 bytes, which actually represents 8 pixels. The YUV component order in the image data is as follows:
U0 y0 V0 Y1 U4 Y2 V4 Y3 Y4 Y5 y6 Y8...
Iyuv format (planar) extracts Y components for each pixel. When UV components are extracted, the image is first divided into several 2 x
And then each Macro Block extracts one u component and one V component. Yv12
The format is similar to that of iyuv, but it is still flat mode.
The yuv411 and yuv420 formats are mostly used in DV data. The former is used in NTSC, and the latter is used in DV data.
Pals. Yuv411 extracts the Y component for each pixel, while the UV component samples every four pixels horizontally. Yuv420 is not a V component sampling of 0, but compared with yuv411,
Increase the color difference sampling frequency by one time in the horizontal direction, and reduce the color difference by half at the U/V interval in the vertical direction, as shown in.
(It seems that the image cannot be displayed)
Specific requirements for the use of BITs in various formats (sampling with is used, and 8 bits are used for each element ):
Format: Sub-qcif brightness resolution: 128*96 bits per frame: 147456
Format: qcif brightness resolution: 176*144
Bits used for each frame: 304128
Format: CIF brightness resolution: 352*288 bits per frame: 1216512
Format: 4cif
Brightness resolution: 704*576 bits per frame: 4866048