H264 Related knowledge

Source: Internet
Author: User

1. Basic Concepts

I Frame: Intra-frame encoding frame also known as intra picture,i frame is usually the first frame of each GOP (a video compression technique used by MPEG), which is moderately compressed as a reference point for random access and can be used as an image. I-frames can be seen as a product of a compressed image.

P Frame: Forward prediction coding frame, also known aspredictive-frame, by sufficiently lower than the image sequence of the previously encoded frame of time redundancy information to compress the amount of transmitted data encoded image, called the prediction frame;

B frame: Bidirectional prediction interpolation coding frame also known as bi-directional interpolated prediction frame, both considering with the source image sequence before the encoded frame, It also takes into account the time redundancy information between the encoded frames behind the source image sequence to compress the encoded image of the transmitted data, also called the bidirectional prediction frame;

Pts:presentation time Stamp. PTS are primarily used to measure when a decoded video frame is displayed.

Dts:decode time Stamp. DTS primarily identifies when a binary stream that is read into memory is decoded when it starts to feed into the decoder. The order of DTS and the order of PTS should be the same without the existence of B-frames.

VCL, NAL, Nalu: In the H.264/AVC video coding standard, the entire system framework is divided into two levels: the video coding level (VCL) and the network abstraction level (NAL). The former is responsible for effectively representing the content of the video data, while the latter is responsible for formatting the data and providing the header information to ensure that the data suitable for the transmission of various channels and storage media. So each frame of data is a NAL unit (NALU).

SPS, PPS, IDR: In the actual H264 data frame, the frame is often preceded by a 00 00 00 01 or 00 00 01 delimiter, based on the Nalu type information after the delimiter, gets the frame as an SPS (sequence parameter set sequence Parameter set), PPS (image parameter set Picture Parameter set), IDR (Instant decoder refresh) and other specific types.

IPB frames are different:

I frame: itself can be extracted by the video decompression algorithm into a single complete picture.

P frame: You need to refer to an I frame or b frame in front of it to generate a complete picture.

B frame: Refer to its previous I or P frame and a P-frame behind it to generate a complete picture.

A GOP is formed between two I frames, and the size of the BF can be set by parameter in x264, i.e. the number of B between I and P or two p.

The above-mentioned basic can be explained that if a B frame exists, the last frame of a GOP must be p.

Different for DTS and pts:

DTS is primarily used for decoding video, which is used in the decoding phase. PTS are primarily used for video synchronization and output. Used when display. In the absence of a B frame. The output order of DTS and PTS is the same.

IDR and I-frames differ:

Both I and IDR frames are predicted using intra-frame. They are all the same thing, in the encoding and decoding for convenience, the first I-frame and other I-frame differences, so the first I frame is called IDR, so it is convenient to control the encoding and decoding process. The function of the IDR frame is to refresh immediately so that the error does not propagate, starting with the IDR frame and re-calculating a new sequence to begin coding. While I-frames do not have the ability to randomly access, this feature is assumed by IDR. IDR will cause DPB (Reference frame list-This is the key) to empty, and I will not. The IDR image must be an I image, but the I image is not necessarily an IDR image. A sequence can have a lot of I images, I image after the image can refer to I image between the image to do motion reference. There can be a lot of I images in a sequence, and the image after I image can refer to the images between I images for motion reference.

For IDR frames, all frames after the IDR frame cannot reference the contents of the frame before any IDR frames, and in contrast, for normal I-frames, the B-and P-frames after them can refer to the I-frame before the normal I-frame. From a randomly accessed video stream, the player can always play from an IDR frame, since there is no frame after it referencing the frame before it. However, you cannot start from any point in a video that does not have an IDR frame, because the frame that follows will always refer to the previous frame.

Example:

Here is an example of a GOP of 15, with its decoded reference frame and its decoding sequence in it:

such as: I-frame decoding does not depend on any other frames. The decoding of P frame depends on the I frame or p frame in front of it. The decoding of frame B depends on the nearest I-frame or P-frame and the nearest P-frame in front of it.

How do I determine the frame type (image reference frame or I, p frame, etc.)?

The Nalu type is a powerful tool for judging the type of frame, which is drawn from official documents such as:

We still go to the top figure of the code stream corresponding to the data to layer analysis, 00 00 00 01 After the next byte is the Nalu type, converted to binary data, the order of interpretation from left to right, as follows:
(1) 1th bit forbidden bit, value 1 indicates syntax error
(2) the 2nd to 3rd position is the reference level
(3) 4th to 8th is the NAL unit type

For example, there are 67,68 and 65 after 00000001.

Where the 0x67 binary code is:
0110 0111
4-8 = 00111, decimal 7, reference second Picture: 7 corresponding sequence parameter set SPS

Where the 0x68 binary code is:
0110 1000
4-8 is 01000, to decimal 8, refer to the second Picture: 8 corresponding image parameter set PPS

Where the 0x65 binary code is:
0110 0101
4-8 is 00101, to decimal 5, refer to the second Picture: 5 for the slices in the IDR image (I-frames)

So the algorithm for determining if the I-Frame is: (Nalu type & 0001 1111) = 5 = Nalu type & 31 = 5

such as 0x65 & 31 = 5

H264 Related knowledge

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.