Video I frame/p/b Frames

Source: Internet
Author: User

In video compression, each frame represents a still image. And in the actual compression, will take various algorithms to reduce the capacity of the data, which IPB is the most common.

Simply put,I frame is a keyframe, which belongs to intra compression . Is the same as the compression of AVI. P is the meaning of the forward search. B is a two-way search . They all compress data based on I frames.

I frame to represent the key frame, you can understand this frame picture of the complete reservation , decoding only need this frame data can be completed (because contains the full picture)

The p frame represents the difference between this frame and a previous key frame (or P frame), which needs to be superimposed on the previously cached screen to create the final picture. (that is, different frames , p frames do not have complete picture data, only with the previous frame of the picture of the different data)

b Frame is a two-way difference frame, that is, B frame record is this frame and before and after frame difference (concrete more complex, there are 4 cases), in other words, to decode B frame, not only to obtain the previous cache screen, but also to decode the screen after, The final picture is obtained by overlaying the frame data with the front and rear frames. b frame compression rate is high, but decoding CPU will be more tired ~.

adopted compression methods: grouping: A few frames of images into a group (GOP), in order to prevent movement changes, the number of frames should not be taken more.

1. Define frames: Each frame image in each group is defined as three types, i.e. I frame, b frame and p frame;

2. Prediction frame: Frame by I frame, the P frame is predicted by I frame, and B frame is predicted by I frame and p frame;

3. Data transmission: Finally the I frame data and the predicted difference information are stored and transmitted.

one, I frame

I image (frame) is a frame-coded image which can compress the transmitted data volume by removing the redundant information of the image space as far as possible.

I frames are also called interior frames (Intra picture), and I frames are usually the first frame of each GOP (a video compression technique used by MPEG), which is moderately compressed (as a random-access reference point) to be an image. In the process of MPEG coding, some video frame sequences are compressed into I frames, some are compressed into p frames, and some  are compressed into B frames. The I-frame method is an intra compression method (P, B is a frame), also known as the "Keyframe" compression method. The I-frame method is a compression technique based on discrete cosine transform (DCT) (discrete cosine Transform), which is similar to JPEG compression algorithm. Using I frame compression can achieve a compression ratio of 1/6 and no obvious compression traces.

I frame features :

1. It is a full frame compression coding frame. It encodes and transmits the whole frame image information in JPEG compression;

2. The complete image can be reconstructed using only I-frame data during decoding;

3.I frames Describe the background of the image and the details of the motion body;

4.I frames are generated without reference to other images;

5. I frame is the reference frame of P frame and B frame (its mass directly affects the quality of the subsequent frames in the same group);

6.I frame is the base frame (first frame) of frame group GOP, only one I frame in a group;

7.I frames do not need to consider motion vectors;

8.I frames account for a large amount of data.

I frame coding process:

(1) To make intra prediction and decide the frame prediction mode.

(2) The pixel value minus the predicted value and the residual error.

(3) Transform and quantify the residual error.

(4) Variable length coding and arithmetic coding.

(5) Reconstructing the image and filtering the image as the reference frame of other frames.

Two, p frame

The P image (frame) is a coded image, also known as a predictive frame , to compress the transmitted data by reducing the time redundancy information of the previous coded frames in the image sequence.

When encoding for continuous dynamic images, several consecutive images are divided into p,b,i three types , and p frames are predicted by P frames or I frames in front of it, comparing with the same information or data between the P frame or the I frame before it, and that is to consider the motion characteristic to compress between frames. The P-Frame method compresses this frame data according to the difference between the frame and the adjacent previous frame (I or P frame). The combination of P-frame and I-Frame compression method can achieve higher compression and no obvious compression traces.

prediction and reconstruction of P-frames:

The P frame is a reference frame with I frame, and the prediction value and motion vector of P frame "some point" are found in I frame, and the predicted difference and the motion vector are transmitted together. The complete P frame can be obtained by finding the predicted value of P frame "some point" in the receiver based on the motion vector and adding the difference to get the P frame "some point" sample value.

p Frame Features:

①p frames are coded frames that are 1-2 frames apart behind I frames.

The ②p Frame uses motion compensation to transmit the difference between it and the previous I or P frame and the motion vector (prediction error).

The complete P-frame image can be reconstructed when the ③ decoding must sum the predicted value and the predicted error in the I frame.

④p frames belong to the inter-frame encoding of forward prediction. It refers only to the I frame or p frame closest to it.

The ⑤p frame can be a frame of reference for p frames behind it, or it can be a reference frame for frame B before and after it.

⑥ because P frames are reference frames, it can cause decoding errors to spread.

⑦ because it is the difference transmission, the compression of P frames is relatively high.

three, B frame

b Image (frame) is not only considered with the source image sequence in front of the coded frame, but also takes into account the source image sequence after the coded frame between the time redundancy information to compress the transmitted data volume of encoded images, also known as bidirectional predictive frames .

B-Frame method is a bidirectional prediction of the frame compression algorithm. When a frame is compressed into B frame, it compresses this frame according to the different points of the previous frame, this frame and the next frame data, that is, only the difference between frame and frame is recorded. Only use B frame compression to achieve high compression of 200:1. Generally, I frame compression efficiency is lowest, p frame is higher, b frame is the highest.

B-Frame prediction and refactoring:

The frame B is based on the previous I or P frame and the P frame followed by "finding" the predicted value and two motion vectors of "B" frame "a certain point", and taking the predictive difference and the motion vector for transmission. The receiving End "finds (calculates)" The predicted value and sums it with the difference value according to the motion vector in two reference frames, obtains the B frame "some point" the sample value, thus can obtain the complete B frame.

b frame Features:

1.B frames are predicted by the previous I or P frames and the p frames behind them;

The 2.B frame transmits the prediction error and the motion vector between it and the previous I or P frame and the p frame behind it;

3.B frames are bidirectional predictive coding frames;

4.B Frame Compression ratio is the highest, because it only reflects the change of motion body between 2 reference frames, and the prediction is more accurate;

5.B frames are not reference frames and do not cause the spread of decoding errors.

the basic flow of P-frame and B-frame coding is:

(1) Motion estimation is used to compute the rate-distortion function (section) value of the frame-coded mode. P frames refer only to the preceding frames, and B frames refer to the frames that follow.

(2) in-frame prediction, the selection of the minimum value of the distortion function in the frame model and the comparison between frames, determine which encoding mode to use.

(3) Calculate the difference between the actual value and the predicted value.

(4) Transform and quantify the residual error.

(5) If the encoding, if it is the frame encoding mode, coding motion vector.

Note: I, B, p each frame is based on the needs of the compression algorithm, is artificial definition, they are real physical frame, as to the image of which frame is I frame, is random, one but determined I frame, the subsequent frames are strictly in accordance with the prescribed order.

Iv. Practical Application

From the above explanation, we know that the decoding algorithm of I and P is relatively simple, resource consumption is also relatively small,I as long as the completion of the line,p, also only need a decoder to cache the previous screen, when you encounter the use of the cache before the screen is good, if the video stream only I and P, Decoder can be no matter the back of the data, while reading side decoding, linear forward, everyone very comfortable.

But a lot of movies on the Web use B frame, because B frame is recorded before and after frame difference, than P frame can save more space , but in this way, the file is small, decoder is troublesome, because in decoding, not only to use the previously cached screen, Also know the next I or P screen (that is, to read the pre-decoding ), moreover, B frame can not be simply discarded, because B frame actually also contains the picture information, if simply discarded, and the previous picture simple repetition, will cause the picture card (in fact, is lost frame), And because the movie on the network in order to save space, often use quite a lot of B frame, b frame used more, to not support the player of B frame caused more trouble, the picture is more card.

Generally speaking, thecompression rate of I is 7(similar to JPG),p is 20,b can reach, visible use B frame can save a lot of space, save the space can be used to save a few more I frame, so at the same rate, can provide better quality.


In the above illustration, the length of GOP (Group of Pictures) is 13,S0~S7 for 8 viewpoints, t0~t12 for GOP 13 moments. Each GOP contains the number of frames that are the product of the GOP length of the viewpoint number. In the diagram, a GOP contains 94 frames of B. Frame B takes up 90.38% of the total frame of a GOP. The longer the GOP, the higher the proportion of frame B, and the higher the rate distortion of the coding. The following figure tests the rate-distortion performance comparison of the sequence Race1 under different GOP.


Original source: http://blog.csdn.net/liangxiaozhang/article/details/17628829

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.