Video bit rate, Frame Rate and resolution and h264 Introduction

Source: Internet
Author: User

Which of the following affects the definition of a movie? video bit rate, frame rate, and resolution?

Bit Rate: affects the volume, which is proportional to the volume. The larger the bit rate, the larger the volume. The smaller the bit rate, the smaller the volume.

Bit rate is the number of digits transmitted per unit time during data transmission. Generally, the unit is kbps, that is, a thousand bits per second. That is, the sampling rate. The larger the sampling rate per unit time, the higher the accuracy. The closer the processed file is to the original file, but the file volume is proportional to the sampling rate, therefore, almost all encoding formats focus on how to use the lowest Bit Rate to achieve the least distortion. CBR (fixed bit rate) and VBR (Variable Bit Rate) are derived from this core ), "Bit Rate" is distortion. The higher the bit rate, the clearer it is. On the contrary, the picture is rough and multi-Mosaic.

Frame Rate: affects the smoothness of the image, which is directly proportional to the smoothness of the image. The larger the frame rate, the smoother the image. The smaller the frame rate, the more dynamic the image is. If the bit rate is a variable, the frame rate also affects the volume. The higher the frame rate, the more images pass through each second, the higher the bit rate and the larger the volume.

The frame rate is the number of frames transmitted in one second. It can also be understood that the graphic processor can refresh several times per second,

Resolution: affects the image size, which is proportional to the image size. The higher the Resolution, the larger the image. The lower the Resolution, the smaller the image.


When the bit rate is fixed, the resolution is inversely proportional to the definition. The higher the Resolution, the less the image is, the lower the resolution, and the clearer the image.
When the resolution is fixed, the bit rate is directly proportional to the definition. The higher the bit rate, the clearer the image. The lower the bit rate, the less unclear the image.


Bandwidth and Frame Rate

For example, if an image is transmitted over an ADSL line, the uplink bandwidth is only 512 kbps, but a four-way CIF resolution image needs to be transmitted. According to the general rule, the recommended bitrate for CIF resolution is 512 kbps. Therefore, only one channel can be transferred based on this computation. Reducing the bitrate will inevitably affect image quality. To ensure the image quality, we must reduce the frame rate. In this way, even the bit rate is reduced, the image quality will not be affected, but the image consistency will be affected.



H. 264 is built on the basis of the MPEG-4 technology, and its coding and decoding process mainly includes five parts: Inter-frame and intra-Frame Prediction (estimation), transform (Transform) and anti-transformation, quantization, anti-quantization, loop filter, and entropy coding ).

H. 264 of the MPEG-4 (ASP) and H.263 + + (HLP) was significantly better than that of (ASP) and H.263 + (HLP. 264 of the SNR is 2db higher than the MPEG-4 (ASP) average, and the ratio is 3 dB higher than the H.263 (HLP) average. Six testing rates and their related conditions are: 32 kbit/s rate, 10f/S frame rate, and qcif format; 64 kbit/s rate, 15f/S frame rate, and qcif format; 128 kbit/s rate, 15f/S frame rate, and CIF format; 256 kbit/s rate, 15f/S Frame Rate and qcif format; 512 kbit/s rate, 30f/S Frame Rate and CIF format; 1024
Kbit/s rate, 30f/S frame rate, and CIF format. Bytes


H. the biggest advantage of 264 is that it has a high data compression ratio. When the image quality is the same, H. the compression ratio of 264 is more than 2 times of the MPEG-2, And the chisel is 1.5 ~ of the MPEG-4 ~ 2 times. For example, if the size of the raw file is 88 GB, The MPEG-2 compression standard is used to compress the file into 3.5 GB, the chisel compression ratio is 25∶1, and H. 264 compression standard compression is changed to 879 MB, bytes from 88gb to 879 MB, memory H. the compression ratio of 264 reaches an astonishing 102: 1! Why is H.264 so high compression ratio? Low Bit Rate (Low Bit Rate) plays an important role, compared with MPEG-2 and MPEG-4 compression technology such as ASP, H. 264 compression technology will greatly save the user's download time and data traffic charges.





I/P/B frame

I frame: Intra-frame encoding frame, that is, key frame or independent Frame

Prediction frame: I frame is used as the base frame, I frame is used to predict P frame, and then I frame and P frame are used to predict B frame;

B frame: bidirectional prediction interpolation Encoding Frame

P frame: Forward prediction Encoding Frame

In addition to P frames and B frames, swap H.264 also supports a new inter-Stream Transfer frame-SP Frame

The statistical results show that the interval is 1 ~ In a two-frame image, each pixel has only less than 10% points, and its brightness difference changes more than 2%, while the color difference changes only less than 1%.

In the video compression process, I frame is image data compression, and it is an independent frame. P frame refers to I frame for Image Data Compression between frames, rather than independent frames. Most of the compressed videos are B/P frames, so the quality of the videos is mainly represented by B/P frames. Because B/P frames are not independent frames, but only save the difference between them and adjacent I frames, there is no definition of resolution, and we should regard them as a binary difference sequence.

However, this binary sequence uses quantitative parameters for lossy compression When Using Entropy encoding compression technology. The video quality is directly determined by the quantitative parameters, which directly affect the compression ratio and bit rate.

Video quality can be expressed in subjective and objective ways. The subjective mode is usually the definition of the video, while the objective parameter is the quantization parameter or the compression ratio or bit rate. When the video source is the same and the compression algorithm is the same, there is a direct proportional relationship between the quantization parameter, compression ratio and bit rate.

Resolution changes are also called re-sampling. From high resolution to low resolution is called subsampling. Because there is sufficient data before sampling, you only need to keep more information as much as possible, and generally get relatively good results. From low resolution to high resolution, it is called on sampling. Because interpolation and other methods are required to supplement (guess) the missing pixels, distortion is inevitable. This is a kind of video quality (Definition).





Transmitted data volume

Because the packets received by the network adapter are different in length, these different packet length groups will inevitably affect the interface speed. BusTransmissionIn addition to dataNet LoadThe required protocol header and end are added to the protocol layer. Therefore, with the network overhead, the amount of transmitted data is generally smaller than or equal to the amount of net load data x 1.3.












H.264 Key Technology
1. Intra-Frame Prediction Encoding
In-frame encoding is used to reduce image space redundancy. In order to improve the encoding efficiency within H.264 frames, cosine makes full use of the spatial relevance of adjacent macro blocks in a given frame. Adjacent macro blocks generally have similar attributes. Therefore, when coding a given Macro Block, the planner can first predict the surrounding Macro Block (typically based on the macro block in the upper left corner, because this macro block has been encoded), then encode the difference between the predicted value and the actual value. In this way, the encoding is relative to the direct encoding of this frame, percent can greatly reduce the bit rate. Bytes
H.264 provides 6 Modes for 4x4 pixel Macro Block prediction, including one DC prediction and five direction prediction, as shown in trend 2. As shown in the following figure, the 9 pixels from A to I of the adjacent blocks of the primary node have been encoded, and the secondary node can be used for prediction. If we select mode 4, then, pixel A, B, C, and D4 are predicted to be equal to the value of E, and pixel E, F, G, and H4 are predicted to be equal to the value of F, fill h. 264 also supports 16x16 in-frame encoding. Figure 2 intra-frame encoding Mode

2. Inter-Frame Prediction Encoding
Inter-Frame Prediction encoding uses time redundancy in Consecutive Frames for motion estimation and compensation. Optional H. 264 of Motion Compensation supports most of the key features of previous video encoding standards. More functions are added in concave mode and flexible mode. Except for P frames and B frames, H. 264 also supports a new inter-Stream Transfer frame-SP frame, as shown in step 3. After a gang code stream contains an SP frame, the hacker can quickly switch between code streams with similar content but different bit rates. The hacker also supports random access and fast Playback modes. Fig. 3 The motion estimation of SP-frame H.264 has the following four characteristics. Bytes

(1) macro block separation of different sizes and shapes
The motion compensation for each 16*16 pixel macro block can adopt different sizes and shapes. The reduction of H.264 supports 7 modes, as shown in Fig. 4. The reduction of Block Mode Motion Compensation improves the performance for processing motion details, the reduction of the block effect, and the increase of the image quality. Method for separating 4 macro blocks for plotting

(2) high-precision sub-pixel motion compensation
In H.263, motion estimation with half pixel precision is adopted, while 264 or 1/4 pixel motion estimation can be used in H.264. When the same precision is required, the residual value after using motion estimation of 264 or 1/4 pixels is smaller than that after H.263 uses Motion Estimation with half pixel precision. In this way, at the same precision, the bit rate required for the encoding of H.264 is smaller. Bytes

(3) Multi-Frame Prediction
H. 264 provides an optional Multi-frame prediction function. During Inter-frame encoding, the secondary node can select five different reference frames, providing better error correction performance, this improves the quality of video images. The camera feature is mainly used in the following scenarios: periodic motion, Pan motion, and switching the camera lens between two different scenes. Bytes

(4) block filter Removal
H.264 defines a filter for adaptive block effect removal. This can be used to process the horizontal and vertical block edges in the prediction loop, greatly reducing the block effect. Kai
3. Integer Conversion
Liu H. 264 uses a 4*4 pixel block-based transformation similar to DCT, while the transform uses an integer-based spatial transformation, and the transform does not have a reverse transformation, there is an error due to the trade-off, as shown in Transformation Matrix 5. Compared with floating-point operations, the bitwise DCT transformation may cause some extra errors. The Quantization Error also exists because of the quantization error after the DCT transformation, the Quantization Error Caused by cosine integer DCT transformation is not significant. In addition, the integer DCT transformation reduces the calculation workload and complexity, and the transform is advantageous for porting to the fixed-point DSP. Bytes

4. Quantization
H. in 264, 32 different quantization step sizes are available, which is similar to 31 quantization step values in H.263, but in H. in 264, the stool step is progressive with a compound rate of 12.5%, instead of a fixed constant. Bytes
In H.264, two reading modes are available for the strike transform coefficient: The shape (Zigzag) scan and the double scan, as shown in Step 6. In most cases, a simple glyph scan is used. Dual-scan is only used in blocks with a smaller quantization level, which helps improve coding efficiency. Delete the reading method of the conversion coefficient in Figure 6

5. Entropy Encoding
The last step of video encoding is entropy encoding, which is the culprit in H. 264 adopts two different entropy encoding methods: Universal variable-length coding (UVLC) and text-based adaptive binary arithmetic coding (cabac ). Benefit




H.264 technical highlights

1. Hierarchical Design
H. in terms of concept, the 264 algorithm can be divided into two layers: the video encoding layer (VCL: Video Coding layer) is responsible for efficient video content representation, and the Liu network extraction layer (NAl: Network encoding action Layer) pack and transmit data in an appropriate manner as required by the network. A grouping-based interface is defined between VCL and Nal. The RST packaging and corresponding signaling are part of Nal. In this way, the tasks with high coding efficiency and network friendliness are completed by VCL and NAL respectively. Bytes
The VCL layer includes block-Based Motion Compensation mixed encoding and some new features. Like the preceding video encoding standard, ikeh.264 does not include pre-processing, post-processing, and other functions in the draft. Deletion can increase the flexibility of the standard. Bytes
NAL is responsible for encapsulating data using the segmented format of the lower-layer network. The Edge includes the group frame, the signaling of the Logical Channel, the utilization of the timing information, or the end signal of the sequence. For example, phoennal supports the transmission format of the video on the circuit switching channel, and TTS supports the RTP/udp/IP transmission format of the video on the Internet. Internal nal includes its own header information, segment structure information, and actual load information, creating VCL data at the upper layer. Stool (if data segmentation technology is used, shard data may consist of several parts ). Bytes

2. High-precision and multi-mode Motion Estimation
H.264 supports motion vectors of 264 or 1/4 pixels. 6-head filters can be used to reduce high-frequency noise when the precision is 1/4 pixels. For motion vectors with a precision of 1/8 pixels, a more complex 8-head filter can be used. When performing motion estimation, the encoder can also select the "enhancement" interpolation filter to improve the prediction effect. Bytes

In the Motion Prediction of H.264, a macro block (MB) can be divided into different sub blocks according to Figure 2, and seven block sizes in different modes are formed on the stool. Because of the flexibility and meticulous division of the multi-mode model, the accuracy of motion estimation is greatly improved by reducing the actual shape of moving objects in the image. In this way, the Histogram can contain 1, 2, 4, 8, or 16 motion vectors in each macro block. Bytes

In H.264, the encoder is allowed to use more than one frame of the previous frame for motion estimation. the culprit is the so-called multi-frame reference technology. For example, if the secondary encoder uses two or three encoded reference frames, the secondary encoder selects a better prediction frame for each target macro block, indicates which frame is used for prediction for each macro block. Bytes

Integer transformation of Blocks 3 and 4 × 4
H.264 is similar to the previous standard. Cosine uses block-based transform encoding for the residual, while transform is an integer operation rather than a real number operation. The process is similar to that of DCT. The advantage of this method is that the conversion and inverse transformation with the same precision are allowed in the encoder and decoder, which makes it easy to use a simple fixed-point operation method. In other words, there is no "inverse transformation error" here ". The unit of swap conversion is 4x4, and swap is not the commonly used 8x8 block. Because is used to reduce the size of the transform block, the division of convex moving objects is more accurate. In this case, not only does it have a small amount of transformation calculation, in addition, the error of joint at the edge of the moving object is greatly reduced. In order to enable the Small-Size Block transformation method to avoid gray-scale differences between blocks in a large area of the image, you can perform the second 4× 4 conversion on the DC coefficient of 16 4x4 blocks of Macro Block brightness data in the frame (one for each small block, 16 in total,. Bytes
H.264 in order to improve the bit rate control capability, the variation of the Gini quantization step is controlled at around 264, rather than the constant increase. The normalization of the shard transformation coefficient amplitude is processed in the inverse quantization process to reduce the complexity of calculation. In order to emphasize the vividness of the color, the Gini algorithm uses a small quantization step for the color coefficient. Create

4. Unified VLC
H. in 264, there are two methods for entropy encoding. One is to use the unified VLC (UVLC: Universal VLC) for all the symbols to be encoded ), the other method is adaptive binary arithmetic encoding (cabac: Context-adaptive binary arithmetic coding ). Cabac is optional. its encoding performance is slightly better than UVLC, but the complexity is also high. UVLC minus uses an infinitely long codeword set. The sequence design structure is very regular, and the sequence can encode different objects with the same code table. This method is very easy to generate a code word, and the decoder can easily identify the prefix of the code word. During the case of bit errors, the uvuvlc can quickly obtain the re-synchronization. Van

5. Intra-Frame Prediction
In the previous H.26x series and MPEG-x series standards, frames are used for inter-frame prediction. In H.264, intra-frame prediction is available when encoding intra images. For each 4x4 blocks (except edge blocks for special disposal ), each pixel can be predicted by different weighting and (some weights can be 0) of 17 closest previously encoded pixels, pixel is the 17 pixels in the upper left corner of the block where the pixel is located. Obviously, intra-frame prediction, namely, intra-frame prediction, is not a time-concave algorithm but a prediction Encoding Algorithm in the spatial domain. Intra can remove the space redundancy between adjacent blocks, ling achieves more effective compression. Bytes
As shown in figure 4, blocks A, B, and ,... P is 16 pixels to be predicted, while a, B ,... P is the encoded pixel. Values such as m points can be predicted in the (J + 2 K + L + 2)/4 formula, pipeline can also be predicted by (A + B + C + D + I + J + K + l)/8, and then Else. Based on the selected prediction reference points, there are nine different brightness modes in the brightness mode, while intra-frame color prediction only has one mode. Quantity
6. IP and Wireless Environments
Draft H.264 contains a tool for error elimination, which facilitates the transmission of compressed videos in scenarios with multiple error codes and packet loss, such as mobile channels or IP channels. Stool
To prevent transmission errors, the time synchronization in the H.264 video stream can be completed by refreshing the intra-frame image, and the swap space synchronization is supported by slice structured coding. At the same time, in order to facilitate the re-Synchronization After the error code, the secondary Node also provides some re-synchronization points in the video data of an image. In addition, when the Macro Block refresh and multi-reference Macro Block in the frame allow the encoder to consider not only the encoding efficiency, but also the transmission channel characteristics when determining the macro block mode. Bytes
In addition to the quantitative step change to adapt to the channel bit rate, in H.264, the data splitting method is usually used to cope with the channel bit rate change. In general, the concept of embedding data segmentation is to generate video data with different priorities in the encoder to support QoS in the network. For example, when syntax-based data partitioning is used, the author divides each frame of data into several parts based on its importance, this allows discarding unimportant information when the buffer overflow occurs. The cursor can also use a similar temporal data partitioning method. The cursor can use multiple reference frames in the P and B frames. Judgement

In the application of wireless communication, we can change the quantization precision of each frame or spatial/temporal resolution to support the large bitrate variation of wireless channels. However, in the case of multicast, it is impossible for the encoder to respond to different bit rates. As a result, Kay is different from the fine hierarchical coding FGS (Fine Granular Scalability) Method Used in MPEG-4 (low efficiency. 264 use SP frames for stream switching instead of hierarchical encoding.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.