1. Data rate of TV images
1.1 ITU-R BT.601 standard data rate
itu-r bt.601 standard. According to this standard, use 4:2:2 the sampling format,
Brightness (Y):
sample/row x525 Line/frame x30 frame/sec x10 bit /sample ≌ 13.5 megabits per second (NTSC)
864 samples/row x625 Line/frame X25 frame/second x10 bit/sample ≌ 13.5 megabits per second (PAL)
Cr (r-y):
429 sample/row x525 Line/frame x30 frame/sec x10 bit/sample Ben ≌ 6.8 megabits per second (NTSC)
429 samples/row x625 Line/frame x25 fps x10 bit/sample ≌ 6.8 megabits per second (PAL)
Cb (b-y):
429 sample/row x525 Line/frame x30 frame/sec x10 bit/sample ≌ 6.8 megabits per second (NTSC)
429 samples/row x625 Line/frame X25 frame/sec x10 bit/sample ≌ 6.8 megabits per second (PAL)
Total:
27 Mega Sample/sec x10 bit/sample = 270 megabits per second
In fact, the data transfer rate for effective images shown on the screen is not so high,
Brightness (Y): 720x480x30x10≌104 MB/s (NTSC)
720x576x25x10≌104 MB/s (PAL)
Chromatic aberration (CR,CB): 2x360x480x30x10≌104 MB/s (NTSC)
2x360x576x25x10≌104 MB/s (PAL)
Total: ~ 207 MB/s
If the sampling accuracy of each sample is reduced from 10 bits to 8 bits, the data transfer rate of the color digital TV signal is reduced to 166 MB/s.
2. Data compression algorithm
The TV image itself contains many redundant information in both time and space, and the structure of the image itself is redundant. In addition, as described earlier, the use of human visual characteristics can also be compressed image, which is called visual redundancy.
Various redundancy information of TV image compression and utilization
Kinds |
Content |
The main methods currently used |
Statistics |
Space redundancy |
The correlation between pixels |
Transform coding, Predictive coding |
Characteristics |
Time Redundancy |
Correlation in the time direction |
Inter-frame prediction, motion compensation |
Image Construction Redundancy |
The construction of the image itself |
Contour Coding, Area segmentation |
Knowledge redundancy |
The mutual understanding of the characters at both ends |
Knowledge-based coding |
Visual redundancy |
People's visual characteristics |
Nonlinear quantization, bit assignment |
Other |
Uncertainty factors |
|
The basic methods and methods of Mpeg-video image compression technology can be summed up into two points:
① in the spatial direction, image data compression uses the JPEG (Joint Photographic experts Group) compression algorithm to remove redundant information.
② in the time direction, image data compression uses the motion compensation algorithm to remove redundant information.
In order to ensure that the image quality is not reduced substantially and can obtain a high compression ratio, the MPEG Expert group defines three kinds of images: intra- frame image I (intra), predictive image P (predicted) and bi-directional predictive image B (bidirectionally interpolated ), these three images will be compressed using three different algorithms.
Three images defined by the MPEG Expert Group
Add:
In video compression, each frame represents a still image. And in the actual compression, will take a variety of algorithms to reduce the data capacity, where IPB is the most common. To put it simply, I-frame is a keyframe, which belongs to intra-frame compression. Is the same as the compression of AVI. P is the meaning of the forward search. B is a two-way search. They are all based on I-frames to compress the data. I-frames represent keyframes, which you can understand as a complete reservation for this frame, and only need this frame of data to complete the decoding (because it contains the full picture). The P-frame represents the difference between this frame and a previous keyframe (or P-frame), which needs to be decoded to create the final picture by overlaying the differences defined by this frame with the previously cached screen. (That is, the difference frame, p-frame does not have the complete picture data, only with the previous frame the picture difference data). B-Frame is a two-way differential frame, that is, B-frame recording is the difference between this frame and the front and back frame (more complex, there are 4 cases), in other words, to decode the B-frame, not only to obtain the previous cache screen, but also to decode the screen after the image and the frame data overlay to obtain the final picture B-Frame compression rate is high, but the CPU will be more tired when decoding ~.
From the above explanation, we know that I and P decoding algorithm is relatively simple, resource consumption is relatively small, I as long as the completion of the line, p, but also only need the decoder to cache the previous screen, encountered p when using the cache before the screen is good, if the video stream only I and P, decoder can no matter the data behind, Reading side decoding, linear forward, everyone is very comfortable. But many movies on the network use B-frame, because B-frame record is the difference between the frame, compared to P-frame can save more space, but in this case, the file is small, the decoder is troublesome, because in decoding, not only to use the screen before the cache, but also know the next I or P screen (that is, pre-read pre- , B-frame can not simply throw away, because B-frame actually also contains the picture information, if simply discarded, and with the previous screen simple repetition, will cause the picture card (in fact, dropped frames), and because the network of movies in order to save space, often use a lot of B-frame, b frame with more, For a player that does not support B-frames, it can cause more trouble, and the picture will become more and more jammed. Generally speaking, I compression rate is 7 (similar to JPG), p is 20,b can reach 50, the use of B-frame can save a lot of space, save space can be used to save more I frame, so that at the same rate, can provide better picture quality.
2.1. Compression coding algorithm for intra-frame image I
If the TV image is represented by an RGB space, it is first converted to the image represented by the YCRCB space. Each image plane is divided into 8x8 tiles, with discrete cosine transform DCT (discrete cosine Transform) for each tile. After the DCT transformation, the quantization AC component coefficients are sorted according to the Zig-zag shape, and then encoded using lossless compression technique. The quantized DC component coefficients of the DCT transform are encoded with the differential Pulse code DPCM (Differentialpulse code modulation), the AC component coefficients are encoded with the stroke length RLE (run-length encoding), and then Hoffman ( HUFFMAN) encoded or encoded with arithmetic.
2.2. Compression coding algorithm for predicting image p
The encoding of the predictive image is also based on the image macro block (Macroblock) as the basic coding unit, and a macro block is defined as an image block of ixj pixels, which is generally taken 16x16. The predictive image p is represented by two types of parameters: One parameter is the difference between the macro block of the image that is currently being encoded and the macro block of the reference image, and the other is the move vector of the macro block. Assuming that the encoded Image Macro block MPI is the best matching block of the reference image macro block MRJ, their difference is the difference between the corresponding pixel values in the two macro blocks. The difference value is converted into color space, and the Y,CR and CB component values are obtained by the sub-sampling of 4:1:1, then the difference is encoded by the JPEG compression algorithm, and the calculated moving vectors are also carried out Huffman coding.
2.3. Two-way predictive image B compression coding algorithm
2.4. Image structure
The MPEG encoder algorithm allows you to select the frequency and position of the I image. The frequency of I images is the number of times I images appear per second, where the frame is located in the time direction. In general, I images have a frequency of 2. The MPEG encoder also allows you to select the number of B images between a pair of I images or a P image. The selection of the number of images, p images and B images is based primarily on the content of the root program. For example, for fast moving images, I images can be selected at a higher frequency, the number of B images can be selected less, and for the full-speed motion image I images can be a little lower, while the number of B images can be selected a little more. In addition, the rate of media is also considered in practical applications.
A typical I, P, b image arrangement. The encoding parameter is: The distance of the image I within the frame is n=15, and the distance of the predicted image (P) is m=3.
MPEG TV Frame Arrangement
The size of the I, p, and B images is compressed as shown in the table, in bits. As you can see from the table,I-frame images have the largest amount of data, while B-frame images have the smallest amount of data .
Typical compression of MPEG three images (bit)
Image type |
I |
P |
B |
Average Data/Frame |
MPEG-1 CIF Format (1.15 mb/s) |
150 000 |
50 000 |
20 000 |
38 000 |
MPEG-2 601 Format (4.00 MB/s) |
400 000 |
200 000 |
80 000 |
130 000 |
Streaming Media 6--mpeg TV