One of the video codec learning: Theoretical basis

Last Update:2015-12-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Ext.: http://www.cnblogs.com/xkfz007/archive/2012/08/12/2613690.html

The 1th chapter introduces

1. Why video compression?

Uncompressed digital video has a huge amount of data
Storage difficulties
- A DVD can only store uncompressed digital video for a few seconds.
Transmission difficulties
- A 1 Gigabit bandwidth transfer of one second of digital TV video takes about 4 minutes.

2. Why can I compress

Removing redundant information
- Spatial redundancy: a strong correlation between image neighboring pixels
- Time redundancy: Similar content between neighboring images of a video sequence
- Coding redundancy: Different pixel values appear in different probabilities
- Visual redundancy: The human visual system is insensitive to certain details
- Knowledge Redundancy: The structure of regularity can be obtained by prior knowledge and background knowledge

3. Data Compression Classification

Lossless compression (Lossless)
- Image is fully consistent x=x ' after decompression
- Low compression ratio (2:1~3:1)
- Example: Winzip,jpeg-ls
lossy compression (Lossy)
- Image inconsistency after decompression before compression x≠x '
- Compression ratio (10:1~20:1)
- Using the characteristics of human visual systems
- Example: Mpeg-2,h.264/avc,avs

4. codec

Encoder (Encoder)
- A device or program that compresses a signal.
Decoder (Decoder)
- A device or program for extracting a signal.
Codec (CODEC)
- Codec pair

5. Composition of the compression system

(1) Key technologies in the encoder

(2) Key technologies in coding and decoding

6. Codec Implementation

The implementation platform of the codec:
- LSI VLSI
  - ASIC, FPGA
- DSP for digital signal processor
- Software
Codec Products:
- Stb
- Digital TV
- Camera
- Monitoring device

7. Video Coding Standard

Coding Standard Effect:

Compatible:
- Encoder-compressed code streams from different manufacturers can be decoded by different manufacturers ' decoders
Efficient:
- Standard codecs are available for mass production and cost savings.

The mainstream video coding standard:

MPEG-2
MPEG-4 Simple Profile
H.264/avc
Avs
VC-1

Standardization Organization:

Itu:international Telecommunications Union
- Vecg:video Coding Experts Group
Iso:international Standards Organization
- Mpeg:motion Picture Experts Group

8. Video Transmission

Video transmission: The compressed video stream is transmitted from the encoded end to the decoding end via the transmission system
Transmission system: Internet, terrestrial radio, satellite

9. Problems with video transmission

The transmission system is unreliable
- Bandwidth limit
- Signal attenuation
- Noise interference
- Transmission delay
Problems with video transmission
- Unable to decode the correct video
- Video playback delay

10. Video Transmission error control

Error control solves problems caused by data loss or delay during video transmission
Error control Technology:
- Error control technology of channel coding
- Encoder Error Recovery
- Decoder Error Concealment

One. QoS parameters for video transmission

End-to-end delay of a packet
Bandwidth: bits per second
Data packet Churn rate
Fluctuations in the delay time of the packet

2nd Chapter Digital Video

1. Image and Video

Image: A material representation of a person's visual perception.
Three-dimensional natural scene objects include: depth, texture, and luminance information
Two-dimensional images: texture and luminance information

Video: a continuous image.
Video consists of a number of images, including the object's motion information, also known as motion image.

2. Digital video

Digital Video: A digital sampling representation of the natural scene space and time.
- Spatial sampling
  - Resolution (Resolution)
- Time sampling
  - Frame rate: Frames per second

3. Spatial sampling

Spatial sampling of two-dimensional digital video image

4. Digital Video System

Acquisition
- Camera, Camera
Processing
- codecs, transmission devices
Show
- Display

5. Human visual system Hvs

Hvs
- Eyes
- Neural
- Brain

Hvs Features:
- Insensitive to high frequency information
- More sensitive to High contrast
- More sensitive to luminance information than chroma information
- More sensitive to movement information

6. The design of the digital video system should consider the features of Hvs:

Discard high-frequency information and encode only low-frequency information
Improving the subjective quality of edge information
Reduce the resolution of Chroma
Special handling of areas of interest (region of Interesting,roi)

7. RGB color Space

Three primary Colors: Red (R), Green (G), Blue (B).
Any color can be produced by mixing three primary colors in a certain proportion.
RGB Chroma Space
- Made up of RGB three colors
- Widely used in bmp,tiff,ppm, etc.
- Each chroma ingredient is usually expressed in 8bit [0,255]

8. YUV Color Space

YUV Color space:
- Y: Luminance Component
- UV: two chroma components
- YUV better reflects the HVS characteristics

9. Convert RGB to YUV space

The luminance component Y has the following relationship with the three primary colors:

After a lot of experimentation, ITU-R gave, and,

The mainstream encoding and decoding standard compression objects are YUV images

YUV Image Component sampling

YUV image can reduce the amount of video data by sampling the Chroma component according to the HVS characteristics.
YUV images typically have the following formats depending on the sample ratio of luminance and chroma components:

11. Common YUV Image format

Defines the image format based on the luminance resolution of the YUV image

12. Frame and field images

A frame image consists of two fields--the top field, the bottom field

13. Progressive and interlaced images

Progressive Image: Two fields of a frame image are obtained at the same time,ttop=Tbot.
Interlaced Image: Two fields of a frame image are obtained at different times, Ttop≠tbot.

14. Video Quality evaluation

Lossy video compression makes codec images different, and it needs a means to evaluate the quality of decoded images.
Quality Evaluation:
- Objective quality Evaluation
- Subjective quality evaluation
- Objective evaluation of video quality based on vision
Objective quality Evaluation: The method of measuring image quality evaluation by means of mathematical method.
Advantages:
- Can be quantified
- Reproducible measurement Results
- Simple measurement
Disadvantages:
- does not fully conform to the subjective perception of man

15. Methods of objective evaluation

Common Objective evaluation methods:

16. Subjective Evaluation method

Subjective quality evaluation: The method of direct measurement of the subjective perception of employing persons.
Advantages:
- Conforms to the subjective perception of man
Disadvantages:
- Not easy to quantify
- The measurement results are not reproducible because of uncertainties
- High measurement costs

Common Subjective evaluation methods

Objective evaluation method of video quality based on Vision

Objective evaluation of video quality based on vision: The human visual characteristics are described by mathematical method and used in the video quality evaluation method.
Both subjective quality evaluation and objective quality evaluation are combined.
Common methods: Structural similarity (Structural Similarity,ssim) method.
The features of Hvs are expressed in mathematical models.
Important research directions for the future

3rd Chapter Information Theory Foundation

1. Composition of communication systems

Source: Generate Message
Channels: Transmitting messages
Message House: Receive messages

2. Basic Concepts

The expression of information in communication is divided into three levels: signal, message, information.
- Signal: Is the physical layer of information expressed, can be measured, can be described, can be displayed. such as electrical signals, light signals.
- Message: Is the carrier of information, in the form of words, languages, images and other human beings can be recognized as the expression.
- Information: indeterminate content.

3. Information entropy

Characteristics of information

Measurement of information

Self-Information

Conditional information

4. Information entropy

5. Conditional Entropy and joint entropy

6. The nature of entropy

Nonnegative: The source entropy is non-negative, i.e. H (X) >=0;
Extensibility: The source entropy X has m symbols, if one of the signs appears to zero, the source entropy is equal to the remaining M-1 symbol of the source entropy;
Extremum (maximum information entropy): For a source with M-symbols, the source entropy reaches the maximum value only if all the probabilities of the symbols are present, i.e.
Additive:
Entropy does not increase: conditional entropy is not greater than information entropy H (x| Y) <= H (X);
The combined entropy is not the same as the entropy of each information, i.e. H (XY) <= H (X) + H (Y).

7. Mutual information

8. Mutual information

Physical meaning: H (x) is the information contained in X, H (x| Y) is the amount of information that X can bring under the condition known as Y. The difference between the two is the amount of information that Y can get from X, because you know how much y makes x less.

9. Relationship of various entropy

11. Source Code

Source code: Converts a message symbol into a message that the channel can transmit.
Two basic questions:
- Transmit the source message with as few channels as possible, and improve the transmission efficiency;
- Reduce distortion due to reduced channel transmission symbols.

12. Discrete source statistical characteristics

13. Discrete source Type: Simple non-memory source and Markov source

14. Coding classification

equal length code: in a set of code word set C of all the code Word cm (m =, ...,m), the code length is the same, it is said that the group code C is equal length code.
variable length code: If the code word set C in all the code Word cm (m =, ...,m), its code length is not the same, the code C is a variable length code.

15. Average code length

16. Comparison of equal length code and variable length code

The equal-length code encodes any one of the source output symbol sequences (probabilities may be different) into output code words of the same length, without using the statistical characteristics of the source;
The variable length coding can be encoded into different lengths of output code words according to the different probability sizes of the source output symbol sequence, and the statistical characteristics of the source are taken advantage of. It is also called entropy coding.

Huffman encoding

Huffman encoding: Typical variable-length encoding.
Steps:
- The source symbols are arranged in the order of probability from large to small, assuming p(x1) ≥ p(x2) ... ≥ P(xN)
- To the two least probability source symbol p(xn-1), P(xn) Each assigns a code bit "0" and "1", the two source symbols are combined into a new symbol, Using the sum of these two minimum probabilities as the probability of a new symbol, the result is a new source containing only (n-1) source symbols. The first reduction source, called the source, is represented by the S1 table.
- The symbols for the reduced source S1 are still arranged in the order of probabilities from large to small, repeating step 2 to obtain a reduced source S2with only (n-2) symbols.
- Repeat the above steps until the reduced source has only two symbols left, at which point the probability of the remaining two symbols must be 1. Then from the last level to reduce the source to start, according to the code path forward to return, the source symbols corresponding to the code word.

18. Channel Encoding

Channel coding mainly consider how to increase the anti-jamming ability of the signal, improve the transmission reliability, and improve the transmission efficiency.
In general, the redundancy coding method is used to give the error-correcting and error-checking ability of the code itself, so that the error probability of channel transmission falls within the allowable range.

19. Channel Type

Classification based on channel continuity or not
- Discrete channel
- Continuous channel
- Semi-Continuous channel
According to whether the channel has interference classification
- No interference channel
- Interference Channel
Classification according to the statistical characteristics of the channel
- No memory Channel
- There's a memory channel.
- Constant parameter Channel
- Variable parameter Channel
- Symmetric channel
- Asymmetric channel

20. Channel capacity

In the information theory, the maximum rate of transmission without error is the channel capacity.
Shannon Channel capacity formula:
- Assuming that the additive Gaussian white noise power of the continuous channel is N, the channel bandwidth is B, and the signal power is S, the capacity of the channel is
- Because noise power n is related to channel bandwidth B , the noise power n=n0b . Therefore, the Shannon formula can also be expressed as

21. The meaning of Shannon's channel capacity formula

Given B and S/N , the limit transmission capacity of the channel is C, and it can be transmitted without error at this time. If the actual transmission rate of the channel is greater than the C value, then the error-free transmission is theoretically impossible. Therefore, the actual transmission rate can not generally be greater than the channel capacity C , unless there is a certain error.
Increase the signal-to-noise ratio s/N(by reducing n0 or increasing s) to increase the channel capacity C. In particular, if n0->0, then C->∞, which means that no interference channel capacity is infinite;
Increase the channel bandwidth B, can also increase the channel capacity C, but do not increase indefinitely. This is because, if S,n0 must, have
Maintaining the same size of channel capacity can be achieved by adjusting the channel B and S/N , that is, the channel capacity can be maintained by the exchange of system bandwidth and signal to noise ratio.

22. Distortion

Distortion: Source messages cannot be fully recovered after being encoded and decoded
In the actual source and channel coding, the transmission of the message is not always distortion-free.
- Due to limitations of storage and transport resources
- Interference from noise and other factors

23. Rate Distortion Theory

Shannon defines the information rate distortion function R (D)
- D is message distortion
- R is the bit rate
Rate Distortion theorem: The information rate of the source output can be compressed to R (d) in cases where a certain degree of distortion is allowed.

24. Distortion function

Distortion function: source symbol x={x1, x2, ... Xn}, channel transmit receive end symbol y={y1, y2 ...yn}, for each pair (XI, YJ) Specify a nonnegative function D (Xi, YJ), called D(XI, YJ) is the distortion or distortion function of a single symbol. In the case of continuous source continuous channel, D(x, y) is commonly used.
Common distortion functions:
Average degree of distortion:

One of the video codec learning: Theoretical basis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More