Coding and decoding study notes (3): Mpeg1 and Mpeg2 in Mpeg Series

Source: Internet
Author: User
MPEG is short for MovingPictureExpertsGroup. The original meaning of this name is a group that studies video and audio encoding standards. What we call MPEG now refers to a series of video encoding standards developed by the team. The Group was formed in 1988, so far has developed MPEG-1, MPEG-2, MPEG-3, MPEG-4, MPEG-7 and other standards, M

MPEG is short for Moving Picture Experts Group. The original meaning of this name is a group that studies video and audio encoding standards. What we call MPEG now refers to a series of video encoding standards developed by the team. The Group was formed in 1988, so far has developed MPEG-1, MPEG-2, MPEG-3, MPEG-4, MPEG-7 and other standards, M

MPEG is short for Moving Picture Experts Group. The original meaning of this name is a group that studies video and audio encoding standards. What we call MPEG now refers to a series of video encoding standards developed by the team. The Group was formed in 1988, so far has developed MPEG-1, MPEG-2, MPEG-3, MPEG-4, MPEG-7 and other standards, MPEG-21 is being developed.

The following video-related standards have been developed and are currently being developed for MPEG:

  • MPEG-1: the first official Video audio compression standards, followed by the Video CD is adopted, the audio compression level 3 (MPEG-1 Layer 3) referred to as MP3, it has become a popular audio compression format.
  • MPEG-2: broadcast quality video, audio, and transmission protocols. It is used in digital TV-ATSC, DVB, ISDB, Digital Satellite TV (such as DirecTV), digital cable TV signal, and DVD video disc technology.
  • MPEG-3: The original goal is for High Resolution TV (HDTV) design, and then found that the MPEG-2 has enough HDTV Application, so the development of the MPEG-3 will be suspended.
  • MPEG-4: Vision compression standards released in 2003, mainly to expand the MPEG-1, MPEG-2 and other standards to support audio/audio objects (video/audio "objects ") low bitrate encoding and Digital Rights Management ), part 1 is jointly released by ISO/IEC and ITU-T, known as H. 264/MPEG-4 Part 10. See H.264.
  • MPEG-7: MPEG-7 is not a video compression standard, it is a description of multimedia content standards.
  • MPEG-21: MPEG-21 is a standard being developed and its goal is to provide a complete platform for future multimedia applications.

Media codec is in MPEG-1, MPEG-2, MPEG-4, as shown in.

Explanation of the name in the figure: everyone on the DVD Earth knows what is DVB?

DVB: Digital Video Broadcasting is a series of internationally recognized Digital TV public standards maintained by the "DVB Project. The following transmission modes are available for the DVB system:

  • Satellite Television (DVB-S and DVB-S2)
  • Cable TV (DVB-C)
  • Wireless TV (DVB-T)
  • Handheld wireless (DVB-H)

These standards define the transport system's physical and data link layers. The device interacts with the physical layer through synchronous parallel interface (SPI), synchronous serial interface (SSI), or asynchronous serial interface (ASI. Data is transmitted in the form of a MPEG-2 Transfer Stream and requires stricter limits (DVB-MPEG ). The standard (DVB-H) for instant compression of transmitted data on mobile terminals is currently under testing.

The main difference between these transmission modes is the modulation method, because the frequency and bandwidth requirements for different applications are different. The DVB-S of high-frequency carrier is used to use the PSK modulation method, the DVB-C of low-frequency carrier is used to use the QAM-64 modulation method, and the DVB-T of high frequency carrier and ultra-high frequency carrier is used to use the coofdm modulation method.

In addition to audio and video transmission, DVB also defines data communication standards (DVB-RC) that bring back the channel (DVB-DATA ).

DVB codec, video: MPEG-2, MPEG-4 AVC; audio: MP3, AC-3, AAC, HE-AAC.

MPEG-1

MPEG-1 was officially released as ISO/IEC11172.

MPEG-1 earlier Video encoding, quality is poor, mainly used for CD-ROM storage Video, the most familiar with the domestic is VCD (Video CD), his Video encoding is the use of MPEG-1. It is a custom video and audio compression format for the CD media. The transfer rate of a 70-minute CD is about 1.4 Mbps. The MPEG-1 adopts block-mode motion compensation, discrete cosine transform (DCT), quantization and other technologies, and optimizes the transmission rate of 1.2Mbps. The MPEG-1 was subsequently used as the kernel technology by Video CD. The output quality of the MPEG-1 is about the same as that of the traditional Video recorder VCR, which may be the reason why Video CD was not successful in developed countries.

MPEG-1 audio is divided into three layers, that is, MPEG-1 Layer I, II, III, the third Layer protocol is MPEG-1 Layer 3, referred to as MP3. MP3 has become a widely used audio compression technology.

MPEG-1 has the following parts:

  • Part 1: System;
  • Part 2: Video;
  • Part 3 (Part 3): audio; defines level1, level2, level3, and defines extensions in the MPEG-2.
  • Part 4 (Part 4): one-time test;
  • Part 5: Reference software;

Disadvantages of MPEG-1:

  • One audio compression system is limited to two channels (stereo)
  • There is no standardized support for the barrier scan video, and the compression ratio is poor.
  • There is only one standardized "profile" (restricted parameter bit stream) that is not applicable to videos with higher resolution. MPEG-1 supports 4 K videos, but it is difficult to provide video encoding with higher resolution and support for identifying hardware.
  • Only one color space is supported.

MPEG-2

Introduction to MPEG-2

MPEG-2, officially released as ISO/IEC 13818, is generally used to provide video and audio encoding for broadcast signals, including satellite TVs and cable TVs. After a few modifications to the MPEG-2, also become the core technology of the DVD product.

MPEG-2 has 11 parts, specific as follows:

Part 1: System-Description of video and audio synchronization and multiplexing

The official name is H.222.0 in ISO/IEC 13818-1 or ITU-T

The system description part of the MPEG-2 (Part 1) defines the transmission stream, which is used to transmit digital video signals and audio signals on unreliable media, and is mainly used in the field of broadcast television.

Two different but related container formats are defined: MPEG transport stream and MPEG program stream, that is, TS and PS in the figure. MPEG transmission streams (TS) are carrying lossy digital videos and audios. the start and end of a media stream are not identified, just like broadcast or tape. Examples include ATSC, DVB, SBTVD and HDV. The MPEG-2 system also defines MPEG program streams (PS), which design a container format for file-based media for hard drive, disc, and Flash.

MPEG-2 PS (Program Stream) is developed to save video information in storage media. MPEG-2 TS (Transfer Stream) is developed for transmitting video information over the network. At present, MPEG-2 TS is the most widely used DVB system. The difference between TS stream and PS stream is that the packet structure of TS stream is fixed, while the packet structure of PS stream is variable length. The difference in the structure of the PS package and the TS package leads to different resistance to the transmission error code, so the application environment is also different. The TS code stream adopts a fixed-length packet structure. When the transmission error code breaks the synchronization information of a TS packet, the receiver can detect the synchronization information in the bread at a fixed position, in this way, synchronization is resumed to avoid information loss. The length of the PS package varies. Once the synchronization information of a PS package is lost, the receiver cannot determine the synchronization location of the next package, which may lead to a loss of data and serious loss of information. Therefore, TS code streams are generally used when the channel environment is poor and the transmission error code is high. When the channel environment is good and the transmission error code is low, generally, the PS code stream is used because the TS code stream has strong ability to resist the transmission error code, so the MPEG-2 code stream transmitted in the transmission media basically adopts the packet lattice of TS code stream.

Part 2: Video-Video Compression

Official name is ISO/IEC 13818-2 or ITU-T H.262.

It provides a compression decoder for both the line scan and the non-line scan video signal.

The second part of the MPEG-2 is similar to the video and MPEG-1, but it provides support for the display mode of the barrier scan video (barrier scan is widely used in the broadcast and television field ). MPEG-2 video is not low speed (less than 1 Mbps) optimization, in 3 Mbit/s and above bit rate, the MPEG-2 is significantly better than the MPEG-1. The MPEG-2 is backward compatible, that is, all standard-compliant MPEG-2 decoder can also play the MPEG-1 video stream normally.

MPEG-2 technology is also applied in the HDTV Transmission System. MPEG-2 light transport for DVD-Video, now most of the HDTV (hd TV) also uses MPEG-2 encoding, resolution up to 1920x1080. Because of the popularity of MPEG-2, originally prepared for HDTV MPEG-3 finally declared to give up.

A MPEG-2 video typically contains multiple GOP (Group Of Pictures), and each GOP contains multiple frames ). Frame types usually include I-frame, P-frame, and B-frame ). Among them, I-frames adopt intra-frame encoding, P-frames adopt forward estimation, and B-frames adopt bidirectional estimation. Generally, the input video format is 25 (CCIR standard) or 29.97 (FCC) frames/second.

The MPEG-2 supports both line-by-line and line-by-line scans. In row-by-row scan mode, the basic unit of encoding is frame. In the interlace scan mode, the basic encoding can be frame or field ).

The original input image is first converted to The YCbCr color space. Among them, Y is the brightness, Cb and Cr are two color channels. Cb indicates the blue color and Cr indicates the red color. For each channel, block partitions are used first to form a "macro block". The macro block constitutes the basic unit of encoding. Each macro block is partitioned into 8x8 blocks. The number of small segments partitioned by the color channel depends on the initial parameter settings. For example, in the commonly used format, only one small block is sampled for each color macro block, therefore, the number of small blocks that can be partitioned by three channel macro blocks is 4 + 1 + 1 = 6.

For an I-frame, the entire image enters the encoding process directly. For P-frames and B-frames, motion compensation is first performed. Generally, due to the strong correlation between adjacent frames, it is better for macro blocks to find similar areas in the same positions in the front frame and the back frame, this offset is recorded as the motion vector, and the error in the reconstructed area of motion estimation is sent to the encoder for encoding.

For each 8x8 small block, the discrete cosine transform converts the image from the spatial domain to the frequency domain. The obtained transformation coefficients are quantified and re-arranged to increase the possibility of zero length. Run-length code ). Finally, it is coded as Huffman Encoding ).

I-frame encoding is used to reduce space redundancy, while P-frame and B-frame are used to reduce time redundancy.

GOP is composed of a series of I frames, P frames, and B frames in fixed mode. The common structure consists of 15 frames and has the following form: IBBPBBPBBPBBPBB. The ratio of frames in the GOP is determined by the bandwidth and image quality requirements. For example, because the compression time of B frame may be three times that of I frame, some real-time systems with poor computing power may need to reduce the ratio of B frame.

The bit stream output by the MPEG-2 can be a constant speed or a variable speed. The maximum bit rate. For example, in a DVD application, the maximum bit rate is 10.4 Mbit/s. If you want to use a fixed bit rate, the Quantization Scale needs to be constantly adjusted to generate a uniform bit stream. However, increasing the quantitative scale may result in visual distortion. For example, Mosaic.

Part 3: audio-Audio Compression

The third part of the MPEG-2 defines the audio compression standard. MPEG-2 BC (Backwards compatible), backward compatible with MPEG-1 audio. This section improves MPEG-1 audio compression, supports more than two channels of audio, up to 5.1 multi-channel. The MPEG-2 audio compression section also maintains backward compatible features (also known as MPEG-2 BC), allowing the MPEG-1 audio decoder to decode two main stereo components. It also defines additional bit rates and sampling frequencies for the audio MPEG-1 Layer I, II, III.

For example, mp2, Which is MPEG-1 Audio level 2, has: ISO/IEC 11172-3, ISO/IEC 13818-3. MPEG-1 Layer II is defined in ISO/IEC 11172-3, that is, part 3 of the MPEG-1, defining extensions in ISO/IEC 13818-3, that is, Part 2 of the MPEG-2.

Part 4: test specifications

Describes the test program.

Part 5: Simulation Software

Describes the software simulation system.

Part 6: Digital Storage Media Command and Control extensions

Describes DSM-CC (digital storage media commands and controls) extensions.

Part 7: Advanced Audio Coding (AAC)

The seventh part of the MPEG-2 defines audio compression that is not backward compatible (also known as MPEG-2 NBC ). Also known as MPEG-2 NBC (not-backwards compatible MPEG-1 Audio ). This part provides stronger audio functions. What we usually call MPEG-2 AAC refers to is this part. AAC is Advanced Audio Coding. AAC is more efficient than the previous MPEG audio standard and, to some extent, not as complex as its predecessor MPEG-1 Layer3 (MP3), it does not have a complex hybrid filter bank ). It supports from 1 to 48 channels, sampling rates from 8 to 96 kHz, multi-channel, multilingual and multi-Program (multiprogram) capabilities. AAC is also described in Part 1 of the MPEG-4 standard.

Part 8 ):

Canceled.

Part 9: Real-time interface Extension

Real-time interface extension.

Part 10 (Part 10): DSM-CC consistent expansion

DSM-CC consistency extension.

Part 11: IP

Intellectual Property Management (IPMP ). XML is defined in ISO/IEC 23001-3. MPEG-2 kernel technology involves about 640 patents that are primarily concentrated in 20 companies and a university.

MPEG-2 audio

The MPEG-2 provides new audio encoding methods. In part 2 and Part 3.

Part 3

MPEG-2 BC (backward compatible with MPEG-1 audio formats), processing low-speed audio at half sampling rate (MPEG-1 Layer 1/2/3 LSF), multi-channel encoding up to 5.1 channels.

Part 7

MPEG-2 NBC (Non-Backward Compatible), which provides MPEG-2 AAC and is not Backward Compatible with up to 48 channels of multi-channel encoding.

MPEG-2 profile and level

  MPEG-2 provides a wide range of applications for most applications, that is, unrealistic and too expensive, to support the entire standard, usually only supports subsets, therefore, profile and level are defined to represent these subsets. Profile definition features, such as compression algorithm and color format. Level defines performance, such as the maximum bit rate and the maximum frame size. An application should express its capabilities through profile and level. The combination of profile and level forms a subset of MPEG-2 video encoding standards in a specific application. For an image in an input format, a specific set of compression and encoding tools are used to generate a code stream within a specified speed range.For example, a DVD player supports the most major profile and major level (usually written as MP @ ML ).

MPEG-2 main profile:

Name English Chinese Image Encoding type Color formatYCbCr Aspect Ratio Scaling Mode
SP Simple Profile Simple class I frame, P Frame Or
MP Main Profile Main class I, P, and B
Or
SNR SNR Scalable profile SNR hierarchy I, P, and B Or Scalable Signal-to-Noise Ratio
Spatial Spatially scalable profile Hierarchical Space I, P, and B Or SNR or spatial scalability
442 P Profile I, P, and B
HP High profile High class I, P, and B Or Or SNR or spatial scalability

MPEG-2 Main level:

Name English Frame Rate Maximum length × maximum width Maximum Brightness sample per second (about high X wide x frame frequency) Maximum bit rate (Mbit/s)
LL Low Level 23.976, 24, 25, 29.97, 30 352 × 288 3,041,280 4
ML Main Level 23.976, 24, 25, 29.97, 30 720 × 576 10,368,000, With the exception: 14,475,600 for HP and 11,059,200 15
H-14 High-level-1440 level 23.976, 24, 25, 29.97, 30, 50, 59.94, 60 1440 × 1152 47,001,600, With the exception: 62,668,800 for in HP 60
HL High level 23.976, 24, 25, 29.97, 30, 50, 59.94, 60 1920 × 1152 62,668,800, With the exception: 83,558,400 for in HP 80

Combination example

Profile @ Level Resolution (px) Framerate max. (Hz) Sampling Bitrate (Mbit/s) Example Application
SP @ LL 176 × 144 15 0.096 Wireless handsets
SP @ ML 352 × 288 15 0.384 PDAs
320 × 240 24
MP @ LL 352 × 288 30 4 Set-top boxes (STB)
MP @ ML 720 × 480 30 15 (DVDs: 9.8) SD-DVB, DVD
720 × 576 25
MP @ H-14 1440 × 1080 30 60 (HDV: 25) HDV
1280 × 720 30
MP @ HL 1920 × 1080 30 80 ATSC 1080i, 720p60, HD-DVB (HDTV ).

(Bitrate for terrestrial transmission is limited to 19.39 Mbit/s)

1280 × 720 60
422P @ LL
422P @ ML 720 × 480 30 50 Sony IMX using I-frame only, Broadcast "contribution" video (I & P only)
720 × 576 25
422P @ H-14 1440 × 1080 30 80 Potential future MPEG-2-based HD products from Sony and Panasonic
1280 × 720 60
422P @ HL 1920 × 1080 30 300 Potential future MPEG-2-based HD products from Panasonic
1280 × 720 60

Application of MPEG-2 on DVD

The DVD adopts the MPEG-2 standard and introduces the following technical parameter limitations:
* Resolution
O 720x480,704x480,352x480,352x240 pixels (NTSC Standard)
O 720x576,704x576,352x576,352x288 pixels (PAL)
* Aspect Ratio
O 4: 3
O 16: 9
* Frame Rate (frame playback speed)
O 59.94 fields/second, 23.976 frames/second, 29.97 frames/second (NTSC)
O 50 games/second, 25 frames/second (PAL)
* Video + audio bit rate
O average maximum buffer zone 9.8 Mbit/s
O peak: 15 Mbit/s
O minimum value: 300 Kbit/s
* YUV
* Subtitle support
* Support for Embedded Subtitles (NTSC only)
* Audio
O LPCM encoding: 48 kHz or 96 kHz; 16 or 24-bit; up to 6 Channels
O MPEG Layer 2 (MP2): 48 kHz, up to 5.1 Channels
O Dolby Digital-Dolby Digital (DD, also known as AC-3): 48 kHz, 32-448 kbit/s, up to 5.1 Channels
O Digital Cinema System-Digital Theater Systems (DTS): 754 kbit/s or 1510 kbit/s
O ntsc dvd must contain at least one LPCM or Dolby Digital
O pal dvd must contain at least one MPEG Layer 2, LPCM, or Dolby Digital
* GOP structure
O must provide serial header information for GOP
O GOP maximum number of frames: 18 (NTSC)/15 (PAL)

Application of MPEG-2 on DVB

DVB-MPEG related technical parameters:
* The following resolution must be met:
O 720x480 pixels, 24/1. 30/1,. 001 or 30 frames/second
O 640x480 pixels, 24/1. 30/1,. 001 or 30 frames/second
O 544x480 pixels, 24/1. 30/1,. 001 or 30 frames/second
O 480x480 pixels, 24/1. 30/1,. 001 or 30 frames/second
O 352x480 pixels, 24/1. 30/1,. 001 or 30 frames/second
O 352x240 pixels, 24/1. 30/1,. 001 or 30 frames/second
O 720x576 pixels, 25 frames/second
O 544x576 pixels, 25 frames/second
O 480X576 pixels, 25 frames/second
O 352x576 pixels, 25 frames/second
O 352x288 pixels, 25 frames/second

MPEG-2 and NTSC.

The following resolution must be met:
O 1920x1080 pixels, up to 60 frames/second (1080i)
O 1280x720 pixels, up to 60 frames/second (720 p)
O 720x576 pixels, up to 50 frames/second, 25 frames/second (576i, 576 p)
O 720x480 pixels, up to 60 frames/second, 30 frames/second (480 I, p)
O 640x480 pixels, up to 60 frames/second
Note: 1080i is encoded in 1920x1088 pixels, but the last eight lines are discarded during display.

Supplement to YCbCr

YCbCr is not an absolute color space, but a version of YUV compression and offset. The picture on the right is UV.

Y (Luma, Luminance) video, that is, grayscale value. UV is regarded as the C (Chrominance or Chroma) of the color ). The main subsample formats are YCbCr, YCbCr, and YCbCr. The YUV notation is called A: B: C Notation:

* 4: 4 indicates full sampling.
* Indicates a horizontal sampling without vertical subsampling.
* Indicates a horizontal sampling of and a vertical subsampling.
* Indicates a horizontal sampling with no vertical subsampling.

The most common Y: UV records are usually or, And the DVD-Video records are recorded in YUV, which is also known as I420, YUV4: 2: 0 does not mean that only U (that is, Cb), V (that is, Cr) must be 0, but U: V is quoted from each other, and the time is hidden, that is, for each row, there is only one U or V weight. If one line is, the next line is, And the next line is... and so on.

Organize the preceding wiki documents.

Related Links: My articles on Industrial Ecosystem chain and Miscellaneous

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.