MPEG2 parsing Summary

Last Update:2018-12-05 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

ISO/IEC-13818-1: system part;

ISO/IEC-13818-2: video;

ISO/IEC-13818-3: audio;

ISO/IEC-13818-4: consistency test;

ISO/IEC-13818-5: software part;

ISO/IEC-13818-6: digital storage media command and control;

ISO/IEC-13818-7: Advanced Audio Encoding;

ISO/IEC-13818-8: Real-time interface for system decoding;

MPEG2 system tasks include:

1. Protocol for data transmission in packages;

2. The protocol that specifies the synchronization of data streams between the sending and receiving ends;

3. Provides multiplexing and reuse protocols for multiple data streams;

4. Provides data stream encryption protocols.

Storing and transmitting data streams in the form of packets is the key point of MPEG2 systems.

ES is a data stream directly from the encoder. It can be a collectively referred to as an encoded video data stream, audio data stream, or other encoded data streams. The es stream is converted to the PES package after it passes through the PES package. The PES package consists of a packet header and payload. The specific format is as follows:

We can see that PTS/DTS is in the PES package, and the two parameters are the key to solving synchronous display of video and audio and preventing the input cache overflow or underflow of decoder. PTS indicates the time when the unit appears in the system target Decoder (STD: system target decoder). DTS indicates the time when all the bytes of the access unit are removed from the es decoder cache of STD. Each packet header of I, P, and B has a PTS and DTS, but PTS and DTs are the same for B frames and do not need to mark the data transmission of B frames. For frames I and P, it must be stored in the re-Sort cache of the Video Decoder before display. After delay (re-sorting), it must be marked with PTS and DTS respectively.

As described in the previous section, es needs to be packaged as a PES stream package, and PES needs to be packaged as a PS or TS package for storage or transmission as needed. Each elasticsearch contains only the encoded data streams of one source. Therefore, each PES contains only the data streams of the corresponding source.

For PS streams, each PES header contains pts, DTS, and stream identification code to distinguish es of different properties. The PES package is then reused as a PS package through the PS multiplexing. Actually, the PES package is decomposed into smaller PS packages. During decoding, The demutor splits the PS into PES packages. The package is split into video and audio elasticsearch packages and input them to the respective decoder for decoding. One question is: how can we ensure the synchronization of video and audio during decoding by es? In addition to the combination of PTS and DTS, an important parameter is SCR (system clock reference ). During encoding, PTS, DTS, and SCR are both composed
Time Clock) is generated. During decoding, STC will regenerate and use the PLL-Phase Lock Loop To compare the local SCR phase with the input instantaneous SCR phase lock, to determine whether the decoding process is synchronized. If not, use this instantaneous SCR to adjust the local clock frequency of 27 MHz. Finally, PTS, DTS, and SCR work together to solve the problem of Synchronous Video/audio playback. The PS format is excerpted as follows:

The length of the PS package is long and variable. It is mainly used in an error-free environment, because the longer the package, the more difficult the synchronization is, and the more difficult the reorganization is in the case of packet loss. Therefore, PS is suitable for editing program information and application of local content applications.

TS streams are also composed of one or more PES. They can have the same time benchmark or different. The basic reuse idea is to reuse programs for multiple pes with the same time benchmark, and transmit and reuse each PS with an independent time benchmark, finally, ts is generated. The TS packet consists of two parts: the packet header and the packet data. The packet header can also include an extended self-applicable zone. The Header Length occupies 4 bytes, and the self-use zone and packet data occupy 184 bytes. The length of the entire ts package is equivalent to the length of 4 ATM packets. The header of the TS packet consists of the Synchronization Byte, transmission error code indicator, start indicator of the payload unit, transmission priority, and packet recognition (PID-Packet
Identification), transmission Disturbance Control, adaptive Zone Control, and continuous counter.

It can be used to synchronize the automatic characteristics of the byte string, detect packet restrictions in the data stream, and establish packet synchronization. When the transmission error code indicator is available, the error correction decoder can represent 1 bit codes, but cannot be corrected. The start indicator of the payload Unit indicates whether a specified start information exists in the data packet, the priority is assigned to the TS package. The PID value is determined by the user. The Decoder uses the PID to differentiate the TS packages from different es to reconstruct the original es; transmission Disturbance Control, which indicates whether the data packet content is disturbed, but the packet header and the adaptive area are never disturbed. Adaptive Zone Control, with 2
Bit indicates whether there is an adaptive zone. That is, (01) indicates that there is useful information without an adaptive zone. (10) indicates that no useful information has an adaptive zone. (11) indicates that there is a useful information with an adaptive zone, (00) not defined; continuous counters can count the transfer sequence of the PID package. According to the counter reading, the receiving end can determine whether there are packet loss and packet transfer sequence errors. Apparently, Baotou has synchronization, identification, error checking, and encryption functions for TS packets.

The adaptive area of the TS package consists of four parts: Adaptive Area chief, various sign indicators, information related to the inserted sign, and filled data. The marker consists of eight parts: intermittent indicator, random access indicator, es optimized indicator, PCR sign, contact sign, transmission specific data mark, original PCR sign, and adaptive area extension sign.. It is important to mark part of the PCR field, which can provide synchronization data for the 27mhz clock of the decoder for synchronization. The process is to use the PLL to compare the local PCR phase with the input instantaneous PCR phase lock during decoding to determine whether the decoding process is synchronized. If not, then the instantaneous PCR is used to adjust the clock frequency. Because digital images use complex and different compression encoding algorithms, resulting in different data of each image, making it impossible to obtain clock information directly from the beginning of the compressed image data. Therefore, some (rather than all) Self-Adaptive regions of TS packets are selected to transmit the scheduled information. Therefore, the adaptive area of the selected TS packet can be used to determine the control bit and important control information of the packet information. The adaptive area does not need to be sent along with each packet. the maximum number of messages sent is determined by the specific time-scale parameter of the selected TS packet. The random access indicator and contact sign in the sign. When the program changes, it provides a random entry point for the data stream that randomly enters the I-frame compression, and also facilitates the insertion of local programs. The padding data in the adaptive area is because the PES package length cannot be exactly an integer multiple of the TS package. The last ts package retains a small part of the useful capacity, which is filled by padding bytes, this prevents the buffer overflow and keeps the total bit rate unchanged.

The preceding three sections summarize the basic formats of MPEG2 ts, including PES, PS, ts, and related fields. As a transmission stream, TS will package and reuse the content, convert the media content into ts for transmission, and finally decode the content on the decoding end. To put it simply, ts is a transport layer protocol stack that supports transmission of various content, such as MPEG, WMV, h264, and even IP addresses, how are the transmission specifications defined? This is what PSI (Program-specific information) is about to do.

PSI consists of four tables: Pat, PMT, cat, and NIT, which respectively describe the Transmission Structure of all es streams included in a TS. The first concept is that ts is transmitted in the form of packets. The codec end must use a certain package ID to identify the content carried in the TS stream. For example, the Pat table exists in one or more ts packages, so it must be represented by a special package ID. In addition, different es streams also need different package IDs to identify. We have two tables, Pat and PMT, And the decoder can identify the TS packages from different es Based on the PID for decoding.

The decoding of TS is divided into two steps. One is to parse the Pat table from the TS packet whose PID is 0, and then find the PID of each node source from the Pat table, generally, these program sources are composed of several es streams and described in the PMT table. Then, through the PID of the program source, the es PID can be retrieved in the PMT table. Second, the decoder distinguishes the packages on the TS Stream Based on the es stream PID in the PMT table and decodes the packets based on different es streams. Therefore, ts is completed through two layers: program multiplexing and transmission multiplexing. That is, when the program is reused, PMT is added, and Pat is added during transmission reusage. Similarly, when the program is resolved, you can obtain the PMT.
You can get the Pat when the data is re-transmitted. The idea is well outlined.

TS supports multiplexing, so it can be used to transmit multi-layer programs after reuse. In the process of multiplexing, we should note the time reference and synchronization problems that need to be faced during decoding. Because the de-multiplexing requires synchronization of various information, during the multiplexing process, you need to insert the relevant time information: pts, DTS, and PCR.

During ts formation, PTS and DTS inject the clock information into the PES package according to the STC reference when ES is packaged into PES, And then when PES is cut into ts, then the PID and the PCR side will be extracted and analyzed, and then the PCR will be extracted, then, according to the unified STC reference, the new PCR is generated and injected into ts. Finally, because the original Pat table information is not applicable, the new Pat table needs to be generated again, and attach it to the new TS stream. After this multi-layer multiplexing, the new TS stream can enter the scheduling and transmission phase. For the process, see:

The decoding process is faced with the following problems: demultiplexing, video and audio synchronization, and the decoding cache never overflows. Demultiplexing separates the Programs transmitted by ts in different time sequences in the same channel. The video/audio synchronization is completed by DTs, PTS, and PCR, and PCR is the absolute time scale for rebuilding the system time benchmark, DTS and PTS are the relative time points for decoding and reproduction. To solve the problem that the decoder cache is overwhelming, it must be implemented through the system target Decoder (STD) model, the basic idea is as follows:

After the TS Stream enters the decoder, it first splits various es streams (including PSI information stream) by the commutator according to a certain time sequence ).
After the decomposition, the es stream enters its own transmission cache, and its PES stream enters its primary storage. Note that the PSI information stream enters the system cache, it also reaches the primary storage.
Finally, the decoder extracts the media or system information from each primary storage based on the DTs information for decoding, and displays the media content based on the PTS information.

For the process, see:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More