Code stream identification and transmission

Source: Internet
Author: User

Introduction to the basic grammar of H.

???? Compared to the previous video compression coding standard, there is a great change in the grammatical structure, which is embodied in the following two aspects:

???? 1. Canceling the frame-level syntax unit

???? There are no grammatical units such as frame_header in the grammar, and the frame information is all placed in Slice_header, SPS, and PPS. Because a frame image can correspond to multiple slice, the decoder cannot identify a frame of code stream by parsing a syntax similar to Frame_header

???? 2, the introduction of PPS, SPS and other parameter set concept

???? Extracts the common features of a video sequence (data from IDR to the next IDR frame into a video sequence) in all images, placed in the SPS syntax unit

???? The typical features of each image are extracted and placed in the PPS grammar hail

???? SPS can only be switched between video sequences, that is, only the first slice of the IDR frame switches to SPS

???? PPS can only be switched between images, only the first slice per frame of the image can switch pps

???? On the macro level, a typical H-p stream structure consists of an SPS, PPS, IDR frame (containing one or more i-slice), a frame (containing one or more p-slice), a B-frame (containing one or more b-slice). In addition to the above information, the SEI syntax structure is defined (unless the encoder and decoder do a specific syntax negotiation, otherwise it is generally not parsed).

Nalu Introduction

???? Nalu is the highest level of abstraction for H264, and all of H264 's grammatical structures are ultimately encapsulated as NALU, and the Nalu element in the stream must define the appropriate delimiter. For example, the prefix code of "00 00 01" As Nalu delimiter, you can search the prefix code "00 00 01" To identify a Nalu

???? Nalu has its own grammatical structure, but takes only one byte, that is, the NALU unit is H264 payload except for the first byte of the prefix code "00 00 01".

???? Nalu_type is the most important grammatical element of NALU, which characterizes the type of the structure of the grammar of H. Nalu within the package.

???? Nalu_type can be resolved in the following ways;

???? Nalu_type = first_byte_in_nal & 0x1F

???? Here, the first byte of Nalu is represented by first_byte_in_nal.

???? The correspondence between the grammatical structures of Nalu_type and H.


How to identify IDR frames

???? The decoder can only be decoded from the IDR frame, so the player needs to recognize the IDR frame in order to dig into fast forward, rewind, and decode channel conversion functions. The IDR frame can be identified by identifying the nalu_type. For example, you can search from the stream and extract several contiguous nalu_type equal to 05 of the NALU, to obtain a complete IDR frame (above the nalu_type corresponding to the H. x syntax structure). An IDR frame can be divided into multiple Nalu

???? sps and pps are searched in the same way as IDR frames, but each SPS and PPS go in one Nalu

Frame Boundary Recognition Method

???? Introduction to frame boundary recognition

???? H2.64 all Nalu sets that make up a frame of an image are called an AU, and frame boundary recognition is actually the recognition AU. Because of the cancellation of frame syntax, the AU can not be easily obtained from the stream. Decoder only in the process of decoding, through the combination of certain grammatical elements in order to determine whether the end of a frame image. The decoder must therefore be decoded before the first slice_header syntax of the new image is completed to know that the previous frame image has ended.

???? The AU identification steps are as follows:

    1. To code stream implementation to 03 processing
    2. Parsing Nalu Syntax
    3. Parsing Slice_header Syntax
    4. Judge the two Nalu and the corresponding slice_header in a number of grammatical elements to see if there is a change. If a change occurs, the two nalu belong to different frames, otherwise the two nalu belong to the same frame.


      Obviously, the AU recognition before decoding consumes a lot of CPU resources, so it is not recommended to decode the AU way

Encoder and decoder mates for frame boundary recognition

???? In order to provide a simple au identification scheme, H. 09 Specifies a type of Nalu, that is, after each time an AU encoding is completed, the encoder inserts a NALU of type 09 in the stream, in which case the decoder only needs to search the stream for a nalu of type 09 to obtain an AU. (This needs to match the encoder, decoding the words requires a specific environment)

Efficient frame-Boundary recognition method

???? The main idea is to use the grammatical elements in the slice_header of a frame image first_mb_in_slice generally equal to 0 of this feature. This kind of funny frame boundary recognition method is mainly for the configuration of a contiguous buffer stream in the line search, the frame boundary is returned after the location of the frame boundary information, no copy operation.

Decoding video channel switching

???? In general, in video-on-demand or channel switching, the search IDR frame is not enough. It is stipulated that SPS and PPS should be sent to the decoder before the IDR frames are transmitted. Therefore, the most rigorous approach is to start the video stream from the beginning of the search, get all the SPS and PPS in turn to the decoder, until the search for an IDR frame.

???? In a real-world scenario, the encoder typically delivers SPS, PPS, and IDR frames continuously, i.e., the SPS and PPS parameters are passed strictly before the IDR is passed. The HI3507 encoder follows this standard, so it is possible to find four smaller Nalu before the first nalu of each IDR, with all the parameters needed for decoding the current video sequence, where the first Nalu is an SPS. Therefore, for Hi3507 streams, the complete image can be obtained if the SPS are found and the decoder is decoded from the starting position.

How streaming media networks are transmitted

???? MPEG-4-based scenarios typically employ the following two scenarios when transmitting over a network

    1. Segmenting linear primitive streams with fixed-length data segments that are smaller than the MTU of the network layer
    2. Direct transmission increases the entire frame of data in the private format header, the network layer slices the data, and the application layer ignores packets discarded by the network congestion

The network transmission Anti-error solution is more excellent than MPEG-4. A. h contains VCL and nal.

The type of IP network can be divided into three kinds:

    1. An uncontrolled IP network (such as an Internet network)
    2. controllable IP network (e.g. WAN)
    3. Wireless IP network (e.g. 3G network)

These three IP networks have different MTU, bit error probabilities, and TCP usage tokens. Two IP nodes the pointing MTU is dynamically variable, usually assuming that the MTU of the Wired IP network is 1.5KB and the Wireless IP network MTU ranges from 100byte to 500byte.

It is recommended to use Nalu as the base unit when sending UDP packets, which makes it easy to tag at the application layer. RTP is recommended for packet transport, a NALU is placed in an RTP packet, the NALU (including the Synchronizer head) is placed in the RTP payload, and the RTP header information is set. Because the path of the packet transfer is different, the receiving side needs to reorder the slice groupings, and the sequence information that RTP contains can be used to solve the problem.

???? Do not arbitrarily discard the size of the UDP packet larger than the MTU, can be sent directly, there is a network driver layer for unpacking and packaging, the application layer will not have an impact. Even if the network appears UDP drops, the loss is also a complete package.

Packet loss policy prior to decoder

???? The player will need to drop packets under the following circumstances

    1. Computer CPU overload or network jitter causes the player front-end stream buffer to overflow
    2. The application layer detects that the current network packet loss is a serious problem
    3. Player scheduling requires packet loss processing

Current general method of active packet loss processing

    1. Whenever possible, send any stream information you receive to the decoder, and do not easily discard the stream information you have received.
    2. if the player needs to drop the packet actively in the above cases, the player should discard it from the current frame and discard it to the next IDR frame.

The learning materials are from the HiSilicon H264 encoding library PDF file.

Code stream identification and transmission

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.