H264 Flow, frame structure

Source: Internet
Author: User

Hierarchical structure of H264 elements

In the bit stream of the encoder output, each bit is subordinate to a syntactic element. Syntactic elements are organized into hierarchical structures that describe each level of information.

In H. Five, the syntactic elements are organized into sequences, images, slices, macro blocks, and Submacro blocks. In such a structure, the head of each layer and its data part form a strong dependency relationship between management and management, the syntactic element of the head is the core of this layer of data, and once the head is lost, the data part of the information is almost impossible to decode correctly , especially in the sequence layer and image layer.

The biggest difference in the hierarchical structure is that the sequence layer and the image layer are eliminated, and most of the syntactic elements that originally belong to the sequence and the head of the image are free to form sequence and image two-level parameter set, and the remainder is put into the slice layer.
The parameter set is a separate unit of data that does not depend on other syntactic elements outside the parameter set. One parameter set does not correspond to a particular image or sequence, and the same sequence parameter set can be referenced by multiple sets of image parameters, so the same set of image parameters can be referenced by multiple images. A new set of parameters is emitted only when the encoder considers it necessary to update the contents of the parameter set.

Data units that may appear in a stream in a complex communication:

IDR: The image is organized in sequence in H. the first image of a sequence is called an IDR image (immediately refreshes the image), and the IDR image is an I-frame image. The IDR image is introduced in order to decode the resynchronization, when the decoder decoding to the IDR image, immediately empty the reference frame queue, the decoded data are all output or discard, re-find the parameter set, start a new sequence. Thus, if there is a significant error in the previous sequence, chances of resynchronization can be obtained here.      The image after the IDR image is never decoded using the data from the image before the IDR. the IDR image must be an I image, but the I image is not necessarily an IDR image. An image after the I-frame may use the image before the I-frame to do a motion reference.

H264 Code Stream structure 1. H264 Layered StructureThe code flow structure defined by h.263 is a hierarchical structure with a total of four layers. Top-down are: Image layer (picturelayer), block layer (GOB layer), macro block layer (macroblock layer), and block layer. Compared with h.263, the code flow structure of H. H.263 is very different from that of the three, and it is no longer a strict hierarchical structure. The function of H.two layer, the Video Coding layer (VCL) and the network extraction layer (NAL) VCL data is compressed encoded video data sequence. The VCL data can be used for transmission or storage only after it is encapsulated in the NAL unit. The nal cell format [2] is shown in table 1:
Table 1 nal cell format
NAL Head rbsp NAL Head rbsp
RBSP: The data encapsulated in the network abstraction unit is called the primitive byte sequence load rbsp, which is the basic transmission unit of the NAL. Among them, rbsp is divided into video coding data and control data. The basic structure is that the end bits are added after the original encoded data. A bit "1" of several bits "0" for byte alignment. Types of rbsp
One of the RBSP types PS: Includes the sequence parameter set SPS and the image parameter set PPS
SPS contains parameters for a sequential encoded video sequence, such as identifier seq_parameter_set_id, frame number and POC constraints, number of reference frames, decoding image size and frame field encoding pattern selection identification, and so on.
PPS corresponds to a sequence of an image or a number of images, whose parameters such as identifier pic_parameter_set_id, optional seq_parameter_set_id, Entropy coding mode selection of identification, chip groups, The initial quantization parameters and the de-block filter coefficients are adjusted to identify and so on. Nalu Type
Identifies the RBSP data type in the Nal cell, wherein the NAL unit Nal_unit_type is 1, 2, 3, 4, 5, and 12 is called a vcl nal unit, and other types of nal units are non-VCL nal units.
0: Not Specified
1: Fragments that do not use data partitioning in non-IDR images
2: Segment A in non-IDR images
3: Division of Class B data in non-IDR images
4: Division of Class C data in non-IDR images
Fragments of the 5:IDR image
6: Supplemental Enhancement Information (SEI)
7: Set of sequence parameters
8: Set of image parameters
9: Separator
10: Sequence Terminator
11: Stream Terminator
12: Populating data
13–23: Reserved
24–31: Not Specified 2. Code Flow structure diagramThrough the relevant knowledge of the review, summed upthe code flow structure of H. [2]1:

Figure 1 The code flow structure of H.


Application of 3-H code stream analysisIn some cases, information such as the width of the image and the height of the image can be obtained directly from a stream of H. The following is a description of how to obtain relevant information: Image information is stored in the Network extraction layer (NAL) of the RBSP structure, to obtain information about the image, we need to obtain the relevant bits of the image. According to the RBSP structure, to obtain the PIC_WIDTH_IN_MBS_MINUS1 and Pic_height_in_map_units_minus1 two values, then the width is (pic_width_in_mbs_minus1+1) *16, the height is ( pic_height_in_map_units_minus1+1) *16, but some cases have to consider the value of nnum_ref_frames, typically 1. 3.1 Obtaining test data Device: Sunnic (IP Cam) Name: St100factoryfirmware version: P8B8 video format: H. 1 The device resolution is set to 176*144, a set of data is captured using the ethereal and other grab tools, and after the corresponding RTP header is removed, This data is 0x00,0x00,0x00,0x01,0x67,0x42,0x00,0x1e,0x99,0xa0,0xb1,0x31. (2) The device resolution is set to 720*240, using the ethereal and other grasping kit tool to grasp a set of data, and remove the corresponding RTP header, the data is 0x00,0x00,0x00,0x01,0x67,0x42,0xe0,0x1e,0xda,0x82, 0xd1,0xf1. (3) The device resolution is set to 720*480, using the ethereal and other grasping kit tool to grasp a set of data, and remove the corresponding RTP header, the data is 0x00,0x00,0x00,0x01,0x67,0x42,0xe0,0x1e,0xdb,0x82, 0xd1,0xf1.

H264 Flow, frame structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.