Streaming Media basic key points: how to obtain PTS in H264 data?
Order:
Only outline the key points. More specific methods are not described in detail.
My open-source project and many open-source projects have detailed and complete implementation code.
These points are all summarized by myself, and there is no responsibility to ensure correctness. For reference only.
If any problem is found, please drop the bricks and beg the high personnel of all parties to correct the mistake. Orz
Content:
The raw ES data of H264 usually exists in the format of NALNetwork Abstract Layer. It can be directly used for file storage and network transmission. Each NALU (Network Abstract Layer Unit) data is composed of a Data header + RBSP data.
First, you need to split the data stream into an independent NALU data.
Then obtain the nal_type of NALU. The value of I _nal_type is 0x7, indicating that the nalu is an sps data packet. Find and parse this sps data packet, which contains very important frame rate information.
Time_scale/num_units_in_tick = fps
Then, based on nal_type, determine that slice in sliceH264 is similar to a video FRAME concept ). The nal_type value is smaller than 0x1, or greater than 0x5, indicating that the NALU belongs to a server Load balancer.
- // Check whether it is slice
- If (I _nal_type <1/* NAL_SLICE */| I _nal_type> 5/* NAL_SLICE_IDR */)
- // Find slice !!!!!
After finding the slice NALU, you can calculate the NALU data and 0x80 in bytes. If the result is true, it indicates the end position of the slice video FRAME.
- // Determine whether the frame ends
- For (uint32_t I = 3; I <nal_length; I ++)
- {
- If (p_nal [I] & 0x80)
- {
- // Find frame_begin !!!! End of the previous frame, start of the next frame
- }
- }
The above code is excerpted from FFMPEG. The actual function is to determine the first_mb_in_slice in the slice, that is, the position of the 1st macro blocks in the slice. If it is a frame, the value of this field must be 1st macro blocks. Therefore, the slice header information can also be completely parsed, And the first_mb_in_slice can be parsed. If it is 0, Note: This is a Columbus value), that is, the NALU is the beginning of a frame.
Why is the code here 0x80 determined by byte? I wrote some famous saying: programmers are not 100,000, not Wikipedia, and programmers are demand ape. If a programmer has begun to study how to parse the slice Header Format, he naturally does not have this question.
In addition, the nal_type and silice_type can be used to determine the frame end position. The code in VLC does this.
Resolve to the NALU at the end of the frame to determine the start and end of each frame slice. Parse the slice_type of the slice. Based on the slice_type, you can determine the IPB type of the slice.
- // Determine the frame type based on the slice type
- Switch (slice. I _slice_type)
- {
- Case 2: case 7:
- Case 4: case 9:
- * P_flags = 0x0002/* BLOCK_FLAG_TYPE_ I */;
- Break;
- Case 0: case 5:
- Case 3: case 8:
- * P_flags = 0x0004/* BLOCK_FLAG_TYPE_P */;
- Break;
- Case 1:
- Case 6:
- * P_flags = 0x0008/* BLOCK_FLAG_TYPE_ B */;
- Break;
- Default:
- * P_flags = 0;
- Break;
- }
From now on, there are two ways to calculate PTS.
Method 1: Based on the IPB type of the front and back frames, you can know the actual display sequence of the frames, and calculate the PTS using the frame rate and Frame Count frame_count in the obtained sps information. This method requires several frames of cache to cache the length of a group ).
I p B... Frame Type
1 2 3 4 5 6 7 8 9 10 11... Frames
1 4 2 3 5 8 6 7 9 12 10... frame display Sequence
A group is used between an I frame and the next I frame.
As you can see, P-type frames are displayed after the last B-frame.
Therefore, to obtain the pts of 7th frames, you must know the type of the next frame to know the display sequence.
8th frame pts = 1000 milliseconds) * 7 frame display sequence) * Frame Rate
Method 2: pic_order_cnt_lsb is recorded in each slice information, and the display sequence of the current frame in this group is recorded. With this pic_order_cnt_lsb, you can directly calculate the PTS of the current frame. This method does not require frame caching.
Calculation formula:
Pts = 1000 * (I _frame_counter + pic_order_cnt_lsb) * (time_scale/num_units_in_tick)
I _frame_counter is the frame sequence of the last I frame position. The sequence position of the frame is displayed based on the I frame count + the frame sequence in the current group, multiplied by the frame rate and then multiplied by 1000 milliseconds) base_clock basic clock frequency) to obtain the PTS.
I p B... Frame Type
1 2 3 4 5 6 7 8 9 10 11... Frames
1 4 2 3 5 8 6 7 9 12 10... frame display Sequence
0 6 2 4 0 6 2 4 0 6 2... pic_order_cnt_lsb
Note that the pic_order_cnt_lsb in slice is incremented by 2.
Generally, the frame rate recorded in the sps In H264 is twice the actual frame rate time_scale/num_units_in_tick = fps * 2.
Therefore, the actual formula is as follows:
Pts = 1000 * (I _frame_counter * 2 + pic_order_cnt_lsb) * (time_scale/num_units_in_tick)
Or
Pts = 1000 * (I _frame_counter + pic_order_cnt_lsb/2) * (time_scale/num_units_in_tick/2)
Therefore, the 11th-frame pts should be calculated in this way.
1000*(9*2 + 2) * (time_scale/num_units_in_tick)
Conclusion:
Here, the base_clock of pts is calculated in 1000 milliseconds. If it is reused in ts, base_clock is 90 KB, so it should be multiplied by 90.
Topic: In H264, the frame rate recorded in the sps is twice the actual frame rate, including the pic_order_cnt_lsb in the slice is also doubled, I guess the code may be based on the top and bottom of the sub-field) encoding. In addition, I noticed the offset_for_top_to_bottom_field field in the sps information. It seems that it can be used to mark whether the field is field by field or parity field encoding. All of the above are guesses. Please be confused. Orz
This article is from the "C + Detective de Pants" blog, please be sure to keep this source http://70565912.blog.51cto.com/1358202/533736