The Base_clock of PTS here are calculated in terms of 1000 (milliseconds), and if reused in TS, the Base_clock is 90k, so it should be multiplied by 90. About the H264 in the SPS recorded in the frame rate is twice times the actual frame rate, including slice inside the PIC_ORDER_CNT_LSB is also twice times the increment, I guess the code according to the Sub-field (Top field, bottom field) coding.
H264 es raw data is generally in the format of the NAL (Network Abstract Layer). Can be used directly for file storage and network transport. Each nalu (Network Abstract Layer Unit) data is composed of data header +rbsp data.
The first step is to split the data stream into a single, NALU data.
The value of Nal_type,i_nal_type that gets Nalu equals 0x7 indicates that the NALU is an SPS packet. Find and parse this SPS packet, which contains very important frame rate information
Time_scale/num_units_in_tick=fps
Then according to Nal_type Judge Slice (H264 in the slice similar to a video frame concept). Where the Nal_type value is less than 0x1, or greater than 0x5, indicates that this Nalu belongs to a slice.
Check if it is SLICE if (I_nal_type < 1/*nal_slice*/| | i_nal_type > 5/*nal_slice_idr*/)//FIND SLICE!!!!!
After finding the nalu of slice, the data of Nalu can be calculated and calculated by byte, and the result is true to indicate the end position of this slice (frame of video frames).
Determine if the frame ends for (uint32_t i = 3; i < nal_length; i++) {if (P_nal[i] & 0x80) {//Find Frame_ BEGIN!!!! The end of the previous frame, the next frame's start}}
The above code is excerpt from FFmpeg. His actual role is to judge the slice inside the first_mb_in_slice, that is, the 1th macro block in the slice position, if it is a frame, the value of this field is definitely to identify the 1th macro block. Therefore, can also complete parse slice header information, parse out First_mb_in_slice, if is 0 (note: This is 1 Columbus value), namely this nalu is the beginning of a frame.
Why the code here is byte-wise to Judge 0x80. I write some extra words about a great God: The program ape is not 100,000 why, not a wiki ape, the procedural ape is the demand ape. If a program ape has begun to study how to parse the slice head format, he will naturally not have this question.
In addition, through Nal_type and Silice_type can also determine the frame end position, VLC inside the code is so dry.
Resolving to the Nalu at the end of the frame, you can determine the start and end of each frame (slice). Parsing Slice's slice_type, according to Slice_type, can determine the IPB type of this slice.
Judging frame type by slice type switch (slice.i_slice_type) {case 2:case 7:case 4:case 9: *p_flags = 0x0002/*block_fla g_type_i*/; Break Case 0:case 5:case 3:case 8: *p_flags = 0x0004/*block_flag_type_p*/; Break Case 1:case 6: *p_flags = 0x0008/*block_flag_type_b*/; Break Default: *p_flags = 0; Break }
From now on, there are two ways to calculate PTS.
Method one, according to the IPB type of the front and back frame, we can know the actual display order of the frame, use the frame rate in the SPS information obtained earlier, and the frame count frame_count to calculate the PTS. This method requires a few frames of cache (typically caching the length of a group).
ipbbipbbipb... Frame type
1 2 3 4 5 6 7 8 9 10 11 ... The first few frames
1 4 2 3 5 8 6 7 9 12 10 ... Frame Display Order
Between an I-frame and the next I-frame, is a group.
As can be seen from the above illustration, the display order of frames of type P is after the last B-frame behind.
So to get the PTS of the 7th frame, at least to know the type of his next frame, in order to learn his display sequence.
8th Frame pts=1000 (ms) *7 (frame display order) * frame rate
Method Two, each slice information inside, all records have PIC_ORDER_CNT_LSB, the current frame in this group display order. With this pic_order_cnt_lsb, you can calculate the PTS of the current frame directly. This method does not require frame caching.
Calculation formula:
pts=1000* (I_frame_counter + pic_order_cnt_lsb) * (Time_scale/num_units_in_tick)
I_frame_counter is the frame sequence of the most recent I-frame position, through the I-frame count + the frame order in the current group, to get the actual display sequence position of the frame, multiply the frame rate, and then multiply 1000 (milliseconds) of the base_clock (basic clock frequency), get pts.
ipbbipbbipb... Frame type
1 2 3 4 5 6 7 8 9 10 11 ... The first few frames
1 4 2 3 5 8 6 7 9 12 10 ... Frame Display Order
0 6 2 4 0 6 2 4 0 6 2 ... pic_order_cnt_lsb
Be careful to note that in the figure above, the pic_order_cnt_lsb inside the slice is incremented by 2.
The frame rate recorded in the SPS in H264 is usually twice times the actual frame rate time_scale/num_units_in_tick=fps*2
Therefore, the actual calculation formula should be this way
PTS=1000* (I_FRAME_COUNTER*2+PIC_ORDER_CNT_LSB) * (Time_scale/num_units_in_tick)
or a
PTS=1000* (I_FRAME_COUNTER+PIC_ORDER_CNT_LSB/2) * (TIME_SCALE/NUM_UNITS_IN_TICK/2)
So, the 11th frame of PTS should be so calculated
1000* (9*2+2) * (Time_scale/num_units_in_tick)
Closing:
The Base_clock of PTS here are calculated in 1000 (milliseconds), and if reused in TS, Base_clock is 90k, so it should be multiplied by 90.
Digression: About twice times the actual frame rate recorded in the SPS in H264, including the slice inside the PIC_ORDER_CNT_LSB is also twice times the increment, I guess the code is based on the sub-field (Top-field, bottom-field) coding. In addition, I notice that the Offset_for_top_to_bottom_field field in the SPS information, from the name, seems to be used to mark whether the field or the odd even field code. All of these are guesses, please have an expert doubts.