Analysis of timestamp in FFMPEG Stream Transcoding by FFMPEG-0.11.1

Source: Internet
Author: User
[This article is a complex topic of FFMPEG. It will be quite cool to write]

[Different transcoding environments may have different code flows]

First popularized:

Timestamp, DTS (Decoding time stamp), PTS (presention time stamp), CTS (current time stamp ).

The timestamp in ffmepg is in microseconds and is related to the timebase variable. It is used as the time benchmark granularity of DTs and PTS, and the value will be large.

Among them, the av_rescale_q () function is many, av_round_near_inf is the nearest and intermediate from zero, av_rescale_rnd is the calculation of a * B/c, the input parameter is octal, to avoid overflow, compared with int_max, it is calculated separately.

First lookThe front-end packets parsing, that is, the av_read_frame function:

const int genpts = s->flags & AVFMT_FLAG_GENPTS;

For the flags, the description of each flag is in avformat. h.

//ffmpeg.c, opt_input_file()ic->flags |= AVFMT_FLAG_NONBLOCK;

Go to the read_frame_internal () function, [this function: http://www.chinavideo.org/viewthread.php? Action = printable & tid = 13846]

While (! Got_packet &&! S-> parse_queue) {...} // If got_packet is valid, return

In the ff_read_packet () function,

ret= s->iformat->read_packet(s, pkt);

Demux generates a package in

 if(!pktl && st->request_probe <= 0)

. Avframe: need_parsing is invalid at this time,

compute_pkt_fields(s, st, NULL, pkt);
got_packet = 1;

At this point, we output Pkt as it is.

Let's see.What is done before parser and Decoding in transcode,

If (Pkt. DTS! = Av_nopts_value & ist-> next_dts! = Av_nopts_value &&! Copy_ts) {int64_t pkt_dts = av_rescale_q (Pkt. DTS, ist-> St-> time_base, av_time_base_q); int64_t Delta = pkt_dts-ist-> next_dts; If (is-> iformat-> flags & avfmt_ts_discont) {If (delta <-1ll * dts_delta_threshold * av_time_base | (delta> 1ll * dts_delta_threshold * av_time_base & ist-> St-> codec-> codec_type! = Avmedia_type_subtitle) | pkt_dts + 1 <ist-> PTS) {input_files [ist-> file_index]-> ts_offset-= delta; Pkt. DTS-= av_rescale_q (delta, av_time_base_q, ist-> St-> time_base); If (Pkt. PTS! = Av_nopts_value) Pkt. PTS-= av_rescale_q (delta, av_time_base_q, ist-> St-> time_base );}} else {If (delta <-1ll * dts_error_threshold * av_time_base | (delta> 1ll * dts_error_threshold * av_time_base & ist-> St-> codec-> codec_type! = Avmedia_type_subtitle) | pkt_dts + 1 <ist-> PTS) {Pkt. DTS = av_nopts_value;} If (Pkt. PTs! = Av_nopts_value) {int64_t pkt_pts = av_rescale_q (Pkt. PTS, ist-> St-> time_base, av_time_base_q); Delta = pkt_pts-ist-> next_dts; if (delta <-1ll * dts_error_threshold * av_time_base | (delta> 1ll * dts_error_threshold * av_time_base & ist-> St-> codec-> codec_type! = Avmedia_type_subtitle) | pkt_pts + 1 <ist-> PTS) {Pkt. PTs = av_nopts_value; // fallenink: If the PTS is small, the configuration here is invalid. Set it to ist-> DTS }}} in output_packet }}}}

Decoding againInputstream: [focus on DTS, next_dts, PTS, and next_pts in the inputstream structure]

In output_packet, use saw_first_ts to mark DTS and PTS for one-time initialization.

    if (!ist->saw_first_ts) {        ist->dts = ist->st->avg_frame_rate.num ? - ist->st->codec->has_b_frames * AV_TIME_BASE / av_q2d(ist->st->avg_frame_rate) : 0;        ist->pts = 0;        if (pkt != NULL && pkt->pts != AV_NOPTS_VALUE && !ist->decoding_needed) {            ist->dts += av_rescale_q(pkt->pts, ist->st->time_base, AV_TIME_BASE_Q);            ist->pts = ist->dts; //unused but better to set it to a value thats not totally wrong        }        ist->saw_first_ts = 1;    }    if (ist->next_dts == AV_NOPTS_VALUE)        ist->next_dts = ist->dts;    if (ist->next_pts == AV_NOPTS_VALUE)        ist->next_pts = ist->pts;
If (Pkt-> DTS! = Av_nopts_value) {// If the timestamp of the Pkt is set to invalid, in this case, ist-> next_dts = ist-> DTS = av_rescale_q (Pkt-> DTS, ist-> St-> time_base, av_time_base_q) is not updated ); if (IST-> St-> codec-> codec_type! = Avmedia_type_video |! Ist-> decoding_needed) // fallenink: "Not video" or "copy" Ist-> next_pts = ist-> PTS = av_rescale_q (Pkt-> DTS, ist-> St-> time_base, av_time_base_q );}

Next, the audio is directly copy (in my case)

    if (!ist->decoding_needed) {        rate_emu_sleep(ist);        ist->dts = ist->next_dts;        switch (ist->st->codec->codec_type) {        case AVMEDIA_TYPE_AUDIO:            ist->next_dts += ((int64_t)AV_TIME_BASE * ist->st->codec->frame_size) /                             ist->st->codec->sample_rate;            break;        case AVMEDIA_TYPE_VIDEO:    //...        }        ist->pts = ist->dts;        ist->next_pts = ist->next_dts;    }
for (i = 0; pkt && i < nb_output_streams; i++) {        OutputStream *ost = output_streams[i];        if (!check_output_constraints(ist, ost) || ost->encoding_needed)            continue;        do_streamcopy(ist, ost, pkt);    }

Before decoding,

ist->pts = ist->next_pts;        ist->dts = ist->next_dts;

In the decode_video function,

Pkt-> DTS = av_rescale_q (IST-> DTS, av_time_base_q, ist-> St-> time_base); // update the timestamp of the input stream to Pkt.

In avcodec_decode_video2,

    ret = avctx->codec->decode(avctx, picture, got_picture_ptr, &tmp);    picture->pkt_dts= avpkt->dts;

If the decoding has output

if (*got_picture_ptr){    avctx->frame_number++;    picture->best_effort_timestamp = guess_correct_pts(avctx,                                         picture->pkt_pts,                                         picture->pkt_dts);}

Call guess_correct_pts. Generally, the value of picture-> pkt_dts is returned and then returned to decode_video,

    best_effort_timestamp = av_frame_get_best_effort_timestamp(decoded_frame);    if(best_effort_timestamp != AV_NOPTS_VALUE)        ist->next_pts = ist->pts = av_rescale_q(decoded_frame->pts = best_effort_timestamp, ist->st->time_base, AV_TIME_BASE_Q);

Where is the av_frame_get_best_effort_timestamp () function defined? Yes:

/* ./libavcodec/utils.c */
#define MAKE_ACCESSORS(str, name, type, field) \    type av_##name##_get_##field(const str *s) { return s->field; } \    void av_##name##_set_##field(str *s, type v) { s->field = v; }

Continue to look back, and perform preprocessing before post-processing (some codec encoding and decoding often require Edge Addition)

pre_process_video_frame(ist, (AVPicture *)decoded_frame, &buffer_to_free);

In version 0.11.1, filter is already used to replace the original swscale module,

if (ist->dr1 && decoded_frame->type==FF_BUFFER_TYPE_USER && !changed) {//fallenink: "type" come from "codec_get_buffer"    FrameBuffer      *buf = decoded_frame->opaque;    AVFilterBufferRef *fb = avfilter_get_video_buffer_ref_from_arrays(                                        decoded_frame->data, decoded_frame->linesize,                                        AV_PERM_READ | AV_PERM_PRESERVE,                                        ist->st->codec->width, ist->st->codec->height,                                        ist->st->codec->pix_fmt);avfilter_copy_frame_props(fb, decoded_frame);fb->buf->priv           = buf;fb->buf->free           = filter_release_buffer;av_assert0(buf->refcount>0);buf->refcount++;av_buffersrc_add_ref(ist->filters[i]->filter, fb,                                 AV_BUFFERSRC_FLAG_NO_CHECK_FORMAT |                                 AV_BUFFERSRC_FLAG_NO_COPY);        } else    if(av_buffersrc_add_frame(ist->filters[i]->filter, decoded_frame, 0)<0) {//fallenink: codec buffer copy to filterav_log(NULL, AV_LOG_FATAL, "Failed to inject frame into filter network\n");exit_program(1);    }

Here, the type initialization of DR1 and decoded_frame-> type is in init_input_stream. You can see that,

ist->dr1 = (codec->capabilities & CODEC_CAP_DR1) && !do_deinterlace;        if (codec->type == AVMEDIA_TYPE_VIDEO && ist->dr1) {            ist->st->codec->get_buffer     = codec_get_buffer;            ist->st->codec->release_buffer = codec_release_buffer;            ist->st->codec->opaque         = ist;}

Capabilities correspond to the values given in the avcodec struct definition, such as in h264.c,

Avcodec ff_h1__decoder = {. name = "h264 ",//... omitted. capabilities =/* codec_cap_draw_horiz_band | */codec_cap_dr1 | codec_cap_delay | codec_cap_slice_threads | codec_cap_frame_threads ,//... omitted };

Then, the decoded data and parameters are thrown into the filter, and the av_buffersink_read before the encoding is performed,

avfilter_copy_frame_props(fb, decoded_frame);            fb->buf->priv           = buf;            fb->buf->free           = filter_release_buffer;            av_assert0(buf->refcount>0);            buf->refcount++;            av_buffersrc_add_ref(ist->filters[i]->filter, fb,                                 AV_BUFFERSRC_FLAG_NO_CHECK_FORMAT |                                 AV_BUFFERSRC_FLAG_NO_COPY);

In this way, the timestamp and other information are obtained from the encoding. After the decode_video function returns,

if (avpkt.duration) {                duration = av_rescale_q(avpkt.duration, ist->st->time_base, AV_TIME_BASE_Q);            } else if(ist->st->codec->time_base.num != 0 && ist->st->codec->time_base.den != 0) {                int ticks= ist->st->parser ? ist->st->parser->repeat_pict+1 : ist->st->codec->ticks_per_frame;                duration = ((int64_t)AV_TIME_BASE *                                ist->st->codec->time_base.num * ticks) /                                ist->st->codec->time_base.den;            } else                duration = 0;            if(ist->dts != AV_NOPTS_VALUE && duration) {                ist->next_dts += duration;            }else                ist->next_dts = AV_NOPTS_VALUE;            if (got_output)                ist->next_pts += duration; //FIXME the duration is not correct in some cases

Now let's look at the encoding end poll_filters: After a frame is taken from the graph,

frame_pts = AV_NOPTS_VALUE;                if (picref->pts != AV_NOPTS_VALUE) {                    filtered_frame->pts = frame_pts = av_rescale_q(picref->pts, ost->filter->filter->inputs[0]->time_base,                                                    ost->st->codec->time_base) - av_rescale_q(of->start_time,                                                    AV_TIME_BASE_Q,                                                    ost->st->codec->time_base);                    if (of->start_time && filtered_frame->pts < 0) {                        avfilter_unref_buffer(picref);                        continue;                    }                }//...avfilter_fill_frame_from_video_buffer_ref(filtered_frame, picref);                    filtered_frame->pts = frame_pts;

Go to the do_video_out () function,

if(ist && ist->st->start_time != AV_NOPTS_VALUE && ist->st->first_dts != AV_NOPTS_VALUE && ost->frame_rate.num)        duration = 1/(av_q2d(ost->frame_rate) * av_q2d(enc->time_base));    sync_ipts = in_picture->pts;    delta = sync_ipts - ost->sync_opts + duration;switch (format_video_sync) {/*ost->sync_opts = lrint(sync_ipts);*/}in_picture->pts = ost->sync_opts;if (pkt.pts == AV_NOPTS_VALUE && !(enc->codec->capabilities & CODEC_CAP_DELAY))                pkt.pts = ost->sync_opts;            if (pkt.pts != AV_NOPTS_VALUE)                pkt.pts = av_rescale_q(pkt.pts, enc->time_base, ost->st->time_base);            if (pkt.dts != AV_NOPTS_VALUE)                pkt.dts = av_rescale_q(pkt.dts, enc->time_base, ost->st->time_base);

Next I will not talk about it. I will directly send it to MUX and then write it to the remote end.

_______________________________________________________ Boring cut-off line _________________________________________________
Let's talk about the problems I encountered. The procedures mentioned above have no technical content. They just serve as a sort, and it is difficult to find and handle the details without sorting them out.

Symptom 1: In streaming media transcoding (the front end is multi-thread decoding), if the output limit in the ff_interleave_packet_per_dts () function is removed, all interested streams are sent to muxer, and the outgoing streams are discovered, the video is later than (the time stamp value is too small) the audio. Cause: the front-end is multi-threaded decoding. If the number of threads is not properly controlled and the CPU is overloaded, the packet buffer thread will fail to be processed in a timely manner.

Symptom 2: Based on symptom 1, the video is processed in a single thread. if most of the time the stream is sent, the video is in advance of the audio, or even several seconds. Cause: (the reason here is only the situation encountered below, which is not universal .) When the server starts some videos, the packets with a timestamp of 0 are replaced if they are smaller than ist-> next_dts, while ist-> next_dts is in order, increment by duration_frame, so this problem occurs. Here I set this situation to got output none after decoding. The frame cannot be released here, because several pieces of memory are actually allocated in codec, and then copied to the filter module. In my case, the synchronization problem is caused. The key point here is that the timestamp problem is superficial. We are all concerned about audio/video synchronization.

In addition, on the basis of symptom 1, I am here pushing streams to the FMS. The Video timestamp is always later than the audio (or a lot earlier than the audio). After a while, the write end of rtmp is in write, and the SELECT statement always fails. The specific cause is unknown. However, this problem does not occur in symptom 2. Why ??? Of course, there is no strict synchronization capability during transcoding. It is based on the packet timestamp from the front end, but the audio and video packets must be staggered at the backend of transcoding, that is, ff_interleave_packet_per_dts () what this function does, of course, is preferred if MUX itself provides interleave_packet.

Another problem is that when the decoder receives the video frame split, FFMPEG continuously receives several packets with the same pts. The following Code focuses on the situation:

if(ist->dts != AV_NOPTS_VALUE && duration) {                ist->next_dts += duration;}else                ist->next_dts = AV_NOPTS_VALUE;if (got_output)                ist->next_pts += duration; //FIXME the duration is not correct in some cases

It is reasonable to say that when decoding has no output, next_pts should not accumulate duration. In my case, videos require codec and direct audio copy, resulting in non-synchronization. In addition, the PTS of video frames in the audio/video interleaved area is growing too fast, and the videos on the client are stuck.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.