【該文章,是屬於ffmpeg的細枝末節,會寫的比較囉嗦】
【不同的轉碼環境,會有代碼流程的不同】
首先普及下:
時間戳記,DTS(decoding time stamp),PTS(presention time stamp),CTS(current time stamp)。
ffmepg中的時間戳記,是以微秒為單位,關乎timebase變數,它是作為dts、pts的時間基準粒度,數值會很大。
其中函數av_rescale_q()是很多的,AV_ROUND_NEAR_INF是就近、中間從零,av_rescale_rnd它是計算a*b/c,傳入參數為八位元組,為避免溢出,裡面做了與INT_MAX的比較,分開計算。
先看前端的packets parsing,也就是av_read_frame函數:
const int genpts = s->flags & AVFMT_FLAG_GENPTS;
關於該flags,各標誌位的說明在 avformat.h
//ffmpeg.c, opt_input_file()ic->flags |= AVFMT_FLAG_NONBLOCK;
進入 read_frame_internal()函數,【該函數:http://www.chinavideo.org/viewthread.php?action=printable&tid=13846】
while (!got_packet && !s->parse_queue) {...} //got_packet有效則返回
在 ff_read_packet() 函數中,
ret= s->iformat->read_packet(s, pkt);
demux出一個包,在
if(!pktl && st->request_probe <= 0)
返回。AVFrame:need_parsing此時是無效,
compute_pkt_fields(s, st, NULL, pkt);
got_packet = 1;
到此,we output pkt as it is。
來看看transcode中在parser和解碼前做了什麼,
if (pkt.dts != AV_NOPTS_VALUE && ist->next_dts != AV_NOPTS_VALUE && !copy_ts) { int64_t pkt_dts = av_rescale_q(pkt.dts, ist->st->time_base, AV_TIME_BASE_Q); int64_t delta = pkt_dts - ist->next_dts; if (is->iformat->flags & AVFMT_TS_DISCONT) {if(delta < -1LL*dts_delta_threshold*AV_TIME_BASE || (delta > 1LL*dts_delta_threshold*AV_TIME_BASE && ist->st->codec->codec_type != AVMEDIA_TYPE_SUBTITLE) || pkt_dts+1<ist->pts){input_files[ist->file_index]->ts_offset -= delta;pkt.dts-= av_rescale_q(delta, AV_TIME_BASE_Q, ist->st->time_base);if (pkt.pts != AV_NOPTS_VALUE) pkt.pts-= av_rescale_q(delta, AV_TIME_BASE_Q, ist->st->time_base);} } else {if ( delta < -1LL*dts_error_threshold*AV_TIME_BASE || (delta > 1LL*dts_error_threshold*AV_TIME_BASE && ist->st->codec->codec_type != AVMEDIA_TYPE_SUBTITLE) || pkt_dts+1<ist->pts){ pkt.dts = AV_NOPTS_VALUE;}if (pkt.pts != AV_NOPTS_VALUE){ int64_t pkt_pts = av_rescale_q(pkt.pts, ist->st->time_base, AV_TIME_BASE_Q); delta = pkt_pts - ist->next_dts; if ( delta < -1LL*dts_error_threshold*AV_TIME_BASE || (delta > 1LL*dts_error_threshold*AV_TIME_BASE && ist->st->codec->codec_type != AVMEDIA_TYPE_SUBTITLE) || pkt_pts+1<ist->pts) { pkt.pts = AV_NOPTS_VALUE;//fallenink:如果pts小了,這裡置為無效。則在output_packet中置為ist->dts } } }}
再看解碼 InputStream:【關注InputStream結構中的dts、next_dts、pts、next_pts】
在output_packet中,用saw_first_ts標記給dts、pts做一次性初始化,等
if (!ist->saw_first_ts) { ist->dts = ist->st->avg_frame_rate.num ? - ist->st->codec->has_b_frames * AV_TIME_BASE / av_q2d(ist->st->avg_frame_rate) : 0; ist->pts = 0; if (pkt != NULL && pkt->pts != AV_NOPTS_VALUE && !ist->decoding_needed) { ist->dts += av_rescale_q(pkt->pts, ist->st->time_base, AV_TIME_BASE_Q); ist->pts = ist->dts; //unused but better to set it to a value thats not totally wrong } ist->saw_first_ts = 1; } if (ist->next_dts == AV_NOPTS_VALUE) ist->next_dts = ist->dts; if (ist->next_pts == AV_NOPTS_VALUE) ist->next_pts = ist->pts;
if (pkt->dts != AV_NOPTS_VALUE) {//這裡如果pkt的時戳被置為無效了,則不作ist的更新 ist->next_dts = ist->dts = av_rescale_q(pkt->dts, ist->st->time_base, AV_TIME_BASE_Q); if (ist->st->codec->codec_type != AVMEDIA_TYPE_VIDEO || !ist->decoding_needed)//fallenink: "not video" or "copy" ist->next_pts = ist->pts = av_rescale_q(pkt->dts, ist->st->time_base, AV_TIME_BASE_Q); }
接著,音頻直接是copy(In my case)
if (!ist->decoding_needed) { rate_emu_sleep(ist); ist->dts = ist->next_dts; switch (ist->st->codec->codec_type) { case AVMEDIA_TYPE_AUDIO: ist->next_dts += ((int64_t)AV_TIME_BASE * ist->st->codec->frame_size) / ist->st->codec->sample_rate; break; case AVMEDIA_TYPE_VIDEO: //... } ist->pts = ist->dts; ist->next_pts = ist->next_dts; }
for (i = 0; pkt && i < nb_output_streams; i++) { OutputStream *ost = output_streams[i]; if (!check_output_constraints(ist, ost) || ost->encoding_needed) continue; do_streamcopy(ist, ost, pkt); }
在解碼前,
ist->pts = ist->next_pts; ist->dts = ist->next_dts;
在decode_video函數中,
pkt->dts = av_rescale_q(ist->dts, AV_TIME_BASE_Q, ist->st->time_base);//這裡將輸入資料流的時戳,更新給pkt
在avcodec_decode_video2中,單線程情況中,
ret = avctx->codec->decode(avctx, picture, got_picture_ptr, &tmp); picture->pkt_dts= avpkt->dts;
如果解碼有輸出,則
if (*got_picture_ptr){ avctx->frame_number++; picture->best_effort_timestamp = guess_correct_pts(avctx, picture->pkt_pts, picture->pkt_dts);}
調用guess_correct_pts,一般就是返回picture->pkt_dts的值,再回到decode_video中,
best_effort_timestamp = av_frame_get_best_effort_timestamp(decoded_frame); if(best_effort_timestamp != AV_NOPTS_VALUE) ist->next_pts = ist->pts = av_rescale_q(decoded_frame->pts = best_effort_timestamp, ist->st->time_base, AV_TIME_BASE_Q);
函數av_frame_get_best_effort_timestamp()定義在哪裡呢?是這樣的:
/* ./libavcodec/utils.c */
#define MAKE_ACCESSORS(str, name, type, field) \ type av_##name##_get_##field(const str *s) { return s->field; } \ void av_##name##_set_##field(str *s, type v) { s->field = v; }
繼續向後看,在後處理前先做下預先處理(某些編解碼往往要加邊的)
pre_process_video_frame(ist, (AVPicture *)decoded_frame, &buffer_to_free);
0.11.1版本中已經在用filter,替代原來的swscale模組了,
if (ist->dr1 && decoded_frame->type==FF_BUFFER_TYPE_USER && !changed) {//fallenink: "type" come from "codec_get_buffer" FrameBuffer *buf = decoded_frame->opaque; AVFilterBufferRef *fb = avfilter_get_video_buffer_ref_from_arrays( decoded_frame->data, decoded_frame->linesize, AV_PERM_READ | AV_PERM_PRESERVE, ist->st->codec->width, ist->st->codec->height, ist->st->codec->pix_fmt);avfilter_copy_frame_props(fb, decoded_frame);fb->buf->priv = buf;fb->buf->free = filter_release_buffer;av_assert0(buf->refcount>0);buf->refcount++;av_buffersrc_add_ref(ist->filters[i]->filter, fb, AV_BUFFERSRC_FLAG_NO_CHECK_FORMAT | AV_BUFFERSRC_FLAG_NO_COPY); } else if(av_buffersrc_add_frame(ist->filters[i]->filter, decoded_frame, 0)<0) {//fallenink: codec buffer copy to filterav_log(NULL, AV_LOG_FATAL, "Failed to inject frame into filter network\n");exit_program(1); }
這裡dr1和decoded_frame->type的type的初始化在init_input_stream中,可以看到,
ist->dr1 = (codec->capabilities & CODEC_CAP_DR1) && !do_deinterlace; if (codec->type == AVMEDIA_TYPE_VIDEO && ist->dr1) { ist->st->codec->get_buffer = codec_get_buffer; ist->st->codec->release_buffer = codec_release_buffer; ist->st->codec->opaque = ist;}
capabilities對應於AVCodec結構體定義中給的值,比如 h264.c中,
AVCodec ff_h264_decoder = { .name = "h264", //...省略 .capabilities = /*CODEC_CAP_DRAW_HORIZ_BAND |*/ CODEC_CAP_DR1 | CODEC_CAP_DELAY | CODEC_CAP_SLICE_THREADS | CODEC_CAP_FRAME_THREADS, //...省略};
接著將解碼後的資料及參數扔進filter,後頭再編碼前av_buffersink_read就可以了,
avfilter_copy_frame_props(fb, decoded_frame); fb->buf->priv = buf; fb->buf->free = filter_release_buffer; av_assert0(buf->refcount>0); buf->refcount++; av_buffersrc_add_ref(ist->filters[i]->filter, fb, AV_BUFFERSRC_FLAG_NO_CHECK_FORMAT | AV_BUFFERSRC_FLAG_NO_COPY);
這樣,時戳等資訊也就在編碼那邊擷取到了。decode_video函數返回之後,
if (avpkt.duration) { duration = av_rescale_q(avpkt.duration, ist->st->time_base, AV_TIME_BASE_Q); } else if(ist->st->codec->time_base.num != 0 && ist->st->codec->time_base.den != 0) { int ticks= ist->st->parser ? ist->st->parser->repeat_pict+1 : ist->st->codec->ticks_per_frame; duration = ((int64_t)AV_TIME_BASE * ist->st->codec->time_base.num * ticks) / ist->st->codec->time_base.den; } else duration = 0; if(ist->dts != AV_NOPTS_VALUE && duration) { ist->next_dts += duration; }else ist->next_dts = AV_NOPTS_VALUE; if (got_output) ist->next_pts += duration; //FIXME the duration is not correct in some cases
現在看看編碼這一端poll_filters:從graph中取出一幀後,
frame_pts = AV_NOPTS_VALUE; if (picref->pts != AV_NOPTS_VALUE) { filtered_frame->pts = frame_pts = av_rescale_q(picref->pts, ost->filter->filter->inputs[0]->time_base, ost->st->codec->time_base) - av_rescale_q(of->start_time, AV_TIME_BASE_Q, ost->st->codec->time_base); if (of->start_time && filtered_frame->pts < 0) { avfilter_unref_buffer(picref); continue; } }//...avfilter_fill_frame_from_video_buffer_ref(filtered_frame, picref); filtered_frame->pts = frame_pts;
進入到do_video_out()函數中,
if(ist && ist->st->start_time != AV_NOPTS_VALUE && ist->st->first_dts != AV_NOPTS_VALUE && ost->frame_rate.num) duration = 1/(av_q2d(ost->frame_rate) * av_q2d(enc->time_base)); sync_ipts = in_picture->pts; delta = sync_ipts - ost->sync_opts + duration;switch (format_video_sync) {/*ost->sync_opts = lrint(sync_ipts);*/}in_picture->pts = ost->sync_opts;if (pkt.pts == AV_NOPTS_VALUE && !(enc->codec->capabilities & CODEC_CAP_DELAY)) pkt.pts = ost->sync_opts; if (pkt.pts != AV_NOPTS_VALUE) pkt.pts = av_rescale_q(pkt.pts, enc->time_base, ost->st->time_base); if (pkt.dts != AV_NOPTS_VALUE) pkt.dts = av_rescale_q(pkt.dts, enc->time_base, ost->st->time_base);
接下來就不多說了,直接就送到mux,然後寫到遠端去了。
_______________________________________________無聊的分割線_________________________________________________
下面談談我遇到的問題吧,上面說了一些流程,沒什麼技術含量,只是作為一個梳理,不梳理很難發現並處理好細節問題。
現象一:流媒體轉碼中(前端為多線程解碼),如果去掉ff_interleave_packet_per_dts()函數中的輸出限制,改成有任意流都送去muxer,發現此時出去的流,視頻是晚於(時戳值偏小)音訊。原因:正因為前端是多線程解碼,如果不把線程數目控制好,cpu過載,會有packet緩衝線上程中,沒有得到及時的處理,導致該現象。
現象二:基於現象一,改為單線程處理,如果多數時候,發出去的流,視頻又是提前於音訊,甚至數秒。原因:(說明一下,這裡的原因只是在下遇到的情況,不具普遍性。)伺服器端一開始發了好一些視頻的,時戳為0的包,小於了ist->next_dts則被替換,而ist->next_dts是按序,按duration_frame遞增的,所以出現該情況,這裡我在解碼後將這樣的情況,置為got output none。這兒是不能釋放frame的,因為出來的幾片記憶體實在codec中分配的,然後拷貝去了濾波器模組中。在我這裡,導致了不同步問題,重點在這裡,時戳問題是表面的,我們大家關注的還是音視頻同步問題。
另外,在現象一的基礎上,我這裡是往fms推流,視頻時戳總是晚於音頻(或者早於音頻很多),出現了一段時間後,rtmp的寫端在write,select總是失敗,具體原因還不明白。但是在現象二中,卻不會出現這個問題。why???轉碼過程中,當然不會有嚴格處理同步的能力,它是以前端過來的packet時戳為根據的,但轉碼後端要做好音視頻包的交錯,也就是ff_interleave_packet_per_dts()這個函數所做的,當然如果mux本身提供interleave_packet則優先使用。
另外有個問題,就是當解碼端收到,視訊框架被拆分的情況,ffmpeg會連續收到幾個包為pts相同的情況,下面這段代碼關注下:
if(ist->dts != AV_NOPTS_VALUE && duration) { ist->next_dts += duration;}else ist->next_dts = AV_NOPTS_VALUE;if (got_output) ist->next_pts += duration; //FIXME the duration is not correct in some cases
按理說,當解碼沒有輸出的時候,是不應該next_pts累加duration的。在我這裡,視頻需要編解碼,音頻直接copy,導致了不同步,以及在音視頻交錯那邊視訊框架pts增長過快,在那裡囤積,導致用戶端視訊卡住。