Document directory
- 1. register all container formats and CODEC: av_register_all ()
- 2. open the file: av_open_input_file ()
- 3. Extract stream information from the file: av_find_stream_info ()
- 4. Search for all the streams in codec_type_video format.
- 5. Find the corresponding decoder: avcodec_find_decoder ()
- 6. Enable the codec: avcodec_open ()
- 7. allocate memory for decoding frames: avcodec_alloc_frame ()
- 8. Continuously extract frame data from the code stream: av_read_frame ()
- 9. Determine the frame type. For Video Frames, call avcodec_decode_video ()
- 10. Release the decoder avcodec_close () After decoding ()
- 11. Close the input file av_close_input_file ()
FFmpeg decoding process:
1. register all container formats and CODEC: av_register_all () 2. open the file: av_open_input_file () 3. extract stream information from the file: av_find_stream_info () 4. search for all the streams in codec_type_video5. find the corresponding decoder: avcodec_find_decoder () 6. enable the codec: avcodec_open () 7. allocate memory for decoded frames: avcodec_alloc_frame () 8. extract frame data from the code stream continuously: av_read_frame () 9. determine the frame type. For Video Frames, call avcodec_decode_video () 10. after decoding, release the decoder avcodec_close () 11. close the input file: av_close_input_file ()
The first thing is to open a video file and get a stream from it. The first thing we need to do is to use av_register_all (); to initialize libavformat/libavcodec:
This step registers all available file formats and encoders in the library so that when a file is opened, they can automatically select the corresponding file format and encoder. Av_register_all () only needs to be called once, so it should be placed in the initialization code. You can also register only the file format and encoding of an individual.
Next, open the file:
Avformatcontext * pformatctx;
Const char * filename = "myvideo. mpg ";
Av_open_input_file (& pformatctx, filename, null, 0, null); // open the video file
The last three parameters describe the file format, buffer size (size), and format parameters. By specifying null or 0, we will tell libavformat to automatically detect the file format and use the default buffer size. The format parameter here refers to the video output parameter, such as the coordinates of the width and height.
Next, we need to retrieve the stream information contained in the file:
Av_find_stream_info (pformatctx); // retrieves the stream information
Avformatcontext struct
Dump_format (pformatctx, 0, filename, false); // you can use this function to output all the obtained parameters.
For (I = 0; I <pformatctx-> nb_streams; I ++) // differentiate between video streams and audio streams
If (pformatctx-> streams-> codec. codec_type = codec_type_video) // find the video stream. You can change it to audio.
{
Videostream = I;
Break;
}
Next we need to find the decoder
Avcodec * pcodec;
Pcodec = avcodec_find_decoder (pcodecctx-> codec_id );
Avcodec_open (pcodecctx, pcodec); // enable the decoder
Allocate space for video frames to store decoded images:
Avframe * pframe;
Pframe = avcodec_alloc_frame ();
//////////////////////////////////////// /Start decoding ///////////////////////////////////// //////
The first step is to read data:
What we will do is read the entire video stream by reading the package, decode it into a frame, convert the format, and save it.
While (av_read_frame (pformatctx, & Packet)> = 0) {// read data
If (packet. stream_index = videostream) {// determines whether the video stream is video streams.
Avcodec_decode_video (pcodecctx, pframe, & framefinished,
Packet. Data, packet. size); // decode
If (framefinished ){
Img_convert (avpicture *) pframergb, pix_fmt_rgb24, (avpicture *) pframe, pcodecctx-> pix_fmt, pcodecctx-> width, pcodecctx-> height); // conversion}
Saveframe (pframergb, pcodecctx-> width, pcodecctx-> height, I); // save data
Av_free_packet (& Packet); // release
Av_read_frame () reads a package and saves it to the avpacket struct. The data can be released later through av_free_packet. Function avcodec_decode_video () converts a package to a frame. However, when decoding a packet, we may not get the required frame information. Therefore, when we get the next frame, avcodec_decode_video () sets the frame end mark framefinished for us. Finally, we use the img_convert () function to convert frames from the original format (pcodecctx-> pix_fmt) to the RGB format. Remember, you can
Converts an avframe struct pointer to an avpicture struct pointer. Finally, we pass the frame and height width information to our saveframe function.
After decoding, the display process is completed using SDL. Considering that we will use firmware for display in the future, SDL ignores this.
Audio/Video Synchronization
DTS (Decoding timestamp) and PTS (displaying timestamp)
When we call av_read_frame () to obtain a package, the information of PTS and DTS will also be saved in the package. But what we really want is the PTS of the original frame we just decoded, so that we can know when to display it. However, the frame we get from the avcodec_decode_video () function is only an avframe and does not contain any useful PTS values (Note: avframe does not contain timestamp information, but when we wait for the frame, it is not what we want ).. We save the PTS of the first packet of a frame: this will serve as the PTS of the entire frame. We can use the avcodec_decode_video () function to calculate which package is the first package of a frame. How to implement it? At any time, when a package starts to frame
Avcodec_decode_video () will call a function to request a buffer for a frame. Of course, FFMPEG allows us to redefine the function for allocating memory. Calculate the timestamp of the previous frame and the current frame to predict the time of the next timestamp. At the same time, we need to synchronize the video to the audio. We will set an audio time audioclock; an internal value records the position of the Audio being played. Just like the number read from any mp3 player. Since we synchronize the video to the audio, the video thread uses this value to determine whether it is too fast or too slow.
How to solve the problem of audio/video synchronization when using ffmpeg sdk for Video Transcoding and compression:
When you use the ffmpeg sdk for Video Transcoding and compression, you can view the video content after the transcoding is successful and find that the audio and video are not synchronized. This is indeed an annoyance. I encountered this problem when I used FFMPEG SDK to filter the FLV file encoding in h264 format.
After research, we found that the ffmpeg sdk controls the write timestamp in two places when writing videos. One is avpacket and the other is avframe. When avcodec_encode_video is called, you need to pass in the avframe Object Pointer, that is, transfer an uncompressed video for compression. avframe contains a PTS parameter, this parameter is the timestamp of the current frame in future playback. In avpacket, PTS and DTs are also included. To do this, we must describe the three types of video compression frames I, P, and B. I-frame is a key frame, which does not depend on other video frames. P-frame is a frame to be predicted forward and only depends on the previous video frame. B-frame is a two-way prediction video frame, it depends on the front and back video frames. Because frame B exists, because it is bidirectional, you must know the details of the previous video frame and the subsequent video frame before you can know the final image to be presented for this frame B. The PTS and DTS parameters are used to control the display and decoding sequence of video frames.
PTS is the sequence in which frames are displayed.
DTS is the sequence in which frames are read and decoded.
If no B frame exists, DTS and PTS are the same. Otherwise, they are different. For details about this, refer to the principle of MPEG.
What are the values of PTS and DTs in avpacket?
PTS and DTS must set the video frame decoding and display sequence. Add one for each frame added, instead of the video playback timestamp.
However, it is proved that the video decoded with rmvb is not fixed frame rate, but the frame rate is changed. In this way, if each frame is compressed, the PTS and DTS add-on solution causes audio and video not to be synchronized.
How can we solve the problem of audio/video synchronization?
See the following code snippet.
Ltimestamp is the timestamp of the current video frame obtained through DirectShow.
M_llframe_index indicates the number of frames that have been compressed.
First, av_rescale calculates the video frame whose current compression processing requires processing timestamp. If the timestamp has not reached the timestamp of the video frame currently provided by DirectShow, the frame is discarded.
Otherwise, perform the compression operation. Set PTS and DTS for avpacket. Assume that frame B does not exist.
Because in the future, the video will be played at the fixed frame rate we set, therefore, you need to compare the video frame timestamp calculated based on the set playback frame rate with the timestamp of the current video frame provided by DirectShow, and determine whether to implement a playback delay policy. If you need to delay playback, add Step 2 to PTS; otherwise, set it to 1. DTS is the same as normal speed playback.
_ Int64 x = av_rescale (m_llframe_index, av_time_base * (int64_t) C-> time_base.num, C-> time_base.den );
If (x> ltimestamp)
{
Return true;
}
M_pvideoframe2-> PTS = ltimestamp;
M_pvideoframe2-> pict_type = 0;
Int out_size = avcodec_encode_video (C, m_pvideo_outbuf, video_outbuf_size, m_pvideoframe2 );
/* If zero size, it means the image was buffered */
If (out_size> 0)
{
Avpacket Pkt;
Av_init_packet (& Pkt );
If (x> ltimestamp)
{
Pkt. PTs = Pkt. DTS = m_llframe_index;
Pkt. Duration = 0;
}
Else
{
Pkt. Duration = (ltimestamp-x) * C-> time_base.den/1000000 + 1;
Pkt. PTs = m_llframe_index;
Pkt. DTS = Pkt. PTS;
M_llframe_index + = Pkt. duration;
}
// Pkt. PTs = ltimestamp * (_ int64) frame_rate.den/1000;
If (c-> coded_frame & C-> coded_frame-> key_frame)
{
Pkt. Flags | = pkt_flag_key;
}
Pkt. stream_index = m_pvideostream-> index;
Pkt. Data = m_pvideo_outbuf;
Pkt. size = out_size;
/* Write the compressed frame in the media file */
Ret = av_interleaved_write_frame (m_pavformatcontext, & Pkt );
}
Else
{
Ret = 0;
}
Why is the frame decoded by avcodec_decode_video smaller than the front PTS?
The following code is provided:
While (av_read_frame (pformatctxsource, & Packet)> = 0)
{
If (packet. stream_index = videostream)
{
Int out_size = avcodec_decode_video (pcodecctxsource, pframesource, & bframefinished, packet. Data, packet. size); // decode fromsource Frame
If (bframefinished)
{
Pframesource-> PTS = av_rescale_q (packet. pts, pcodecctxsource-> time_base, pstcodec-> time_base );
Int out_size = avcodec_encode_video (pstcodec, video_buffer, 200000, pframesource); // encodeto output
If (out_size> 0)
{
//...
}
}
}
Av_free_packet (& Packet );
}
When I decode, The pframesource-> PTS obtained from the first frame is 96. When the second frame is resolved, pframesource-> PTS becomes 80 after computation, the next few frames are also smaller than 96. After a while, more than 100 frames will be obtained, and the next frame is smaller than 100 frames. Why? In encode, encode a frame with PTS = 96 first, and then encode a frame smaller than 96 then returns-1 until a frame greater than 96 is found.
In addition, is the PTS calculation method correct?
Reply:
Because you have B-Frame
For example:
The inputsequence for Video Encoder
1 2 3 4 5 6 7
I B P B I
Let's take1, 2, 3... as PTS for simplification
The out sequencefor video encoder (this equals the decoder sequence)
1 4 2 3 7 5 6
I P B I B
You will get APTS sequence as following:
1 4 2 3 7 5 6
7 5 6 sequence will be same as your question
Q:
Oh, isn't my PTS so computation? Instead, we need to add more than 1 each time, right? So where should PTS and DTS be used in packet? If I decode the data in the storage order, do I need to cache the data myself? Thank you!
In addition, there is another problem. Since the decoded image is not necessarily obtained in the ascending order of PTS, When I encode the image, should the encoding be performed according to the decoded frame sequence? Or is it possible to cache the frames first and then strictly encode them according to the display sequence of the images? It is represented by code:
Method 1:
While (av_read_frame)
{
Decoding;
PTS + 1;
Encoding;
Output;
}
Method 2:
While (av_read_frame)
{
Decoding;
If (PTS <previous)
{
Cache;
}
Else
{
Encode cached frames and write them into files;
}
}
Which of the two methods is correct? Because I see that the Code on the Internet uses method 1, but I think method 2 is correct?
A:
The output of decoderis the right order for display because I/P frames will be cacheduntil next I/P
Understanding:
After decoder, the output PTS is output in the normal order, that is, the display order. If there are B frames, decoder will cache them.
However, after encoder, the output is output by DTs.
PTS, DTS is not a timestamp, but should be understood as the sequence number of the frame. The frame rate of each frame is not necessarily the same and may change. If it is converted to a timestamp, it should be (PTS * frame rate ). To deepen understanding
The PTS ratio can be set to the frame of the second PTS frame. If the frame rate of each frame remains unchanged, the displayed timestamp is (PTS * frame rate). If you consider the frame rate change, you need to add the (PTS * Current Frame Rate) to the end.
Add a trace under decode in tutorial5 and then print it:
Len1 = avcodec_decode_video (is-> video_st-> codec, pframe, & framefinished,
Packet-> data, packet-> size );
Printf ("------------------------------------------------------------------------------- \ n ");
Printf ("avcodec_decode_videopacket-> PTS: % x, packet-> DTS: % x \ n", packet-> pts, packet-> DTS );
Printf ("avcodec_decode_videopframe-> pkt_pts: % x, pframe-> pkt_dts: % x, pframe-> PTS: % x \ n", pframe-> pkt_pts, pframe-> pkt_dts, pframe-> PTS );
If (pframe-> opaque)
Printf ("avcodec_decode_video * (uint64_t *) pframe-> opaque: % x \ n", * (uint64_t *) pframe-> opaque );
Print an MP4 file:
-----------------------------------------------------------------------------
Avcodec_decode_video packet-> PTS: 1ae, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 1ae
-----------------------------------------------------------------------------
Avcodec_decode_video packet-> PTS: 1AF, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 1AF
-----------------------------------------------------------------------------
Avcodec_decode_video packet-> PTS: 24c, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 1ac
-----------------------------------------------------------------------------
Avcodec_decode_video packet-> PTS: 24d, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 24d
-----------------------------------------------------------------------------
Avcodec_decode_video packet-> PTS: 24E, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 24E
The following describes how to play an RM file:
-----------------------------------------------------------------------------
Avcodec_decode_videopacket-> PTS: 1831b, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 1831b
-----------------------------------------------------------------------------
Avcodec_decode_videopacket-> PTS: 18704, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 18704
-----------------------------------------------------------------------------
Avcodec_decode_videopacket-> PTS: 18aed, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 18aed
-----------------------------------------------------------------------------
Avcodec_decode_videopacket-> PTS: 18ed6, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 18ed6
-----------------------------------------------------------------------------
Avcodec_decode_videopacket-> PTS: 192bf, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 192bf
-----------------------------------------------------------------------------
Avcodec_decode_videopacket-> PTS: 196a8, packet-> DTS: 0
Avcodec_decode_videopframe-> pkt_pts: 0, pframe-> pkt_dts: 80000000, pframe-> PTS: 0
Avcodec_decode_video * (uint64_t *) pframe-> opaque: 196a8
We can see that some PTS is accumulated by + 1, while others are accumulated by order. When the packet before decoder has pts, the frame obtained after decoder is assigned to the PTS of packet. When the packet of the transmitter is only part of the data of one frame or B frame, because the frames produced by decoder must be output in the normal PTS order, it is possible that decoder will not get the frame, or the decoder will cache or output the frame, that is, the frame PTS will be empty. If frame PTS (that is, opaque) is null, it will be viewed in frame-> DTS. If DTS does not have it, frame-> PTS is regarded as 0.
For:
PTS * = av_q2d (is-> video_st-> time_base); // that is, PTS * Frame Rate
// Did we get avideo frame?
If (framefinished ){
PTS = synchronize_video (is, pframe, PTS );
//// Synchronize_video:
1. Use this PTS if PTS gets it.
2. If PTS is not obtained, the PTS time of the previous frame will be used.
3. If the frame needs to be repeatedly displayed, the number of frames displayed * frame rate will be added to the front PTS.
If (queue_picture (is, pframe, PTS) <0) {// The frame queue after the courier decoder, so that you can get the show later.
Static double synchronize_video (videostate * is, avframe * src_frame, double PTS ){
Doubleframe_delay;
If (PTS! = 0 ){
/* If we havepts, set video clock to it */
Is-> video_clock = PTS;
} Else {
/* If we aren 'tgiven a PTS, set it to the clock */
PTS = is-> video_clock;
}
/* Update thevideo clock */
//// The key is that the forward PTS is a timestamp, which is the timestamp at which the current frame starts playing,
///// The following frame_delay is the time that will be spent to display the frame. (PTS + frame_delay) is also the timestamp of the next frame to be played.
Frame_delay = av_q2d (is-> video_st-> codec-> time_base );
/* If we arerepeating a frame, adjust clock accordingly */
///// If you have repeated multiple frames, you must be tired of adding them.
Frame_delay + = src_frame-> repeat_pict * (frame_delay * 0.5 );
Is-> video_clock + = frame_delay;
Return PTS; ///// the returned value is the timestamp to be displayed in the next frame.
}
////// Enable the timer to display the decode data in the frame queue, according to the previous analysis, we know that the data in the frame queue has been inserted to the queue in the PTS sequence. The role of timer is to ensure that the timestamp is not linear due to inconsistent frame rates and repeated frames, so that tutorial5 can only play in Timer mode: Catch up
The following is an intuitive and simple example for a netizen:
Ccq (183892517) 17:05:21 if (packet-> DTS = av_nopts_value is that DTS is not obtained?
David CEN (3727567) 17:06:44 is a ruler. An ant is following a benchmark. David CEN (3727567) 17:06:58 is a constant speed ant or fast or slow David CEN (3727567) at 17:07:18, when it's slow, you smoke it and let him run fast. Then, you drag it to David CEN (3727567) 17:07:38. In this way, the sound (benchmark) Video (ANT) can synchronize David CEN (3727567) 17:08:00 the biggest problem here is that the video with a constant speed is non-linear.
In addition, the PTS obtained by VP-> PTS has been converted to a timestamp, which is the end timestamp of the current frame, that is, the prediction timestamp that will be displayed in the next frame.
Static void video_refresh_timer (void * userdata ){
Videostate * Is = (videostate *) userdata;
Videopicture * VP;
Double actual_delay, delay, sync_threshold, ref_clock, diff;
If (is-> video_st ){
If (is-> pictq_size = 0 ){
Schedule_refresh (is, 1 );
} Else {
Vp = & is-> pictq [is-> pictq_rindex];
Delay = VP-> PTS-is-> frame_last_pts; /* the PTS from last time * // This is the interval between the frame to be displayed and the next frame to be displayed. //
If (delay <= 0 | delay> = 1.0 ){
/* If incorrect delay, useprevious one */
Delay = is-> frame_last_delay;
}
/* Save for next time */
Is-> frame_last_delay = delay;
Is-> frame_last_pts = VP-> PTS;
/* Update delay to sync toaudio */
Ref_clock = get_audio_clock (is); //// obtain the timestamp of the current sound.
Diff = VP-> PTS-ref_clock; //// VP-> PTS is actually the predicted start time of the next frame,
//////// That is to say, during the diff period, the sound occurs at a constant speed, however, during the delay period, the frame display may be fast.
/* Skip or repeat the frame. Take delay into account
Ffplay still doesn't "know if this is thebest guess ."*/
Sync_threshold = (delay> av_sync_threshold )? Delay: av_sync_threshold;
If (FABS (diff) <av_nosync_threshold ){
If (diff <=-sync_threshold ){
Delay = 0; ///// if the display time of the next frame is very close to the current sound, the next frame will be accelerated (that is, after video_display shows the current frame, the timer will be enabled and the next frame will be displayed soon) frame)
} Else if (diff> = sync_threshold ){
Delay = 2 * delay; // the start time of the next frame is longer than the current sound time, that is to say, the display time between two frames is longer than the playing time between two frames. We will double the display time when two frames are displayed, for example, the time interval between frame 1 and frame 2 is 40 ms, but the sound playback time of frame 1 and frame 2 is 55 ms. What should I do? It is impossible for us to disrupt the sound quality. The method we adopt is to increase the playback interval between the two frames. It would have been 30 ms before broadcasting the next frame, we changed it to 60 ms before broadcasting the next frame.
}
}/////
//// Of course, if diff is greater than av_nosync_threshold, that is, the fast forward mode, the screen beat is too large, and there is no problem with audio/video synchronization.
Is-> frame_timer + = delay;
/* Computer the real delay */
Actual_delay = is-> frame_timer-(av_gettime ()/1000000.0 );
If (actual_delay <0.010 ){
/* Really it shoshould skipthe picture instead */
Actual_delay = 0.010;
}
Schedule_refresh (is, (INT) (actual_delay * 1000 + 0.5); // enable the timer to display the next frame
/* Show the picture! */
Video_display (is); // The current frame is immediately displayed.
/* Update queue for nextpicture! */
If (++ is-> pictq_rindex = video_picture_queue_size ){
Is-> pictq_rindex = 0;
}
Sdl_lockmutex (is-> pictq_mutex );
Is-> pictq_size --;
Sdl_condsignal (is-> pictq_cond );
Sdl_unlockmutex (is-> pictq_mutex );
}
} Else {
Schedule_refresh (yes, 100 );
}