FFmpeg decoding process
1. Starting from the basics
First, several concepts are provided to facilitate understanding in the subsequent analysis.
Container: a container in audio and video. It generally refers to a specific file format, which specifies
Audio and video, subtitles, and other related information
Stream: This word is subtle and used in many places, such as the TCP and svr4 systems.
It can be understood as pure audio or video data.
Frames: this concept is not very clear. It refers to a data unit in stream.
Concepts, you may need to read some theoretical knowledge about audio/video encoding and decoding.
Packet: raw data of stream
Codec: coded + decoded
In fact, these concepts are well reflected in FFMPEG. We will see them in the subsequent analysis.
2. Basic decoding process
I am very lazy, so I chose the process overview from <an FFMPEG and SDL tutorial>:
10 open video_stream from video. Avi
20 read packet from video_stream into frame
30 if frame not complete goto 20
40 do something with Frame
50 goto 20
This is the entire process of decoding. At a glance, does it feel like this? :), however, there is a deep, superficial issue.
To the depth, and then from the depth to the shallow may be an interesting process. Let's start with this story.
.
3. Sample Code
In <an FFMPEG and SDL tutorial 1>, A Yangchun decoder is provided. Let's take a closer look.
For the sake of convenience, I posted the following code:
# Include <FFMPEG/avcodec. h>
# Include <FFMPEG/avformat. h>
# Include <stdio. h>
Void saveframe (avframe * pframe, int width, int height, int IFRAME ){
File * pfile;
Char szfilename [32];
Int y;
// Open File
Sprintf (szfilename, "frame % d. ppm", IFRAME );
Pfile = fopen (szfilename, "WB ");
If (pfile = NULL)
Return;
// Write Header
Fprintf (pfile, "P6/n % d/n255/N", width, height );
// Write pixel data
For (y = 0; y Fwrite (pframe-> data [0] + y * pframe-> linesize [0], 1, width * 3, pfile );
// Close file
Fclose (pfile );
}
Int main (INT argc, char * argv []) {
Avformatcontext * pformatctx;
Int I, videostream;
Avcodeccontext * pcodecctx;
Avcodec * pcodec;
Avframe * pframe;
Avframe * pframergb;
Avpacket packet;
Int framefinished;
Int numbytes;
Uint8_t * buffer;
If (argc <2 ){
Printf ("Please provide a movie file/N ");
Return-1;
}
// Register all formats and codecs
########################################
[1]
########################################
Av_register_all ();
// Open video file
########################################
[2]
########################################
If (av_open_input_file (& pformatctx, argv [1], null, 0, null )! = 0)
Return-1; // couldn't open file
// Retrieve stream information
########################################
[3]
########################################
If (av_find_stream_info (pformatctx) <0)
Return-1; // couldn't find stream information
// Dump information about file onto standard error
Dump_format (pformatctx, 0, argv [1], 0 );
// Find the first video stream
Videostream =-1;
For (I = 0; I <pformatctx-> nb_streams; I ++)
If (pformatctx-> streams [I]-> codec-> codec_type = codec_type_video ){
Videostream = I;
Break;
}
If (videostream =-1)
Return-1; // didn't find a video stream
// Get a pointer to the codec context for the video stream
Pcodecctx = pformatctx-> streams [videostream]-> codec;
// Find the decoder for the video stream
Pcodec = avcodec_find_decoder (pcodecctx-> codec_id );
If (pcodec = NULL ){
Fprintf (stderr, "unsupported codec! /N ");
Return-1; // codec not found
}
// Open Codec
If (avcodec_open (pcodecctx, pcodec) <0)
Return-1; // cocould not open Codec
// Allocate Video Frame
Pframe = avcodec_alloc_frame ();
// Allocate an avframe Structure
Pframergb = avcodec_alloc_frame ();
If (pframergb = NULL)
Return-1;
// Determine required buffer size and allocate Buffer
Numbytes = avpicture_get_size (pix_fmt_rgb24, pcodecctx-> width,
Pcodecctx-> height );
Buffer = (uint8_t *) av_malloc (numbytes * sizeof (uint8_t ));
// Assign appropriate parts of buffer to image planes in pframergb
// Note that pframergb is an avframe, but avframe is a superset
// Of avpicture
Avpicture_fill (avpicture *) pframergb, buffer, pix_fmt_rgb24,
Pcodecctx-> width, pcodecctx-> height );
// Read frames and save first five frames to disk
########################################
[4]
########################################
I = 0;
While (av_read_frame (pformatctx, & Packet)> = 0 ){
// Is this a packet from the video stream?
If (packet. stream_index = videostream ){
// Decode Video Frame
Avcodec_decode_video (pcodecctx, pframe, & framefinished,
Packet. Data, packet. size );
// Did we get a video frame?
If (framefinished ){
// Convert the image from its native format to RGB
Img_convert (avpicture *) pframergb, pix_fmt_rgb24,
(Avpicture *) pframe, pcodecctx-> pix_fmt,
Pcodecctx-> width,
Pcodecctx-> height );
// Save the frame to disk
If (++ I <= 5)
Saveframe (pframergb, pcodecctx-> width, pcodecctx-> height,
I );
}
}
// Free the packet that was allocated by av_read_frame
Av_free_packet (& Packet );
}
// Free the RGB image
Av_free (buffer );
Av_free (pframergb );
// Free the YUV Frame
Av_free (pframe );
// Close the Codec
Avcodec_close (pcodecctx );
// Close the video file
Av_close_input_file (pformatctx );
Return 0;
}
The code is clearly commented out, and there is nothing to explain too much about the yuv420, RGB, ppm and other formats.
If you do not understand it, please google it. For more information, see http://barrypopy.cublog.cn /.
Related Articles
In fact, this part of the code is a good demo of how to implement the screen capture function, but we have to look at the magic behind
Instead of simply enjoying their performances.
4. The Story Behind
The real difficulty is actually the above [1], [2], [3], [4], and other parts are conversions between data structures,
If you carefully read the code, it is not difficult to understand other parts.
[1]: There is nothing to say. If you don't understand it, read my reposted article on the ffmepg framework.
[2]: First, let's talk about the avformatcontext * pformatctx structure.
It is about the context (scenario) of avformat (in fact, the container format we mentioned above ),
However, it is the general control structure that stores the iner information. You can also see that basically all the information is available.
Get it from it
Let's take a look at what av_open_input_file () has done:
[Libavformat/utils. C]
Int av_open_input_file (avformatcontext ** ic_ptr, const char * filename,
Avinputformat * FMT,
Int buf_size,
Avformatparameters * AP)
{
......
If (! FMt ){
/* Guess format if no file can be opened */
FMt = av_probe_input_format (PD, 0 );
}
......
Err = av_open_input_stream (ic_ptr, Pb, filename, FMT, AP );
......
}
In this case, we only did two things:
1). Detect the container File Format
2). Obtain stream information from the container File
These two tasks are actually the process of calling the demuxer of a specific file to separate the stream:
The specific process is as follows:
Av_open_input_file
|
+ ----> Av_probe_input_format traverses all registered demuxer from first_iformat and
| Call the corresponding probe function
|
+ ----> Av_open_input_stream calls the read_header function of the specified demuxer to obtain the relevant
Stream information IC-> iformat-> read_header
If I refer to my post on the FFMPEG framework in turn, are you clear :)
[3]: Retrieving stream information from avformatcontext is nothing to mention.
[4]: First, let's talk about FFMPEG. Theoretically, packet can contain the frame part.
However, for convenience of implementation, FFMPEG enables each packet to contain at least one
Frame. This is an implementation consideration, not a protocol requirement.
Therefore, the above Code is actually like this:
Reads packet from the file and decodes the corresponding frame from packet;
Decoding from frames;
If (decoded frame completed)
Do something ();
Let's take a look at how to get packet and how to decode the frame from packet.
Av_read_frame
|
+ ----> Av_read_frame_internal
|
+ ----> Av_parser_parse calls the S-> parser-> parser_parse function of the specified decoder to reconstruct the frame from the raw packet.
Avcodec_decode_video
|
+ ----> Avctx-> codec-> decode calls the decoding function of the specified codec.
Therefore, from the process above, we can see that it is actually divided into two parts:
One part is demuxer and then decode)
The following are used:
Av_open_input_file () ----> demultiplexing
Av_read_frame () |
| ----> Decoding
Avcodec_decode_video () |
5. What should I do later?
In combination with this Part and the post-posted ffmepg framework, we should be able to basically break through the decoding process. The following problem is the analysis of the specific container format and specific decoder. We will continue later.
Refer:
[1]. <an FFMPEG and SDL tutorial>
Http://www.dranger.com/ffmpeg/tutorial01.html
[2]. <FFMPEG framework code reading>
Http://blog.csdn.net/wstarx/archive/2007/04/20/1572393.aspx