Avi Data Format
To learn about ffmpeg, we studied the avi container along the plane to learn how ffmpeg reads data streams in the container.
AVI (short for Audio Video Interleaved) is a type of RIFF (short for Resource Interchange File Format) File Format. It is mostly used in Audio/Video capturing, editing, and playback applications. Normally, an AVI file can contain multiple different types of media streams (in typical cases, there is an audio stream and a video stream ), however, AVI files containing a single audio stream or a single video stream are also valid. AVI is the most basic and commonly used media file format in Windows. First, we will introduce the RIFF file format. The RIFF file uses the four-character code FOURCC (four-character code) to characterize the data type, such as 'riff', 'av', and 'LIST. Note that in Windows, the byte sequence is little-endian, And the DWORD type 0xA8B9C0D1 is stored in the file (or memory) in the order of D1 C0 B9 A8. In addition, it is legal that the four-digit code contains spaces like 'av. The first four bytes are a four-byte code 'riff', which indicates that this is a RIFF file, followed by four bytes to indicate the size of the RIFF file; then there is a four-digit code indicating the specific type of the file (such as AVI and WAVE), and finally the actual data. Note that the calculation method of the file size value is: actual Data Length + 4 (size of the file type field); that is, the file size value does not include the size of the 'riff' and "file size" fields. In the actual data of the RIFF file, the List and Chunk are usually used for organization. The list can be nested with sublists and blocks. Here, the LIST structure is: 'LIST' listSize listType listData -- 'LIST' is a four-byte code, indicating that this is a LIST; listSize occupies 4 bytes and records the size of the entire LIST; listType is also a four-digit code, indicating the specific type of the List; listData is the actual list data. Note that the listSize value is calculated as follows: the actual length of the LIST data + 4 (the size of the listType field); that is, the listSize value does not include the size of the 'LIST' and listSize fields. Next, let's look at the block structure: ckID ckSize ckData -- ckID is a four-byte code that represents the block type. ckSize occupies 4 bytes and records the size of the entire block. ckData is the actual block data. Note that the ckSize value refers to the actual block data length, excluding the size of the ckID and ckSize fields. (Note: In the following content, a LIST is represented in the form of LIST (listType (listData), and a block is represented in ckID (ckData, for example, the elements in [optional element] are represented as optional .) Next we will introduce the AVI file format. The AVI file type is represented by a four-character code 'avi. The structure of the entire AVI file is: a RIFF header + two lists (one for describing the media stream format, one for saving media stream data) + an optional index block. The structure of the AVI file is roughly as follows:
/** heres the general layout of an AVI riff file (new format)** RIFF (3F??????) AVI <- not more than 1 GB in size* LIST (size) hdrl* avih (0038)* LIST (size) strl* strh (0038)* strf (????)* indx (3ff8) <- size may vary,should be sector sized* LIST (size) strl* strh (0038)* strf (????)* indx (3ff8) <- size may vary,should be sector sized* LIST (size) odml* dmlh (????)* JUNK (size) <- fill to align to sector - 12* LIST (7f??????) movi <- aligned on sector - 12* 00dc (size) <- sector aligned* 01wb (size) <- sector aligned* ix00 (size) <- sector aligned* idx1 (00??????) <- sector aligned* RIFF (7F??????) AVIX* JUNK (size) <- fill to align to sector -12* LIST (size) movi* 00dc (size) <- sector aligned* RIFF (7F??????) AVIX <- not more than 2GB in size* JUNK (size) <- fill to align to sector - 12* LIST (size) movi* 00dc (size) <- sector aligned**-===================================================================*/
'Hdrl' first, RIFF ('avi '...) Represents the AVI file type. The first list required by the AVI file is the 'hdrl' list, which describes the format information of each stream in the AVI file (each media data in the AVI file is called a stream ). The 'hdrl' list contains a series of blocks and sublists. The first is an 'avih 'block, which is used to record the global information of an AVI file, such as the number of streams, width and height of the video image, you can use an AVIMAINHEADER data structure to operate:
Typedef struct _ avimainheader {FOURCC fcc; // It Must Be 'avih 'dword cb; // the size of the data structure, excluding the first 8 bytes (fcc and cb) DWORD dwMicroSecPerFrame; // video frame interval (in microseconds) DWORD dwMaxBytesPerSec; // maximum data rate of the AVI file DWORD dwPaddingGranularity; // DWORD dwFlags; // global tag of the AVI file, such as whether or not it contains index blocks and other DWORD dwTotalFrames; // The total number of frames DWORD dwInitialFrames; // specify the initial frame number for the interaction format (the non-interaction format should be set to 0) DWORD dwStreams; // The number of streams contained in this file DWORD dwSuggestedBufferSize; // It is recommended to read the cache size of this file (which should accommodate the largest block) DWORD dwWidth; // The width of the video image (in pixels) DWORD dwHeight; // video image height (in pixels) DWORD dwReserved [4]; // reserved} AVIMAINHEADER;
'Strl' is then one or more 'strl' sublists. (The number of streams in the file. Here we will list the number of 'strl' sublists .) Each 'strl' sublist contains at least one 'strh' block and one 'strf' block, while the 'strd' block (save some configuration information required by the decoder) and 'strn' blocks (Save the stream name) are optional. The first is the 'strh' block, which is used to describe the header information of the stream. You can use an AVISTREAMHEADER data structure to perform operations:
Typedef struct _ avistreamheader {FOURCC fcc; // It Must Be 'strh' FOURCC fccType; // stream type: 'auds '(audio stream), 'vids' (Video Stream), // 'mids '(MIDI stream), 'txts' (Text Stream) FOURCC fccHandler; // specifies the stream processor, which is the decoder DWORD dwFlags for audio and video; // mark: allow this stream output? Does the palette change? WORD wPriority; // stream priority (when multiple streams of the same type have the highest priority, the default stream is used) WORD wLanguage; DWORD dwInitialFrames; // specify the initial frame number DWORD dwScale for the interaction format; // The time scale DWORD dwRate used by the stream; DWORD dwStart; // The stream start time DWORD dwLength; // The stream length (Unit related to the dwScale and dwRate definitions). We recommend that you use DWORD dwQuality to read the stream data. // The streaming data quality indicator (0 ~ 10,000) DWORD dwSampleSize; // Sample size struct {short int left; short int top; short int right; short int bottom;} rcFrame; // specify the display position of the stream (video stream or text stream) in the main video window. // The main video window is determined by dwWidth and dwHeight in the vi mainheader structure.} AVISTREAMHEADER;
'Strf' is followed by the 'strf' block to describe the specific format of the stream. For video streams, a bitmapinfo data structure is used for description. For audio streams, a waveformatex data structure is used for description. After all streams in the AVI file are described in the 'strl' sublist (note: the sequence in which the 'strl' sublist appears corresponds to the ID of the media stream, for example, the first 'strl' sublist describes the first Stream (Stream 0), the second 'strl' sublist describes the second Stream (Stream 1), and so on ), the 'hdrl' list task is completed, followed by the second list required by the AVI file-'movi' list, stores Real Media Stream Data (video image frame data or audio sample data ). So how can we organize this data? Data blocks can be directly embedded in the 'movi' list, or several data blocks can be grouped into a 'rec 'list and then organized into the 'movi' list. (Note: when reading the content of an AVI file, we recommend that you read all data blocks in the 'rec 'list at a time .) However, when an AVI file contains multiple streams, what are the differences between data blocks and data blocks? Therefore, a data block uses a four-byte code to represent its type. This four-byte code consists of two types and two stream numbers. The standard type code is defined as follows: 'db' (non-compressed video frame), 'dc' (compressed video frame), and 'pc' (use a new color palette) and 'wb '(audio reduction video ). For example, if the first Stream 0 is audio, the four-digit code representing the audio data block is '00wb ', and the second Stream (Stream 1) is video, the four-digit code representing the video data block is '01db' or '01dc '. For video data, a new palette can be defined in the middle of the AVI Data Sequence. Each changed palette data block is characterized by 'xxpc, the new palette is defined using a Data Structure AVIPALCHANGE. (Note: If the color of a stream may change midway through, you should include an AVISF_VIDEO_PALCHANGES mark in the description of the stream format, that is, dwFlags of the avistreamheader structure .) In addition, the data block of a text stream can use arbitrary type code table. Finally, the index block that follows the 'hdrl' list and 'movi' list is an optional index block for the AVI file. This index block is used to index each media block in the AVI file and record their offset in the file (which may be relative to the 'movi' list or start with the AVI file ). The index block is characterized by a four-character code 'idx1'. The index information is defined by AVIOLDINDEX using a data structure.
Typedef struct _ avioldindex {FOURCC fcc; // The value must be 'idx1' struct _ avioldindex_entry {DWORD dwChunkId; // The four-character code DWORD dwFlags that represents the data block; // indicates whether the data block is a key frame, whether it is a 'rec 'list, and other information DWORD dwOffset; // the offset of the data block in the file DWORD dwSize; // size of the data block} aIndex []; // This is an array! Define an index for each media data block} AVIOLDINDEX;
NOTE: If an AVI file package contains an index block, an AVIF_HASINDEX mark should be included in the description of the master AVI information header, that is, dwFlags of the avimainheader structure. There is also a special data block, which is characterized by a four-character code 'junk'. It is used for internal data queues (filling), and the application should ignore the actual meaning of these data blocks.