mp4 should be considered a more complex media format, originating from QuickTime. It took a while to study, especially how to integrate it perfectly into the video-on- Demand application, but also a great deal of effort, the main problem is to handle the large "media head" of MP4 file. Of course, streaming media On demand can also be used in the FLV format, FLV can also encapsulate the video data, but Adobe is not recommended to do so, they say that after all, mp4 is the best storage format of H.
these days to organize and reconstruct the MP4 file parser, fused with the decomposition and merger of the program, previously written in C, the application of Linux running on the server program, now changed to C + +, convenient for me to use it in other projects, as for the use of not porting a C #, temporarily used, Wait till it's necessary. This article first briefly introduces the general structure of the MP4 file, and its segmentation algorithm, and then write the article on how to apply the MP4 perfect in on-demand projects.
I.. MP4 format Analysis
MP4 (MPEG-4 part 14) is a common multimedia container format that is defined in the "ISO/IEC 14496-14" standard file and belongs to MPEG-4, which is "ISO/IEC 14496-12 (MPEG-4 parts ISO base Media file format) "as defined in the standard, which defines a common standard for media document architecture. MP4 is a more comprehensive container format, which is considered to be able to embed any form of data, a variety of encoded video, audio, etc., but we common most of the MP4 files stored in AVC (H . 2) or MPEG-4 encode the video and AAC encoded audio. The official file suffix in MP4 format is ". mp4", and there are other formats for MP4-based extensions or shrink versions, including:m4v, 3GP, f4v , etc.
MP4 is composed of a "box", Large box storage small box, a level of nesting to store media information. The basic structure of box is:
In this case, size indicates the amount of space occupied by the entire box, including the header section. If the box is large (for example, Mdat box for specific video data), the size is set to 1 and the next 8-bit UInt64 is used to store the maximum value of the UInt32.
A MP4 file may contain a very large number of boxes, which has greatly increased the complexity of parsing, and this page http://mp4ra.org/atoms.html records some of the currently registered box types. See so many box, if you want to support all, a resolution, even if the head will explode. Fortunately, most MP4 files do not have so many box types, which is a simplified, common MP4 file structure:
In general, the most important part of parsing media files is the width, length, bitrate, encoding format, frame list, key frame list, and the corresponding timestamp and position in the file, which are stored separately in the MP4 by a specific algorithm in a few boxes subordinate to Stbl box, You need to parse all the boxes below stbl to restore the media information. The following table is a description of several important box storage information:
See, to get to the frame list of the MP4 file, it is not easy, need a layer of parsing, and then synthesize Stts STSC Stsz stss Stco and other boxes of information, to restore the frame list, the time stamp and offset per frame. Also, you have to look after the boxes that may or may not appear ... Can see out, MP4 the frame sample to group, that is, chunk, need to indirectly through the chunk to describe the frame, the reason is to compress the storage space, reduce the size of the media information occupied by the file. In this case, STSC Box's resolution is relatively complex, it uses a clever way to illustrate the mapping between sample and chunk, specifically introduced.
This is the structure of the STSC box, the meaning of the previous items is not explained, you can see STSC box each entry structure has three data, they mean: "From the first_chunk this chunk serial number, each chunk has Samples_per_chunk number of sample, and each sample can be sample_description_index This index, in the STSD box to find the description information. " That is, each entry structure describes a set of chunk, they have the same characteristics, that is, each chunk contains samples_per_chunk a sample, well, then you have to ask, this group of the same characteristics of chunk how many? Please calculate by the next entry structure, with the next entry First_chunk minus the first_chunk, we get the number of this group of Chunk. The last entry structure indicates that each chunk has sampls_per_chunk sample from the First_chunk to the last chunk. It's a mouthful, but that's what it means :). Since this algorithm does not know the number of all chunk in the file, you must use Stco or co64. Directly on the code may be clearer:
1. First Direct analysis of entry
2. Then, after you know the total number of chunk by Stco or co64, start restoring the mapping table
After reading the STSC, you can combine all the boxes under the STBL to derive data such as video and audio frame lists, timestamps, and offsets. The following is a list of the key frames obtained:
Once you have a list of keyframes, you can continue with our topic, that is, mp4 file segmentation . The realization of MP4 segmentation, is the application of MP4 to the on-demand system of the most critical technical link, do not do this, can not achieve on-demand playback MP4 film " drag ."
Two, MP4 file segmentation algorithm
The so-called "segmentation", is to cut large files into small files, to achieve MP4 segmentation,
- First, you need to get to the Keyframe list
- Then, select the time period you want to split (for example, starting with a keyframe)
- Next, regenerate the Moov box (note that all relevant box and box size need to be changed)
- Finally, copy the corresponding data, generate a new file
1th, the above has been introduced, 2nd, only need to traverse the keyframe list, you can find the time period that you want to split the closest keyframe, 4th is "Copy-paste" work, the key is 3rd. Because this step involves stbl under all box, must regenerate Entrys, the same, the other box is OK, only need to keep the key frame corresponding to the sample and chunk, the rest of the deletion can be, just stsc box is more troublesome, said more verbose, Or look at the code directly:
After modifying the box, you need to regenerate Moov box, because the size of Moov box and the length of the information has changed, so need box size to make corresponding changes, this must not forget, otherwise the player will parse the error. After rebuilding the box, but also to calculate the length of the segmented data, because the length of the data has also changed, so modify the size of the Mdat box at the same time, to modify the stbl under All box chunk offset, remember!
The following is the entire logical process:
Well, after all this has been achieved, we have the conditions to do the MP4 on-demand system. However, to do MP4 on-demand, there are some other issues that need to be addressed, which I will cover in the next article.
Parsing of MP4 file format and segmentation algorithm of MP4 file