MP4 File Format

Source: Internet
Author: User
Tags compact
MP4 File Format (for conversion)  
In MP4 file format, all content is stored in a container called movie. A movie can be composed of multiple tracks. Each track is a media sequence that changes over time, for example, a video frame sequence. Each time unit in a track is a sample, which can be a frame of video or audio. Samples are arranged in chronological order. Note that a single frame of audio can be divided into multiple audio samples, so the audio is generally used as the unit, instead of frame. In the definition of MP4 file format, the word "sample" is used to represent a time frame or data unit. Each track has one or more samples.
Descriptions. Each sample in the track is associated with a sample description through reference. This sample
Descriptions defines how to decode the sample, such as the compression algorithm used.

Unlike other multimedia file formats, MP4 file formats often use several different concepts. Different definitions are the key to understanding the file format.

The physical format of this file does not limit the media format. For example, many file formats split media data into frames, and the header or other data closely follows each frame of video ,!!! Todo (for example, MPEG2 ). This is not the case for MP4 files.

The physical format of files and the arrangement of media data are not restricted by the time sequence of the media. Video Frames do not need to be arranged in chronological order. This means that if there are such frames in the file, there are some file structures to describe the media arrangement and corresponding time information.

All the data in the MP4 file is encapsulated in some boxes (previously called atom ). All metadata (Media description metadata), including data defining media arrangement and time information, are included in such a structure box. The MP4 file format defines the formats of these boxes. Metadata references media data (such as video frames. Media data can be contained in one or more boxes, or in other files. Metadata allows the use of URLs to reference other files, the arrangement of media data in these referenced files is described in metadata in the first primary file. Other files are not necessarily in MP4 format. For example, a box may not exist.

There are many types of tracks, three of which are the most important. Video track includes the video sample; audio track contains the audio
Sample; hint track is slightly different. It describes how a Streaming Media Server makes up the media data in the file into a data packet conforming to the streaming media protocol. If the file is only played locally, you can ignore the hint.
Track, they are only related to streaming media.

Physical Structure of the media
Box defines how
Table to find the media data arrangement. This includes data reference, the sample size table, the sample
Chunk table, and the chunk Offset Table. These tables can locate the position and size of each sample in the track file.

Data
Reference allows you to locate the media position in the second media file. In this way, a movie can be composed of multiple different files in a media database without copying them all to another new file. For example, video editing is very helpful.

To save space, these tables are compact. In addition, interleave is not a sample
Sample, but combines several samples of a single track, and then combines several other samples. Several consecutive records of a track
The unit composed of samples is called chunk. Each chunk has an offset in the file, which starts from the beginning of the file. In this chunk, the sample is continuously stored.

In this way, if a chunk contains two samples, the location of the second sample is the offset of the chunk plus the size of the first sample. Chunk
The offset table describes the offset of each chunk. The ing between the sample and Chunk numbers is described in sample to chunk table.

Note that there may be dead zones between chunks, and no media data is referenced in this area, but there will be no dead zones inside the chunk. In this way, if you do not need some media data during program editing, you can simply stay there without referencing them so that you do not need to delete them. Similarly, if the media is stored in the second file, but the format is different from the MP4 file format, the header of this unfamiliar file or other file formats can be simply ignored.

Temporal Structure of the media

The time in the file can be understood as some structures. Movies and each track has a timescale. It defines a timeline to describe how many ticks each second. You can select this number to make accurate timing. Generally, for audio
Track is the sampling rate of audio. For video track, the situation is slightly complicated and needs to be properly selected. For example, if a media
The timescale value is 30000, and the media sample durations value is 1001. NTSC is defined accurately.
Video time format (although not accurate, usually 29.97), and provide 19.9 hours of time in 32 bits.

The Time Structure of the track is subject to an edit
The list effect has two purposes: A part of the time segment changes in one track in all movies (may be reused); Blank time insertion, that is, empty edits. Note that if a track does not start from the beginning of the program, Edit
The first edit of the list must be an empty edit.

The full duration of each track is defined in the file header, which is the summary of the track. Each sample has a specified duration. The exact description time of a sample, that is, the timestamp, is the sum of the duration values of the previous sample.

Interleave

The file time and physical structure can be aligned, which indicates that the physical order of media data in the container is time order. In addition, if the media data of multiple tracks is contained in the same file, the media data can be interleaved. In general, to facilitate reading media data from a track and ensure that each table is compact, it is done at an appropriate time interval (for example, 1 second ).
Interleave instead of sample by sample. This reduces the chunk data and the chunk offset table size.

Composition
If multiple audio tracks are included in the same file, they may be mixed for playing together and played by a total track.
Volume and left/right balance control.

Similarly, video
The track can also be mixed with the merging mode based on the serial numbers of the respective layers (from the back to the Front. In addition, each track can be transformed using a matrix, or a single
Matrix. In this way, you can perform simple operations (such as enlarging an image, correcting 90 ° rotation), or perform more complex operations (such as shearing, arbitrary
Rotation ).

This hybrid method is just very simple and is a default method, and another MPEG4 document will define a more powerful method (for example, MPEG-4 BIFS ).

There are some good tools in the Darwin Streaming Server to help analyze the MP4 file format.

However, if you can use your own byte parse file, you can better understand the MP4 file format. Here I will analyze the file structure byte. The file example is sample_100kbit.mp4 in dss.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.