I was going to talk about the history of MP4 and some common video file formats. It is not necessary to think now, after all, this article is about the MP4 on-demand drag technical details. Introduction, preface God Horse seems a little superfluous.
Speaking of MP4, we have to mention the concept of "Digital container format". Wikipedia gives an explanation:
A container or wrapper format is a metafile format whose specification describes how different elements of data and Metada Ta coexist in a computer file
We are referred to here as containers. MP4 is actually a container that contains information that describes what is stored in the container, how it is laid out, where it is placed, how much is placed, and so on. Since we are talking about containers, there are a lot of similarities besides MP4. such as 3GP, ASF, AVI, MKV, RM and many more of our common video formats.
Here is recommended a tool, called Mp4parse.exe, can be found online. When you open a MP4 file with this tool, you will see that the MP4 interior is Jiangzi:
In this figure we can see the same hierarchical relationship as the Windows registry, and the ' + ' can be clicked to expand and see the internal structure. The ftyp,moov,mvhd,trak,mdat in the image are called box in MP4. The basic structure of box is made up of box header + box data.
As there are many box types in the official MP4 agreement, there are articles on the web explaining the role of each box. It's not listed here. Next we are talking about the principle of file dragging will slowly come into contact.
When handling the drag, it is necessary to parse the MP4 format, we assume that the file already exists on our server's disk, the file name is Test.mp4. So we parse the file format, we need to read the file, but there are many ways to read the policy, you can either read or mmap the contents of the disk into memory, but for too large files, this method is not feasible. Another common way is to read a specified piece of content when you need to read it. For example, the current only read Boxa, processing after parsing and then read BOXB. Our analysis here is carried out in this way. The price of this is to generate a lot of IO. One way to do this is to read a fixed-size piece of content every time, then parse it, parse it, and read a piece of data. This allows for a tradeoff between IO and memory consumption. These are some of the problems associated with optimization. Let's not do much discussion first.
The general MP4-on-demand drag business is usually handled by the start and end parameters of the URL's parameters. For example:
http://test.com/vedio/mp4/test.mp4?start=10.01&&end=100.00
Here the start and end from the program's point of view, they are seemingly double types. They are actually time, units of seconds. That is, the request is to get the contents of the Test.mp4 file from 10.01 seconds to 100 seconds, in terms of playback time. Of course, you may say that the more common should be only start and no end, after all, the other player to drag the video, just looking for a starting point of play, end does not need to take, the default is to the end of the file. However, there may be a need for the site, in order to save bandwidth, they after you drag, only give you to load 5 minutes of data, that is, end=start+5min, so that the request start and end parameters will exist simultaneously.
That's the question, why not just use the displacement as the start and end parameters. This request to the video server, you can follow the offset directly to the file sent to you. First of all, this situation does not require the use of URL parameters, directly sent with a range HTTP request is possible. Furthermore, because of the many key displacement information that the follower drags on the server side of the video file, the front-end player is hard to get relevant data when it starts playing.
So you might ask, since it's hard to get a position offset for the player, how does time get it? Take Youku as an example:
Http://v.youku.com/v_show/id_XMTI3NjM1Nzk1Ng==_ev_1.html
This is just an HTML file for the playback page, not the actual video you want to play. This request is critical, and with this request, the player's front end can receive information such as the overall duration of the video. For this information we can get it in this HTML source:
The last 3611.82 is the total video length, after rounding is 60 minutes and 11 seconds.
So the next time we drag the video progress bar, the front end can take the cursor position on the bar, proportional to the time, and then as the start parameter, initiates an HTTP request to obtain the corresponding content.
Mouse offset (known)/total length of progress bar (known) = offset time (unknown)/total duration (known)
(not to be continued)
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
On the principle and analysis of MP4 video dragging (i.)