Streaming media/Streaming media file format detailed

Source: Internet
Author: User
Tags format definition icecast

Pick   Want    streaming media file format plays an important role in streaming media system, so designing a reasonable file format is the most direct and effective way to improve the efficiency of streaming media server. Based on the analysis of the common streaming media system and file format, this paper In particular, the U.S. Xiph.org Foundation's open source streaming media Engineering ogg file format sub-project made in-depth analysis, it is pointed out that the Ogg format is concise to the storage reading and transmission of the media encoded data, and the mapping and inverse mapping of the Ogg format is relatively independent of the media encoding data, which can effectively improve the efficiency of the streaming media server. 1 Introduction       streaming media refers to the continuous-time base media, such as audio, video and other multimedia files, using streaming transmission technology in internet/intranet. File format and Transfer Protocol are the main techniques of streaming media application. From different point of view, streaming media data has three kinds of formats: compressed format, file format, publishing format. The compressed format describes the encoding and decoding of media data in streaming media files, and the streaming media file format refers to the server-side streaming media organization, and the file format provides a standardized way for data exchange. Streaming media publishing format is a kind of media arrangement that is presented to clients. The format discussed in this article refers to the second type: Streaming media file format. In particular, an open source media file format is analyzed: Ogg. It is a subproject of the open source streaming media project developed by the American Xiph.org Foundation, which is designed for storage and transmission needs of its open source audio/Video media compression coding format Vorbis/theora.     in this paper, based on the research and analysis of the existing streaming media system, combined with the experience and lessons learned in the research and development of the new streaming media system, the streaming media file format is analyzed systematically and deeply, aiming at understanding the stream media system and finding a way to improve the efficiency of streaming media system. 2 Streaming media file format Analysis 2.1 The importance of file format in streaming media system      A simplified streaming media system consists of streaming media server, client and transmission network, the core of streaming media system is streaming media server. With the development of streaming media technology and the expansion of streaming media application, how to improve the efficiency of streaming media system, the main indicators are how to improve the number of server concurrent media flow, this is a topic of widespread concern. It depends on the efficiency of the server processing each stream, determines how many customers can serve the server at the same time, the results of which not only have theoretical value, but also have great economic value.        Figure 1 shows some of the main factors affecting the performance of the streaming media system. If you analyze and judge each of these factors carefully, you can find that for aIn the existing streaming media server, there are not many means to effectively improve its efficiency. For example: Improve the client, server-side hardware configuration, only to obtain performance improvement, efficiency has not been improved; real-time transmission, control protocol is a lot of people after many years of application practice summed up by the IETF Internet Engineering Task Force to confirm the common standards of media streaming transmission, it is necessary to comply with the standards, can not , the quality of the program source, compression methods and other factors for the server is unpredictable or uncontrollable. It is possible to optimize the streaming media file format to improve the efficiency of streaming media server in the case that these objective factors cannot be changed.     Streaming media file format can affect the efficiency of the server is determined by the characteristics of the flow server mode of operation. The main task of streaming server is to provide streaming media content to the user via live broadcast or on demand, input the streaming media files stored on the disk, then the real-time Transfer protocol is encapsulated and then output to the client via IP network. In short, its workflow is 3 steps: Read, package, send. Because each send a media stream need to start a process, and all processes need to be done in real time, visible when a stream server concurrent thousands of streams, the small difference in productivity of each process will have a great impact on the efficiency of the server.

Figure 1  Simplified streaming media system structure and its influencing factors     each workflow input is a streaming media file, the output is a media packet. Input, output data content is not changed, are the multimedia compression code flow, the two have only a different format, so from the point of view of the data flow, the main work of the server is actually a format conversion process. Because the format of the media packet is determined by the transport protocol beforehand, whether the streaming media file format can be convenient for the server to read and encapsulate determines the workload of the server. Analysis and comparison of 2.2  streaming media file format     streaming media files have an important place in streaming media systems, and the literature [2] analyzes representative QuickTime movie files (mov) and Microsoft Media Server's movie file (ASF), its file format and related links have done in-depth analysis. These file formats are found to have a negative impact on server productivity:     (1) disk controller access throughput is low. Each package of a media packet needs to read a frame of data, generally 1K per frame size of about 25 frames per second, which causes frequent disk access, low throughput.     (2) for QuickTime MOV file format, the media data has not been preprocessed, the server every packet needs to obtain packaging from the hint track of the relevant parameters, real-time reading media data, encapsulation, send, CPU occupancy rate is very large.     (3) for the Microsoft ASF file format, media data is already a semi-finished product of the MMS package in packet, and the server saves time to intercept the media stream, but still requires the server to select the media stream to organize the MMS package. Moreover, the data in packet is not all the data that needs to be sent, and the memory space and disk IO time are wasted.         Literature [3] A new streaming media file format NMF is proposed, which has the following basic structure (shown in Figure 2) and features:

Figure 2 NMF file structure     NMF streaming media files are composed of headers and body files. The header file mainly contains the necessary information such as file description, media description, flow description, etc., and the body file contains all the media data. A NMF is composed of a header file and several individual files, the same media source different streams (different transport protocols or different code rates) are stored in different body files, this structure is used to achieve multiple-bit rate switching/intelligent streaming technology and compatible with existing streaming media players. The functional partitioning principle of header and body files is that when the server and the client establish a connection (before the media data is sent), only the data is read from scratch, and the media data is only required to be read from the file body when the server and the client establish a connection. In this way, the coupling between each module in the server is reduced and the efficiency is improved. Because of the relative independence of the header file and the body file, the file is highly scalable and makes it possible to encapsulate and send with hardware. The core idea of     NMF is to make full use of the preprocessing process, organize the original media file into a convenient format for server processing, reduce the workload of real-time encapsulation and dispatch, and increase the compatibility and scalability of the file structure to improve the efficiency of streaming server. Increase the number of concurrent streams. 3  OGG file Format Structure 3.1 The importance of file format in streaming media systems      logical streams are linked to physical flows in pages (page), as shown in Figure 3:

Figure 3   Ogg files are organized     the files in Figure 3 link two physical streams, a, B, and C three logical flows form a physical stream, and logical flow D is a separate physical stream. The bos_page of all logical flows in a physical stream must be adjacent to the physical location, as shown in Figure 3, *a*, *b*, *c* three bos_page locations.     bos:beginning of stream;    eos:end of stream      mapped to ogg format media ( such as Vorbis audio, Theora video) have a detailed definition, these definitions make these media have more specific constraint relationship. The OGG itself does not elaborate on the time relationship between multiple concurrent media streams, which requires concurrent media streams to be specified at the time of the mapping to the OGG format, where the staggered relationships are usually arranged in the order in which they are produced. 3.2 OGG page Structure     each page is independent of each other, contains the information they should have, the size of the page is variable, usually 4k-8kb, and the maximum value cannot exceed 65307bytes (27+255+255*255= 65307). The page header format is shown in Figure 4.     header field fields in detail see documentation [4]: (Small-endian format LSB).    ⑴capture_pattern: Pattern capture Domain, 4 bytes, representing the beginning of the page, which is the role of separating the Ogg encapsulated format to recognize the new page when the media is encoded, including four magic numbers (ASCII character set): 0x4f ' O '     0x67 ' g '     0x67 ' g '      0x53 ' S '     ⑵ Stream_structure_version:1 a byte that represents the current Ogg file format, currently 0.

Figure 4 OGG Header structure    ⑶header_type_flag: Head type identification, 1 bytes. Identifies the current page specific type. Its setting is divided into three kinds of cases:     *  bit 0x01  If set, the page contains the media encoded data on the previous page of the same packet as a logical stream. If not set, this page is a new packet.     *  bit 0x02   setting, representing the first page Bos of the logical stream. Not set, not first page.     *  bit 0x04   setting that represents the last page of the logical stream EOS. Not set, not last page. The    ⑷granule_position:8 byte (byte 6-byte 13) contains information about the media encoding related parameters. For an audio stream, contains the total number of times the logical stream sampled the encoding in PCM until this page. For video streaming, contains the total number of video frame encodings that are logically streaming to this page. If the value is-1, the packet of the logical stream is not finished until this page.    ⑸bitstream_serial_number: Stream serial number, 4 bytes, indicating the serial number of the logical stream to which this page belongs and other logical streams.    ⑹page_sequence_number: Indicates the serial number of this page in the logical stream, and the Ogg decoder can identify any page loss on this basis.    ⑺crc_checksum: Cyclic redundancy check code checksum, 4 byte domain, contains the page 32bit CRC checksum (including page header 0 CRC checksum page data check), its production polynomial is: 0x04c11db7.    ⑻number_page_segments:1 bytes, given the number of segement that appear in the Segment_tabale field on this page, with a maximum value of 255segments (255 bytes per piece), The 26th byte of the page header is a range of: 0x00-0xff (0-255). The maximum physical size of the page is 65307bytes, less than 64KB.    ⑼segment_table: The value of each packet per segment length in the logical stream (lacing values, except that the last segment of each packet is less than 255, the other segment are 255), and these values are sorted sequentially in the order in which segment occurs. The number of bytes in this field is the number represented by the Number_page_segments field (that is, between 0-255). byte     value   27       0xff (255)         [................. ]        n-1      0xff (255) n        0x00-0xfe (0-254, n=num_segments+26) number of bytes in page header length:    header _size = + number_page_segments [byte]     header length is the sum of the number of bytes occupied by the above 9 domain names. Total length of page:   page_size = header_size + sum (lacing_values:1...number_page_segments)    [Byte] The total length of the page is the sum of the length of the page head plus the number of segments (net load length) followed. 3.3  ogg Encapsulation Process     (1) Audio-video coding is presented in the form of "Packets" with a package boundary before being provided to the Ogg package, and the packet boundary relies on the specific encoding format. As shown in Figure 5.     (2) fragment segmentation for each packet of the logical stream, each piece is fixed to 255Byte, but the last segment of the package is usually less than 255 bytes. Because the size of the packet can be any length, it is determined by the specific media Encoder.     (3) for page encapsulation, each page is added to the page header, the length of each page can be ranged, determined by the specific circumstances. Page Header segment_table field to inform the "LaciThe size of the Ng_value value, which is the length of the last segment in the page (can be 0, or less than 255). Processes one packet at a time, this packet is encapsulated into one or more page pages (the length of the page is set to the upper limit, typically 4kB); the next packet must be encapsulated with the new page, which is represented by the setting of the first Field field Header_type_flag.     (4) Multiple logical streams (such as voice, text, picture, audio, video, etc.) that have been encapsulated in the page format are used to synthesize the physical flow according to the sequential relationship of application requirements. Mapping and inverse mapping of 3.4  ogg files     using ogg file format to encapsulate compressed encoded media streams can be used to store (disk files) or direct transfer (TCP or piping) because the OGG bit stream format provides encapsulation/synchronization, error synchronization capture, Looking for tags and enough other information to enable this decentralized data to be completely restored to the encapsulated packet boundary "packet" in the form of a compressed-coded media stream, restored to the original media stream has the package boundary form does not need to rely on the codec for compression encoding. That is to say, the Ogg mapping and the inverse mapping and the compression encoding and decoding of the media stream are relatively independent.

Figure 5  ogg encapsulation process diagram     ogg file needs to be encapsulated in two ways: (1) before the player is decoded to the media stream, (2) before the media stream is rtp/udp transmitted. The solution encapsulation process is the Ogg inverse mapping process, namely restores to have the package boundary "the packet" form the media flow, simultaneously fills the RTP first field and the corresponding piece of media data to bind, forms the RTP packet. This process is the transformation of media streams from Ogg format to RTP format.     the media stream with packet as cell is mapped to the Ogg format bit stream with page as unit, which is divided and reorganized by segment, but it facilitates the storage and transmission of the media stream (TCP). The operation of the source buffer media data (packet), need to establish a number of links between the data structure, just cut the media data in memory to move once, the operation point to the media data can be transferred to the target media data to the destination buffer (page) of the intention, the process can be two functional conversion to express: Ogg_ Stream_packetin () àogg_stream_pageout (). The ogg format bit stream inverse mapping is restored to the packets media stream for playback decoding or for the RTP package for UDP transmission. The intermediate link is the segment unit data in the page in order to packet, the same media data in memory copy only once, the process can be three function conversion to express: Ogg_sync_pageout () Àogg_stream_pagein () à Ogg_stream_packetout (), media data replication occurs in the first function ogg_sync_pageout ().   OGG mapping and inverse mapping functions are reflected in the OGG function library, the current version is libogg-1.1.3.   4 closing            ogg format is based on the advantages of absorbing other streaming media file formats for having "packet" The media streaming in the package boundary form, the open source streaming media file format which is beneficial to its storage and transmission, has been applied well in the transmission of Icecast stream server; According to Icecast's official website, the results of the test are published, There are 14,000 client concurrent streams for Oggvorbis audio transmissions under the GB backbone. More importantly, as the core of streaming media technology, most streaming media file formats are still not completely open and protected by patents. To be fast &nbsp in streaming media technology and applications;  Development today occupies a place, comply with the GNU/GPL agreement, take the road of open source, the development of non-intellectual property rights-constrained streaming media file format is a better choice to catch up with advanced streaming media technology.  

Several common streaming media format files:

Microsoft Advanced Streaming Format ASF Introduction

The core of Microsoft's Windows Media is ASF (Advanced Stream Format). Microsoft defines ASF as the unified container file format for synchronous media. ASF is a data format, audio, video, image and control command scripts, such as multimedia information through this format, in the form of network packets transmission, the realization of streaming multimedia content publishing.

The biggest advantage of ASF is that it is small in size and therefore suitable for network transmission, using Microsoft's latest Media Player (Microsoft Windows Media Player) to play the file directly in that format. Users can combine graphics, sound, and animation data into an ASF-formatted file, and, of course, convert other formats of video and audio to ASF format, and users can also save data such as microphones, VCRs, and other peripherals in ASF format through sound cards and video capture cards. In addition, an ASF-formatted video can have command code that specifies that an event or action is triggered after a certain time when the video or audio is reached.

The characteristics of ASF

Extensible Media Type-ASF files allow the creator to easily define new media types. The ASF format provides a very efficient and flexible way to define new media stream types that conform to the ASF file format definition. Any stored media stream is logically independent of other media streams, unless it is clearly defined in the header part of the file as its relationship to another media stream.

Part Downloads-specific information about the playback part (such as the decompression algorithm and the player) can be stored in the ASF file header, which can be used by the client to find the appropriate version of the desired playback part---if they are not installed on the client.

Scalable media Type-ASF is a dependency that is designed to represent the "bandwidth" of a scalable media type. ASF stores each bandwidth as a separate media stream. The dependencies between media streams are stored in the header portion of the file, providing a rich flow of information for clients to interpret scalable options in a separate, compressed way-a modern multimedia transmission system can be dynamically tuned to accommodate network resource constraints (e.g., insufficient bandwidth). The producers of multimedia content should be able to express their reference information according to the priority level of the stream, such as the minimum guaranteed transmission of audio streams. With the advent of scalable media types, the scheduling of flow priorities becomes complex, because it is difficult to determine the order of each media stream at the time of production. ASF allows content producers to effectively express their views (on the priority of the media), even in the event of a scalable media type.

Multi-language-ASF is designed to support multiple languages. The media stream can optionally indicate the language of the media in which it is contained. This feature is commonly used for audio and text streams. A multilingual ASF file refers to a series of media streams that contain the same content in different language versions, allowing the client to select the most appropriate version during playback.

Directory Information-ASF provides the ability to continue to expand directory information, and the functionality is both extensible and flexible. All directory information is stored in unformatted encoding in the file header and supports multiple languages, and, if necessary, directory information can be predefined (e.g., author and title), or be a system author customization. The directory information feature can be used for both an entire file and a single media stream.

Realsystem RealMedia file format

The RealNetworks company's RealMedia includes RealAudio, RealVideo and Realflash files, RealAudio used to transmit audio data that is close to CD quality, RealVideo to transmit uninterrupted video data, Realflash is the RealNetworks company and the Macromedia company recently launched a high compression ratio of the animation format RealMedia file format introduced, it allows realsystem through a variety of networks to transmit high-quality multimedia content. Third-party developers can convert their media formats into RealMedia file formats via the SDK provided by RealNetworks.

QuickTime Movie (Movie) file format

Apple's QuickTime movie file is now an industry standard in the digital media field. The QuickTime movie file format defines a standard way to store digital media content, using this file format not only to store individual media content (such as video frames or audio samples), but also to preserve a complete description of the media works The QuickTime file format is designed to accommodate a variety of data that needs to be stored to work with digital media. Because this file format can be used to describe almost all media structures, it is an ideal format for exchanging data between applications, regardless of how the platform is running. Media descriptions and media data are stored separately in the QuickTime file format, and media descriptions or metadata (Meta-data) are called movies (movie), including track numbers, video compression formats, and time information. Movie also contains the index of the media data store area. The media data is all sample data, such as video frame and audio sampling, and the media data can be stored in the same file as the QuickTime movie, or in a separate file or in several files.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.