Android h264 (3): Streaming Media Player Design Scheme

Source: Internet
Author: User

If the power of a person is limited, the power of the network is infinite. The purpose of studying h264 is to customize a streaming media player to play videos in real time.

Fortunately, there are a lot of online cool people standing on the shoulders of giants, and the pressure will be much lower.

Overall Player Design Scheme

Generally, the overall player design includes three stages:
1) obtain media data
2) decoding Audio and Video Streaming Media
3) display decoded media data to the user

Layer-based player Structure Design from top to bottom layers:

1. Data Extraction Layer
This layer includes obtaining local files and streaming media files.
2. data preprocessing Layer
Unencapsulate local files in the media format, obtain the audio and video or text subtitles of the files, and put them in the corresponding upper-layer uncode buffer by frame; the header information of RTP is removed from the streaming media file, audio and video information in RTP is framed, and the complete data frame is transmitted to the upper-layer buffer zone to be decoded.
3. Audio/Video decoding Layer
Supports decoding and selection components, and supports synchronization between mainstream audio and video decoder and multi-channel media streams.
4. User Interface
Mainly provides interaction interfaces between users and players
FFmpeg: it provides a complete solution for recording, conversion, and streaming audio and video.

This article from:

Design of Streaming Media Player Based on Android platform

1 Introduction
With the rapid development of mobile communication technology and multimedia technology, video surveillance technology that integrates mobile phones, networks, and multimedia technology has also made great strides, it has become possible to provide streaming media services through mobile communication networks. The number of mobile users around the world is very large. Therefore, mobile streaming media services have huge market potential and are becoming one of the hot topics of mobile businesses. In this context, it is of practical significance to propose a solution for mobile streaming media clients based on the characteristics of mobile networks and mobile terminals.
Based on the decoding process in the FFMPEG open source code, this paper proposes a design scheme of the streaming media player on the mobile terminal based on the layered architecture. This design features differences in media stream processing when decoding different types of files at the underlying layer, and provides control functions for external camera devices, the player is finally implemented on the Android platform [1.
2. Overall Player Design
The player needs to obtain media data, decode audio and video media streams, and display decoded media data to users in three processing stages, regardless of whether the player plays a local file or a network streaming media file, based on the three significant stages in the 0-file playback process, this paper proposes a hierarchical player structure design.
Because the data acquisition methods for local files and network streaming media files are different, to maintain the consistency of upper-layer decoding, the two types of files need to be preprocessed, data in the same format is provided to the upper layer for decoding. Based on the above features, combined with the file decoding process in this paper, the real-time monitoring player design uses a layered structure, each layer independently completes the task, reducing the coupling of the system, it facilitates independent expansion of each layer without affecting the application of the upper and lower layers. Data extraction layer, data preprocessing layer, audio/video decoding layer, and user interface are arranged from bottom to top. The layer structure 1 of the Streaming Media Player is shown in.
The user interface layer provides interfaces for interaction between users and players. For example, you can pause, fast forward, and fast return when playing a local file, when viewing a streaming media file, you can control the camera's focal length and direction through the digital key, navigation key, or the player's top-direction button.
The audio/video decoding layer mainly provides decoding selection components, various mainstream audio/video format decoder, and the synchronization between multiple media streams. The decoding Selection Component obtains the decoding format information of the media from the local file or the streaming media file header, and selects the corresponding decoder based on the format information to decode the compressed media stream. This part is optimized by FFMPEG as the player decoding module. Multi-media synchronization includes video stream and audio stream synchronization, and Subtitle synchronization may be required when playing a local file.
The data preprocessing layer unpacks the local file according to its media format, obtains the audio and video or subtitle information of the file, and puts it into the corresponding upper-Layer Code buffer zone by frame. The header information of RTP is removed from the streaming media file, audio and video information in RTP is set, and the complete data frame is sent to the upper-layer buffer to be decoded. The encapsulation control information component encapsulates the user's control input according to the text format stipulated by the PELCO-D/P protocol, and transmits the control information to the lower layer.
The function of the data acquisition layer includes obtaining local files, streaming media files, and sending the camera control information. The former only needs to read local files, and obtaining streaming media files requires obtaining media data from the Streaming Media Server. The streaming media file acquisition part includes the session negotiation part, data transmission part, and data buffer part. In the media information negotiation section, you need to use the RTSP Protocol [2] to negotiate conventional media stream information, such as media types (video and audio) and transmission protocols (RTP/udp/IP ...) And media formats (h263, MPEG ...) And the media transmission port.
3 FFMPEG porting to Android platform
FFmpeg is a complete open-source solution that combines recording, conversion, audio/video encoding and decoding functions. However, in this article, the player only needs the file decoding and audio/video decoding functions in FFMPEG. If the entire FFMPEG solution is transplanted to the target platform, a lot of code redundancy will occur. In addition, FFMPEG code is developed based on the Linux operating system and does not take into account restrictions such as low processing capability and insufficient energy on the mobile phone platform, therefore, it is very important to trim and optimize the FFMPEG code for specific functional requirements on mobile phones.
3.1 FFMPEG trim and Optimization
Finding out the code required in this article from the source code with such a huge FFMPEG and complex code structures is indeed a very difficult task. Compiling and running FFMPEG code in Linux requires three steps: Configure, make, and make install to correctly compile FFMPEG into Linux. In the configure stage, a Configure. h and make files are generated. From these two files, you can find out which files have been compiled by this compilation.
It is found that many configuration parameters can be added to configure source code. The parameters include basic option parameters, advanced option parameters, and specially provided optimization parameters. Optimization parameters are mainly used to configure the content to be compiled during compilation. The TRIM of FFmpeg is to remove unnecessary files in the system. Therefore, this article uses the method of selecting the appropriate optimization parameters to find the files required by the player. After careful study of these parameters, the parameters set during compilation are obtained as follows:

./configure --enable-version3 --disable-encoders --enable-encoder=h263 --enable-encoder= amr_nb --disable-parsers --disable-bsfs --enable-muxer=tgp --disable-protocols --enable-protocol =file。

When you configure the source file, the system only compiles the h263, amr_nb encoding method, 3GP file Encapsulation Format, all its decoding formats, and the source code Part Of The unwrapped file to the link library.
At this time, the source code set compiled to the Linked Library is the source code valid set required in this article, by searching for configure. the suffix in the H and make files is. o file with the suffix. O files are compiled. the target file generated in C code, each of which is compiled. c files are generated. o file, so by viewing all the suffix names. o file name, you can know which source files are compiled under this configuration parameter, so you can get the minimum set of source files to be compiled in this article.
Although FFMPEG open source code can be compiled and run across platforms, its code is designed for PCs, there are many differences between PCs and mobile phones in terms of CPU processing capability, energy, memory, and other resources. This article focuses on the features of mobile phones and optimizes the code from the following aspects:
1. remove redundant code, standardize program structure, reduce if-else judgment, adjust local and global variables, use register variables instead of local variables, and reduce unnecessary code redundancy, remove the print statement during FFMPEG debugging;
2. Replace multiplication and division operations with logical shift operations, because the execution time of multiplication and division operations commands is much longer than that of logical shift commands, especially division commands. Using Logical shift operations can reduce the running time of commands;
3. Pay attention to calling cyclic functions, minimize the use of multiple loops, minimize the correlation between the previous loop and the next loop when writing code, and reduce unnecessary code computations;
4. Set a reasonable cache. For the Android platform of the Target Platform for FFMPEG transplantation, set the cache size suitable for this platform ;.
Here we will not repeat the modifications to specific codes.
3.2 FFMPEG Transplantation
The android. mk File Syntax of the ndk makefile released by Google is different from that of the normal MAKEFILE file. The original makefile cannot be used when compiling FFMPEG source code across platforms. Therefore, the precondition for transplantation is to replace all the makefile files in FFMPEG with the Android. mk file in ndk.
By analyzing the FFMPEG module structure, avutil is the basic module. avcodec module compilation is based on the compiled avutil module, and avformat is based on the previous two modules, according to this module structure, the order of compilation and transplantation is avutil, avcoedec, and avformat. The compilation steps are described as follows:
1. About config. h and config. Mak
First, describe the built-in makefile framework of FFMPEG. After the configure command is run, FFMPEG generates a config. h file and a config. mak file. These two files are added with a total of-macro definitions, which are used to describe the parameter settings of all aspects of the compiled code, there are macro definitions related to the architecture, compiler, Link Library, header file, version, codec, and so on. In this part, you must modify the definition of platform differences. For example, you must change the architecture to the armv5te of the Android platform, at this time, when the file is compiled, the instruction set will select the arm Instruction Set instead of the x86 instruction set. These two files are very important. Many files will be included in the future.
Config. h. The compiler will selectively compile the code based on this file.
2. Compile libavutil.
Create an android SDK in libavutil. MK file. The MAKEFILE file in libavutil needs to call subdir. mak, which is actually a real compilation, but written in Android. MK, this make file can not be, but the corresponding source file needs to be introduced directly, the standard makefile is specified. O target file, but in Android. MK needs to be specified directly. c source file, android. the MK file is as follows:

LOCAL_PATH := $(call my-dir)  include $(CLEAR_VARS)   LOCAL_MODULE := avutil  LOCAL_SRC_FILES:=adler32.c \  …… \   include $(BUILD_STATIC_LIBRARY)

Many errors may occur during compilation, but most of these problems are caused by the absence of header files. You only need to introduce the corresponding header files. For example, if the size_t keyword of some files is not recognized, no error is reported after the File Include stdio. H. Other similar errors are not listed in detail.
Other modules write the Android. mk file in the same way and port it to the player decoding module on the Android platform.
4. modules at different layers
4.1 Data Acquisition Layer
The main function of this layer is to negotiate with the Streaming Media Server about media details, obtain streaming media data from the server based on the negotiation results, and store the streaming media data in the buffer zone, the data packet is sent to the data preprocessing layer according to the buffer policy described in this article. Its structure is shown in Figure 2:
In this article, five threads are started at the layer. One of the threads starts a TCP connection for RTSP session negotiation. During RTP data transmission, the TCP connection must be retained. The two threads are the threads that receive audio and video RTP data, and the other two threads are the RTCP packets that receive and send audio and video respectively.
4.2 data preprocessing Layer
The Preprocessing of local files in this layer relies entirely on the function file decoding and encapsulation function provided by FFMPEG. The Preprocessing of streaming media files requires the integration of one or more RTP data packets, this technology is relatively mature and will not be repeated in this article.
In this article, the Streaming Media Player is different from other common streaming media players. The biggest feature is that it can control external cameras with a cloud platform, such as focal length, top, bottom, left, right, and other settings. So this article uses the PELCO-D protocol as the control protocol of cloud platform.
The first byte is the start symbol of the synchronization word, which is usually 0xff. The symbol byte is used to check whether the sending and receiving methods used are correct or not. Enter the address of the target device in the second byte, and control the camera aperture and focal length in the Command word 1 byte. In the command word, 2 bytes are used to control the focal length and change times. bit4, bit3, bit2, and bitl are the upper and lower control bits, and the last bit0 bits are always 0. Horizontal speed (00-3f) in 1 byte of data ). 2 bytes of data, vertical speed. The value is the same as 1. The verification code segment is the sum of the first six bytes.
In this paper, the design of the PELCO-D protocol text, the first default case of command word 1, command word 2 all 0, data word 1 and data word 2 value is 20 h. Modify the corresponding digits of command 1 and command 2 through the upper-layer key message.
In this article, the Streaming Media Player only provides the above six control functions. This module sets the corresponding bit to 1 Based on the buttons on the upper layer, calculates the byte value, and generates seven bytes of text to be sent to external devices, after receiving messages that are stopped by the upper-layer buttons, the command {0xff, 0x01,0 X, 0 x, 0 x, 0 x,} is uniformly sent to stop the message.
4.3 decoding and display Layer
The decoding layer is mainly used to transplant FFMPEG code to Android platform as the player decoding module. This part of the Code supports more than 90 decoding formats and file formats including Avi, 3GP, MPEG-4, etc, furthermore, the efficiency and efficiency of the optimized FFMPEG code have been greatly improved.
Display layer this article mainly applies the Open Source SDL function library implementation, SDL (Simple DirectMedia Layer) is a cross-platform, free open source software. The software is developed in C language and provides simple interfaces for a variety of platform images, sounds, and other input devices. This open-source software is often used for development of games and other multimedia applications. It can run on a variety of operating systems, including Linux, PSP, windows, and Mac OS X. SDL also provides video, audio, thread, timer, event, and other functions.

5. Summary
This article introduces the hierarchical design structure of the Streaming Media Player Based on the Android platform and the detailed design of its layers. The decoder library of the player is derived from the source code of the Cut-optimized FFMPEG, in addition, the player in this article provides the external camera control function, which is more widely used.
Although this article has completed the implementation of the prototype function of the streaming media player with control functions, there are still many issues such as QoS and code optimization that need further research.

This article from:

Implementation and Research of RTSP-based on-demand mobile video streaming

With the arrival of 3G, the bandwidth is higher and the traffic fee is lower, and multimedia applications such as mobile phone and TV will surely develop greatly. I will sum up my previous experience, I will discuss with you how to create a VoD solution for mobile phones, and finally provide a preliminary client implementation result. Welcome to the discussion.
First, let's talk about the architecture. For ease of management and expansion, bandwidth restrictions, and multi-user concurrency, commercial solutions will adopt the Streaming Media Server + Web Server + Transfer Server + mobile client solution, the streaming server is responsible for collecting and compressing video sources and waiting for RTSP connection requests from clients at any time;
Web servers facilitate the publishing and Management of video information;
The Transmission Server is optional. It is used to forward RTSP requests from the client to the server and transfer the server to the client in real time, the advantage is that more users can watch videos at the same time under the same bandwidth;
The mobile client can use a built-in player (such as the RealPlayer on Nokia) or an independent player developed by itself. The advantage of the former is to lower the threshold for users and facilitate large-scale applications; the latter is easy to expand and customize to meet more features.
Streaming Server is the core of the entire solution. Currently, the mainstream Streaming Media Server solutions are as follows:
Helix Server: with the strength of real, this is the most popular solution currently. It supports all audio and video formats with stable performance. It is the only solution that can span windows Mac, Linux, Solaris, the streaming media service platform for HP/UX users. It supports player playback on mobile phones. The free version of Helix Server only supports 1 Mbit/s of traffic, and the Enterprise Edition is very expensive. Of course, it is another thing to crack :)
Darwin server: this is an open-source Streaming Media Solution launched by Apple. The supported formats are not as many as helix. However, because they are open-source and free, they have a lot of development space for developers.
Live555 Media Server: stable performance, but supports few formats (only MP3, Amr, AAC, MPEG4 es, and other streams). It is rarely used independently and generally used as part of the system.
Windows Media Server: only available on the Microsoft platform.
The mobile phone Framework process is as follows:

Currently, there are two transmission protocols for mobile clients and servers: HTTP and RTSP. In the early days, mobile phones and TVs often use HTTP. HTTP has the advantages of no special server software or IIS, firewall Nat is not used, but HTTP does not support real-time streams and bandwidth is wasted. RTSP is the mainstream standard for streaming media transmission. Even Microsoft has abandoned MMS and switched to support RTSP, RTSP supports pause, pause, and stop operations on the client. You do not need to consider audio/video synchronization issues (because the audio and video are read from different RTP ports for buffering ). It is worth noting that after the RTSP is successful, RTP transmission starts, which can be divided into RTP over TCP and RTP over UDP. The former ensures that each packet can be received and re-transmitted if it is not received, in addition, you do not need to consider the firewall Nat. The latter only guarantees the best transmission effort, and does not re-transmit frames, so the real-time performance is good.
Nat problems. If you want to use a mobile TV with a high frame rate, we recommend that you use UDP for transmission, because the retransmission of data with a high latency is meaningless to users and would rather discard it.
I used the powerful open-source library live555 in the network to implement the RTSP/RTP protocol. Its performance is stable and it supports transmission of most audio and video formats. (Of course, FFMPEG also implements the network transmission part, which can also be used after modification) after the live555 is cropped and transplanted to Symbian and Windows Mobile, this part of work is time-consuming in Symbian real machine debugging.
Of course, FFmpeg is used for video decoding, and MPEG4 SP/H is transplanted. the 264 decoder supports 32 K, CIF, and 5-10fps without any optimization, which is sufficient for general streaming media applications. In the future, it will be optimized through algorithms and compilation. After decoding, you also need to pass through yuv2rgb and scale. Note that the decode of FFMPEG has a deactivating zone, that is, the linesize of the qcif image is 176 instead of 192, if you find that the decoded image is green, you need to convert it with img_convert () (the destination format is pix_fmt_yuv420p ). In Symbian, use DSA to directly write the screen. Windows
Audio Decoding mainly includes AAC, amrnb, and amrwb. AAC and amrnb are audio supported by GPRS and edge bandwidth (AAC is better than amrnb), and amrwb is the 3G audio format. In FFMPEG 0.5 release, fixed point Decoding of amrnb/WB is supported, which is very powerful.
Both Symbian and Windows Mobile are tested. The effects of the 6122c and Windows mobile5.0 simulators are as follows:

The demo video address is rtsp: // v. starv. TV/later.3gp, the video is MPEG4 sp, and the audio is amrwb. Currently, only images can be seen, and audio is not added yet.
Note that the access point for streaming media applications is generally cmnet, and cmwap is only used to browse low-volume applications such as web pages.

This article from:

One recommended:,The powerful video encoding web service!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.