With the development of wireless networks and smartphones, smartphones are becoming more and more closely connected with people's daily lives, entertainment, business applications, financial applications, traffic and various functions of the software emerged, making people's lives colorful, fast and convenient, but also let it become a part of the people's lives can not be replaced. Among them, multimedia because of its intuition and real-time, the scope of application is more and more wide, video decoding and playback has become a research hotspot. The standard technology is becoming more and more mature, using unified VLC symbol coding, high precision, multi-mode displacement estimation, integer transform based on 4x4 block, layered coding grammar and so on. These measures make the H-h.263 algorithm have high coding efficiency, and can save about 50% of the bit rate under the same reconstructed image quality. Moreover, the code-flow structure network has strong adaptability, which increases the ability of error recovery. It is suitable for wireless networks with limited bandwidth and high error rate. Based on the decoding method in FFmpeg open source code, this paper adopts multi-thread receiving data packet, multilevel buffering data, receiving and decoding parallel double-threading operation, and so on, which can alleviate the data jam, decoding error, video screen slowness, delay and so on due to the large amount of data transmitted and fast speed. So that the transmission speed of the video is fast and good stability. Finally realize the PC-to-Android mobile video transmission, as well as on the Android phone side of the decoding playback. This technology can be used in video conferencing, video surveillance and other applications.
I. The overall structure of the video transmission and playback system
The video transmission system is divided into 2 parts: Server side and client, the server is responsible for reading the video data of H. E, and it is packaged and sent to the client in RTP/RTCP format, and receives feedback from the client and controls the transmission speed. The Android phone client mainly completes receiving real-time stream data from the server, buffering, parsing the video data, then sending the decoding, and finally displaying the playback on the phone. The server side uses the C language implementation, the client mainly uses the Java language implementation.
second, the key technology and its realization
1. Packaging and unpacking based on RTP Protocol (1) a single nal packaging H.264nalu unit is usually composed of [start Code][nalu Header][nalu payload], where start code is used to mark the beginning of a NALU unit and must be "00000001" or "000001", the package to remove the start code, the other data package to the RTP package is OK. (2) Shard packing because 1500 bytes is the upper limit of the length of the IP datagram, the 20-byte datagram header is removed, and 1480 bytes are used to hold the UDP datagram. So when the number of bytes in a frame exceeds this value, we have to package its shards. and UDP in the process of transmission also by the packet overhead, so the RTP packet maximum number of bytes to locate 1400 bytes. The package format that requires sharding is distinguished by the format of the next shard: Fu indicates that the byte has the following format: +---------------+|0|1|2|3|4|5|6|7|+-+-+-+-+-+-+-+-+| F| nri| Type |+---------------+FU indicates the types of bytes 28,29 represent Fu-a and fu-b. The value of the NRI field must be set according to the value of the NAL unit NRI to be fragmented. The format of the FU header is as follows: +---------------+|0|1|2|3|4|5|6|7|+-+-+-+-+-+-+-+-+| s| e| r| Type |+---------------+s: Start bit (1bit), when set to 1, the start bit indicates the beginning of the Shard nal unit. The first shard package is set to 1, and the other Shard sets to 0. E: End Bit (1bit), when set to 1, the end bit indicates the end of the Shard nal unit, that is, the FU load is the last Shard set to 1, other times set to 0. R: Reserved Bit (1bit), must be set to 0. Type:5bit (3) packaging and unpacking process analysis: Packaging:
Shard Details: ① The first fu-a package of FU indicator is set up: F,nri=nalu Nri,type=28 fu in F=nalu head header:s=1,e=0,r=0,type= The Type;② in the middle of the Nalu head is set by the FU indicator of the FU-A package: F=nalu F,nri=nalu in Nri,type=28 fu head The FU indicator of the Type;③ tail fu-a package in the Nalu head is set in this way: the Type in F=nalu head F,nri=nalu in nri,type=28 head. Unpacking: Below we analyze the code implementation for classifying shards when RTP is unpacked: Byte startbit= (Byte) (recbuf&0x80); byte endbit= (Byte) (recbuf[ 13]&0x40); ① If, startbit==-128, this packet is the first packet of the Shard. Nalbuf= (Byte) ((RECBUF&0XE0) + (recbuf&0x1f)); This sentence is used to reconstruct the combined NAL unit type ② if (startbit==0) && (endbit==0), this packet is the middle part of the Shard. ③ if endbit==64, this packet is a shard trailer. When the classification is clear, it is possible to do the corresponding processing of the parts, as in the analysis. 2. Bitstream management mechanism (1) The reception of the code stream. In the sending side code stream is sent very quickly, because the receiving side not only to receive the stream, but also to analyze, decoding, this processing needs a longer process, if the receiving end sequence to perform this process, it will not be able to completely receive the sender packet, packet loss, resulting in a decoding error, unable to play the video properly, Even serious errors such as program crashes. We take the concurrent processing mechanism to solve this problem. One of the meanings of thread concurrency is to increase the speed of running on a single processor. In Java we use the executor (Executor) in the Java.util.concurrent package to manage thread threading objects. We created 20 threads, that is, we submitted 20 tasks to Singlethreadexecutor, and the tasks will be queued and each task will be completed before the next task starts, each one according to their submittedOrder, complete before the next task begins. This not only enables fast reception but also ensures that the received packet sequence is correct. After such processing, receive and analysis decoding can be divided into two parts, we can put the received data temporarily in the buffer, and then we can then go to receive the next packet of data, do not wait for the analysis, decoding is completed before receiving the next packet of data. This greatly improves the reception efficiency and avoids the problem of packet loss. (2) Video data parsing and decoding. Because of the concurrency mechanism, the received data is more than one packet, so the data received by the docking should be done how to properly handle, become our next difficulty. What we need to ensure is still the order of the packets, and we can only process one package at a time, and this involves a collaboration problem between the threads. We use the threading collaboration pattern of consumer producers to do the processing. We will parse the packet from the buffer in which the data is stored and put it into another buffer, notifying the decoder that it can get the data decoded from this buffer and then analyze the video data to wait for the program. After the decoding is complete, the program that notifies the analysis video data continues to analyze the video data, and the decoding program goes into wait. Two programs alternate between execution and wait.
(3) Multi-stage buffering mechanism. We also mentioned a few buffers above, summarized below. ① buffer after receiving the data, due to the constant stream of real-time bitstream on the server, and the adoption of a concurrency mechanism to bring a lot of data, we can not immediately finish, so we must set a buffer. ② between the receiving end and the processing end of the buffer, due to network instability, the received data may sometimes fast sometimes slow, which will directly cause the decoding instability and video playback discontinuity, so in this setting a buffer, play a smooth, transition role, This buffer has to be stored to receive a large number of streams and to provide data for video data analysis, there is a read-write and read-out process, so we use the first-in, first-out Queues queue container to do buffer. ③ the buffer between parsing and decoding process, because the amount of data in this process is not very large, and the speed of the acquisition of data directly affect the speed of decoding, so we have to use an efficient buffer to serve as a buffer at this time, because the stack is automatically allocated by the system, so the speed is relatively fast, So we just allocate an array on the stack for storage. ④ decoding to play between the buffer, the same buffer in addition to play to make the video play the role of continuous stability, is mainly used to display the image, but also to the video image of some processing work, smoothing, filtering and so on. 3. Decoding and playback of the implementation of the H. S decoding is the porting of the H. decoding part of the FFmpeg to Android, and is optimized for deep deletion. Interface section, file reception processing and video display are all done in Java, the underlying video decoding part uses C to do so to meet the requirements of speed. The Code Stream segmentation nal (receiving video data recovery work) is done in the Java layer is not divided into C, because each time the data sent will be limited, if the amount of data sent, the underlying may decode several frames of video at a time, but to the interface layer can only display one frame, resulting in dropped frames. If the amount of data sent each time is small, it will make multiple underlying calls but not the phenomenon of real decoding, so although the coupling degree is less, slower, but the overall consideration is still the data analysis work on the Java layer to complete.
We will decode the video data with bitmap display, draw to Surfaceview method display to the mobile phone screen, because some phones do not support rgb24 but almost all phones support rgb565, so the decoding is returned rgb565 data. 4. program Flow function architecture
Iii. Concluding remarks
This paper designs and realizes the function of video transmission and decoding from PC side to Android phone. Detailed analysis of the implementation of the technical points and difficulties, detailed analysis of the RTP packaging, unpacking process, for the sending of data fast and slow processing of the problem, the use of multi-threaded concurrency mechanism to solve, in the face of a large number, and unstable data packets, for the characteristics of each link, set up a multistage buffer. Make video playback smoother and smoother. For the order of analysis and decoding, using the idea of thread collaboration, using the consumer, producer mode, to ensure the timing of video data. In addition, for the decoding part, the existing decoding method is used to transplant the platform, and the division of C-layer and Java layer is handled rationally, and the complete function is realized. In the case of good network conditions, Android mobile video playback delay short, smooth playback, smooth. The technical research can be applied to the various applications of video broadcasting, which has a strong practical value.
Video decoding and playback on Android phone (RPM)