"Video Broadcast Technology details" series 2: processing,
Qiniu cloud released a live video stream network LiveNet and a complete live video cloud solution at the end of June, many developers are very interested in the details and Use Cases of this network and solution.
We will use seven articles to give a more systematic introduction to the key technologies of live video in all aspects of the current hot season, and help live video entrepreneurs to gain a more comprehensive and in-depth understanding of live video technology, better technology selection.
The outline of this series of articles is as follows. If you want to review previous articles, click the direct link:
(1) Collection
(2) handling
(3) coding and Encapsulation
(4) streaming and transmission
(5) latency Optimization
(6) Principles of Modern players
(7) SDK Performance Test Model
In the previous collection, we introduced video acquisition for Audio Acquisition and image acquisition, and they correspond to two completely different input sources and data formats. This article is part two of the video live decryption technology series: processing. We will explain common video processing functions such as beautify, video watermarks, filters, and livemix.
The original data is obtained after the video or audio is collected. To enhance the field effect or add some additional effects, we generally process the original data before encoding and compression, for example, the time stamp or company Logo watermark, freckle beautification and sound obfuscation processing. In a video connection scenario between the caster and the audience, the caster needs to talk to one or more audience members and share the conversation results with all other audience members in real time, some of the work of livemix is completed at the streaming end.
Open Design
As shown in, the processing process includes audio and video processing. audio processing includes audio mixing, noise reduction, and special audio effects, video processing includes beautification, watermarks, and various custom filters. For live video cloud services such as qiniu, in order to meet the needs of all customers, apart from providing these "standard" processing functions, we also need to design this module into a way to freely access custom processing functions.
IOS SDK address: https://github.com/pili-engineering/PLMediaStreamingKit
Android SDK address: GitHub-pili-engineering/PLDroidMediaStreaming
Common video processing functions
1. Beautify
They all said, "80% of the broadcasters cannot watch without face filter." face filter is one of the most common features of live video products. The main product of meitu, which is going to be listed in Hong Kong recently, is the face filter camera and meitu. Some media outlets say it will impact the cosmetics industry. In fact, it is the credit for the effect of the face filter, in this way, the beauty host can broadcast live without makeup or confidence, while the beauty camera users can "better themselves 」.
The main principle of beautify is to achieve the overall beauty effect through "skin polishing + skin whitening. The technical term of skin polishing is "de-noise", that is, to remove or blur the noise in the image. Common de-noise algorithms include mean blur, Gaussian blur, and median filter. Of course, because each part of the face is different, the freckles on the face may look like black spots on the eyes. You do not need to remove the eyes when "de-noise" the entire image, therefore, face and skin detection technologies are also involved in this step.
The iOS and Android streaming sdks provided in our live video system have built-in beautification functions. You can switch the beautification function based on your needs, and you can freely adjust the functions including beautify, whitening, parameters such as skin shining. The parameter settings of iOS SDK PLCameraStreamingKit are as follows:
1) enable or disable face Filter Based on the default parameters:
-(void)setBeautifyModeOn:(BOOL)beautifyModeOn;
2) set the intensity of face filter, ranging from 0 ~ 1:
-(void)setBeautify:(CGFloat)beautify;
3) set the intensity of skin whitening, ranging from 0 ~ 1
-(void)setWhiten:(CGFloat)whiten;
4) set the intensity of skin shining, ranging from 0 ~ 1
-(void)setRedden:(CGFloat)redden;
2. Video watermark
Watermarks are a common feature in image and video content. They can be used for copyright protection or advertisement settings. In compliance with regulatory requirements, the relevant national departments also stipulate that the video must be watermark during the live video process, while the live video must be recorded and stored for a certain period of time, and then added a watermark to the recorded video.
Video watermarks are available in two ways: Player watermarks and embedded video watermarks. For player watermarks, if there are no effective anti-theft measures, streaming without playback Authentication, after the client obtains the live stream, it can play it in any player without watermarks, thus losing the video protection capability. Considering the watermark requirements for cloud recording, we generally choose "embedded video watermark" to add a watermark.
The iOS and Android streaming sdks provided in our live video system also provide the built-in watermark function. You can add or remove the watermark as needed, the watermark size and position can be set freely. The parameter settings of iOS SDK PLCameraStreamingKit are as follows:
1) Add a watermark
-(void)setWaterMarkWithImage:(UIImage *)wateMarkImage position:(CGPoint)position;
This method adds a watermark for the live stream. The watermark size is determined by the size of the wateMarkImage, and the position is determined by the position. Note that these values are measured by the pixels of the collected data. For example, we use AVCaptureSessionPreset1280x720 for collection and wateMarkImage. if the size is (100,100) and the origin is (200,300), the watermark position is (1280) in the collection frame of the size of 720x200,300, the size is (100,100 ).
2) Remove the watermark
-(void)clearWaterMark;
3. Filter
In addition to the beautify and watermark mentioned above, there are many other processing effects in the video. Based on the open design, sdks provided by qiniu live video cloud support various custom filters through the data source callback interface.
To achieve rich filter effects, you can use the GPUImage library on iOS. This is an open-source GPU-based image or video processing framework, it has over 120 built-in filter effects. With it, you only need to simply add a few lines of code to add a Real-Time Filter. You can also write algorithms based on this library to achieve richer results. GPUImage address: GitHub-BradLarson/GPUImage: An open source iOS framework for GPU-based image and video processing
In addition to iOS, Android also supports porting the GPUImage Library: GitHub-CyberAgent/android-gpuimage: Android filters based on OpenGL (idea from GPUImage for iOS)
At the same time, Google has also opened up a great library, covering a lot of multimedia and image-related processing on Android: GitHub-google/grafika: Grafika test app
4. livemix
Livemix is a common requirement in interactive live broadcasting, as shown in the process. Real-time interaction can be performed between the caster and some viewers, and the interaction results can be played to other viewers in real time.
Based on the above business needs, we can easily think of interactions between two-way streaming and two-way streaming on the caster and connected audience based on the one-way live broadcasting principle, then, the two streams are merged on the server and pushed to other audiences. However, the latency caused by RTMP makes it impossible for users to accept interactive live broadcasts.
In fact, the main technical difficulties of interactive live broadcasting are:
1) Low-latency Interaction: ensures real-time interaction between the host and the interactive audience. Communication between the two is like telephone communication. Therefore, you must ensure that the two can hear each other's voice within seconds, view the video of the other party;
2) Audio and Video Synchronization: the requirement for audio and video synchronization in interactive live broadcasting is similar to that in one-way live broadcasting, but the latency requirement in interactive live broadcasting is higher, and the synchronization of audio and video in seconds must be ensured.
3) Real-time audio and video Synthesis: other audiences need to view the conversation results in real time. Therefore, they need to synthesize the pictures and sounds in real time on the client or server side, and then transmit the audience side in a low-cost and high-quality manner.
In the video and teleconference field, the mature solutions currently use cisco or WebEx solutions. However, these commercial solutions are not open-source, the two are relatively closed, and the three are relatively high. Currently, WebRTC-based real-time communication is a mature solution for interactive live broadcasts with a relatively small number of interactions.
It is a real-time communication between multiple parties based on the WebRTC protocol. connections between local users (Anchor) and remote users (connected audience) are managed through the RTCPeerConnection API, this API encapsulates details related to bottom-Layer Management and signaling control. Based on this solution, you can easily implement real-time communications between multiple parties (less than 14 persons), as shown in:
Of course, the complexity of communication is relatively simple, for example, in the case of two people. However, when the number of people increases to four, the optional network structure increases, as shown in. As shown in, a self-organizing network can be formed between each vertex for communication, it can also form a star-type communication network centered on one person, and allow everyone to communicate through a centralized server.
As a high-performance and scalable basic live broadcasting service provider, qiniu live broadcasting cloud has chosen anchor as the center to form a star-shaped communication network, supporting the interaction quality between the host and multiple viewers. At the same time, in order to ensure the real-time transmission of the synthesized audio and video to other audiences, the modified UDP protocol is used for transmission:
In the next serialization, we will introduce encoding and encapsulation in detail.
Coming soon!
Author: He Lishi @ qiniu cloud's chief evangelist. For more technical insights in the cloud industry, visit qiniu cloud's blog.