C # Implementation of video conferencing system ggmeeting (with Source)

Source: Internet
Author: User
Tags server memory

Some time ago did an online education training project, and video conferencing is similar, so, I intend to like WAN instant communication system GG (QQ high imitation version), write a video conferencing system and the realization of the principle and source are shared out, so that interested friends can refer to. Inherit the name of GG, I have this video conferencing system named Ggmeeting, the current version of 1.0, the following features will be continuously enhanced.

In general, the main core functions of video conferencing are: multi-person voice, multiplayer video, public electronic whiteboard, conference room management. In this paper, we will introduce the main functions of video conferencing system and its implementation principles, followed by the details of detailed implementation details of each feature.

I. Voice calls 1. Basic model

In video conferencing, network voice calls are usually many-to-many, but at the model level, we discuss the channel of One Direction. One side speaks and the other hears the sound. Seemingly simple and swift, but the process behind it is rather complicated. We simplify the main aspects of their passing into the conceptual model shown:

This is a most basic model, consisting of five important links: acquisition, encoding, transmission, decoding, playback.

Voice acquisition refers to the acquisition of audio data from the microphone, that is, a sound sample converted into a digital signal. It involves several important parameters: sampling frequency, number of sample bits, number of channels.

Assuming we have the captured audio frame not encoded, and sent directly, then we can calculate the required bandwidth requirements, still the above example: 320*100 =32kbytes/s, if converted to bits/s, then 256kb/s. This is a lot of bandwidth usage. And through the network traffic Monitoring tool, we can find the use of similar QQ and other IM software for voice calls, traffic is 3-5kb/s, which is less than the original traffic is an order of magnitude. And this is mainly due to the audio coding technology. Therefore, in the actual voice call application, coding this link is indispensable. There are many commonly used speech coding techniques, such as g.729, ILBC, AAC, Speex and so on.

Once an audio frame has been encoded, it can be sent over the network to the caller. For realtime applications such as voice conversations, low latency and stability are very important, which requires our network to be delivered very smoothly.

When the other party receives the encoded frame, it is decoded to revert to the data that can be played directly by the sound card.

Once the decoding is complete, the resulting audio frame can be submitted to the sound card for playback.

2. Advanced Features

If you rely on the above technology to achieve a good effect on the WAN voice dialogue system, it is too easy. It is because many realistic factors have introduced many challenges to the above conceptual model, so that the realization of the network voice system is not so simple, it involves a lot of professional technology. A "good" voice dialogue system should reach the following points: low latency, small background noise, smooth voice, no card, pause feeling, no echo.

For low latency, only in the case of low latency, so that the two sides of the call have a strong sense of realtime. Of course, this mainly depends on the speed of the network and the distance between the physical location of the call, on the simple software perspective, the possibility of optimization is very small.

(1) Echo cancellation

Now almost everyone has been accustomed to voice chat, directly with the PC or notebook voice speaker function. When using the speaker function, the sound of the speakers will be collected again by the microphone and sent back to the other side, so that the other side heard their own echo.

The principle of echo cancellation is simply that the Echo cancellation module eliminates the echo from the capture frame by doing some sort of offset operation in the captured audio frame based on the audio frame just playing . This process is quite complicated because it is also related to the size of the room you are chatting with and the position you have in the room, as this information determines how long the sound waves reflect. An intelligent echo cancellation module that dynamically adjusts internal parameters to optimally adapt to the current environment.

(2) Noise suppression

Noise suppression, also known as Denoising, is based on the characteristics of the voice data, the part of the background noise is identified and filtered out from the audio frame. There are many encoders built into this feature.

(3) Jitter buffer

Jitter buffers (jitterbuffer) are used to resolve network jitter issues. The so-called network jitter, is the network delay one will be small, in this case, even if the sender is scheduled to send packets (such as every 100ms send a packet), and the receiving party can not be the same timing, and sometimes a period of a packet can not receive, sometimes a cycle received several packets. As a result, the receiver hears the sound is one card one card.

Jitterbuffer after the decoder, the voice plays before the link. That is, after the completion of the speech decoding, the decoding frame is put into Jitterbuffer, the playback callback of the sound card arrives, the oldest frame is removed from the jitterbuffer to play.

The buffer depth of the jitterbuffer depends on the degree of network jitter, the greater the network jitter, the greater the buffer depth, the greater the delay in playing audio. Therefore, Jitterbuffer is the use of a higher delay in exchange for the smooth playback of sound, because compared to the sound of a card, a slightly larger delay but smoother effect, its subjective experience is better.

Of course,the buffer depth of jitterbuffer is not constant, but is adjusted dynamically according to the degree of network jitter . When the network returns to very smooth, the buffer depth is very small, so that the increased playback delay due to jitterbuffer is negligible.

(4) Mute detection

In a voice conversation, if a party does not speak, it will not generate traffic. Mute detection is used for this purpose. Mute detection is often also integrated into the coding module. The mute detection algorithm, combined with the previous noise suppression algorithm, can identify if there is currently a voice input, and if there is no voice input, it can encode and output a special encoding frame (such as length 0). Especially in a multiplayer video conference where only one person is usually speaking, the use of silent detection technology to conserve bandwidth is still very impressive.

(5) Remix

In video conferencing, when many people speak at the same time, we need to play the voice data from more than one person at the same time, and the sound card plays a buffer, so we need to mix the multi-channel speech, this is the mixing algorithm to do things.

Two. Video Call 1. Basic model

The conceptual model for video calls is exactly the same as voice:

Camera capture refers to every frame of video captured from the capture camera. On Windows systems, it is often implemented using VFW or DirectShow technology. The two key parameters for capturing video are frame rate (FPS) and resolution.

In general, a camera can support a variety of different acquisition resolutions and capture frame rate, and different cameras support the same set of resolutions. For example, there are many high-definition cameras can support 30fps 1920*1080 image acquisition.

Encoding is used to compress the video image and also determines the sharpness of the image. The commonly used techniques of video coding are h.263, H. MPEG-4, XviD, etc.

When the other party receives the encoded video frame, it is decoded to revert to a frame image and then drawn on the UI interface.

2. Advanced Features

Compared to the voice, the video related processing to be simple.

(1) Dynamically adjust the sharpness of the video

On the Internet, the speed of the network is real-time dynamic change, so in the video conference, in order to prioritize the quality of voice calls, the need to adjust the video parameters in real time, the most important thing is to adjust the definition of the code, because the higher the clarity, the higher the bandwidth requirements, and vice versa.

For example, when the detection network is busy, it will automatically reduce the sharpness of the code to reduce the use of bandwidth.

(2) automatically discard video frames

Similarly, when the network is busy, there is another way, that is, the sender is actively discard the video frame to send, so in the receiver's view, the frame rate fps is reduced.

Three. Electronic Whiteboard

In video conferencing, the function of whiteboard is very important. Usually the host of the meeting will draw on the whiteboard, and others will be able to watch and manipulate the Whiteboard content simultaneously.

The usual whiteboard supports the following functions: line segment, arrowhead Line, double arrow line, horizontal elbow type connector, vertical elbow connector, rectangle, triangle, ellipse (circle), text, freeform, insert picture, laser pointer.

In the realization, the electronic whiteboard mainly uses the GDI + technology.

For the electronic whiteboard synchronization, the principle is this: for example, when the operator draws an image on the whiteboard, this operation will be encapsulated into a Command object (order mode), and then sent through the webcast to the other people in the meeting. When someone else receives the command object, it is converted to a whiteboard operation, so that the contents of each whiteboard are synchronized automatically.

Four. Conference Room Management

It is very appropriate to use dynamic groups to represent meeting rooms in a typical video conferencing scenario where dynamic creation of video conference rooms is done and destroyed dynamically after use.

A "dynamic group" is a group that is dynamically created in server memory, does not need to serialize storage to a database or disk, creates one when needed, then joins multiple members for intra-group communication, and destroys it directly from memory when it is no longer in use .

Based on socket technology, we can implement the Dynamicgroupmanager class on the service side to manage dynamic groups.

Although dynamic groups exist only in memory, we can persist some of their important information to the database as long as the project needs to be stored. Then, when the server restarts, you can load important room information from the DB.

Five. Ggmeeting Source code

Ggmeeting the current version of 1.0, has achieved the above 4 main functions, you can download the source under study.

ggmeeting-v1.0 Source

Ggmeeting-v1.0 can be deployed directly to the version

Operating effect:

 

Deployment notes:

(1) Deploy the Ggmeeting.server to the server and run it up.

(2) Modify the value of ServerIP in the client configuration file GGMeeting.exe.config.

(3) Run the first client instance to enter the test room with a random account.

(4) Continue to run the client on another machine, enter the test room with a random account, and you can have a video conference in the Test room.

Note: The Voice video data is real-time acquisition, real-time playback of data, so when testing, the bandwidth requirements of the server is best to enjoy exclusive bandwidth, shared bandwidth is generally unable to meet the requirements of real-time voice video.    

________________________________________________________________________

Welcome and I discuss all about GG and ggmeeting, my qq:2027224508, a lot of communication!

We have any questions and suggestions, you can leave a message, you can also send email to my email address: [email protected].

If you feel good, please powder me, by the way to the top

C # Implementation of video conferencing system ggmeeting (with Source)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.