An instant Messaging project needs to think about issues

Source: Internet
Author: User

First, the agreement

The choice of what kind of agreement is mainly from the following considerations.

1. Cross-platform versatility

If the pursuit of high-versatility, such as their own im server can be connected with MSN,gtalk and other implementations can use the xmpp protocol.

2. Performance

The protocol itself is bloated with text protocols based on XML,JSON , and other formats. Because descriptive information is needed to identify the meaning of each field,

But this kind of protocol is good extensibility.

For example, A to B message can be defined as the following

{    // represents a chat message    ' from ': ' A ', 'to    ': ' B ',    ' Message ":" Hello "}

If you switch to a custom binary protocol, the server and the client resolve the contents of each field sequentially by agreeing on the order in which each field appears.

You no longer need the descriptive information for each field, and the protocol itself is much lighter. Especially for mobile applications, which are sensitive to traffic, this association

Will have a more obvious advantage.


Second, distributed (cluster)

Whether it's for hosting caps or for disaster-tolerant, robust considerations, the most effective way is to build clusters.

Once the cluster is established, the user will be dispersed to different nodes, how can users on different nodes locate each other and send messages to each other? There are two ways of doing this.

First, all node data is synchronized.

Large im systems tend to have their own cache system, and many nosql databases (Redis,Mongo,Mnesia , etc.) can be used to implement this cache.

Information such as online users and their nodes can be stored in the cache. When a user a logs in, the user and the node information logged in will be synchronized

To all of the nodes. With this data, users on any node know how to find A user and send them a message. The outermost can use a load

Balance the device and distribute the user to different nodes. The disadvantage of this approach is that data synchronization consumes performance while data redundancy makes memory consumption a bit higher.

The second is to use hash to scatter the user to different nodes.

This method does not require data synchronization, and each node only holds user data logged on to this node. The user each login can be based on their own uid Hash,

Compute the node that belongs to you. When a user sends a message to another user, the server calculates the target user according to the target user uid using the same hash algorithm

The node being located. To reduce the impact on the entire system when the number of nodes is increased or decreased, a consistent hash can be used.

The disadvantage of this approach is that when there is a single point of failure, the user data of the fault node needs to be rebuilt at other nodes, which can put a great strain on other nodes.

It may also cause some data to be permanently lost.


Iii. Avoiding loss of messages

The core business of im system is to handle the forwarding of messages, and losing messages is intolerable. The key to ensuring that the message is not lost is how the server determines whether a message was received by the client.

This criterion of judgment is the key.

Cannot be judged based on the user's online status, because the server perceives the user's online status as inaccurate regardless of the method used. The only certainty that the client has

The way to receive the message is to ask the client to send an ackto the server after it receives a message, and the server can confirm that the message client received it when it receives an ack .

Then user A to user B message processing logic should look like this:

A sends a message to the server, the server stores the message and forwards the message to b, and when the ack of b is received, the message is deleted or transferred to the message backup table.

If b is not on the line, then wait until b is online to push all messages belonging to b to the B,B client after the message has been sent to the server to send an ack. The server received an ack to delete or transfer the message.


Iv.Webim

There are several ways to implement Webim , and the differences in these approaches are mainly reflected in how the server pushes messages to the browser client.

I know the way to have polling,WebSocket,Bosh and so on.

Polling pull as the name implies, the client initiates a timer, and periodically sends an HTTP request to the server to get the message, performance is not good.

websocket and CS structure are treated in the same way. But not all browsers support WebSocket.

What I like more is bosh this way. A simple understanding of the Bosh principle is that the browser client sends an HTTP request to the server, and the server receives the request and

Do not return immediately, but in the server waiting for the message to arrive, when there is a message to return the body as the message content of the response, or to reach the time-out

A message that is still not sent to the user arrives, and the server returns an empty body response. The client initiates the requestagain after receiving the response ,

The server hold this request again waiting for the message to arrive, and so on. This is the way I learned from XMPP .

To get a deeper look at this

Five, rapid construction

If you want to build quickly, then of course you are looking for open source.

I recommend Ejabberdfor a reason.

1. concurrency performance is good.

I ran on the 16G memory server to 30W concurrency without dropping the situation, with more than one machine to run the client, the last machine is not enough to press only to 30W.

Depending on the memory uptrend, about 1G of memory can carry 5 of millions of users online (where our own code has a lot of room to optimize).

2. more mature, complete.

I used to find a JS developer on GitHub to develop the XMPP client, which can be used directly. is a Bosh -based xmpp client.

If you embed this web-version client on iOS,andriod , and other applications and pc clients, and then deploy a ejabberd, you can

Can originally need several teams to develop the product of half a year, by a person spent one weeks to get a ballpark figure.

and Tsung, a stress test tool developed by native Erlang , comes directly to the press.

3. good expansion,Ejabberd Internal provides a lot of hooks, for the expansion of the development of Ejabberd plug-in provides convenience. In addition, the plug-in (module) can be customized through the configuration file, dynamic loading, choose the load you need.

4. In addition to these, the features of theErlang language itself, such as fault tolerance, are not much to say.


Report:

Https://github.com/processone/ejabberd Ejabberd

https://github.com/erlang-synrc/xmpp.js XMPP Client

Https://github.com/processone/tsung Tsung pressure test tool

An Instant Messaging project needs to think about the problem

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.