Reprinted Please note: http://blog.csdn.net/herm_lib/article/details/7252329
I have had several IM system development experiences, and one is still running online. Prepare to briefly introduce the architecture of IM systems for large-scale commercial applications. One of the most important aspects of designing this architecture is low coupling. The whole system is designed into multiple separate subsystems. I divide the entire system into the following parts: (1) status message system (2) Friend system (3) P2P system (4) other extended business systems
First check the status message system
Connd
The client can access the server. It can support udp or TCP. Generally, TCP is recommended. Connd can be configured with multiple servers. when the client is connected, Server Load balancer can be achieved through simple DNS round robin. Connd is used to maintain connections and forward message packets.
Pconnd
Proxy connd, a proxy access server, is a connd extension. In addition to the connd function, it supports server access, such as web server.
Msgd
The message processing server provides the following functions: User status management, message Forwarding (including rationality verification), and offline message storage.
Indicates the process of notifying all friends of a user after successful logon. In my system, the user status is also simple as text chat messages. The following user U has friends F1 and F2 in the launch process.
(1) connd receives a publish message from the u and sends the message to the msgd of the U.
(2) msgd gets U friends, F1, F2. If F1, F2, and u are not in the same msgd, msgd transfers the message to F1 through connd and the msgd where F2 is located.
(3) The final msgd sends the online notification through connd to F1 and F2.
How does msgd U obtain the latest friends? I would like to describe this question.
User Friend data is stored in another sub-system: Friend sub-system. Msgd uses TCP (why does it use TCP ?) Actively retrieve from the friend system. Msgd also caches a copy of friend data. When msgd obtains a user's friend, if the cache is the latest, it will be retrieved directly from the cache; otherwise, it will be retrieved from the friend subsystem. Now the key issue arises. How can we determine that the user's friends are up-to-date? We need to flexibly adopt different methods based on the characteristics of different businesses. Please refer to an efficient processing method:
(1) The friend Sub-system calculates a hash value for each user's friend (MD5 can be used ).
(2) When the client obtains a friend, it also obtains the hash value. When sending a message related to the friend, it brings the hash value to msgd.
(3) msgd obtains the hash value when it first obtains a user's friend from the friend subsystem. For example, if a user wants to forward a status message and obtain a friend, compare hash1 brought by the client with its own hash2...
Businesses such as Im are characterized by a small amount of writing and reading of friend data, which is negligible relative to read consumption. In the above method, the hash values of the two are basically equal each time, and friends data is directly obtained from the cache. This method can also be introduced to other application services. We recommend that you do not obtain data of similar friends from different processes.
Well, the above is a brief introduction to the IM system status message subsystem.
In addition, like the friend sub-system, it is very simple to enable several TCP access servers and access friends from different friends logic servers based on different users. For friend Logic Server, you can consider using the LRU elimination algorithm (this algorithm is often used ). If you have a lot of money, you can also perform full cache on your friend data.
The P2P system is similar to the architecture introduced on the Internet.