Realization principle of IM instant Messaging

Source: Internet
Author: User
Tags data structures list of attributes file transfer protocol

Instant Messaging (Instant Messenger, IM) software is based on TCP/IP and UDP communication, TCP/IP and UDP are built on the lower level of the IP protocol two communication protocols. The former is in the form of data flow, the transmission data after segmentation, packaging, through the virtual circuit between the two machines, continuous, two-way, strict guarantee data correctness of the file Transfer protocol. The latter is in the form of a datagram, the data to be split after the sequence of the arrival of the required file transfer protocol.


QQ is the use of UDP protocol to send and receive messages. When your machine is installed with OICQ, you are actually both a server and a client. When you log in to Oicq, your OICQ is connected to Tencent's main server as a client, and when you look at who is online, your OICQ once again reads the list of online netizens from QQ server as client. When you and your OICQ partner chat, if you and the other connection is more stable, you and his chat content is in the form of UDP, the transfer between the computer. If you and the other's connection is not very stable, QQ server will be for your chat content to relay. Other instant messaging software principles are similar.

General steps:

First, user a enters their username and password to log in to the Instant Messaging server, the server authenticates the user by reading the user database, and if the username and password are correct, register the IP address of User A, the version number of the IM client software and the TCP/UDP port number used. It then returns the flag of User a login success, at which point user A's status in the IM system is online (Presence).


Second, based on User A's buddy list stored on IM servers (Buddy list), the server sends user A-related information online to the PC of the instant messaging buddy who is also online, including online status, IP address, TCP port number used by IM clients, etc. Instant Messaging Buddy PC on the Instant Messaging software received this information will be on the PC desktop pop-up a small window to be prompted.


The third step, the Instant Messaging server to the user a store on the server's buddy list and related information back to his PC, which includes also online status, IP address, IM clients use TCP port number, etc., User A's PC on the IM client received These buddy lists and their online status are displayed.


Next, if User A wants to chat with his online friend User B, he will directly through the server sent over the User B's IP address, TCP port number and other information, directly to the User B's PC sent chat information, User B's IM client software received after the display on the screen, Then User B then directly reply to User A's PC, so that both sides of the instant text message is not through the IM server relay, but through the network for Point-to-Point direct communication, which is called the Peer-to-peer communication mode (Peer to Peer) . In the commercial Instant messaging system, if User A and User B point-to-point communication because of firewall, network speed and other reasons difficult to establish or slow, IM server also provides a message relay service, that is, user A and User B instant messages all first sent to the IM server, and then by the server forward to each other. In the early IM system, communication between IM client and IM server adopts UDP protocol, UDP protocol is unreliable transmission protocol, and in direct communication between IM client, TCP protocol with reliable transmission capability is adopted. With the development of user requirements and technology environment, the current mainstream instant messaging system tends to use TCP protocol between Instant Messaging clients, instant Messaging clients and instant messaging servers.


S--c1
|
| C1 each time want to communicate with C2, first to the S hand an application, and then s agreed to transfer the information to C2, each time after the communication is this
C2


S--c1
|
| C1 first wanted to communicate with C2, handing an application to S, S agreed, told C1,C2, then C1 and C2 established a connection that could communicate directly without passing s.
C2

First, the server's performance requirements are high, requiring the server to handle a high number of connections at the same time, because all information is passed through the server, and it can control the information passed.


The second, only when the user landing or offline connection with the server, usually in communication, is a direct point-to-point connection between users, this realization more reasonable.


QQ Chat information is in direct communication between two users, and MSN to go through the server relay.

QQ When the user login, users need to first connect with the QQ server to log in, the server will return some information to the customer, such as your good online situation and IP information, and then the customer can be the requested friend to do point-to-point connection, the two communicate with each other.


If C1 and C2 are in the private network, to do NAT through the router to go out, the socket between them is how to build it.
Take a look at the following agreement.

Simple traversal of User Datagram Protocol (UDP) through Network
Address Translators (NATs) (STUN).

STUN is a set of protocols defined to achieve transparent penetration of NAT. He makes local intranet machines, with the ability to acquire the Ip,nat types that can be learned from his NAT gateway.

Why need stun:

Because NAT solves the problem of sparse IP address, it also brings a lot of problems. For example, all peer-to-peer applications, such as file share, multimedia, and online games, etc.
To solve this problem, someone put application Layer gateways (ALGs) into NAT,
ALGs also has serious problems, such as a client with too fast a speed, and each application needs to be implemented separately.
Can't keep up with the development of applications.
In order to overcome the problem of ALGs, the Middlebox Communications (midcom) was proposed protocol
Intermediate Agreement. But the Midbox protocol allows some of the client to control the behavior of the nat/firewall through this
Separate the application protocol from the NAT, and process the ALGs from the basic NAT. But because to deal with Midbox, then
All NAT or firewalls have to be upgraded,

For all the above reasons:
The protocol described here, simple traversal of UDP through NAT
(STUN), allows entities behind a NAT to the Discover
Of a NAT and the type of NAT, and then to learn the addresses
Bindings allocated by the NAT. STUN requires no changes to NATs, and
Works with an arbitrary (any) number of NATs in tandem between the
Application entity and the public Internet.

To penetrate NAT, you first know some of the features of NAT:
Nat is divided into 4 types (plus firewalls, several more):
1. Full transparent NAT (fully Cone NAT):
from the same internal host (in IPX) + port (in portx) The sent data maping to the same IP (out IP x) and the port (out Port x) to send the Out-of-band network.
and from another server (Y), if it is connected directly to the maping IP (out IP x) and port (out Port x), the data is forwarded to the internal host. (in IPX), (in Portx).
//That is, the sport,sport of the packet into the intranet is unrestricted
2. Restricted NAT (restricted Cone), data sent
from the same internal host in IPX) + port (in portx) maping to send Out-of-band networks for the same IP (X) and port.
Unlike full NAT, the request of the external machine is forwarded to the host in IPX) + port (in portx) only when X is being sent.
that is, the packets that go into the Intranet are sport unrestricted, SIP is restricted, only for NAT map data IP
3, Port-restricted NAT (ports restricted Cone:)
and restricted Nat are different, Only the external unsolicited source IP and port equals the destination IP and port of the request sent by the intranet.
4. Symmetric NAT (symmetric)
If the destination IP and port of the packet being sent, then the mapping IP and port will be the same.
Internal network The same machine, the same port if the destination address is different, then mapping port is also different,
so only his active server can know his mapping port, other servers if you want to
even he can only rely on guessing port.
Summary:
The front 3 nat,maping port and IP are determined based on the IP and port of the internal network that sent the packet.
If the intranet IP and port of the data are the same, then the port and address after the mapping are fixed.
This feature provides a good condition for our crossing.
4th NAT, the mapping address and port after the hole will become unreliable. It's hard to cross.
Note that ServerA, and ServerB are two public addresses, not two machines,

STUN simple Operation procedure:
send a request. The request is divided into two kinds of
1. Binding Requests, sent over UDP,
used to discover whether NAT is used to discover the public network address of NAT, and mapping after the port
2.Binding Response,
Server Generation Binding Response, and to get the MAPPINGIP and port, return to the client, the client compares mapping address and local address is the same, if it is the computer is also public network, otherwise judge the type of NAT (Judge method: Client uses additional STUN Binding Requests)
3.Binding Error,
4.Shared Secret Requests, sent over TLS [2] over TCP.
This request requires the server to return a temporary user Name and password, used next to the binding requests/response, to verify the integrity of the information
5.Shared Secret Response,
6 Shared Secret Error Response. The
STUN information Structure
STUN is composed of later data structures: STUN header +stun Payload
STUN header structure as follows: Stored values are in Network order
field type
STUN message type short int messages class Type
Length short int payload length, no header length
Transaction ID OCTET[16] Connection ID value, check request,
and response

Payload of Stun
The payload of shun is a property of some stun, and the type of the attribute is determined by the type of information.
The attributes of the stun are defined, and the list of attributes is as follows:
Mapped-address must be selected in binding Response, (add maping IP and Port)
Responseaddress can be selected in binding Request, specify response, send to where
If not specified, response is sent to maping IP and PORT
Change-request can be selected in binding REQUEST. Used to determine whether the client's NAT type is restricted NAT or Port-restricted NAT (the command server/ip,response requests from different source ports)
Changed-address can be selected in binding responses tells the client to change the port and IP
Source-address will only be used in binding responses, the source port he IP for marking information
USERNAME Optional Shared Secret response/binding Requests
PASSWORD, must choose Sharedsecret Response
Essageintegrity can be selected in Binding responses, Binding request record information integrity
Error-code Binding error Response and Shared Secret error Response.
Unknown-attributes
Reflected-from Binding responses. For tracing and preventing DDoS

Methods and processes of penetrating
Note that SERVER1, and SERVER2 are two public address, not two machines.

Appendix:

I. The concept of IM technology

Im technology full name Instant messaging, Chinese translation "Instant Messaging", which is a technology that enables people to identify online users on the Internet and exchange messages with them in real time, is the rapid rise of online communication since the invention of e-mail.

The advent of IM and the internet have a close relationship, IM completely based on TCP/IP network protocol family implementation, and TCP/IP protocol family is the entire Internet to achieve the technical basis. The earliest Instant Messaging protocol is IRC (Internet Relay Chat), but unfortunately it can only use words, symbols in the way through the Internet to talk and communicate. As the internet becomes highly developed, instant messaging has become far more than just chatting. Since 1996, the first IM product ICQ invention, IM technology and function also began to basic molding, voice, video, file sharing, send SMS and other advanced information exchange functions can be implemented in IM tools, in is a powerful IM software is enough to build a complete communication platform. At present, some of the most representative of the IM communication software has MSN, Google Talk, Yahoo, Messenger, Tencent QQ.


Second, IM technology principles and working methods

Typical IM works as follows: Login IM communication Center (IM Communication Server), get a history of the Exchange object list (Buddy list), and then its own logo for online status, when someone in the buddy list at any time log in online and try to contact you through your computer, IM system will send a message to remind you, then you can establish a chat session channel with him to carry on various kinds of messages such as typing text, through voice communication.


Technically, the basic technical principles of IM are as follows:

Log in or log off via IM server
User A finds B through the list, and User B gets the message and talks to it
Establish a separate communication channel with b via IM server guidelines


The first step, user a enter their own username and password login IM server, the server by reading the user database to authenticate the user, if the verification through, registered user A's IP address, IM client software version number and the use of the TCP/UDP port number, and then return the user a sign of success, At this point, user A's status in the IM system is online (Presence).


The second step, according to user a store in the IM server Friend list (Buddy list), the server will be the user a online information sent to the same time online im friends of the PC, which includes online status, IP address, IM client use TCP port number, etc., Im Buddy's clients will be prompted when they receive this information.


The third step is the IM server to store user A on the server's buddy list and related information back to his client machine, this information includes information such as the online status, IP address, and TCP port number used by the IM client, which will display the list of friends and their online status when the IM client of user a receives it.


third, IM communication mode

1. Online Direct communication
If User A wants to chat with his online buddy User B, he will directly through the server sent over the User B's IP address, TCP port number and other information, directly to the User B's PC to send chat information, User B IM client software received after the display on the screen, and then User B directly back to the user a PC machine, So that the instant text messages on both sides no longer IM server relay, but directly through the network point-to-point communication, that is, Peer-to-peer communication mode (Peer to Peer).

2. Online proxy communication
User A and User B point-to-point communication because of firewalls, network speed and other reasons difficult to establish or slow, IM server will be the initiative to provide transit services, that is, user A and User B Instant Messaging all first sent to the IM server, and then by the server forward to each other.

3. Offline Agent Communication
User A and User B for various reasons can not be online at the same time, such as at this time A to B send a message, IM server can actively register a user's message, to B users next time, automatically forwarded the message to B.

4. Extended Mode Communication
User A can extend the information to B through the IM server, such as sending the SMS to B's cell phone, sending the fax to B's telephone, and passing it to B's e-mail by email.


In the early IM system, UDP protocol is used to communicate between IM client and IM server, UDP protocol is unreliable transmission protocol, and in direct communication between IM clients, TCP protocol with reliable transmission capability is adopted. With the development of user demand and technology environment, the current mainstream IM system tends to adopt TCP protocol between IM Client, IM client and IM server.


Instant Messaging in contrast to other modes of communication such as telephone, fax, email, etc., the greatest advantage is the immediacy and accuracy of message transmission, as long as the message delivery both on the network can be exchanged, the use of Instant messaging software delivery message, delivery delay is only 1 seconds.


the emerging embedded IM tool.

The traditional IM has ruled the Internet instant Communication domain for 10 years long, with its increasingly stable, and strong user adhesion, still dominate this huge market. However, the technology elites in the software industry are not content with this. They are thick and thin, and have been working to develop more superior instant messaging tools. Of course, in the function of continuous improvement, nature is an inevitable direction of development, in the Web2.0 era, how to enhance the user's adherence to the site, and not just for the attachment of IM, has become their main direction. So, embedded IM tool, came into being.

Relatively traditional even communication tools, they require users to download software packages, users need to install. for the website with IM products, the user can not use its IM tool directly after landing the website, it has certain influence on the flow and the user's adhesion. Therefore, in the IM and the site interdependence today, no network company, is willing to IM tools isolated.

So, at present, a new type of embedded IM tool has emerged. This IM tool, do not need to download the installation, when the user landing page, the IM directly nested in the Web page, you can directly use.

But in the function, then does not lose in the traditional IM, whether is the traditional text communication speed and the efficiency, or in recent years more and more becomes the IM tool essential audio/video function, this kind of embedded IM can provide the very stable transmission. More worth mentioning is, because embedded IM is nested in the Web page, software vendors, according to the needs of the site, design a suitable site style of IM products. Rather than the traditional IM tools, stereotyped, no personality.

At present, this kind of embedded im in the community, friends, associations and collaboration types of Web sites, the application has been more extensive. In the Web2.0 era, will play a more and more important role.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.