Getting started: The most complete web-end instant messaging technology in history

Source: Internet
Author: User
Tags csrf attack

Getting started: The most complete web-end instant messaging technology in historywords 4514 Read the comments 0 likes 4 Objective

For instant messaging scenarios such as IM (instantmessaging) chat applications (such as:, QQ), message push technology (e.g., messaging push modules as standard in mobile apps today), most are desktop applications or native applications are popular, And online about the native IM (related articles please see: "IM architecture," "Im Comprehensive Information", "im/push Communication Format, protocol," "Im Heartbeat live", "Im Security", "real-time audio and video development"), message push application (see: "Push Technology Good Article") of the communication principle of the introduction also more , not to repeat it here.

And the web-side IM application, due to browser compatibility and its inherent "client Request server processing and response" communication model, resulting in a browser to achieve a better compatibility of IM applications, its communication process must be a combination of many technologies, The purpose of this article is to discuss these techniques in detail and analyze their principles and processes.

Learning Communication

-More information on instant Messaging: Http://www.52im.net/forum.php?mod=collection&op=all

-Instant Messenger Development Exchange Group:215891622

More information

"Web-side Instant Messaging technology inventory See":

"Web-Side IM Technology inventory: Short Polling, Comet, Websocket, SSE"

"About Ajax short polling":

There is no point in finding this information, unless you are bluffing, consider 3 other options.

"For a detailed introduction to comet technology, see":

"Comet technical details: Web-based real-time communication technology with long-connected HTTP"

"Web-side Instant Messaging: http long connection, long polling"

"Web-side Instant Messaging: No websocket can handle the immediacy of the message."

Open Source Comet Server Icomet: Support for millions of concurrent web-enabled Instant Messaging solutions

"For a detailed description of the WebSocket, see":

"WebSocket Detailed (a): Preliminary understanding of WebSocket technology"

WebSocket detailed (ii): Technical Principles, code demonstrations and application cases

WebSocket Detailed (iii): in-depth websocket communication protocol details

Socket.io: Supporting WebSocket, a framework for web-side instant Messaging

What is the relationship between Socket.io and WebSocket? What's the difference? 》

"For a detailed article on SSE, see":

SSE Technical detail: A new HTML5 server push event Technology

"More web-based instant messenger articles see":

Http://www.52im.net/forum.php?mod=collection&action=view&ctid=15

First, the traditional web communication principle

The browser itself, as a thin client, does not have the ability to communicate directly through a system call to another client browser that is offsite. This is different from the way our desktop app works, and it's common for desktop apps to have a TCP connection to a process on the other end of the remote host to achieve full-duplex instant communication through a socket.

Browser from the beginning of the start is the client request server, the server returns the results of the mode, even if the development has not changed any. So, to be sure, to realize the communication between the two clients, it is necessary to forward the information through the server. For example, a to communicate with B, it should be a first send the information to the IM application server, the server according to the information contained in a message sent to B, and also b to a is this mode, as follows:

Second, the traditional means of communication to realize IM application needs to solve the problem

We recognize that the web-based implementation of IM software is still a browser request server mode, this way, for the development of IM software needs to address the following three issues:

Dual-Work communication:

That is, to reach the browser pull server data, Server push (push) data to the browser;

Low latency:

That is, the information that is sent to B by browser A is forwarded to B quickly by the server, and the information of B is also quickly given to a, which is actually asking any browser to quickly request the data from the server, and the server can quickly push the data to the browser;

Cross-domain support:

Usually the client browser and server are in different locations of the network, the browser itself does not allow direct access to the server under different domain name, even if the same domain name of the same IP address, the same port is not the same, this is mainly for security reasons.

Instant Messenger Note: For security issues caused by browser cross-domain access, there is a method called CSRF network attack, please see the excerpt below

CSRF (Cross-site request forgery), Chinese name: cross-site requests forgery, also known as: one click Attack/session Riding, abbreviated as: CSRF/XSRF.

You can understand that. CSRF attack: An attacker steals your identity and sends a malicious request on your behalf. The things that CSRF can do include: Send mail in your name, message, steal your account, even buy goods, virtual money transfer ... Issues include: personal privacy breaches and property security.

Csrf This attack method in 2000 has been put forward by foreign security personnel, but at home, until 06 began to be concerned, 08, a number of large communities and interactive sites at home and abroad, respectively, CSRF loopholes, such as: NYTimes.com (New York Times), MetaFilter (a large blog site), YouTube and Baidu Hi ... Now, many sites on the Internet remain defenseless, so that the security industry calls CSRF "the Sleeping Giant".

Based on the above analysis, the following three questions are given a solution.

Three, full duplex low Latency Solution Solution 3.1: Client browser polling server (polling)

This is the simplest solution, the principle is that the client in the way of Ajax in a short period of time to send a request to the server, the server returns the latest data, and then the client based on the data obtained to update the interface, so that the indirect realization of instant communication. The advantage is simple, the disadvantage is that the server pressure is large, wasting bandwidth traffic (typically, the data is not changed).

The client code is as follows:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

Create a XHR object that requests the server to get the server time and print it every 2 seconds.

Server-side code (node. js):

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

The results are as follows:

Solution 3.2: Long Polling (long-polling)

In the polling solution above, the connection is closed after the request is complete, because each time a request is sent, the server sends data regardless of whether the data has changed. Much of this communication is unnecessary, and then there is the long polling (long-polling) approach. In this way, the client sends a request to the server, the server checks to see if the data requested by the client has changed (with the latest data), responds immediately if a change occurs, otherwise maintains the connection and periodically inspects the latest data until a data update or connection timeout has occurred. At the same time, once the client connection is disconnected, the request is made again, which greatly reduces the number of client request servers over the same period. The code is as follows. (See the Web-side IM: http long connection, long polling) for detailed technical articles)

Client:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

When the readysate of the Xhr object is 4, the server has returned data, the connection is disconnected, and the server is requested to establish a connection again.

Service-Side code:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

In the service side by generating a random number between 1 to 9 to simulate whether the data has changed, when the random number between 0 to 5 indicates that the data has changed, directly returned, otherwise remain connected, every 2 seconds to detect again.

The results are as follows:

You can see that the returned time is irregular and that the number of responses returned in a unit of time is less than the polling method.

Solution 3.3: Based on Http-stream communication

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

The server periodically sends a random number to the client and invokes the client process function.

The test results in IE5 are as follows:

You can see the instant communication that implements the client-to-server request-push in the low version of IE.

3.3.3 Data flow communication based on Htmlfile

New problems arise, in IE, using the IFRAME request service side, the service side to keep communication connection not all returned before, the browser title has been in the loading state, and the bottom is also displayed loading, which for a product, the user experience is not good, So Google's genius has come up with a hack way. is in IE, the dynamic generation of a Htmlfile object, this object ActiveX form of the COM component, it is actually an in-memory implementation of the HTML document, by adding the generated iframe to this in-memory htmlfile, and using the data flow signal of the IFRAME to achieve the above effect. Also, because the Htmlfile object is not added directly to the page, it does not cause the browser to display a loading phenomenon. The code is as follows.

Client:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

This is what the server sends to the IFRAME:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

Note that the data Content-type header of the server output is set to Application/javascript, otherwise some browsers will interpret it as text.

The results are as follows:

Wu, WebSocket

In the above solutions, are the use of browser one-way request Server or server one-way push data into the browser of these technologies formed by the combination of hack technology, in HTML5, in order to enhance the functionality of the Web, provides WebSocket technology, it is not only a Web communication mode, is also an application-level protocol. It provides native, dual-domain communication between the browser and the server, establishing a websocket connection between the browser and the server (actually a TCP connection) that enables client-to-server and server-to-client data to be sent at the same time. For the principle of this technology, see: "WebSocket detailed (a): Preliminary understanding of WebSocket technology", "WebSocket detailed (ii): Technical Principles, code demonstrations and application Cases", " WebSocket Detailed (iii): in-depth websocket communication protocol details, here is not to repeat, directly to the code. Before you look at the code, you need to understand the websocket of the entire work process.

The first is the client new WebSocket object, the object will send an HTTP request to the server side, the server found that this is a webscoket request, will agree to the protocol conversion, sent back to the client a 101 status code response, the above process is called a handshake, After this handshake, the client establishes a TCP connection with the server, where the server and the client are able to communicate in two-way. At this time the two-way communication in the application layer is the WS or WSS protocol, and HTTP is not related. The so-called WS-Protocol, is to require the client and the server to follow a certain format to send data messages (frames), and then can understand.

The data format for the WS-Protocol requirements is specified as follows:

One of the more important is the fin field, which occupies 1 bits, indicating that this is the end of a data frame flag, and also the start of the next data frame. The opcode field, which occupies 4 bits, when it is 1 o'clock, indicates that the text frame is passed, 2 represents a binary data frame, and 8 means that the communication needs to end (that is, the client or the server which sends the field to the other, which means that the other side is closing the connection). 9 indicates that a ping data is being sent. Mask occupies 1 bits, 1 indicates that the Masking-key field is available, and the Masking-key field is used to perform unmask operations on the data sent by the client. It occupies between 0 and 4 bytes. The payload field represents the data that is actually sent, which can be either character data or binary data.

So both the client and the server send messages to each other, the data must be assembled into the frame format above to send.

First look at the service-side code:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

The service side listens to the data event to obtain the client sent by the message, if it is a handshake request, send an HTTP 101 response, otherwise parse the resulting data and print out, and then determine whether the request is disconnected (opcode is 8), if it is disconnected, Otherwise, the received data is assembled into frames and sent to the client.

Client code:

(Jane can not support the program code style, detailed code, see synchronization published article: http://www.52im.net/thread-338-1-1.html)

The client creates a WebSocket object that, after the OnOpen time trigger (after the handshake succeeds), assigns an event to the button on the page that sends the information in the page input, the server receives the information to print out, and assembles the frames back to the daily client. The client then append to the page.

The customer results are as follows:

Server Output results:

As can be seen from the above, WebSocket in support of its browser does provide a full-duplex cross-domain communication scheme, so in each of the above scenarios, our first choice is undoubtedly websocket.

Conclusion

The above discussed so many for IM application development involved in the communication, in the actual development, we usually use some other people write good real-time communication library, such as Socket.io, SOCKJS, Their rationale is to encapsulate the above (and some other flash-based push) technologies on the client and server side, and then give the developer a unified call interface. This interface uses websocket in an environment that supports websocket, enabling some of the hack technologies described above when it is not supported.

In practice, the use of any of the above mentioned techniques (except WebSocket) does not reach the low latency we put forward at the beginning of the article, full-time, cross-domain all requirements, only to combine them to work well, so usually, These libraries use a variety of combinations on different browsers for real-time communication.

Here are the different combinations that SOCKJS takes under different browsers:

As can be seen from the figure, for modern browsers (ie10+,chrome14+,firefox10+,safari5+ and opera12+) are able to support the websocket very well, The remaining low-version browsers typically use XHR (XDR)-based polling (streaming) or IFRAME-based polling (streaming), which, for Ie6\7, does not support XDR cross-domain or XHR cross-domain, So only can take the jsonp-polling way.

Getting started: The most complete web-end instant messaging technology in history

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.