Study, research and implementation of WebSocket protocol

Source: Internet
Author: User
Tags base64 hash http request md5 reserved time interval versions dojo toolkit
1. What is WebSocket?

WebSocket is a protocol specification proposed by HTML5, referring to rfc6455.

WebSocket the specification of a communication, through a handshake mechanism, between the client (browser) and the server (webserver) can establish a TCP-like connection, so as to facilitate communication between the c-s. Before the advent of WebSocket, web interaction was generally a short or long connection based on the HTTP protocol.

WebSocket is the technology that is generated to address real-time communication between the client and the server. The WebSocket protocol is essentially a TCP-based protocol that creates a TCP connection for exchanging data after a special HTTP request is initiated through the HTTP/HTTPS protocol, after which the server communicates with the client in real time over this TCP connection.

Note: The original HTTP protocol is no longer needed at this time. 2. Advantages of WebSocket

The previous Web server implementation of push technology or instant messaging, with polling (polling), at a characteristic time interval (such as 1 seconds) by the browser automatically issued a request, the server's message pulled back actively, in this case, we need to constantly send requests to the server, however the HTTP The header of the request is very long, and the data contained in it may be just a small value, which can consume a lot of bandwidth and server resources.

And the most new technology to do polling effect is comet– using AJAX. However, although this technique can achieve full-duplex communication, it still needs to make a request (Reuqest).

The great thing about the WebSocket API is that servers and clients can push information to each other at any point within a given time frame. The browser and server only need to do a handshake action, after the connection is established, the server can actively transfer data to the client, the client can also send data to the server at any time. In addition, the header information exchanged between the server and the client is small.

WebSocket is not limited to communicating in Ajax (or XHR), because Ajax technology requires client-initiated requests, and WebSocket servers and clients can push information to each other;

So from a server perspective, WebSocket has the following benefits: Save headers per request
HTTP headers typically have a dozens of-byte Server Push
The server can proactively transmit data to clients 3. History 3.1 http protocol

The 1996 IETF HTTP workgroup released the 1.0 version of the HTTP protocol, and the now widely used version of the 1.1,HTTP protocol has undergone more than 17 years of development. This distributed, stateless, TCP-based request/response is widely used today in an Internet-prevalent protocol. From the rise of the Internet to the present, experienced the web1.0 era of portal site prevalence, and then with the advent of Ajax technology, the development of Web applications prevailing in the web2.0 era, and now moving towards web3.0 direction. The inverse HTTP protocol, developed from version 1.0 to 1.1, is a perfunctory improvement in cache handling, bandwidth optimization, and security, in addition to the default long connection. It retains a stateless, request/response pattern and never seems to realize that this should change. 3.2 HTTP requests sent through scripts (Ajax)

In order for a traditional Web application to interact with a server, a form must be submitted, the server receives and processes the form, and then returns a new page, since the data for the front and back two pages is mostly the same, and the process transmits a lot of redundant data and wastes bandwidth. So Ajax technology was born.

Ajax is the abbreviation for asynchronous JavaScript and, first presented by Jesse James Garrett. This technology pioneered the browser script (JS) to send HTTP requests. The Outlook Web Access group was used in 98 and soon became part of the IE4.0, but this technology has been very small, until early 2005, Google in his goole groups, Gmail and other interactive applications such as the widespread use of this technology, So that Ajax is quickly accepted by everyone.

The advent of Ajax makes the client and server data transfer less and much faster, but also to meet the rich user experience as the characteristics of the web2.0 era of the early development of the need, but slowly also exposed his shortcomings. For example, the need for real-time update data for rich interactive applications such as instant Messaging cannot be met. This small browser-side technology is still based on the HTTP protocol, the HTTP protocol requires the request/Response mode is also immutable, unless the HTTP protocol itself has changed. 3.3 One hack technology (COMET)

The low latency requirements of data for Web applications represented by instant messaging, the traditional polling-based approach is not enough, and it also brings a bad user experience. So a "server push" technology based on a long HTTP connection is hack out. This technique was named Comet, which was first proposed by the project manager of Dojo Toolkit, Alex Russell, in the blog post Comet:low Latency Data for the browser, and continues.

In fact, the server push very early existence, in the classic Client/server model is widely used, but the browser is too lazy, and does not provide a good support for this technology. But the advent of Ajax made it possible to make this technology available on the browser, and the integration of Google's Gmail and Gtalk first used this technology. With some key issues (such as IE loading display problems), soon this technology has been recognized, there are many mature open-source comet framework.

The following is a typical comparison between Ajax and comet data transmission, the difference is straightforward. The typical Ajax communication method is also the classic use of the HTTP protocol, in order to obtain data, you must first send a request. In a Web application where low latency requires more high, only the frequency of server requests can be increased. Unlike comet, the client maintains a long connection to the server, and the server proactively pushes the data to the client only when the data that the client needs is updated.

There are two main ways to implement Comet: The Ajax-based long polling (long-polling) approach

Iframe and Htmlfile-based streaming (HTTP streaming) mode

The IFRAME is an HTML tag, and the src attribute of this tag keeps a long connection request to the specified server, and the server side can keep returning the data, which is closer to the traditional server push than the first way.
In the first way, the browser will call the JS callback function directly after receiving the data, but how to respond to the data in this way. You can embed the JS script in the return data, such as "", the server side will return the data as parameters of the callback function, the browser will receive the data after the execution of this JS script.

3.4 Websocket---the Future solution

If the advent of Ajax is the inevitable development of the Internet, then the advent of comet technology is more revealing a helpless, just as a hack technology, because there is no better solution. It would be reasonable for the comet to solve the problem by WHO. Browser, HTML standard, or HTTP standard. Who should be the protagonist? Essentially, this involves data transfer, the HTTP protocol should be the first, it is time to change the lazy protocol of the request/Response mode.

The answer is given in the new generation of HTML standard HTML5, which provides a network technology WebSocket for full-duplex communication between browsers and servers. From the WebSocket draft, WebSocket is a completely new, independent protocol, based on the TCP protocol, which is compatible with the HTTP protocol and does not fit into the HTTP protocol, just as part of the HTML5. The script is then given another ability: to initiate a websocket request. We should be familiar with this approach, because Ajax is doing it, and the difference is that Ajax initiates HTTP requests. 4. WebSocket Logic

Unlike the HTTP protocol's different request/response modes, WebSocket has a handshake (Opening handshake) process before the connection is established, and there is a handshake (Closing handshake) process before closing the connection. Once the connection is established, both sides can communicate in two directions.
In the evolution of the WebSocket protocol, there have been several versions of the handshake protocol, which is explained here: Flash-based handshake protocol
The usage scenario is the majority version of IE, because most versions of IE do not support the WebSocket protocol, as well as the low version of the FF, Chrome and other browsers, there is no native support WebSocket. Here, the only thing the server has to do is prepare a websocket-location domain for the client, no encryption, and poor reliability.

Client Request:

Get/ls http/1.1
upgrade:websocket
connection:upgrade
Host:www.qixing318.com
origin:http:// Www.qixing318.com

The server returns:

http/1.1 101 Web Socket Protocol handshake
upgrade:websocket connection:upgrade websocket-origin:http:/
/www.qixing318.com
Websocket-location:ws://www.qixing318.com/ls

Handshake protocol based on MD5 encryption method
Client Request:

Get/demo http/1.1
Host:example.com
Connection:upgrade
Sec-websocket-key2:
Upgrade:websocket
Sec-websocket-key1:
Origin:http://www.qixing318.com
[8-byte security key]

Service side return:

http/1.1 101 WebSocket Protocol handshake
upgrade:websocket
connection:upgrade
websocket-origin:http:/ /www.qixing318.com
Websocket-location:ws://example.com/demo
[16-byte Hash response]

where Sec-websocket-key1,sec-websocket-key2 and [8-byte security key] These headers are the sources that Web server uses to generate the response information, based on Definition of draft draft-hixie-thewebsocketprotocol-76.
Web server generates the correct response information based on the following algorithm:

1. Read the values in the Sec-websocket-key1 header information by character, connect prompt the numeric characters together into a temporary string, and count the number of all spaces;
2. Convert the number string generated in step (1) to an integer number, Then divided by the number of spaces counted in step (1), the resulting floating-point numbers are converted into integers;
3. Convert the integer value generated in step (2) to a network byte array that conforms to the network transmission;
4. To Sec-websocket-key2 The header information is also carried out in steps (1) to (3) to get another network byte array,
5. Combine [8-byte security key] and network byte arrays generated in steps (3), (4) into a 16-byte array;
6. The byte array generated for step (5) uses the MD5 algorithm to generate a hash value that is returned to the client as a security key to indicate that the server has obtained the client's request and agrees to create the WebSocket connection

The Handshake protocol based on SHA encryption method
is also the most current way to see, here the version number is currently required more than 13 version.
Client Request:

Get/ls http/1.1
Upgrade:websocket
Connection:upgrade
Host:www.qixing318.com
Sec-websocket-origin:http://www.qixing318.com
sec-websocket-key:2scvxuep9ctjv+0mwb8j6a==
Sec-websocket-version:13

The server returns:

http/1.1 101 Switching Protocols
upgrade:websocket
connection:upgrade
sec-websocket-accept: mldknebnwz6t9sxu+o0fy/hgesw=

Where the server is the client escalated a key to a GUID ("258eafa5-e914-47da-95ca-c5ab0dc85b11″), take this string to do SHA-1 hash calculation, and then the results obtained by Base64 encryption, Finally, it is returned to the client. 4.1 Opening Handshake:

Client Initiates connection handshake request

Get/chat http/1.1
Host:server.example.com
upgrade:websocket
connection:upgrade
sec-websocket-key:dghlihnhbxbszsbub25jzq==
origin:http://example.com
sec-websocket-protocol:chat, Superchat
sec-websocket-version:13

Server-side response:

http/1.1 101 Switching Protocols
upgrade:websocket
connection:upgrade
sec-websocket-accept: s3pplmbitxaq9kygzzhzrbk+xoo=
Sec-websocket-protocol:chat
Upgrade:websocket
Indicates that this is a special HTTP request, and the purpose of the request is to upgrade the client and server-side communication protocols from the HTTP protocol to the WebSocket protocol. Sec-websocket-key
is a browser base64 encrypted key, the server side will need to extract sec-websocket-key information, and then encrypt.

Sec-websocket-accept
The server side appends a magical string "258eafa5-e914-47da-95ca-c5ab0dc85b11" to the received Sec-websocket-key key, and the result is Sha-1 hashed. The base64 encryption is then returned to the client (that is, Sec-websocket-key). Like what:

function Encry ($req)
{
    $key = $this->getkey ($req);
    $mask = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"; 
    # SHA-1 encrypted string once again base64 encrypted
    return Base64_encode (SHA1 ($key. ' 258eafa5-e914-47da-95ca-c5ab0dc85b11 ', true));
}
If the encryption algorithm is wrong, the client will make a direct error when the test is performed. If the handshake succeeds, the client side will start the OnOpen event. Sec-websocket-protocol
Represents the optional sub-protocol provided by the client request, and the server-side selected supported sub-protocol, which is used by the "Origin" server to differentiate between unauthorized WebSocket browsers sec-websocket-version:13
The client carries in the handshake request, such a version ID, indicating that this is an upgrade version, now the browser is the use of this version.

http/1.1 101 Switching protocols
101 Status codes returned by the server, all non-101 status codes indicate that handshake is not completed. 4.2 Data Framing

The WebSocket protocol transmits data through serialized data frames. Fields such as opcode, payload length, payload data are defined in the packet protocol. It requires that the data frames that the client transmits to the server must be masked: If the server receives a data frame that is not masked, it must actively close the connection. The data frames that the server transmits to the client must not be masked. If a client receives a masked data frame, it must actively close the connection.

For the situation, the party that found the error can send a close frame to the other (the status code is 1002, which indicates a protocol error) to close the connection.
The exact data frame format is shown in the following figure:

FIN
Identifies whether the last packet for this message, accounting for 1 bit RSV1, RSV2, RSV3: Used for expansion protocols, typically 0, each accounting for 1bit Opcode
Packet type (frame type), accounting for 4bits
0x0: Identifies an intermediate packet
0x1: Identifies a text type packet
0x2: Identifies a binary type packet
0x3-7: Reserved
0x8: Identifies a disconnect type packet
0x9: Identifies a ping type packet
0xA: Represents a Pong type packet
0xb-f: Reserved MASK: 1bits
Used to identify whether the payloaddata is masked. If the data for the 1,masking-key domain is a masked key, it is used to decode the payloaddata. The data frame emitted by the client needs to be masked, so this bit is 1. Payload length
The length of Payload data, which is 7bits,7+16bits,7+64bits: if its value is 0-125, it is the true length of Payload. If the value is 126, then the value of the number of 16bits unsigned integers formed by the subsequent 2 bytes is the true length of the payload. Note that the network byte order needs to be converted. If the value is 127, then the value of the number of 64bits unsigned integers formed by the subsequent 8 bytes is the true length of the payload. Note that the network byte order needs to be converted.

The length here means following a principle, with a minimum of bytes representing the length (minimizing unnecessary transmission). For example, the true length of the payload is 124, between 0-125, must be represented by the first 7 bits, the length 1 is 126 or 127, and the length 2 is 124, which violates the principle.

Payload data
Application-tier data server resolves client-side data

The parsing rules after receiving the client data are as follows: 1byte 1bit:frame-fin,x0 indicates that the message is followed by a frame;x1 that is the last frame 3bit of the message: FRAME-RSV1, Frame-rsv2 and Frame-rsv3, usually x0 4bit:frame-opcode,x0 is a continuation frame;x1 indicates that the text frame;x2 indicates that the binary frame;x3-7 is reserved to the non-control frame;x8 for closing the connection X9 means that PING;XA represents pong;xb-f reserved to control frame 2byte 1bit:mask,1 indicates that the frame contains a mask, 0 means no mask 7bit, 7bit+2byte, 7bit+8byte:7bit rounding values, If between 0-125, is the load data length, if 126 indicates that the latter two bytes take unsigned 16-bit integer value, is the payload length, 127 means the post 8 byte, take 64-bit unsigned integer value, is the payload length 3-6byte: Here assumes the load length between 0-125, and mask is 1, then these 4 byte is the mask 7-end byte: length is the load length taken out above, including extended data and application data two parts, usually no extended data, if mask is 1, then this data needs decoding, decoding rule is- 1-4byte mask loop and data byte do xor or manipulate.

Example code:

Parsing client packets///<param Name= "Recbytes" > Server received packets </param>//<param name= "Recbytelength" > Valid data length
    </param> private static string Analyticdata (byte[] recbytes, int recbytelength) {if (Recbytelength < 2) {return string.
    Empty; } bool fin = (Recbytes[0] & 0x80) = = 0x80; 1bit,1 represents the last frame if (!fin) {return string. empty;//more than one frame temporarily not processing} bool Mask_flag = (Recbytes[1] & 0x80) = = 0x80; Whether to include the mask if (!mask_flag) {return string. empty;//does not contain a mask of pending} int payload_len = recbytes[1] & 0x7F;
    Data length byte[] masks = new Byte[4];

    Byte[] Payload_data;
        if (Payload_len = = 126) {array.copy (recbytes, 4, masks, 0, 4);
        Payload_len = (UInt16) (recbytes[2] << 8 | recbytes[3]);
        Payload_data = new Byte[payload_len];

    Array.copy (Recbytes, 8, payload_data, 0, Payload_len); } else if (Payload_len = = 127) {array.copy (recbytes, ten, masks,0, 4);
        byte[] uint64bytes = new Byte[8]; For

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.