Python implements a simple WebSocket server that is compatible with both legacy and new socket protocols

Source: Internet
Author: User
Tags sha1 encryption
Recently in a project to use the HTML5 introduced in the WebSocket technology, I thought it should be easy to handle, who knows in the real development after the discovery of a lot of trouble, although we are a front-end development and design of the team, And as a second-hand program apes have long been not to be seen, but in order to have the same needs of friends to take a few detours, I still decided to put the implementation of the method in this place.

On the basic concept of websocket, Wikipedia explained very clearly, and can be found on the internet a lot of, here just skip the table, directly into the topic.

This problem first has a premise, is to use Python to implement this server, if there is no restrictions on the specific language, it is recommended that you prefer node. js a third-party library: Socket.io, very good, 10 minutes without injections do not take medicine to fix WebSocket server, And with JS to write back end, I believe can also be a lot of literary developers appetite.

But if you choose to use Python,google search results are almost useless, the most fatal problem is that the WebSocket protocol itself is a draft, so different browsers support the version of the protocol, Safari 5.1 supports the old version of the Protocol Hybi-02, Chrome 15 and Firefox 8.0 support the new version of the protocol Hybi-10, the old version of the Protocol and the new version of the Protocol in the establishment of communication handshake method and data transmission format requirements are different, resulting in most of the online implementation can only apply to the Safari browser, And the Safari and c&f browsers cannot communicate with each other.

The first step is to explain the handshake of the new and old version of the WebSocket protocol. Let's take a look at the structure of the handshake data sent by three different browsers:

Chrome:
Copy the Code code as follows:


get/http/1.1
Upgrade:websocket
Connection:upgrade
host:127.0.0.1:1337
sec-websocket-origin:http://127.0.0.1:8000
sec-websocket-key:erwjbdvalynhvhnulgrw8q==
Sec-websocket-version:8
cookie:csrftoken=xxxxxx; Sessionid=xxxxx


Firefox:
Copy CodeThe code is as follows:

get/http/1.1
host:127.0.0.1:1337
user-agent:mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:8.0) gecko/20100101 firefox/8.0
accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
accept-language:en-us,en;q=0.5
Accept-encoding:gzip, deflate
accept-charset:iso-8859-1,utf-8;q=0.7,*;q=0.7
Connection:keep-alive, Upgrade
Sec-websocket-version:8
sec-websocket-origin:http://127.0.0.1:8000
sec-websocket-key:1t3f81iaxnize2txqwv+8a==
Cookie:xxx
Pragma:no-cache
Cache-control:no-cache
Upgrade:websocket


Safari:
Copy CodeThe code is as follows:

get/http/1.1
Upgrade:websocket
Connection:upgrade
host:127.0.0.1:1337
origin:http://127.0.0.1:8000
cookie:sessionid=xxxx; Calview=day; daycurrentdate=1314288000000
SEC-WEBSOCKET-KEY1:CV ' p1* 42#7 ^9}_ 647 08{
Sec-websocket-key2:o8 415 8x37r A8 4
;" ######

As you can see, Chrome and Firefox implement the new protocol, so only one "Sec-websocket-key" header is transferred for the server to generate handshake tokens, but following the old version of Safari data There are two keys: " Sec-websocket-key1″ and "Sec-websocket-key2″, so the service side in generating handshake token, need to make a judgment." First look at the Safari,token generation algorithm using the old version protocol as follows:

Take out all the numeric characters in the Sec-websocket-key1 to form a number, here is 1427964708, then divide by the number of spaces in the Key1, this is like 6 spaces, get a numeric value, retain the value of the integer digits, get the value N1 , and Sec-websocket-key2 the second integer N2; N1 and N2 are connected by the Big-endian character sequence and then connected to another Key3 to get a primitive sequence ser_key. So what is Key3? You can see in Safari sent over the handshake request Finally, there is a 8-byte strange string ";" ###### ", this is Key3. Back to Ser_key, the original sequence to do MD5 calculated a 16-byte long Digest, this is the old version of the token required by the protocol, and then the token attached to the handshake message at the end of the send back to the client, you can complete the handshake.

The new version of the protocol to generate tokens is relatively simple: first Sec-websocket-key and a string of fixed uuid "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" do the stitching, and then the concatenation of the string to do SHA1 encryption , after getting digest, do a base64 code, you can get tokens.

It is also important to note that the new version and the old version of the handshake protocol back to the client data structure is different, in the attachment of the server source code is written very clearly, see to understand.
Completing the handshake is only half the WebSocket server, and now it's only guaranteed that the server will be able to link to two versions of the browser, but if you try to send a message to safari in Chrome, Safari won't be able to receive it. This result is due to the fact that the data framing structure of the two versions of the protocol differs from the data structures that the client sends and receives after the handshake is established.

The first step is to obtain the original data sent by the client under different versions of the protocol. The old version of the protocol is relatively simple, in fact, before the original data added a ' \x00′, at the end of the add a ' \xff ', so if Safari's client sends a string ' test ', actually websocket The data received by the server is: ' X00test\xff ', so it is only necessary to peel off the two characters.

The trouble is the new version of the protocol data, according to the version of draft, Chrome and Firefox sent the data message consists of the following parts: First, a fixed byte (1000 0001 or 1000 0002), this byte can be ignored. The trouble is the second byte, where the second byte is assumed to be 1011 1100, the first bit of this byte is definitely 1, indicating that this is a "masked" bit, the remaining 7 0/1 bits can calculate a value, for example, the remaining 011 1100, calculated is 60, This value needs to be judged as follows:

If this value is between 0000 0000 and 0111 1101 (0 to 125), then this value represents the length of the actual data, and if the number is exactly equal to 0111 1110 (126), then the next 2 bytes represent the true data length, if the value is exactly equal to 0111 1 111 (127), then the next 8 bytes represent the data length.

With this judgment, you can know the bytes representing the length of the data at the end of the number, such as we take example 60, this value is between 0~125, so the second byte itself represents the length of the original data (60 bytes), so we can grab 4 bytes from the third byte, This string of bytes is called "Masks" (mask?), the data after the mask, is the actual ... 's brother. Said to be a brother, because this data is actually based on the mask to do a bitwise operation, the method of obtaining the original data is to be the sibling data of each x, and the mask of the i%4 bit of the XOR operation, where I is x in the sibling data index. Look at it, look at the code snippet below, and you might be able to understand:

Copy the Code code as follows:


def send_data (RAW_STR):
BACK_STR = []

Back_str.append (' \x81 ')
Data_length = Len (raw_str)

If Data_length < 125:
Back_str.append (Chr (data_length))
Else
Back_str.append (Chr (126))
Back_str.append (Chr (data_length >> 8))
Back_str.append (Chr (Data_length & 0xFF))

Back_str = "". Join (BACK_STR) + raw_str

The resulting back_str can be sent to Chrome or Firefox that uses the new version of the Protocol.

At this point, this simple WebSocket server is complete, it can be compatible with both the old Protocol and the new Protocol socket connection, as well as the data transfer between different versions. The source of the server click here to download, it is necessary to note that the use of the twisted framework to run the TCP service, the code is not well written, for your reference only.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.