This article mainly introduces a simple WebSocket server that is compatible with the old and new Socket protocols in Python, if you need it, you can refer to the WebSocket technology introduced in HTML5 that needs to be used in a recent project. I thought it would be easy to handle it, who knows that there are a lot of troubles after the development is really started? although we are a long-experienced team in development and design, we have never been seen as a second-hand programmer for a long time, however, I decided to stick the implementation method here to reduce detours for my friends who have the same requirements.
The basic concepts of WebSocket are clearly explained on Wikipedia and can be found on the internet. here, we will skip the list and go directly to the topic.
The first premise of this issue is to use Python to implement this server. if there are no restrictions on specific languages, we recommend that you use Node first. A third-party library of js: Socket. IO, very easy to use, 10 minutes without injections, do not take medicine to get WebSocket Server, and use JS to write back-end, I believe it can also have an appetite for many literary and art developers.
However, if you choose to use Python, google search results are almost useless. the most serious problem is that the WebSocket protocol itself is still a draft, so the protocol versions supported by different browsers are different, safari 5.1 supports the old version protocol Hybi-02, Chrome 15 and Firefox 8.0 support the new version protocol Hybi-10, the old version protocol and new version protocol have different requirements on the handshake method for communication establishment and data transmission formats. as a result, most of the implementation methods on the Internet can only be applied to Safari browsers, in addition, Safari and C & F browsers cannot communicate with each other.
The first step is to explain the handshake of the new and old WebSocket protocols. Let's take a look at the structure of the handshake data sent by three different browsers:
Chrome:
The code is as follows:
GET, HTTP, 1.1
Upgrade: websocket
Connection: Upgrade
Host: 127.0.0.1: 1337
Sec-WebSocket-Origin: http: // 127.0.0.1: 8000
Sec-WebSocket-Key: erWJbDVAlYnHvHNulgrW8Q =
Sec-WebSocket-Version: 8
Cookie: csrftoken = xxxxxx; sessionid = xxxxx
Firefox:
The code is as follows:
GET, HTTP, 1.1
Host: 127.0.0.1: 1337
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv: 8.0) Gecko/20100101 Firefox/8.0
Accept: text/html, application/xhtml + xml, application/xml; q = 0.9, */*; q = 0.8
Accept-Language: en-us, en; q = 0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1, UTF-8; q = 0.7, *; q = 0.7
Connection: keep-alive, Upgrade
Sec-WebSocket-Version: 8
Sec-WebSocket-Origin: http: // 127.0.0.1: 8000
Sec-WebSocket-Key: 1t3F81iAxNIZE2TxqWv + 8A =
Cookie: xxx
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket
Safari:
The code is as follows:
GET, HTTP, 1.1
Upgrade: WebSocket
Connection: Upgrade
Host: 127.0.0.1: 1337
Origin: http: // 127.0.0.1: 8000
Cookie: sessionid = xxxx; calView = day; dayCurrentDate = 1314288000000
Sec-WebSocket-Key1: cV 'p1*42 #7 ^ 9} _ 647 08 {
Sec-WebSocket-Key2: O8 415 8x37R A8 4
;"######
We can see that Chrome and Firefox implement the new protocol, so only one "Sec-WebSocket-Key" header is transmitted for the server to generate a handshake Token, but the data that follows the old version of Safari has two keys: "Sec-WebSocket-Key1" and "Sec-WebSocket-Key2", so the server needs to make a decision when generating a handshake Token. First, let's take a look at the Safari with the old version protocol. the Token generation algorithm is as follows:
Take out all the numeric characters in the Sec-WebSocket-Key1 to form a value, which is 1427964708, and then divided by the number of spaces in Key1, it seems to be 6 spaces, get a value, keep the value of the integer, the numerical value N1 is obtained. for Sec-WebSocket-Key2, the second integer N2 is obtained. the N1 and N2 are connected according to the Big-Endian character sequence, and then connected with another Key3 to obtain an original sequence ser_key. So what is Key3? You can see that at the end of the handshake request sent by Safari, there is an 8-byte strange string ";" ###### ", which is Key3. Return to ser_key and perform md5 calculation on the original sequence to calculate a 16-byte digest. this is the token required by the old version protocol. then, the token is attached to the handshake message and sent back to the Client, you can complete the handshake.
The Token generation method of the new protocol is relatively simple: first, splice the Sec-WebSocket-Key and a fixed UUID "258eafa5-e914-4710995ca-c5ab0dc85b11", and then encrypt the spliced string with SHA1, after obtaining digest, perform a base64 encoding to obtain the Token.
In addition, it should be noted that the data structure sent back from the new and old handshakes protocols to the Client is different, and the server source code in the attachment is clearly written.
The handshake is only half of the WebSocket Server function. now, this Server can only establish a connection with two versions of browsers. However, if you try to send messages in Chrome to Safari, safari cannot receive the message. The reason for this result is that the Data Framing structure of the two versions of the protocol is different, that is, after the handshake establishes a connection, the Data structure sent and received by the Client is different.
The first step is to obtain the raw data sent by the Client under different protocols. The old version protocol is relatively simple. In fact, it adds '\ x00' to the original data and' \ xFF 'to the end ', therefore, if the Safari Client sends a string 'test', the actual data received by the WebSocket Server is 'x00test \ xff'. Therefore, you only need to remove the first and last two characters.
What is troublesome is the data of the new version protocol. according to the new version of draft, the data packets sent by Chrome and Firefox are composed of the following parts: the first is a fixed byte (1000 0001 or 1000 0002), which can be ignored. The trouble is the second byte. Here we assume that the second byte is 1011 1100. First, the first byte must be 1, indicating that this is a "masked" bit, the remaining 7 digits and 0/1 digits can calculate a value. for example, if the remaining value is 011 1100 and 60 is calculated, the value must be judged as follows:
If the value is between 0000 0000 and 0111 1101 (0 ~ 125), then this value represents the actual data length; if the value is equal to 0111 1110 (126), then the next two bytes represent the actual data length; if the value is equal to 0111 1111 (127), the next eight bytes represent the data length.
With this judgment, we can know the number of bytes that indicate the length of the data to end at. for example, 60 is used, and the value is between 0 and ~ Between 125 bytes, so the second byte itself represents the length of the original data (60 bytes), so from the third byte, we can capture 4 bytes, this string of bytes is called "masks" (Mask ?), The data after the mask is the actual data... Brother. This is because the actual data is obtained after a bitwise operation based on the mask. the method to obtain the original data is to convert each x of the sibling data, and the mask at I % 4 perform xor operations, where I is the index of x in Brother data. You may be able to see the following code snippet:
The code is as follows:
Def send_data (raw_str ):
Back_str = []
Back_str.append ('\ x81 ')
Data_length = len (raw_str)
If data_length: <125:
Back_str.append (chr (data_length ))
Else:
Back_str.append (chr (126 ))
Back_str.append (chr (data_length> 8 ))
Back_str.append (chr (data_length & 0xFF ))
Back_str = "". join (back_str) + raw_str
In this way, the generated back_str can be sent to Chrome or Firefox using the new protocol.
So far, this simple WebSocket Server is complete and compatible with the Socket connections of the old and new protocols, and data transmission between different versions. Click here to download the source code of the Server. Note that the twisted framework is used to run the TCP service. The code is not well written and is for your reference only.