As we all know, the communication process of Web application is usually the client sends a request through the browser, the server side receives the request to process and returns the result to the client, the client browser renders the information. This mechanism can be well supported for applications where information change is not particularly frequent, but it is very tight for applications with high real-time demand and massive concurrency, especially in the current trend of booming mobile internet, high concurrency and users ' real-time response are the problems that Web applications often face, such as real-time information of financial securities, Geographic access in Web navigation apps, real-time message push for social networks, and more.
The traditional request-response pattern of web development typically employs a real-time communication scheme when dealing with such business scenarios. For example, the common polling scheme, the principle of simple and understandable, is the client at a certain time interval of frequent requests to the server to send requests to maintain client and server side of the data synchronization. The problem is also obvious: when a client sends a request to the server at a fixed frequency, the server-side data may not be updated, causing a lot of unnecessary requests, wasting bandwidth, and inefficient.
Based on the flash,adobeflash through its own socket implementation of the data exchange, and then use Flash to expose the corresponding interface to JavaScript calls, so as to achieve real-time transmission purposes. This approach is more efficient than polling and is widely used because of the high flash installation rate. However, the support of Flash on the mobile Internet terminal is not good: The iOS system does not support flash,android although it supports flash but the actual effect is not satisfactory, and the hardware configuration of mobile devices requires a high. In 2012, Adobe officially announced that it would no longer support the android4.1+ system, announcing the death of flash on mobile devices.
Traditional Web mode, when dealing with high concurrency and real-time demand, will encounter insurmountable bottleneck, and need an efficient and energy-saving bidirectional communication mechanism to ensure the real-time transmission of data. In this context, the WebSocket, which is based on the HTML5 specification, is called the Web TCP. Early HTML5 did not form an industry-uniform specification, and each browser and application server vendor had a similar implementation, such as the Mqtt of IBM, the Comet Open source framework, and so on. Until 2014, HTML5 finally dust landed, formally implemented as the actual standard specification, the application server and browser vendors gradually began to unify, in JavaEE7 also realized the WebSocket agreement. At this point, both the client and the server WebSocket are complete. Users can consult the HTML5 specification and familiarize themselves with the new HTML protocol specification and WebSocket support.
Why do I need WebSocket?
The first contact with WebSocket will ask the same question: we already have the HTTP protocol, why do we need another protocol? What benefits can it bring?
The answer is simple because the HTTP protocol has a flaw: communication can only be initiated by the client.
For example, we want to understand today's weather, only the client makes a request to the server, and the server returns the query results. The HTTP protocol does not make the server actively push information to the client.
The characteristics of this one-way request, doomed if the server has a continuous state change, the client to learn is very troublesome. We can only use polling: every once in a while, we send out a query to see if the server has any new information. The most typical scenario is a chat room.
Polling is inefficient and a waste of resources (because it must be connected, or the HTTP connection always turns on). So, engineers have been thinking, there's no better way. That's how WebSocket invented it.
WebSocket mechanism
The following is a brief introduction to the principle and operating mechanism of websocket.
WebSocket is HTML5 under a new protocol. It realizes the browser and the server full-duplex communication, can better save the server resources and bandwidth and achieve the purpose of real-time communication. It transmits data in the same way as HTTP through an established TCP connection, but the maximum difference between it and HTTP is:
- WebSocket is a two-way communication protocol. After establishing the connection, the WebSocket server side and the client can send or receive data to each other, just like a socket.
- WebSocket need to establish a connection, like TCP, to communicate with each other before the connection succeeds.
Compared with the traditional HTTP each request-response requires the client to establish a connection with the server mode, WebSocket is similar to the socket TCP long connection communication mode. Once the WebSocket connection is established, subsequent data is transmitted in the form of a frame sequence. The client and the server are not required to re-initiate the connection request until the client disconnects the WebSocket connection or the servers side interrupts the connection. In the case of large amount of concurrency and high client-server interaction load, the consumption of network bandwidth resources is greatly saved, and the performance advantage is obvious, and the client sends and receives messages on the same persistent connection, and the real-time advantage is obvious.
Compared to HTTP long connections, WebSocket has the following features:
- Is the true full duplex approach, after establishing the connection client and server side is completely equal, can be unsolicited requests from each other. HTTP long connections are based on HTTP, which is the traditional mode of client-to-server requests.
- HTTP long connection, each data exchange in addition to the real data part, the server and the client also have to exchange a large number of HTTP headers, information exchange efficiency is very low. After the WebSocket protocol establishes a TCP connection through the first request, the data exchanged does not need to send an HTTP header to exchange data. This obviously differs from the original HTTP protocol, so it needs to be upgraded to both the server and the client (HTML5 is supported by mainstream browsers). In addition, there are multiplexing, different URLs can be reused the same websocket connection and other functions. These are the HTTP long connections that cannot be done.
The following is a comparison between the websocket communication and the traditional HTTP by the client-server interaction message:
On the client, new WebSocket instantiates a WebSocket client object that requests a server-side websocket URL similar to Ws://yourdomain:port/path. The client WebSocket object is automatically parsed and recognized as a websocket request, and connects to the server port, performing both handshake processes, and the client sends a data format similar to:
get/webfin/websocket/http/1.1Host:localhostUpgrade:websocketConnection:UpgradeSec-websocket-key: xqbt3imnzjbyqrinxeflkg==origin:http://localhost:8080Sec-websocket-version:13
As can be seen, the client-initiated WebSocket connection message is similar to the traditional HTTP message, the Upgrade:websocket
parameter value indicates that this is a WebSocket type request, Sec-WebSocket-Key
is a base64 encoded ciphertext sent by the WebSocket client, Requires the server to return a corresponding encrypted Sec-WebSocket-Accept
answer, otherwise the client throws an Error during WebSocket handshake
error and closes the connection.
The data format returned by the server after receiving the message is similar:
http/1.1 101 Switching ProtocolsUpgrade:websocketConnection:UpgradeSec-websocket-accept:k7djldlooiwig/ mopvwfb3y3fe8=
Sec-WebSocket-Accept
The value is the server side with a client-consistent key computed after the return of the client, indicating that the HTTP/1.1 101 Switching Protocols
server accepts the WebSocket protocol client connection, after such request-response processing, both ends of the WebSocket connection handshake succeeds, the subsequent TCP communication can be made. Users can consult the WebSocket protocol stack for a more detailed interactive data format between the WebSocket client and the server.
On the development side, the WebSocket API is simple: simply instantiate the WebSocket, create the connection, and the server and the client can send and respond to each other. Detailed WebSocket API and code implementations can be found in the WebSocket implementation and Case Analysis section.
You can think of WebSocket as an HTTP protocol. In order to support a large patch of long connection, it and HTTP have some commonality, is to solve the HTTP itself can not solve some problems and make an improved design. In the previous HTTP protocol, the so-called keep-alive connection refers to the completion of multiple HTTP requests in a single TCP connection, but it is still a separate header for each request, and the so-called polling is the proactive sending of the server from the client (typically the browser) HTTP Request query whether there is new data. One of the common drawbacks of these two modes is that the server and client have to exchange HTTP headers in large quantities in addition to the real data part, and the information exchange efficiency is very low. They establish a "long connection" that is pseudo. Long connections, but the benefit is that you do not need to modify the existing HTTP server and browser architecture to be implemented.
The first problem that WebSocket solves is that after the TCP connection is established through the first HTTP request, the subsequent exchange data does not need to send the HTTP request again, making the long connection become a true. Long connection. However, there is no need to send HTTP headers to Exchange data, which is obviously different from the original HTTP protocol, so it needs to be upgraded for both the server and the client. On this basis WebSocket is also a two-channel connection, which can be either sent or received on the same TCP connection. There are also multiplexing functions, and several different URIs can be reused for the same WebSocket connection. These are the original HTTP can not do.
Also say a little technical detail, because see someone ask WebSocket may enter some kind of half-dead state. This is actually some flawed design of the original network world. The WebSocket is true. Long connections solve both the server and the client side of the problem, but the pit is the network application in addition to the server and the client, another huge presence is the middle of the network link. A http/websocket connection often has to go through countless routes, firewalls. You think your data is sent in a "connection", in fact, it will cross the mountains and rivers, after countless times of forwarding, filtering, in order to finally reach the end. In this process, the processing of intermediate nodes is likely to surprise you.
For example, the middle nodes of these pits may think that a connection is useless without data sending for a period of time, and they will cut off these connections on their own. In this case, regardless of the server or the client will not receive any prompt, they will only wishful thinking that the red line between each other, in vain side by side to send the information can not reach the other shore. In the implementation of the computer network protocol stack There will be a layer of cache, unless you fill these caches, your program will not find any errors. In this way, a good WebSocket long connection, it may be unaware of the situation into a half-dead state.
and the solution, WebSocket designers have already thought. is to allow the server and client to send Ping/pong Frame (RFC 6455-the WebSocket Protocol). This Frame is a special kind of packet, it contains only some metadata and does not need the real data Payload, can maintain the connection state of the intermediate network without affecting the application.
Reference 1
Reference 2
Reference 3
The WebSocket principle of Python Web learning notes