Summarize:
1. Long connection mechanism--distinguish Websocket,http2,sse:
HTTP/2 introduced server Push technology to allow the server to proactively send data to the client cache. However, it does not allow data to be sent directly to the client program itself. Server-side pushes can only be handled by the browser and cannot be processed in the program code, meaning that the program code does not have APIs that can be used to get notifications for these events.
With SSE (Server Side Event), a one-way push to the client is implemented, and SSE is based on HTTP, which is one-way communication.
WebSocket is to establish duplex communication on the server and client side. Based on TCP.
This is the fifth chapter of how JavaScript works.
Now we will delve into the world of communication protocols, drawing and discussing their features and internal constructs. We will give a quick comparison of WebSockets and HTTP/2. At the end of the article, we will share some insights into how to choose the network protocol correctly.
Brief introduction
Complex Web applications are now rich in functionality, thanks to the dynamic interactivity of Web pages. And that's not surprising-it's been a long time since the advent of the Internet.
At first, the Internet was not designed to support such dynamic and complex web applications. It was supposed to consist of a large number of HTML pages, each linking to other pages, creating the concept of a Web page containing information. Everything is built around the so-called HTTP request/Response pattern. The client loads a Web page until the user clicks on the page and navigates to the next page.
Around 2005, AJAX was introduced, and many people began to explore the possibility of two-way communication between the client and the server. However, all HTTP links are controlled by the client, meaning they must be manipulated by the user or periodically polled to load data from the server.
Allow HTTP to support two-way communication
Technology that enables servers to proactively push data to clients has been in the process for quite some time. such as "Push" and "Comet" technology.
Long polling is one of the most common hack that the service side actively sends data to the client. With long polling, the client opens an HTTP connection to the server until the response data is returned. When the server has new data to send, it sends the new data as a response to the client.
Let's take a look at a simple long polling code snippet:
(function poll(){ setInterval(function(){ $.ajax({ url: ‘https://api.example.com/endpoint‘, success: function(data) { // 处理 `data` // ... //递归调用下一个轮询 poll(); }, dataType: ‘json‘ }); }, 10000);})();
This is basically a self-executing function that will run automatically for the first time. It asynchronously requests the server every 10 seconds and once the asynchronous request to the server is initiated, the function is called again inside the callback function ajax
.
Other technologies involve Flash and XHR multiparty requests and the so-called Htmlfiles.
All of these scenarios have a common problem: they all have HTTP overhead, which makes it impossible for them to meet a program that requires low latency. Imagine a first-person shooter in a browser or any other online game that requires real-time component functionality.
The advent of WebSockets
The WebSocket specification defines an API for establishing a "socket" connection (on the TCP protocol) between a Web browser and a server. In layman's words: There is a persistent connection between the client and the server, and both sides can start sending data at any time.
The client creates the WebSocket connection through the WebSocket handshake process. In this process, the client first initiates a regular HTTP request to the server. The request contains a Upgrade
request header that notifies the server client that it wants to establish a WebSocket connection.
Let's look at how to create a WebSocket connection on the client:
// 创建新的加密 WebSocket 连接var socket = new WebSocket(‘ws://websocket.example.com‘);
The WebSocket address uses a ws
scenario. wss
is an equivalent HTTPS
secure WebSocket connection.
This scenario is open to the beginning of the WebSocket connection to websocket.example.com.
The following is a simplified example of initializing the request header.
GET ws://websocket.example.com/ HTTP/1.1Origin: http://example.comConnection: UpgradeHost: websocket.example.comUpgrade: websocket
If the server supports the WebSocket protocol, it will agree to the upgrade request and then communicate by returning a header inside the response Upgrade
.
Let's look at the implementation of node. JS:
// 我们将会使用 https://github.com/theturtle32/WebSocket-Node 来实现 WebSocketvar WebSocketServer = require(‘websocket‘).server;var http = require(‘http‘);var server = http.createServer(function(request, response) { // 处理 HTTP 请求});server.listen(1337, function() { });// 创建服务器wsServer = new WebSocketServer({ httpServer: server});// WebSocket 服务器wsServer.on(‘request‘, function(request) { var connection = request.accept(null, request.origin); // 这是最重要的回调,在这里处理所有用户返回的信息 connection.on(‘message‘, function(message) { // 处理 WebSocket 信息 }); connection.on(‘close‘, function(connection) { // 关闭连接 });});
After the connection is established, the server uses the upgrade as a response:
HTTP/1.1 101 Switching ProtocolsDate: Wed, 25 Oct 2017 10:07:34 GMTConnection: UpgradeUpgrade: WebSocket
Once the connection is established, the event of the client WebSocket instance is triggered open
.
var socket = new WebSocket(‘ws://websocket.example.com‘);// WebSocket 连接打开的时候,打印出 WebSocket 已连接的信息socket.onopen = function(event) { console.log(‘WebSocket is connected.‘);};
Now, the handshake is over, the initial HTTP connection is replaced with the WebSocket connection, and the same TCP/IP connection is used for the underlying connection. Now both sides can start sending the data.
With WebSocket, you can send data at will without worrying about the overhead associated with traditional HTTP requests. The data is transmitted through WebSocket in the form of a message, and each piece of information consists of one or more frames containing the data (payload) that you transmit. To ensure that the message is correctly reassembled when it arrives at the client, each frame is pre-populated with 4-12 bytes of data about the payload. Using this frame-based information system can help reduce the transmission of non-payload data, significantly reducing information latency.
* * NOTE: * * It is important to note that the client receives notification of new messages only when all message frames are received and the original information payload is reassembled.
WebSocket Address
Before we briefly talked about the introduction of a new address protocol to WebSockets. In fact, WebSocket introduced two new protocols: ws://
and wss://
.
The URL address contains the syntax for the specified scheme. The WebSocket address is unique in that it does not support anchors ( sample_anchor
).
WebSocket and HTTP-style addresses use the same address rules. ws
is unencrypted and defaults to port 80, which wss
requires TSL encryption and the default 443 port.
Frame protocol
Let's take a closer look at the next frame protocol. This is provided by RFC :
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-------+-+-------------+-------------------------------+ | F| r| r| r| opcode| m| Payload Len | Extended Payload Length | | i| s| s| s| (4) | a| (7) | (16/64) | | n| v| v| v| | s| | (if payload len==126/127) | | |1|2|3| | k| | | +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - + | Extended payload length continued, if payload len = = 127 | + - - - - - - - - - - - - - - - +-------------------------------+ | | Masking-key, if MASK set to 1 | +-------------------------------+-------------------------------+ | Masking-key (continued) | Payload Data | +-----------------------------------------------+: PayPayload Data Continued ... | +---------------------------------------------------------------+
Since the WebSocket version is specified by the RFC, there is only one header information in front of each package. However, this header information is quite complex. This is the description of its constituent modules:
fin
(1-bit): Indicates whether it is the last frame of the constituent information. Most of the time, the information is only one frame, so the bit usually has a value. Tests show that the second frame of Firefox data is after 32K.
rsv1
, rsv2
, rsv3
(per bit): Must be 0 unless you use the Negotiate extension to define the meaning of a non-0 value. If you receive a non-0 value and do not negotiate the extension to define the meaning of a non-0 value, the receiving side interrupts the connection.
opcode
(4-bit): Represents the first frame. Currently available values:
0x00
: The frame continues the payload of the previous frame.
0x01
: The frame contains text data.
0x02
: The frame contains the binary data.
0x08
: The frame interrupts the connection.
0x09
: The frame is a ping.
0x0a
: The frame is a pong.
(as you can see, a significant portion of the value is not used; they are reserved for future use.)
mask
(1-bit): Indicates whether the connection is masked. As it is meant, every piece of information sent from the client to the server must be masked, and if the information is not masked, the connection is interrupted according to the specification.
payload_len
(7-bit): The length of the payload. WebSocket frames have the following types of lengths:
0-125 indicates the length of the payload. 126 means that the next two bytes represent the payload length, and 127 means that the next 8 bytes represent the payload length. So the payload length is about 7 bits, 16 bits and 64 bits of these three categories.
masking-key
(32-bit): all frames destined from the client to the server are masked by a 32-bit value within the frame.
payload
: The actual data that will be masked in the general case. Its length depends on payload_len
the length.
Why is WebSocket frame-based rather than stream-based? I'm just as crazy as you, and I want to learn more, if you have any ideas, please add comments and resources in the comments section below. In addition, Hackernews has a discussion on this.
Frame data
As mentioned earlier, the data can be split into multiple frames. The data transmitted by the first frame contains an opcode indicating the order in which the data is transmitted. This is necessary because JavaScript does not support the transfer of binary data well when the specification is complete. 0x01
represents UTF-8 encoded text data, which 0x02
represents binary data. Most people choose text opcode when transmitting JSON data. When you transmit binary data, it is represented by a blob specified by the browser.
The API to transfer data via WebSocket is very simple:
var socket = new WebSocket(‘ws://websocket.example.com‘);socket.onopen = function(event) { socket.send(‘Some message‘); // 向服务器发送数据};
The event is triggered when the WebSocket is receiving data (the client) message
. The event comes with a data
property that contains the contents of the message.
// 处理服务器返回的消息socket.onmessage = function(event) { var message = event.data; console.log(message);};
You can easily use the web tab of the Chrome developer tools to check the data for each frame in the WebSocket connection.
Data sharding
Payload data can be divided into multiple independent frames. The receiving side buffers these frames until the fin
bits have a value. So you can split the string "Hello world" into 11 packages, each consisting of 6 (head length) + 1 bytes. Data shards cannot be used to control packages. However, the spec wants you to be able to handle staggered control frames. This is to prevent the TCP packet from reaching the client in an unordered order.
The approximate logic for connecting frames is as follows:
- Receive first frame
- Remember operation code
- Connect the frame payload until the
fin
bit has a value
- assert that the opcode for each package is 0
The primary purpose of data sharding is to allow the transmission of unknown size information at the beginning. With data sharding, the server may need to set a reasonable buffer size and then return a data shard when the buffer is full. The second use of data sharding is multiplexing, and the large amount of data on the logical channel is unreasonable to occupy the entire output channel, so the multiplexing technology is used to split the information into smaller data shards to better share the output channel.
Heartbeat Pack
At any point after the handshake, the client and server can ping each other at will. When a ping is received, the receiving Party must reply to a pong as soon as possible. This is the heartbeat package. You can use it to ensure that the client remains connected.
ping or pong is just a normal frame, but it is a control frame. Ping contains 0x9
opcode, Pong contains 0xa
opcode. When you receive the ping, return a pong with the same payload data as ping (ping and pong have a maximum payload length of 125). You may receive a pong without sending a ping. Ignore it if there is a situation where this happens.
Heartbeat packs are very useful. Use a service, such as a load balancer, to interrupt idle connections. In addition, the receiving side cannot know whether the server has interrupted the connection. It is only when you send the next frame that you realize that an error has occurred.
Error handling
You can error
handle errors by listening for events.
Like this:
var socket = new WebSocket(‘ws://websocket.example.com‘);// 处理错误socket.onerror = function(error) { console.log(‘WebSocket Error: ‘ + error);};
Close connection
The client or server can send a 0x8
control frame that contains opcode data to close the connection. When the control frame is received, the other node returns a closed frame. The first node then closes the connection. Any data received after the connection is closed is discarded.
This is the code that initializes the WebSocket connection for the shutdown client:
// 如果连接打开着则关闭if (socket.readyState === WebSocket.OPEN) { socket.close();}
Similarly, in order to run any cleanup work after closing the connection, you can add an close
event listener function for the event:
// 运行必要的清理工作socket.onclose = function(event) { console.log(‘Disconnected from WebSocket.‘);};
The server has to listen for close
events to handle when needed:
connection.on(‘close‘, function(reasonCode, description) { // 关闭连接});
Comparison of WebSockets and HTTP/2
Although HTTP/2 provides a lot of functionality, it does not completely replace the current push/streaming technology.
The most important thing to note about HTTP/2 is that it does not completely replace HTTP. Vocabulary, status codes, and most of the header information will remain the same as now. HTTP/2 only improves the efficiency of data transmission on the line.
Now, if we compare WebSocket and HTTP/2, we will find a lot of similar places:
As shown above, HTTP/2 introduces server Push technology to allow the server to proactively send data to the client cache. However, it does not allow data to be sent directly to the client program itself. Server-side pushes can only be handled by the browser and cannot be processed in the program code, meaning that the program code does not have APIs that can be used to get notifications for these events.
This is where the service-side push event (SSE) comes in handy. SSE is such a mechanism once a client-server connection is established, it allows the server to push data asynchronously to the client. Then, whenever the server generates new data, it pushes the data to the client. This can be seen as a one-way publish-subscribe model. It also provides a standard JavaScript client API called EventSource, which is already implemented in most modern browsers as part of the HTML5 standard published by the organization. Please note that browsers that do not support native EventSource APIs can be implemented with shims.
Because SSE is HTTP-based, it is naturally compatible with HTTP/2 and can be mixed to take advantage of its advantages: HTTP/2 handles a highly efficient transport layer based on multiplexed streams and SSE provides APIs for programs to support server-side push.
To fully understand the flow and multiplexing techniques, let's first look at the definition of IETF: "Flow" is an independent bidirectional frame sequence that is exchanged between the client and the server in a HTTP/2 connection. One of its main features is that a single HTTP/2 connection can contain multiple concurrent open streams, interleaving frames from multiple streams at each end.
It is important to remember that SSE is HTTP-based. This means that by using HTTP/2, not only can multiple SSE streams be interleaved into a single TCP connection, but multiple SSE streams (server-to-Client push) and multiple client requests (client-to-server) can be combined into a single TCP connection. Thanks to HTTP/2 and SSE, now we have a pure HTTP bidirectional connection with a simple API that allows program code to register the data push of the listener server. The lack of two-way communication capability has been regarded as the main disadvantage of SSE contrast WebSocket. Thanks to HTTP/2, this is no longer a disadvantage. This gives you the opportunity to persist in using HTTP-based communication systems instead of WebSockets.
Usage scenarios for WebSocket and HTTP/2
WebSockets can still exist under the rule of HTTP/2 + SSE, mainly because it is a highly acclaimed technology, and in exceptional cases, compared to HTTP/2 it has an advantage that it inherently has less overhead (for example, head information) for two-way communication capabilities.
Suppose you want to build a large multiplayer online game that generates a lot of information at each connection terminal. In this case, the WebSockets will behave more perfectly.
In short, when you need to build a real low-latency, near-real-time connection between the client and the server, use WebSockets. Remember that this may require you to reconsider how to build a server-side program, and you need to focus on techniques such as event queuing.
If your usage scenarios require displaying real-time market news, market data, chat programs, and so on, HTTP/2 + SSE will provide you with an efficient two-way communication channel and you can get all the benefits of HTTP:
- When considering the compatibility of existing architectures, WebSockets is often a pain point because upgrading HTTP connects to a protocol that is completely unrelated to HTTP.
- Scalability and Security: network components (firewalls, intrusion detection, load balancers) are built, maintained and configured for HTTP, and large/important programs prefer resilient, secure, and scalable environments.
Similarly, you have to consider browser compatibility. View the following WebSocket compatibility scenarios:
Compatibility is good.
However, the situation in HTTP/2 is not so wonderful:
- Only TLS is supported (not bad)
- Windows 10-only IE 11 section support
- Only OSX 10.11+ Safari Browser supported
- Support HTTP/2 only if you negotiate application ALPN (something that the server needs to explicitly support)
SSE support is better:
Only Ie/edge is not supported. (Well, Opera Mini doesn't support SSE and doesn't support WebSockets, so we put it all in line). There are some elegant gaskets to allow Ie/edge to support SSE.
How JavaScript works (JavaScript works) (v) deep understanding of WebSockets and HTTP/2 with SSE mechanism and correct use posture