Talking about the practice of Websocket and reverse agent

Source: Internet
Author: User
Tags rfc socket node server haproxy nodejs websocket


Directory

What is Websocket
Why use Websocket
Performance theory Analysis
Websocket service end and reverse generation practice
Test for reverse generation performance
Websocket and HTTP
What is Websocket

The Websocket is a Full-duplex communication protocol, initially defined by HTML5, that is built on a single TCP connection, which is independent of the HTML5 specification and standardized by RFC 6445, but is still habitually called HTML5 Websocket.

Websocket works on HTTP 80 and 443 ports and uses prefix ws://or wss://(with SSL) for protocol tagging, but in practice this protocol is not related to HTTP, see the Instructions for RFC 6455 Section 1.7, when established, With the http/1.1 101 status code for protocol switching, the current standard does not support two clients to establish a Websocket connection without using HTTP, see StackOverflow:

Get/chat http/1.1
Host:server.example.com
Upgrade:websocket
Connection:upgrade
sec-websocket-key:x3jjhmbdl1ezlkh9gbhxdw==
Sec-websocket-protocol:chat, Superchat
Sec-websocket-version:13
Origin:http://example.com
Service Side return Response:


http/1.1 Switching protocols
Upgrade:websocket
Connection:upgrade
sec-websocket-accept:hsmrc0smlyukagmm5oppg2hagwk=
Sec-websocket-protocol:chat

Why the Websocket?

Since the basic HTTP is a HALF-DUPLEX request-response pattern, the client initiates a request before the server can return a response, at the beginning of the design, without considering the server-side unsolicited data, without considering the real-time update of the data; Before the advent of Websocket, In order to maintain real-time data updates and the server to actively push data to the client, the earliest and most brutal implementation is to allow the client to continuously send requests (that is, polling), and later appeared Comet model and SSE (Server Sent Event), Comet is to let the server in the absence of browser explicit request, the implementation of the push data of a class of technology, is a class of models rather than a standard, the principle is that most of the service end of the delay to complete the HTTP response, but also can be divided into long polling and streaming two ways to achieve.

Polling (Polling): The client initiates a request periodically, and the server returns the response immediately, regardless of whether the requested content exists. Polling efficiency is low for unpredictable content that occurs over time.
Long-polling (long polling): Include insert <scrpit> label implement long polling and XMLHttpRequest (XHR) client send request, the server retains the connection and the request lasts for some time, and returns if the desired content appears within the predetermined time. If the required content does not appear, the server closes the request. However, when the message volume is large, long polling does not have a performance boost compared to polling, which may be worse than polling because of the need to maintain long connections.
Streaming (Stream): Includes hidden iframe and XHR, client sends request, server maintains connection, and continuously sends back response, such as Google talk uses embed hidden iframe, and Gmail uses XHR.
But these technologies are all based on HTTP, and http/1.1 's standard RFC 7230 does not recommend that clients open too many connections, and HTTP-based transport technology is not only complex, expensive (such as frequent TCP handshake and HTTP header delivery), but is always subject to HT TP one-way transmission and limit the number of connections.

In a nutshell, HTTP is not a protocol designed for real-time Full-duplex, it is too large to simulate full-duplex on HTTP basis, so to improve performance a protocol independent of HTTP is necessary, and the use of independent protocols can completely get rid of these limitations.

Performance analysis

As mentioned earlier, the problem of implementing two-way continuous communication under HTTP is that HTTP has headers, sometimes the header is larger than the message itself, and when Websocket was first introduced, it claimed only 2 bytes of extra overhead to reduce the latency of HTTP to One-third. A big reason is that Websocket has a very small overhead,payload size when the cost is the following table:

PAYLOAD Client-to-server server-to-client
< 126 6 2
< 64k 8 4
< 2**63 12 8
In addition, Websocket allows the transfer of binary content, reducing the cost of transcoding.

The analysis of the performance cost refers to the http://tavendo.com/blog/post/dissecting-websocket-overhead/

As you can see, the general conclusion is that the overhead on TCP is much larger than the cost of Websocket itself.

In the Websocket and Comet comparison tests, a scenario in which a large number of users request very small messages, which dramatically reduces non-essential network traffic and increases bandwidth utilization, is more than polling:


Nginx has done the nginx. Performance test for Websocket generation: Nginx Websocket Performance

Build Websocket server and proxy server

With regard to the server implementation of Websocket, the Websocket API is given by the global consortium, but there are fewer general-purpose servers to implement Websocket now, given that Websocket is more of a tool than an end, a pure Socket Server actually And there is no practical role, more is the implementation of the Websocket package back-end language library, JS Socket.io, Python has tornado have realized the Websocket library.

Websocket is implemented by sending Upgrade headers, but both Upgrade and Connection are header-hop (hop-by-hop), which is valid only in one hop and not passed to the source station, this header The agent is removed under normal circumstances, so a bit of setup needs to be done on the counter-generation server.

Nginx A practical example on the blog: Using Nginx as a Websocket Proxy

Forward Proxy: In forward proxy mode, the client will actively identify the agent and use the CONNECT method to let the server open the tunnel to the source station to avoid this problem;
Shell

CONNECT example.com:80 http/1.1
Host:example.com

Reverse Proxy: In reverse proxy mode, since the client is unaware of the presence of the agent, Nginx has used a special pattern since the 1.3.13 version, and Nginx will maintain a connection between the client and the server when the source station receives HTTP 101, because Upgrade Headers cannot be passed to the back end, you need to add headers manually:
Shell

location/chat/{
Proxy_pass Http://backend;
Proxy_http_version 1.1;
Proxy_set_header Upgrade $http _upgrade;
Proxy_set_header Connection "Upgrade";
}


A more elegant solution: using the MAP Directive, map is the instruction in Ngx_http_map_module, the variable can be combined into a new variable, the following configuration is based on the connection from the client whether the Upgrade header to the source station to pass the Connection header:
Shell

HTTP {
Map $http _upgrade $connection _upgrade {
Default upgrade;
' Close;
}

server {
...

location/socket/{
Proxy_pass Http://backend;
Proxy_http_version 1.1;
Proxy_set_header Upgrade $http _upgrade;
Proxy_set_header Connection $connection _upgrade;
}
}


By default, the connection will be closed after 60 seconds of no data transfer, the Proxy_read_timeout parameter can be extended at this time, or the source station may send ping frames periodically to keep the connection and confirm that the connection is still in use.
Using Nginx to reverse generation of Websocket

Previously mentioned, Websocket is a tool protocol, and its open source implementation is mostly based on back-end languages such as Python and Nodejs, about the choice of server and client, Nginx website has an example, where we generally follow the demonstration of this article to test. It should be noted that, because Nodejs in the community development encountered in the struggle between branches, version number has a history left over a large span problem, plus NODEJS itself update iterations quickly, Debian8 default source Nodejs version number is not enough to support the needs of this test, while the mainstream LTS The support version is already in 4.x, and we are using the Nodejs officially recommended method for installing the 4.x version of Nodejs:

CURL-SL https://deb.nodesource.com/setup_4.x | SUDO-E Bash-
Apt-get update && apt-get install-y Nodejs
Implementing NPM install-g ws Wscat using NPM to install WS and Wscat, where WS is the Nodejs Websocket implementation, we use him to build a simple Websocket echo Server;wscat is an executable Webs Ocket client, we use to debug Websocket service is normal.

Implement a simple service-side after the installation is complete:

Shell

Console.log ("Server started");
var Msg = ';
var websocketserver = require (' ws '). Server
, WSS = new Websocketserver ({port:8010});
Wss.on (' Connection ', function (WS) {
Ws.on (' message ', function (message) {
Console.log (' Received from client:%s ', message);
Ws.send (' Server received from client: ' + message);
});
});


This simple service-side implementation is to return client to client sending is the message that executes node server.js run this simple echo service.

Use Wscat--connect ws://127.0.0.1:8010, enter any content to test, get the same return to indicate that the operation is normal. Then use the Nginx configuration above to reverse generation of simple server, the port is open at 8020.

Common stress test Tools Apache JMeter can test the Websocket after using Plug-ins, but this plugin is older and JMeter itself is cumbersome to use. A lightweight test tool like AB currently finds Thor and artillery, both of which are written by Nodejs and Nginx.com used Thor in the test.

Thor:

You can define the concurrency degree
The output is simple and straightforward.

No HTTP features for Websocket
Artillery:

Can customize payload, multiple back-end with weight polling
Cannot define concurrency degree
You can also test HTTP, support cookies
Test using artillery: Artillery quick--duration--rate [number] ws://127.0.0.1:8020 or Thor--amount 10000--concurrent [numb ER] ws://127.0.0.1:8020, the two payload sent in simple mode are not the same, and the result cannot be directly compared.

As a result of the Websocket principle, the total number of requests will increase when all the ports on a network card to end the test, and the test found that the transmission rate is large, the front-end Nginx CPU share is lower, but the backend node to eat a CPU full.

The main reason is that the source station is too simple, and due to the nature of Websocket tools, it is difficult to have a representative performance evaluation criteria, can only know that nginx as a Websocket when the reverse generation can easily withstand tens of thousands of concurrent.

Using Haproxy to reverse generation of Websocket

In addition to Nginx, in fact, there are haproxy can also be Websocket agent, Haproxy on this feature than Nginx a little early One o'clock (Haproxy in 2012 has a Websocket back-generation support, Nginx is in 2013 At the beginning of the year), and Haproxy in both modes support WEBSOCKET:TCP mode forwarding protocol packet, HTTP mode in lieu of the client and the source station for Websocket handshake, this part of the reference: Websocket load-balancing with Haproxy-blog.haproxy.com

Haproxy TCP mode has little to discuss, and HTTP mode is a bit like a nginx pattern, by judging Sec-websocket-protocol Header,haproxy can be forwarded to different backend.


Defaults
  mode HTTP
  Log global
  option Httplog
  option  http-server- Close
  option  dontlognull
  option  redispatch
  option  contstats
  Backlog 10000
  Timeout client          25s
  Timeout connect          5s
  Timeout server           25s
  Timeout tunnel        3600s
  Timeout tarpit          60s
  option Forwardfor
 
frontend ft_web
  bind *:8020 name http
  Maxconn 10000
  DEFAULT_BAC Kend bk_web
 
Backend bk_web
  balance roundrobin
  Server WEBSRV1 127.0.0.1:8010 maxconn 1 0000

Also using Thor and artillery for testing, there is no significant difference between Nginx, haproxy, and direct connection performance, or it may be simple to write on the server.

HTTP & Websocket

One of the biggest reasons for Websocket is that HTTP is Half-duplex, Websocket solves the problem of the server's initiative to push data to the customer, but HTTP/2 appears, associating with the http/2 of the previous drive features, then HTTP/2 is No need to Websocket such a TCP independent duplex protocol, InfoQ and StackOverflow's conclusion is that HTTP/2 can not replace Websocket such a push technology, mainly because the HTTP/2 Server push is actually It is through a PUSH PROMISE to tell the client which of the more important resources needs to be preloaded before the home page responds, and can only be executed by the browser to load the file; If the client does not request, the server does not actively push the stream, and does not actually change the nature of the HTTP Half-duplex protocol.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.