HTML 5 web socket: unveiling of the next web communication revolution

Source: Internet
Author: User
Tags stomp server hosting stock prices

Recently, rumors about HTML 5 web socket are everywhere. It defines a full-duplex communication channel through a single socket on the web,HTML 5 web socket is not an enhanced version of ordinary HTTP Communication, it represents a huge progress, especially for real-time, event-driven Web ApplicationsProgram.

Ian Hickson, a Google Engineer, said, "The number of data bytes has been sharply reduced to 2 bytes, And the latency has been reduced from 150 milliseconds to 50 milliseconds. In fact, these two factors are enough to interest Google ". By simulating full-duplex connections in a browser, HTML 5 web socket significantly improves web communication.

Let's look at how HTML 5 web socket reduces unnecessary network traffic and latency compared with traditional solutions.

Current web communication-Headache round robin(Polling)

Generally, when a browser accesses a webpage, it sends an HTTP request to the Web server hosting the webpage. The web server recognizes the request and returns a response. For example, when a browser renders a page, the response may expire, such as stock prices, news reports, ticket sales, traffic models, and medical device readings, if you want to obtain the latest "real-time" information, you can refresh the page manually, but obviously this is not the best method.

Currently, the real-time web program is mainly centered on polling and other server push technologies. The most famous is comnet, which delays the end of HTTP response, comnet-based push is usually implemented using JavaScript combined with long polling or stream connection policies.

When polling is used, the browser regularly sends HTTP requests and immediately receives responses. This is the first attempt to deliver real-time information. Obviously, if you know the time interval of message transmission, this is a good solution, because you can synchronize client requests when the information on the server is available, but real-time data is often unpredictable and may inevitably generate unnecessary requests, as a result, many connections are opened, but some connections that do not need to be closed are closed.

When long polling is used, the browser sends a request to the server, and the server keeps the request open within the specified period. If a notification is received during this period, the server sends a response containing the message to the client, if no message is received during this period, the server sends a response to terminate the request. The most important thing is to understand that when your information capacity is very high, long polling does not provide any performance improvement compared with traditional polling. In fact, it may be worse, because long polling may get out of control and enter an endless loop.

When a stream is used, the browser sends a complete request, but the server sends a response, saves the open status, and keeps updating it to keep it open (or keep it open for a period of time ), when a message is ready for sending, the response is updated, but the server does not send an end response. Therefore, the connection remains open, and the subsequent message can continue to use this connection. However, the stream is still encapsulated in HTTP, which blocks the content in the buffer zone selected by the firewall and proxy server for response. Therefore, the message transmission time is extended. Many stream comnet solutions have switched to long polling. In addition, TLS (SSL) connections can be used to shield responses from the buffer zone, but in this case, each connection consumes more server resources.

Eventually, all these methods provide real-time data, including HTTP request and response headers, including many additional and unnecessary header data, and most importantly, full-duplex connections require not only downstream connections from the server to the client. To simulate full-duplex communication over half-duplex HTTP, many current solutions use two connections: One downstream connection and one upstream connection. Maintaining and coordinating these two connections requires a lot of system overhead and increases complexity. In short, HTTP is not designed for real-time and full-duplex communication. 1 shows the complexity of building a comnet web application, it uses the publish/subscribe mode from the back-end data source to display real-time data based on half-duplex HTTP.

Figure 1: complexity of the comnet Program

When you try to scale out those comet solutions, the situation becomes worse, simulating HTTP-based two-way communication is prone to errors, even if the end user feels something looks like a real-time web application, however, the cost of such "real-time" experience is very high, and more delay waiting time is required. Unnecessary network traffic and CPU performance will be slowed down.

HTML 5 web socket--Save

HTML 5 web socket is defined in the HTML 5 standard communication section, which represents the next evolution of web communication: A full-duplex, bidirectional communication channel is implemented through a single socket. HTML 5 web socket provides a real standard that you can use to build scalable real-time web applications. In addition, because it provides a browser socket that eliminates many issues with the comet solution, web socket significantly reduces system overhead and complexity.

To establish a web socket connection, the client and server must be upgraded from http to websocket during the initial handshake, as shown in the following example:

Example 1: websocket handshake (Browser request, server response)

Get, text, HTTP, 1.1
Upgrade: websocket
Connection: Upgrade
HOST: www.websocket.org
...
HTTP/1.1 101 websocket protocol handshake
Upgrade: websocket
Connection: Upgrade
...

After the connection is established, the websocket data frame can be transmitted between the client and the server in full duplex mode. At the same time, both the text and binary frames can be sent in any direction, the minimum frame size is 2 bytes. In a text frame, each frame starts at 0x00 and stops at 0 x FF bytes, and the data is encoded in UTF-8. Websocket text frames use Terminator, while binary frames use a length prefix.

Note: Although websocket supports multiple clients, raw data cannot be transmitted to JavaScript because JavaScript does not support the byte type. Therefore, if the client is JavaScript, binary data is ignored, but it can be passed to clients that support the byte type.

Confrontation between Comet and HTML 5 web socket

What people are most concerned about is how HTML 5 web socket reduces unnecessary network traffic and latency. We will know how to compare a polling application with a web socket application.

For the round-robin example, I created a simple web application where a Web page uses the traditional publishing/subscription mode to request real-time stock data from the rabbitmq message proxy, it is implemented by polling a Java Servlet hosted on a Web server. The rabbitmq message proxy receives data from a fictitious stock price source that constantly updates the price, the webpage connects to and subscribes to a specific stock channel (a topic on the message proxy), and uses XMLHttpRequest to perform polling once every second. When an update is received, perform some calculations and then display the stock data in the table shown in figure 2.

Figure 2: A JavaScript stock market application

Note: The back-end stock source actually produces a large number of stock price updates per second. Therefore, it is better to use the round robin method once per second than the long polling method. Long polling will generate many consecutive round robin, polling will more effectively prevent incoming updates.

All of this looks good, but if you look at it carefully, you will find that such applications have serious problems, for example, using Firefox's firebug plug-in (allows you to debug webpages and monitor page loading and script execution time), you can see that there is a GET request hitting the server every second. Open live HTTP headers (another Firefox plug-in that displays real-time HTTP message header traffic) and reveal that the number of message header overhead associated with each request is astonishing. The following two examples show the HTTP message header data of a request and response.

Example 2: HTTP Request Header

GET/pollingstock // pollingstock HTTP/1.1
HOST: localhost: 8080
User-Agent: Mozilla/5.0 (windows; U; Windows NT 5.1; en-US; RV: 1.9.1.5) Gecko/20091102 Firefox/3.5.5
Accept: text/html, application/XHTML + XML, application/XML; q = 0.9, */*; q = 0.8
Accept-language: En-US
Accept-encoding: gzip, deflate
Accept-charset: ISO-8859-1, UTF-8; q = 0.7, *; q = 0.7
Keep-alive: 300
Connection: keep-alive
Referer: http://www.example.com/PollingStock/
COOKIE: showinheritedconstant = false;
Showinheritedprotectedconstant = false;
Showinheritedproperty = false;
Showinheritedprotectedproperty = false;
Showinheritedmethod = false;
Showinheritedprotectedmethod = false;
Showinheritedevent = false;
Showinheritedstyle = false;
Showinheritedeffect = false

Example 3: HTTP Response Header

HTTP/1.x 200 OK
X-powered-by: servlets/2.5
Server: Sun Java System Application Server 9.1 _ 02
Content-Type: text/html; charset = UTF-8
Content-Length: 21
Date: sat, 07 Nov 2009 00:32:46 GMT

The total overhead of HTTP request and response header information includes 871 bytes, and does not include any data. Of course, this is just an example. Your message header data may be less than 871 bytes, however, I have also seen more than 2000 bytes of message header data. In this example, the stock topic message data is about 20 characters long.

What happens when you deploy such a program on a large scale to users? We use three different use cases to observe the network throughput required for the HTTP request and response header data associated with the polling application.

Case A: 1000 client, polling once per second
Network throughput (871x1000) = 871000 bytes = 6968000 bits/s (6.6 Mbps)

Case B: 10000 client, polling once per second
Network throughput (871x10000) = 8710000 bytes = 69680000 bits/s (66 Mbps)

Case C: 100000 client, polling once per second
Network throughput (871x100000) = 87100000 bytes = 696800000 bits/s (665 Mbps)

This is an unnecessary huge network throughput. In this case, we can use HTML 5 web socket. I used HTML 5 web socket to reconstruct the application and add an event handler to the webpage, synchronously listen to the stock update messages from the message proxy. Each message is a web socket frame with only two bytes overhead (instead of 871 bytes). Let's take a look at the impact on network throughput.

Case A: 1000 client, polling once per second
Network throughput (2x1000) = 2000 bytes = 16000 bits/s (0.015 Mbps)

Case B: 10000 client, polling once per second
Network throughput (2x10000) = 20000 bytes = 160000 bits/s (0.153 Mbps)

Case C: 100000 client, polling once per second
Network throughput (2x100000) = 200000 bytes = 1600000 bits/s (1.526 Mbps)

As you can see in Figure 3, compared with the polling solution, HTML 5 web socket reduces unnecessary network traffic.

Figure 3:Compare Round Robin andWebsocketNetwork throughput between applications

How about delay reduction? As you can see in figure 4, the upper part of the figure shows the latency of the half-duplex round-robin solution. Here we assume that it takes 50 milliseconds for the message to be transmitted from the server to the browser. Many extra latencies are introduced in the round-robin mode, when the response is complete, a new request has been sent to the server, which takes 50 milliseconds. During this period, the server cannot send any messages to the browser, resulting in additional server memory consumption.

Figure 4 the lower half shows the latency caused by the web socket method. Once the connection is upgraded to the web socket, the message transmission will be more timely, and it will still take 50 milliseconds to transmit the message from the server to the browser, however, the web socket connection remains open and no longer needs to send requests to the server.

Figure 4:Round Robin andWeb socketLatency comparison between applications

HTML5 web socket and kaazing websocket Gateway

Currently, only Google's Chrome browser supports HTML 5 web socket, but other browsers will also support it. To solve this problem, the kaazing web socket Gateway provides a complete web socket Simulation for all old browsers (IE 5.5 +, Firefox 1.5 +, Safari 3.0 +, and opera 9.5 +, therefore, you can use the HTML 5 web socket API now.

Web socket is amazing, but what can be done after you have a full-duplex socket connection in your browser? To fully utilize all the functions of HTML 5 web socket, kaazing provides a bytesocket library for Binary communication and a more advanced library for protocols such as stomp, amqp, XMPP, and IRC, they are all built on web socket.

For example, if you use a more advanced library for the stomp or amqp protocol, you can directly communicate with the backend message proxy such as rabbitmq by directly connecting to the service, no additional Application Service Logic is required to convert these two-way, full-duplex TCP backend protocols into non-bidirectional, half-duplex HTTP connections, because the browser itself can understand these protocols.

Figure5: Kaazing web socketGateway Extension Based onTCPAnd has better performance.

Summary

HTML 5 web socket makes a huge step forward in real-time web application scalability. As you can see in this article, HTML 5 web socket can provide 5000: 1 Or-reducing unnecessary HTTP header traffic and communication latency Based on the HTTP header size-, which is not a progressive improvement but a revolutionary leap.

Kaazing web socket gateway enables HTML 5 web socketCodeIt can run in all browsers and provides additional protocol libraries that allow you to make full use of the full-duplex socket connection function provided by HTML 5 web socket to directly communicate with backend services.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.