This article is mainly based on some documents on the Google spdy project homepage. The purpose is to introduce the definition of the spdy protocol as a whole. In the future, I will write a series of articles to analyze some spdy projects (such as nginx), spdy performance tests, and how to deploy spdy to practical production applications.
I. Main Problems in HTTP
1. One connection request. Both the browser and the Web server interact with each other in a short connection. A connection serves only one request. For a page that needs to load multiple resources, a high latency may occur. 2. The server can only initiate a request by the client to complete the response. The server cannot actively push some necessary resources to the client. 3. Http can only compress the body, but cannot compress the header. In a site with many cookies, bandwidth will be seriously wasted.
The HTTP problem is irrelevant in the Web 10 years ago. However, today's web is no longer as simple as the web in the past 10 years. At the same time, users will have higher and higher requirements on Web application experience, therefore, Google launched the "let's make the web faster" project. spdy is part of this project.
Ii. spdy goalsAt first, the spdy project not only aimed to accelerate HTTP, but also to improve the transport layer and Session Layer Protocols. For example, stream control transmission protocol (sctp) is a transport layer protocol used to replace TCP. In addition, there are some other studies that can be found on the corresponding White Paper. Developing a new transport layer protocol may be simple, but considering factors such as deployment, promotion, and compatibility history in the future, this is probably a very bad decision, and the waiting result will inevitably fail. Therefore, the current spdy mainly focuses on improving the application layer protocol to accelerate HTTP.
Spdy's main goals: 1. A single TCP connection supports concurrent HTTP requests. This mainly solves the problem that one HTTP connection can only serve one request at a time. 2. compress the request/response header. 3. Define spdy as a protocol that is easier to implement than HTTP. 4. using SSL encrypted connections, user data is more secure, and compatibility with existing network systems is also solved. In fact, TLS is forced to solve the problem of historical compatibility, rather than security considerations, which will be detailed later. 5. The server can actively initiate a connection to the client and push data.
Iii. spdy protocol stackAfter spdy is introduced in the HTTP protocol stack, it looks like (from the Google official website:
The key is to add the spdy Session Layer to the TCP connection. This layer is also known as the framing layer according to spdy draft. It mainly provides the stream mechanism and the implementation of message frame. Next, let's talk about the specific agreement.
4. TLS-NPNTLS-NPN stands for Transport Layer Security-next protocol negotiation, which is an extension defined by Google For TLS to support the spdy protocol as an application layer protocol. The emergence of the extension is only for better deployment and use of spdy, and it will not be part of spdy at any time. For a detailed introduction of the http://technotes.googlecode.com/git/nextprotoneg.html, see here.
Since the TLS-NPN is not part of the spdy protocol, why force TLS? Why cannot I select SSL based on the user's freedom like HTTP? To do so, Google must make sense:
1. Currently, only port 80 and port 443 are available on the Internet. If spdy runs on a new port to provide web services, it is very likely to be blocked by facilities such as firewalls. To push the entire network system into a new standard port equivalent to port 80 and port 443, I think this is definitely not an easy task, therefore, a smart Google must consider compatibility with the current network system environment, rather than promoting new ports intelligently.
2. If you do not want to use the new port, you can only select port 80 and port 443. However, at present, many web intermediate proxy facilities only accept port 80 to run the HTTP protocol. Once spdy runs on port 80, it will inevitably be blocked by these proxies. Port 80 is obviously unavailable.
3. Without port 80, you can only select 443. Currently, 443 is encrypted data transmission over HTTPS. Theoretically, encrypted sockets on port 443 can transmit any application layer protocol data.
Therefore, in order to be able to negotiate with the spdy protocol on SSL, Google has defined the extension, which is already implemented in OpenSSL and NSS.
5. SPDY-draft2Google has released the spdy protocol draft here: http://www.chromium.org/spdy/spdy-protocol, has been to draft4, but this Article refers to draft2, because the current nginx spdy implementation is based on draft2.
Spdy is logically composed of parts. The first is the framing layer, which is mainly used to group frame messages and provides the foundation for Request/response. The second is the HTTP layer, it defines the relevant behavior of request/response.
Based on the simple model above, let's take a look at three important components of spdy: TCP connection, framing layer, and HTTP layer. Describes a TCP connection between the client and the server. There are two streams on this TCP connection, and a stream is responsible for one request/response; A request/response consists of multiple frames (a small arrow is a frame), which are divided into control frame (Red Arrow) and data frame (Blue Arrow ). The framing layer mainly defines a frame message, and the HTTP layer mainly defines request/response.
Once the request/response is complete, the stream is released, but TCP connection is not released. To initiate a new request, you must first create a stream. It can be seen that spdy has added the Stream Mechanism to TCP connection to simulate the HTTP short connection. Stream is actually only a logic layer on TCP. It is very fast to establish and release, without multiple handshakes of TCP, which solves the slow start problem of TCP. From another perspective, if the establishment and release of TCP connection are not delayed and fast, there is no need to improve HTTP, I guess this is also the first consideration that Google wants to solve the problem from the transmission protocol TCP.
FramingFraming is divided into control and data. The following describes in detail what Control Frames and data frames spdy defines and what their functions are.
Control Frame Structure
+ -------------------------------- + | C | version (15 bits) | type (16 bits) | + -------------------------------- + | flags (8) | length (24 bits) | + ---------------------------------- + | data | + ---------------------------------- +
You can see that a control frame has at least eight bytes of header information. C: occupies a bit to identify whether the frame is used for control or data. Version: occupies 15 BITs, which is the version number of spdy. Currently, it is 2. Type: 16 bits. the type of the control frame, which indicates what the control frame is for. Next we will introduce each type of control frame. Flags: occupies 8 bits, an additional attribute field. Length: 24 bits. It is actually an unsigned 24-bit integer, indicating the length of data after the length field.
Data Frame Structure + -------------------------------- + | c | stream-ID (31 bits) | + ---------------------------------- + | flags (8) | length (24 bits) |
+ -------------------------------- + | Data | + ---------------------------------- +
We can also see that a data frame has at least eight bytes of header information. C: Same as the control frame. Stream-ID: 31bit. ID of a stream. Flags and length: Same as the control frame.
StreamA stream of spdy can be created by either the client or the server. This means that spdy not only allows the client to initiate a request, but also the server to initiate a request. If you want to, you can create a one-way stream. You only need to send a specific control frame. A one-way stream can be considered as only a request and will never have a response. Stream creation, shutdown, and other operations are all performed by the Control
Frame implementation.
Control FramesSpdy defines a lot of Control Frames. Here we will briefly list them and will not detail them. If you want to implement spdy, please refer to the specific draft.
Syn_streamControl Frame-is used by the sender to create a stream.
Syn_replyThe Control Frame-syn_stream receiver uses this control frame to start responding to each other.
Rst_streamControl Frame-can be used to interrupt a stream.
SettingsControl Frame-can be used to change the status and attributes of a stream.
NoopControl Frame-it is a non-operational control frame, and the receiver directly ignores it.
PingControl Frame-similar to the ping command, it can be used to measure an RTT time. The ping tool works on Layer 4, while the ping Control Frame Works on Layer 7.
GoawayControl Frame-the sender tells the peer that the new stream is no longer accepted. It can be understood that the current TCP connection will be critical soon.
HeadersControl Frame-add some additional headers to stream.
At this point, various control frames have been briefly introduced. For details about the definition of each frame, see draft. As for datat frame, it is much simpler than the control frame. It does not have this type. It only performs one thing-that is, data is transmitted on a stream.
HTTP LayerTo be compatible with all current applications, do not make any modifications or make as few modifications as possible. Therefore, the HTTP layer defines how to run HTTP Request/response over framing.
The client initiates a request through syn_stream Control Frame. All headers related to the request are sent in syn_stream frame. If the request has a body, it will be sent in data frame, the fin_flag must be set for the last data frame to indicate the end of the request. The server responds to the client through the syn_reply Control Frame. All the headers of the response are sent in the syn_reply frame, and the response data is sent through the data frame. Similarly, the fin_flag must be set for the last data frame to indicate the end of the response.
Spdy also defines the similarities and differences between HTTP Request/Response Headers in detail. For details, refer to the draft.
Server pushWith the push function, the client can respond to multiple responses for a client request. This prevents the client from actively requesting each resource. On the contrary, the server actively pushes resources that are known to the client. The implementation of the push function is still complicated. For details, refer to spdy-draft.
Spdy deployment1. The server uses the alternate-protocol header to recommend that the spdy server notify the client that spdy can be sent by adding alternate-protocl to the response header when receiving a common HTTP request. After the client receives the alternate-protocl header, it tries to send a request to the server through spdy. If it fails, it will never try again in the current domain.
2. The server uses the TLS-NPN extension to recommend that the spdy server receive a connection with a TLS-NPN extension at port 443, telling the client that it supports the spdy version, then the client communicates with the server through spdy. The only way to extend via TLS-NPN should be to use it across the Internet. The alternate-Protocol method can be used only when the firewall and other facilities of other ports are not rejected between the client and the server.
Spdy status quoCurrently, companies like Google, Twitter, and Facebook have begun deploying spdy to accelerate their web applications. In terms of servers, Apache, nginx, netty, Jetty, node. JS, and so on have started to support spdy initially. In terms of browsers, chrome already supports spdy, and Firefox also seems to be supported. Next, search for spdy on GitHub. We can see that many spdy projects exist.