Explanation of concurrent connections during website construction

Last Update:2018-12-03 Source: Internet

Author: User

Tags domain server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently, our website often reports a 503 error: "HTTP Error 503. The service is unavailable ". But it is normal to brush one or two times. It is estimated that the maximum number of concurrent connections of the website is exceeded.

What is an HTTP connection? When a page is loaded, images, styles, and scripts are used. Do requests for these items share one connection or multiple connections?

Some people say on the Internet that, in order to save connections, we should try to combine external CSS and JS, or Inline; or even combine an image and use CSS to locate it. Obviously, a request uses a connection, and the request is closed after the connection is completed.

However, in IIS, the option "Keep HTTP Connection" is available, and the timeout time can be set. If a connection is enabled for every request and remains active, how many connections are enough? This means that a connection can be used for multiple times.

Which one is true?

Actually, they are all right.

The HTTP protocol is stateless and has no connection. The meaning of "no connection" means that only one request is allowed to be processed for each connection. After receiving the response, the request is disconnected. But it is said that this is http1.0.

In http1.1, the concept of persistent connection is proposed. That is to say, the same HTTP connection can process multiple requests in sequence. It is said that most browsers currently support this feature. It also makes sense to think about it. Creating an HTTP connection consumes a high cost, similar to a database connection. Therefore, we try our best to complete all the operations in a database connection, just as you go to the supermarket to buy things, you cannot just buy one. Otherwise, it would be too dark to buy everything.

However, even with the concept of persistent connection, I still have some questions: is the same page actually only using one connection? What if some things are too big to be compared to slices and other elements cannot wait? Will there be another connection? If the HTTP timeout time is set to 20 minutes, isn't it a waste of time?

In addition, even if the same page only uses one connection, merging CSS, JS, and images makes sense. Because the number of requests is small and the number of requests sent is small, the performance should also be affected.

Appendix 1:

A typical webpage consists of an HTML file and embedded elements, including images, CSS files, and JavaScript files on the page. Each embedded element is no different from the HTML file at the HTTP protocol level: that is, it must be captured by the browser on the server. An early typical browser is implemented as follows: after a user clicks the URL, the browser establishes a connection with the server, requests the HTML page, and then receives and parses the HTML page sent by the server, when an embedded element is encountered, the second connection request can be opened immediately. In addition, if there are many embedded elements, he may open multiple connections and request at the same time. When all the required elements are downloaded
Then, the browser will draw the page. This process is the browser implementation envisaged by the earliest HTTP/1.0 protocol.

The multi-connection mode, HTTP/1.0, can be improved. The process of establishing a TCP connection is as follows: the client sends a network packet to the server saying that I want to establish a connection with you. After receiving the packet, the server returns a network packet and says "I am willing ", then the client will send a network package to the server and say, "Okay, let's start data transmission ". In this case, only three packages can be used to establish a TCP connection. After the connection is established, the browser sends a request to the server and the server responds to the browser. After that, several network packets will be sent back and forth to close the TCP connection. If a page contains many elements with a short file length, each element requires a single connection, resulting in a large number
TCP establishes a network package for connection and disconnection. In addition, TCP has a feature called slow start, which can be roughly explained as follows: the TCP connection requires the sender to send a certain number of network packets, and the receiver will return a "I received" network packet, and the packet header will be overwritten when it passes through each vro, therefore, the larger the Network Package, the higher the efficiency of the network without packet loss. The optimal network packet size for TCP connections is that the network packet size is small at the early stage of TCP connection establishment. According to the network conditions, programs at both ends will gradually increase the size of the network package to adapt to the bandwidth to improve the efficiency of network transmission. Therefore, the browser sends a request to the server. If the connection is closed every time a request is sent, the data transmission speed of the connection will be hard to reach the speed that the bandwidth can carry.

Based on these factors, HTTP/1.1 quickly came out and proposed the concept of persistent connection. That is to say, the same HTTP connection can process multiple requests in sequence, at the same time, a certain mechanism is used to ensure the separation between requests. The specific operation process is: the server does not close the connection immediately after sending a response to the browser; the browser determines that the response of the previous request has been received, you can send a second request to the same connection. This operation mode greatly reduces network packets, and experiments also show that this approach is very effective. However
Maintaining connections on the server takes up a certain amount of resources. Therefore, the server does not maintain persistent connections permanently, and it is not recommended to establish too many persistent connections between the browser and the server.

Persistent connections can be further accelerated. This is pipelining. As you can see above, the browser needs to wait until the response of the previous request in the persistent connection is completely received before sending the subsequent request. If the connection to the server is slow, the persistent connection usually takes most of the time to wait instead of sending/receiving data. Pipelining means that the browser can send multiple requests to the server at a time in a persistent connection, and the server responds to these requests accordingly on the connection. This operation method is especially effective when combined with browser cache. For example, the image will be cached in the browser after browsing, And the browser will say to the server when the request is sent again
The image has been cached. The modification time is XXXX. If the image on the server has not been modified, you do not need to resend it. In this case, the server sends a very short 304 not modified response. If you do not have pipelining, you have to wait for a round-trip request to be sent over the network. If you have pipelining, the browser can also ask the server if I have modified the four images, if the server supports pipelining well, it can even put four responses in the same network package, which is a big acceleration.

When pipelining was first proposed, another idea is that if the server supports pipelining well, you can put the two requests in the same pipeline on two CPUs for processing, this will further accelerate the response speed. Of course, this may be useless.

==========================================================

Appendix 2:

Introduction

HTTP is an object-oriented protocol at the application layer. It is applicable to distributed hypermedia information systems due to its simple and fast method. It proposed in 1990 that, after several years of use and development, it has been continuously improved and expanded. Currently, the sixth version of HTTP/1.0 is used in WWW, standardization of HTTP/1.1 is in progress, and suggestions for HTTP-NG (Next Generation of HTTP) have been put forward.

The main features of HTTP are as follows:

1. Supports the customer/Server mode.

2. simple and fast: when a customer requests a service from the server, they only need to send the request method and path. Common Request methods include get, Head, and post. Each method specifies the type of contact between the customer and the server. Because the HTTP protocol is simple, the program size of the HTTP server is small, so the communication speed is fast.

3. Flexibility: HTTP allows transmission of any type of data objects. The type being transferred is marked by Content-Type.

4. No connection: No connection means that only one request is allowed for each connection. After the server processes the customer's request and receives the customer's response, the connection is disconnected. This method can save transmission time.

5. Stateless: HTTP is stateless. Stateless means that the Protocol has no memory for transaction processing. The lack of status means that if subsequent processing requires the previous information, it must be re-transmitted, which may increase the amount of data transmitted each connection. On the other hand, when the server does not need previous information, its response is faster.

I. url for HTTP protocol explanation

HTTP (Hypertext Transfer Protocol) is a stateless, application-layer protocol based on request and response modes. It is often based on TCP connections, http1.1 provides a persistent connection mechanism. Most Web development applications are Web applications built on the HTTP protocol.

The format of http url (a URL is a special type of URI that contains sufficient information for searching a resource) is as follows:

Http: // host [":" port] [abs_path]

HTTP indicates that network resources are to be located through the HTTP protocol; host indicates a valid Internet host domain name or IP address; port specifies a port number. If it is null, the default port 80 is used; abs_path specifies the URI of the requested resource. If abs_path is not provided in the URL, it must be given in the form of "/" when it is used as the request URI. Generally, this work is automatically completed by the browser.

Eg:

1, input: www.hualai.net.cn browser automatically converted to: http://www.hualai.net.cn/2, http: 192.168.0.116: 8080/index. jsp

Ii. http protocol details

An HTTP request consists of three parts: request line, message header, and request body.

1. The request line starts with a method symbol and is separated by spaces, followed by the request URI and Protocol version. The format is as follows: Method Request-Uri http-version CRLF

The method indicates the request method, the request-Uri is a unified resource identifier, the http-version indicates the HTTP protocol version of the request, and the CRLF indicates the carriage return and line feed (except for the CRLF as the end, separate CR or lf characters are not allowed ).

There are multiple request methods (all methods are capitalized). The methods are described as follows:

GET request to get the resource post identified by request-Uri after the resource identified by request-Uri is appended with a new data head request to get the response message header of the resource identified by request-Uri put request the server to store a resource, use request-Uri as its identifier. The Delete request server deletes the resource trace information identified by request-Uri, it is mainly used to test or diagnose whether connect retains the performance of the server to be queried using options requests in the future, or to query resource-related options and requirements.

Example:

Get method: when you enter a URL in the address bar of the browser to access the webpage, the browser uses the get method to obtain resources from the server. For example: Get/form.html HTTP/1.1 (CRLF)

The post method requires the request server to accept the data attached to the request. It is often used to submit forms.

Eg: Post/Reg. jsp HTTP/(CRLF)

Accept: image/GIF, image/X-xbit ,... (CRLF )... HOST: www.guet.edu.cn (CRLF) Content-Length: 22 (CRLF) connection: keep-alive (CRLF) cache-control: No-Cache (CRLF) // This CRLF indicates that the message header has ended. Before that, the message header user = Jeffrey & Pwd = 1234 // the data submitted below this row

The head method is almost the same as the get method. For the response part of the head request, its HTTP header contains the same information as the GET request. With this method, you do not need to transmit the entire resource content to obtain the information of the resource identified by request-Uri. This method is often used to test the validity, accessibility, and recent update of hyperlinks.

2. Post-Request Header

3. Request body (omitted)

Iii. Response to HTTP protocol details

After receiving and interpreting the request message, the server returns an HTTP Response Message.

HTTP response is composed of three parts: Status line, message header, and response body.

1. The status line format is as follows:

HTTP-version status-code reason-phrase CRLF

HTTP-version indicates the HTTP protocol version of the server, status-code indicates the response status code sent back by the server, and reason-phrase indicates the text description of the status code.

The status code consists of three numbers. The first number defines the response category and has five possible values:

1xx: indication information -- indicates that the request has been received, continue processing 2XX: Successful -- indicates that the request has been successfully received, understood, and accepted 3xx: Redirection -- to complete the request, you must perform further operations 4xx: client error-request syntax error or request failure 5xx: Server Error-Server failure to implement valid request common status code, status description, description: 200 OK // The client request is successful. 400 bad request // The client request has a syntax error and cannot be understood by the server. 401 unauthorized // The request is unauthorized, this status code must be used with the WWW-Authenticate header domain to use 403 Forbidden // The server receives the request, but the Denial of Service 404 Not found // The requested resource does not exist, eg: incorrect URL 500 internal server error // Unexpected error occurred on the server 503 server unavailable // The server cannot process client requests at present, and may return to normal after a period of time eg: HTTP/1.1 200 OK (CRLF)

2. Post-Response Header

3. The response body is the content of the resource returned by the server.

Iv. Explanation of HTTP protocol

An HTTP message consists of a client-to-server request and a server-to-client response. Request Message and Response Message are both from the start line (for request message, the start line is the request line, and for response message, the start line is the status line), the message header (optional ), empty line (only CRLF line), message body (optional.

HTTP message headers include common headers, request headers, response headers, and object headers.

Each header field consists of the name + ":" + space + value. The name of the message header field is case-insensitive.

1. Common Header

In a common header, there are a few header fields used for all request and response messages, but not for transmitted entities, only for transmitted messages.

Eg:

Cache-control is used to specify cache commands. cache commands are unidirectional (Cache commands in the response may not appear in the request ), it is independent (the cache command of one message does not affect the cache mechanism of the other message processing), and the similar header domain used by http1.0 is Pragma. Cache commands for requests include: No-Cache (used to indicate that the request or response message cannot be cached), No-store, Max-age, Max-stale, Min-fresh, and only-if-cached; cache commands for response include: public, private, no-cache, no-store, no-transform, must-revalidate, proxy-revalidate, Max-age, S-maxage.

For example, to instruct the IE browser (client) Not to cache pages, the Server JSP program can be written as follows: Response. sehheader ("cache-control", "No-Cache ");

// Response. setheader ("Pragma", "No-Cache"); equivalent to the above Code, usually both //

This Code sets the common header domain: cache-control: No-cache in the sent response message.

Date common header field indicates the date and time of message generation

The connection common header field allows sending the specified connection option. For example, if the specified connection is continuous or the "close" option is specified, a notification is sent to the server. After the response is complete, the connection is closed.

2. Request Header

The request header allows the client to send additional request information and client information to the server.

Common request headers

The ACCEPT request header field is used to specify the types of information the client accepts. Eg: accept: image/GIF indicates that the client wants to accept resources in the GIF image format; accept: text/html indicates that the client wants to accept HTML text. The accept-charset request header field is used to specify the character set accepted by the client. Eg: Accept-charset: iso-8859-1, gb2312. if this field is not set in the request message, it is acceptable by default for any character set. The accept-encoding Request Header domain is similar to accept, but it is used to specify acceptable content encoding. Eg: Accept-encoding: gzip. Deflate. If the domain server is not set in the request message, it is assumed that the client can accept all content encoding. An accept-language Request Header domain is similar to an ACCEPT, but it is used to specify a natural language. Eg: Accept-language: ZH-CN. If this header field is not set in the request message, the server assumes that the client is acceptable to all languages. Authorization authorization request header domain is mainly used to prove that the client has the right to view a resource. When a browser accesses a page, if the response code of the server is 401 (unauthorized), it can send a request containing the authorization request header domain, requiring the server to verify the request. Host (this header field is required when a request is sent) The host request header field is used to specify the Internet host and port number of the requested resource. It is usually extracted from the http url, eg:

We enter: http://www.hualai.net.cn in the browser

The request message sent by the Browser contains the host Request Header domain, as follows:

HOST: www.hualai.net.cn

The default port number 80 is used here. If the port number is specified, it is changed to: Host: www.hualai.net.cn: the specified port number.

User-Agent

When we log on to the forum online, we will often see some welcome information, which lists the names and versions of your operating system, the names and versions of your browsers, this often makes many people feel very strange. In fact, the server application obtains this information from the User-Agent Request Header domain. The User-Agent request header field allows the client to tell the server its operating system, browser, and other attributes. However, this header field is not required. If we write a browser and do not use the User-Agent to request the header field, the server cannot know that we
.

Example of request header:

GET/form.html HTTP/1.1 (CRLF) accept: image/GIF, image/X-xbitmap, image/JPEG, application/X-Shockwave-flash, application/vnd. MS-Excel, application/vnd. MS-PowerPoint, application/MSWord, */* (CRLF) Accept-language: ZH-CN (CRLF) Accept-encoding: gzip, deflate (CRLF) If-modified-since: wed, 05 Jan 2007 11:21:25 GMT (CRLF) If-None-Match: W/"80b1a4c018f3c41: 8317" (CRLF) User-Agent: Mozilla/4.0 (compatible; msie6.0; windows NT 5.0) (CRLF) Host: www.guet.edu.cn (CRLF) connection: keep-alive (CRLF)

3. Response Header

The Response Header allows the server to transmit additional response information that cannot be placed in the status line, as well as information about the server and the next access to the resource identified by the request-Uri.

Common Response Headers

Location

The location response header field is used to redirect the receiver to a new location. Location response header fields are often used when domain names are changed.

Server

The server response header contains the software information used by the server to process requests. It corresponds to the User-Agent Request Header domain. Below is

An example of the server response header domain:

Server: APACHE-Coyote/1.1

WWW-Authenticate

The WWW-authenticate Response Header domain must be included in the 401 (unauthorized) Response Message. When the client receives the 401 Response Message and sends the Authorization Header domain request server to verify the message, the server response header contains this header field.

Eg: www-Authenticate: Basic realm = "basic auth test! "// You can see that the server uses a basic authentication mechanism for requested resources.

4. Object Header

Both request and response messages can be transmitted as an entity. An object consists of the object header domain and the Object Body, but it does not mean that the object header domain and the Object Body must be sent together, but only the object header domain can be sent. The object header defines metadata about the Object Body (eg: whether there is an entity body) and the resource identified by the request.

Common Object Headers

Content-Encoding

The content-encoding object header field is used as a modifier of the media type. Its value indicates the encoding of additional content that has been applied to the Object Body, to obtain the media types referenced in the Content-Type header field, the corresponding decoding mechanism must be adopted. Content-encoding is used to record the compression method of a document. For example: Content-encoding: Gzip

Content-language

The content-language object header field describes the natural language used by the resource. If this field is not set, the entity content will be provided to all languages for reading.

. Eg: Content-language: da

The Content-Length object header field is used to specify the length of the Object Body, represented by a decimal number stored in bytes. Content-Type object header field terms indicate the media type of the Object Body sent to the recipient. Eg: Content-Type: text/html; charset = UTF-8/> Content-Type: text/html; charset = UTF-8/> last-modified object header field is used to indicate the last modification date and time of the resource.

Expires

The expires object header field specifies the response expiration date and time. To enable the proxy server or browser to update the cache after a period of time (when accessing the previously visited page again, load the page directly from the cache, shorten the response time and reduce the server load, we can use the expires object header field to specify the page expiration time. Eg: expires: Thu, 15 Sep 2006 16:23:12 GMT

The client and cache of http1.1 must regard other illegal date formats (including 0) as expired. Eg: to prevent the browser from caching pages, we can also use the expires object header field to set it to 0. The JSP program is as follows: Response. setdateheader ("expires", "0 ");

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More