Web front-end many of the optimization principles are from the point of view of how to improve the efficiency of network communication, but the use of these principles is still a lot of traps in the inside, if we do not understand the principles behind the optimization of the principle of technology hidden, it is possible to fall into these traps, and ultimately not achieve the best expected effect, Today I am here to analyze the browser and server-side communication of some of the details, hope that through the analysis of these details, can give you a enlightenment, can better understand the hidden behind the principles of optimization, and ultimately better use these principles.
Website Communication technology is built on the HTTP protocol, the HTTP protocol underlying communication means using the TCP/IP protocol, but the TCP protocol in the establishment of connection and disconnection of the two actions are very consumption of communication performance, mainly because the tcp/ The IP protocol is caused by the three handshake mechanism at the time of connection establishment and the four wave mechanism when disconnecting, let's take a look at the following graph:
The middle of the picture is marked by the red box is the TCP/IP protocol when establishing a connection need to send three messages to confirm the success of the connection, the middle of the four blue box is to indicate that the TCP/IP protocol in the disconnection time to send four messages to determine the final connection is disconnected, and a specific HTTP request and response to send two of messages, which also shows that if the browser and the server interaction between the new and close a TCP/IP connection, then the browser and servers to round-trip 9 message communication, and really to handle the user request of the message is only two times, In other words such a request will probably have about 80% performance is not used to deal with business requirements, is equal to the loss of about 80% of the performance, of course, this ratio is 9 times the data size of the message interaction is consistent, if the user business request and response data volume is larger, Then the performance loss ratio of the connection and disconnection is reduced, but even if the ratio is reduced, it is also reduced on the basis of the time of the request processing itself, if the distance between the browser and the server is particularly large, then the efficiency of the more than 7 times of the message exchange is more serious, anyway, tcp/ The three-time handshake mechanism and four wave-waving mechanism of IP will have a significant impact on the efficiency of network request as long as it happens.
In order to solve the problem of excessive frequency of the message, the HTTP protocol itself has changed, that is, HTTP began to use a long connection, the use of long connections after the site only need to open a long connection, the browser after the user closes the browser before the page will be reused this long connection. However, the 1.0 version of the HTTP protocol does not enable long connections by default, so I have to manually open the long connection when I use the HTTP protocol 1.0 version, this method is to set the connection:keep-alive in the HTTP header, The http1.1 version of the long connection is open by default, so do not need our manual settings, and the current browser almost all support the http1.1 protocol, so most of the time we do not need to manually open the long connection.
Although the HTTP protocol with a long connection can reduce the site communication time three handshake and four times the number of waves, but the long connection after the establishment of a browser and server for a long time maintenance, which in itself will consume the browser and server performance, especially the server side long maintenance long connection itself will also damage the server to handle concurrency , so the early browser will limit the number of http1.1 open connections, such as IE7 this antique browser, which allows http1.1 to open up to 2 long connections, and http1.0 because the default to use a short connection it can open 4 by default, there is a diagram below to illustrate, as follows:
The way to improve the efficiency of the browser loading, in addition to improve the efficiency of each connection, in fact, there is a way to use multiple connections for parallel loading, this is equal to several people to join together to complete a task, then the efficiency is certainly higher than a person, and the page load time is very consistent with the use of concurrent loading scenarios, For example, we make the picture in the page in parallel loading is certainly more efficient than loading the image. Back to the number of connections supported by the browser, Due to the differences in the number of http1.0 and http1.1 connections in earlier browsers, some websites, such as Wikipedia, have a lot of static resources, in order to exploit the concurrency advantage, the server that holds these static resources uses the HTTP1.0 protocol so that more static resources can be loaded in parallel because the overall The efficiency increase is much higher than the loss of the TCP/IP handshake and wave, but now this technique has little effect because the new version of the browser has already adjusted the number of connections supported by the two versions of the HTTP protocol, because long connections can be multiplexed, so the efficiency of using long connections is better than non-long connections.
There is a limit to the number of connections above, this restriction is to be under the same domain name, if a page some static resources under different domain names, then this practice can increase the number of concurrent pages, for example, we put some static resources are not constantly changing the sample, The external CSS files and JavaScript files are placed separately on a static resource server, the external URL address of the static resource server and the URL address of the page itself is not the same domain name, then the number of concurrent loading connections of the page itself will increase by one times, However, this means that the browser side to maintain the number of long connections will become more, Yahoo! Engineers have summed up a page of reasonable number of domain names, that is, two, this conclusion has passed for many years, now the browser and server performance has changed, this cross-domain number should be able to increase the point, However, I personally think that a page contains the number of domain names or not too much, in fact, if we use the Web front-end optimization method properly, two different domain names are sufficient, more valuable, unless your site is special, for example, you see the browser itself is now supported by the number of connections is very high, mostly 6, IE9 even reached 10, doubling the number of 12 and 20 connections, we are doubling is 24 and 40, this figure is very scary, a computer support so much concurrency, if you open a Web site in the browser is also so dry, then the browser of the number of concurrent is too scary, I estimate that the computer itself will not run, so more than 10 connections are enough, you can reasonably play the performance of these connections will be greatly improved, and the number of a website concurrent connection too much that itself shows that you are reducing the number of HTTP this means is not used well.
Back to the Web front-end optimization means, if we put these means more careful analysis will find a lot of means of use in the synchronization request this scenario, of course, these methods can also be used in the context of the asynchronous loading scenario, but the asynchronous loading scenario occurs before the concurrent loading requires a single-threaded asynchronous loading, This single-threaded asynchronous load is a bit like a single point of failure in a distributed system, which is likely to be the Achilles heel of the entire process, so a reasonable use of synchronous requests can also make the performance of asynchronous operations better prepared. Above I talk about the browser under the same domain name can open up to how many connections, but in the Web front-end development people can feel, we do page development time is actually unable to control the number of connections, then the problem, so many connections in the end is under what conditions are open? This is a very interesting question, so let's take a look at the waterfall chart below:
From the waterfall above we found that the parallel download is a picture, this by the way, if we see some Web pages have been the design of concurrent optimization, we will find that the concurrent resources are pure static resources, then the number of concurrent connections with our page design has a relationship? First of all, we summarize the static resources on the page, the static resources in the page HTML, if the HTML has inline CSS code and JavaScript code, then the code will also be attributed to HTML, in addition to HTML outside the CSS file, External JavaScript files and images used in the page, then how these elements will promote the page parallel loading, in other words, how can these elements prompt the browser to open more connections at the same time?
First of all, we have to clarify a problem, the browser can open more connections, so many connections in parallel execution is a prerequisite, this premise is that these resources are not loaded in parallel, such as external CSS files, images such as resources, these resources are ready to use after downloading, Because they do not have a logical problem to handle after downloading, so they can be used directly after downloading, so they do not affect the display of the page in parallel loading, this situation if you encounter JavaScript is a bit of trouble, the external JavaScript code contains logic inside, And some logic is likely to affect the display of the page, so after the JavaScript download, the browser will have to execute immediately, so we will see such a waterfall diagram, as shown in:
The white space above the
is the amount of time the browser spends executing JavaScript code. How many connections the browser opens is the spontaneous behavior of the browser, which is driven primarily by the increased efficiency of the browser's concurrent downloads. Since the browser connection is basically to take the http1.1 protocol, that is, the use of a long connection, then the connection is established after the long-term maintenance, if the long connection is a separate static resource on the long connection on the server, this problem is nothing, if the long connection under the main domain name, the problem comes, the primary domain name in the page initialization load Time will be used to download HTML, if we to improve the efficiency of concurrent download, so that the main domain also put other static resources, it may cause the browser and the primary domain name of the server to maintain more long connections, and the page follow-up is basically using AJAX to operate, And Ajax will often only reuse one of the long connection, then the other redundant long connection equals to idling, this idling also need to consume the browser and server system resources, so we found that the main domain name of the request resource type must be carefully controlled, can be migrated to a separate static resource server must be migrated, As far as possible, the main domain name processing requests are included in the business logic of the request, so as to effectively improve the utilization of system resources. Further thinking about this problem, we will find that if the server side of the business application servers placed before a reverse proxy, the reverse proxy is the use of static resource server, and the static resource server to the concurrency of the load capacity is far beyond the business application server, if we accidentally placed too many static resources under the primary domain name, If a reverse proxy is used in the background, the reverse proxy can also mitigate the computational resource loss caused by this long connection.
These scenarios are all in the browser synchronization request, then the asynchronous request to the parallel loading of static resources means still valid? Before answering this question, let's first think that asynchronous loading will cause new static resources to be loaded. This is certainly possible, especially in the context of front-end MVC, where we will put the template technology on the browser side, when some HTML templates may initially be included in the JavaScript code, as a variable stored, and this template is likely to contain a lot of new images are used, When Ajax gets the data from the server, parses the template, and then we add the constructed template to the DOM structure of the page, and when the browser re-renders the page and sees many new images that need to be loaded, it is possible to open multiple connections for parallel loading to improve the efficiency of resource loading. If you encounter the dynamic loading of external CSS files through Ajax technology, then this parallel loading situation will be more prominent, because the CSS file is likely to contain a large number of picture resources, if we put the static resources are not changed in a separate static resource server, Then this parallel load will not open more long connections under the primary domain name, thus, it can be seen that static resources using a separate domain name of the static resource server processing a lot of benefits.
Now the http2.0 agreement is still being drafted, http2.0 if the landing will have a significant impact on web front-end optimization technology, HTTP2.0 intends to use only one TCP/IP connection on a page, However, http2.0 will be in this connection on the link multiplexing, that is, to allow a connection can also do parallel operation, so that the utilization of the connection is higher, if the http2.0 landing, the Web front end of those used to reduce the number of HTTP connections will lose the market, because the protocol itself can handle the problem of concurrency, as the external CSS Pieces, external javascript files, CSS Sprite technology is probably going to be history.
It seems that this topic is not finished, the next chapter to write it, today is the Lantern festival, here I wish you all happy holidays.
Original link: http://www.cnblogs.com/sharpxiajun/p/4316840.html
"Turn" thoughts on the evolution of large-scale website technology (20)--Website static processing-web front end optimization-Medium (12)