High-performance networks in Google Chrome (1)

Source: Internet
Author: User
The following is a draft of "the performance of Open Source Applications" (POSA), which is also
Successor of architecture of Open Source Applications .POSA includes a number of papers on performance optimization and design, as well as performance management during development. It is expected to be released in the spring of 2013 ].

By Ilya grigorik onJanuary 31,201 3 (translated by horky [http://blog.csdn.net/horkychen])

Google Chrome's history and guiding principles

This part is not translated in detail, but only the core meanings are listed.

The core principles driving Chrome's continued development include:

  • Speed:Do the fastest (Fastest.
  • Security:Provides users with the most secure (Most secure.
  • Stability:Provides a robust and stable (Resilient and stable.
  • Simplicity:With concise user experience (Simple user experience(Sophisticated technology ).
This article focuses on the first point, speed. Performance

A modern browser is the same platform as an operating system. Browsers before chrome are single-process applications, and all pages share the same address space and resources. Introducing a multi-process architecture is Chrome's most famous improvement ].

In a process, a Web application mainly needs to execute three tasks: obtaining resources, page layout and rendering, and running JavaScript. Rendering and scripts run in the form of a single thread at the same time. The reason is that JavaScript is also a single-thread language to maintain Dom consistency. Therefore, optimizing rendering and script running is extremely important for both page developers and browser developers.

Chrome's rendering engine is WebKit, while JavaScript engine uses the in-depth optimization V8 ("V8" javascript runtime). However, if the network is poor, whether it is optimizing V8 javascript execution or optimizing WebKit parsing and rendering, the function is actually very limited. It's hard for a clever man to have no rice left. If the data doesn't come, you have to wait.!


Compared with the user experience, the most obvious function is how to optimize the loading sequence, priority, and latency of each resource ). You may not notice that the chrome network module is improving every day, gradually reducing the loading cost of each resource: Learning from DNS lookups and remembering the page topology of the web ), there are many other possible target URLs in advance. From the outside, it is a simple resource loading mechanism, but inside it is a wonderful world.
About Web Applications 

Before getting started, let's take a look at the current network needs of web pages or web applications.

The HTTP Archive Project has been tracking web page construction. In addition to the page content, it also analyzes the number, type, header information, and metadata (metadata) of popular pages ). The following figure shows the average data obtained from the 300,000 target page:
  • 1280 KB
  • Include88 Resources(Images, JavaScript, CSS ...)
  • Connect more than 15 different hosts (distinct hosts ).
These figures have continued to grow over the past few years, with no signs of stopping. This shows that we are constantly building a larger and ambitious network application. Note that, on average, each resource is less than 12 kb, indicating that the vast majority of network transmission is short-lived and bursty. This is inconsistent with the direction of TCP for large data and streaming download, which introduces some complications. The following uses an example to show you how to get a glimpse ......

The W3C navigation timing specification of a resource request defines a set of APIs. We can observe the time sequence and performance data of each request in the browser. The following details: After a webpage resource address is given, the browser checks the local cache and application cache. If you have obtained and have the corresponding Cache Information (appropriate cache headers) (for example Expires, Cache-control,
Etc .), The request will be filled with the cache data. After all, the fastest request is no request ( The fastest request is a request not made). Otherwise, we will re-verify the resource. If the resource has expired (expired) or never been seen at all, a network-consuming request will inevitably be sent.

Given a host name and a resource path, chrome first checks whether existing connections (existing open connections) can be reused, that is, sockets specifies) the specified connection pool ). However, if a proxy is configured or the proxy auto-config (PAC) script is specified, chrome checks the connection with the proxy. The PAC script provides different proxies Based on the URL or specifies specific rules for this. Each proxy can have its own socket
Pool. Finally, if none of the above conditions exists, the request starts from DNS lookup to obtain its IP address.

Fortunately, this host name has been cached. Otherwise, you must first initiate a DNS query. The time required for this process is related to factors such as ISP, page visibility, possibility of intermediate caches for the host name, and response time of authoritative servers. That is to say, there are a lot of variables here, but it is not as exaggerated as several hundred milliseconds. After the resolved IP address is obtained, chrome will open a new TCP connection between the target address, and we will execute a three-degree handshake ("three-way handshake "): SYN> SYN-ACK> ack. This operation must be completed for each new TCP connection and there is no shortcut. Depending on the distance and route path, this process may take several hundred milliseconds or even several seconds. Till now, we haven't even received a valid byte.

When the TCP handshake is complete, if we connect to an HTTPS address, there is also an SSL handshake process, and we need to increase the latency wait for up to two rounds. If the SSL session is cached, it takes only one time.

Finally, Chrome is about to send an HTTP request (as shown in the figure above ). requestStart). After the server receives the request, it will send the response data back to the client. This includes the minimum round-trip latency and service processing time. Then a request is complete. But what if it is an HTTP redirection (redirect? We have to start this process from scratch. If your page has some redundant redirection, you 'd better think twice!

Have you obtained all the latencies? Let's assume a typical broadband environment: no local cache, relatively fast DNS lookup (50 ms), TCP handshake, SSL negotiation, and a fast server response time (100 ms) and one delay (80 ms, average in the United States ):
  • 50 ms for DNS
  • 80 ms for TCP handshake (one RTT)
  • 160 ms for SSL handshake (two RTT's)
  • 40 ms (Send request to server)
  • 100 ms (server processing)
  • 40 ms (Server Response Data)

A request took 470 milliseconds, of which 80% of the time was occupied by network latency. Now, we have a lot to do! In fact, 470 MS is optimistic:

  • If the server does not reach the congestion window of the initial TCP, that is, 4-15kb, more round-trip delays will be introduced.
  • SSL latency may also get worse. If you need to obtain an existing certificate or execute Online Certificate Status check (OCSP), we will need a new TCP connection, it also increases the latency of hundreds to thousands of milliseconds.

How is it "fast enough "? 

We can see that the server response time is only 20% of the total latency, and other operations are occupied by DNS and handshaking. Past user experience research showed that users have different responses to latency:

Latency User response
0-100 ms Fast
100-300 ms A little slow
300-1000 ms The machine is still running.
1 S + Think about other things ......
10 S + I will try again later ...

The above table also applies to the performance of pages: rendering pages should at least give a response within ms to attract users. This is simply targeted at speed. From Google, Amazon, Microsoft, and thousands of other sites, the extra latency directly affects page performance: A smooth page will attract more browsing, more user appeal (engagement), and a conversion rates).


Now we know that the ideal latency is 250 ms, and the previous example tells us that DNS lookup, TCP and SSL handshakes, and request preparation takes 370 Ms, even if we ignore the server processing time, We exceed 50%.

For most users and web developers, the latency of DNS, TCP, and SSL is transparent, and few people will think of it. That's why Chrome's network module is so complicated.

We have identified the problem. Let's take a deeper look at the implementation details ...


In-depth chrome Network Module multi-process Architecture  

Chrome's multi-process architecture brings important significance to browser's network request processing. It currently supports four different execution modes (four different execution models).


By default, the Chrome browser on the desktop uses the process-per-site mode to isolate different website pages and organize the pages of the same website. For example, each tab is a separate process. From the perspective of network performance, there is no essential difference, but the process-per-tabl mode is easier to understand.

Each tab has a render process, which includes the Layout Engine Used to parse pages (interpreting) and layout out ), that is, HTML render in. There is also Dom bindings between the V8 engine and the two. If you are curious about this part, you can see here (Great
Introduction to the plumbing) .

Every such rendering process is running in a sandbox environment and only has limited access to the user's computer environment-including the network. To use these resources, each rendering process must communicate with the browser [kernel] process to manage the security and access policies of each rendering process ). Inter-process communication (IPC) and multi-process resource Loading

Communication between rendering processes and kernel processes is completed through IPC. On Linux and Mac OS, A socketpair () that provides asynchronous named pipeline communication mode is used (). Messages of each rendering process are serialized to a dedicated I/O thread and then sent to the kernel process. At the receiving end, the kernel process provides a filter interface to parse resource-related IPC requests (resourcemessagefilter ),
This part is the responsibility of the network module.

One advantage of this is that all resource requests are processed by I/O processes, whether they are activities generated by the ui or interactions triggered by network events. The I/O thread of the kernel process (Browser/kernel process) parses the resource request message and forwards it to a singleton object of resourcedispatcherhost for processing.

This Singleton interface allows the browser to control the access to the network by each rendering process and achieve effective and consistent resource sharing:

  • Socket pool and connection limits:The browser can limit each profile to open 256 sockets, each proxy opens 32 sockets, and each group of {scheme, host, port} can open 6. Note that a maximum of six HTTP and six HTTPS connections can be enabled for a group of {Host, port.
  • Socket reuse:The socket Pool provides persistent and available TCP connections for reuse. In this way, you can avoid the additional time required to establish DNS, TCP, and SSL for new connections (if needed.
  • Socket late-binding (delayed binding ):Network requests are always associated with a TCP connection when scoket is ready to send data. Therefore, it has the opportunity to effectively grade the request (prioritization). For example, during the socket connection process, a request with a higher priority may be reached. At the same time, there can be a better throughput (throughput). For example, when a connection is opened, a fully available TCP connection can be used to reuse a socket. In fact, the traditional TCP pre-Connect (pre-connection) and a large number of other optimization methods are also this effect.
  • Consistent session state (Consistent session state ):Authorization, cookies, and cache data are shared among all rendering processes.
  • Global Resource and network optimizations (Global Resource and network optimization ):The browser can make better decisions between all rendering processes and unprocessed requests. For example, to give the request corresponding to the current tab a better priority.
  • Predictive optimizations (predictive optimization ):By monitoring network activities, chrome will establish and continuously improve the prediction model to improve performance.
  • ... The project is still being added..

For a single rendering process, it is easy to send resource requests through IPC. If you tell the browser kernel process a unique ID, it will be handed over to the kernel process for processing.

Cross-platform resource Loading

Cross-platform is also a major consideration of the chrome network module, including Linux, windows, OS x, Chrome OS, Android, and IOS.To this end, the Network Module tries its best to implement a single-process model (only separate cache and proxy processes) cross-platform function library, so that the basic components (infrastructure) can be shared between platforms) share the same performance optimization, and have the opportunity to optimize all platforms at the same time.

The "src/net" subdirectory can be found here ). This article will not detail each component, but understanding the code structure helps us understand its capability structure. For example:

Net/android Bind to the android runtime [horky]: The runtime is really a bad term.
Net/base Common network tool functions. For example, host resolution, cookies, network change detection, and SSL authentication management
Net/cookies Cookie storage, management, and Retrieval
Net/disk_cache Disk and memory cache implementation
Net/DNS Implements an asynchronous DNS Parser (DNS resolver)
Net/HTTP HTTP protocol
Net/Proxy Proxy (socks and HTTP) configuration, resolution (resolution), script fetching ),...
Net/socket Cross-platform implementation of TCP sockets, SSL streams, and socket pools
Net/spdy Implemented the spdy protocol.
Net/url_request URLRequest, urlrequestcontext and urlrequestjob implementation
Net/websockets Implemented the websockets Protocol

Each of the above items is worth reading. The code is well organized and you will find a lot of unit tests.

Architecture and performance on the mobile platform

Mobile browsers are developing, and the Chrome team also regards optimizing the mobile experience as the highest priority. It should be noted that the mobile chrome version is not directly transplanted to its desktop version, because it will not bring a good user experience. The inherent characteristics of the Mobile End determine that it is an environment with severe resource limitations. There are some basic differences in the running parameters:

  • Desktop users can use the mouse to create overlapping windows and large screens without worrying about battery. The Network is also very stable, with a large amount of storage space and memory.
  • Mobile users are engaged in touch and gesture operations. The screen is small and the battery power is limited. By using a fast and expensive network, the storage space and memory are also quite limited.

Furthermore, there are not just a typical mobile device, but a large number of devices with various hardware. What chrome wants to do is to try to be compatible with these devices. Fortunately, Chrome has different running modes (Execution models). These problems can be easily solved!

In Android, chrome also uses the desktop multi-process architecture.-One browser kernel process and one or more rendering processes. However, due to memory limitations, mobile chrome cannot run a specific rendering process for each tabl. Instead, it determines the optimal number of rendering processes based on memory conditions and other conditions, the rendering process is then shared among multiple tabs.

If the memory is insufficient or chrome cannot run multi-process due to other reasons, it will switch to the single-process and multi-thread mode. For example, on iOS devices, chrome can only run in this mode due to the limitations of its sandbox mechanism.

For network performance, chrome uses the same network module on different platforms in Android and IOS. This can achieve cross-platform network optimization, which is also one of Chrome's obvious leading advantages. The difference is that you need to adjust the network conditions and device capabilities, including speculative optimization priority, socket timeout settings and management logic, and cache size.

For example, to prolong battery life, chrome on the Mobile End tends to delay disabling idle Sockets (lazy closing of idle sockets), usually to reduce the signal (radio) and disable the old one when opening the new socket. In addition, some network and processing resources will be used for pre-rendering (pre-rendering, which will be introduced later). It is usually used only on WiFi.

The mobile browsing experience is a separate chapter, perhaps in the next phase of the POSA series.

(Unfinished, To be continued ......)

Reprinted please indicate the source: http://blog.csdn.net/horkychen
Original article address:

Http://www.igvita.com/posa/high-performance-networking-in-google-chrome/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.