Predictive feature optimization for Chrome Predictor
Chrome will become faster as you use it. This feature is implemented by a singleton object predictor. This object is instantiated in the browser kernel process (Browser Kernel processes), and its only responsibility is to observe and learn the current way of network activity, anticipating the user's next steps in advance. Here is an example:
- The user hovers over a link, indicating a user's preferences and the next browsing behavior. At this point, Chrome can have DNS lookup and a TCP handshake in advance. The average user's Click action needs to be nearly 200ms, at which time the DNS and TCP-related operations can be processed, which means hundreds of milliseconds of delay is omitted.
- When the high probability option is triggered in the address bar (Omnibox/url bar), it also triggers a DNS lookup and TCP pre-join (pre-connect), even pre-rendering in an invisible tab (Pre-render)!
- Each of us has a list of websites that will be visited every day, and chrome will look at the sub-resources on those pages and try to pre-parse (Pre-resolve) and possibly even preload (pre-fetch) to optimize the browsing experience.
In addition to the above three items, there are many more.
Chrome will learn the topology of the web as you use it, not just your browsing patterns. Ideally, it will save you a hundreds of millisecond delay, which is closer to the state of instant page loading. It is for this purpose that Chrome has invested in the following core optimization technologies:
DNS Pre-parsing (pre-resolve) |
Resolve host addresses in advance to reduce DNS latency |
TCP pre-connect (pre-connect) |
Early connection to the target server to reduce TCP handshake latency |
Resource Pre-load (prefetching) |
Load the page's core resources in advance to load the page display |
Page pre-rendering (prerendering) |
Get the entire page and related sub-resources in advance so that you can display them in a timely manner |
Each decision contains one or more optimizations that can be used to overcome a number of limiting factors . However, after all, are only predictive optimization strategy, if the effect is not ideal, will introduce redundant processing and network transmission. It may even lead to some negative experience on load time.
How does chrome deal with these issues ? Predictor will try to gather information such as user actions, historical browsing data, and information from the rendering engine (render) and the network module itself.
Unlike Resourcedispatcherhost, which is responsible for network transaction scheduling in Chrome, the Predictor object creates a set of filters (filter) for Users and network transactions:
- The IPC channel filter is used to monitor transactions from the render process.
- An object is added to each request so that the mode of the
ConnectInterceptor
network transmission and the metric data for each request can be tracked.
The render process sends messages under a series of events to a browser process (browser process), and these events are defined in an enumeration (resolutionmotivation) for ease of use (Url_info. h):
enum ResolutionMotivation { MOUSE_OVER_MOTIVATED, // 鼠标悬停. OMNIBOX_MOTIVATED, // Omni-box建议进行解析. STARTUP_LIST_MOTIVATED, // 这是在前10个启动项中的资源. EARLY_LOAD_MOTIVATED, // 有时需要使用prefetched来提前建立连接. // 下面定义了预加载评估的方式,会由一个navigation变量指定. // referring_url_也需要同时指定. STATIC_REFERAL_MOTIVATED, // 外部数据库(External Database)建议进行解析。 LEARNED_REFERAL_MOTIVATED, // 前一次浏览(prior navigation建议进行解析. SELF_REFERAL_MOTIVATED, // 猜测下一个连接是不是需要进行解析. // <略> ...};
With these given events, Predictor's goal is to assess the likelihood of its success, and then trigger the action as appropriate. Each event has a chance, priority, and timestamp for success, which can be used to maintain a priority-managed queue and a means of optimization. Eventually, the success rate of each request issued in this queue can be traced to predictor. Based on this data, predictor can further optimize its decision-making.
Chrome Network Architecture Summary
- Chrome uses a multi-process architecture to isolate the rendering process from the browser process.
- Chrome maintains an instance of the resource dispatcher (a single instance of the resource dispatcher), which runs in the browser kernel process and is shared among the individual render processes.
- The network layer is cross-platform and in most cases exists as a single process library.
- The network layer uses non-blocking (no-blocking) operations to manage all network tasks.
- The shared network layer supports efficient resource sequencing, multiplexing, and provides browsers with the ability to globally optimize across multiple processes.
- Each rendering process communicates through the IPC and the resource Dispatcher (resource dispatcher).
- The resource dispatcher (Resource dispatcher) resolves resource requests through a custom IPC filter.
- Predictor learns from parsing resource requests and responding to network transactions, and optimizes subsequent network requests.
- Predictor will be based on the learning of Network Transaction Mode predictive DNS resolution, TCP handshake, and even resource requests, for the user to save hundreds of milliseconds of actual operation time.
After understanding the obscure interior details, let's look at the optimizations that users can feel. Everything starts with the new chrome.
Optimized cold start (cold-boot) Experience
The first time you launch a browser, it's certainly not possible to know your usage habits and favorite pages. But in fact, most of us do something similar after the cold launch of the browser, such as checking email to email, adding news pages, social pages and internal pages to my favorites, and so on. These pages are different, but they still have some similarities, so predictor can still speed up the process.
Chrome notes the 10 domain names most commonly used by users when launching a new browser. When the browser is launched, Chrome will pre-parse these domain names in advance for DNS. You can use it in chrome.
chrome://dns
View to this list. The list of alternate domain names at startup is listed in the top table of the open page.
Optimize interaction with users with Omnibox
Introducing Omnibox is an innovation for chrome, not simply a URL to the target. In addition to documenting the URL of the page that was previously visited, it integrates with the search engine and supports full-text search in the history (for example, entering the page name directly).
When the user enters, Omnibox automatically initiates a behavior that either finds the URL in the browse record or searches for a single time. Each initiated operation is scored to count its performance. You can enter chrome://predictors in Chrome to see the data.
Chrome maintains a history that includes user-entered pre-text, the behavior, and the number of hits. In the list above, you can see that when you enter G, there is a 76% chance to try to open Gmail. If you add another m (that is, GM), the likelihood of opening Gmail increases to 99.8%.
So what does the network module do? Yellow and green in the table above are very important for resourcedispatcher. If there is a general possibility of the page (yellow), chrome is initiating DNS pre-parsing. If you have a high-probability page (green), Chrome also initiates a TCP pre-connection after DNS resolution. If both are completed and the user continues to enter, Chrome will pre-render (Pre-render) in a hidden tab.
Instead, chrome initiates DNS pre-resolution and TCP pre-connect to search engine providers for similar search results if the input text doesn't find a suitable match.
on average, it takes hundreds of milliseconds for a user to fill out a query to evaluate the recommendations. At this point, chrome can pre-parse, pre-connect, and even pre-render in the background. Then when the user is ready to press ENTER, a large amount of network latency has been processed prematurely. Optimizing Cache Performance
The quickest request is no request. No matter when you talk about performance, you can't talk about caching. Believe you have provided expires, ETag, last-modified, and Cache-control for all resources on the page (response headers). What's that? Not yet? Then you better take care of it before you see it!
Chrome has two different implementations of internal caches: one for local disks and one for memory (in-memory). The memory mode (in-memory) is mainly used in the non-trace browse mode (Incognito browsing modes) and is cleared off in the close window. Both methods use the same internal interface ( disk_cache::Backend
, and disk_cache::Entry
), greatly simplifying the system architecture. If you want to implement a cache algorithm of your own, you can easily implement it.
Internally, disk cache implements its own set of data structures, which are stored in a separate cache directory. There are index files (loaded into memory when the browser is started), data files (which store the actual data, and HTTP headers and other information). Interestingly, files below 16KB are stored in a common block file (data block-files, which is stored in a large file in a small file), and other larger files are stored in their own files. Finally, the elimination strategy for disk caching is to maintain an LRU, which is managed through metrics such as frequency of access and resource usage time (age).
Open a tab in Chrome and enter chrome://net-internals/#httpCache
. If you want to see the actual HTTP data and cached response processing, you can open chrome://cache
it, which lists all the resources available in the cache. You can also see detailed data first-class information by opening each item.
Optimizing DNS Pre-parsing
DNS pre-resolution has been mentioned many times before, and before we go into it, summarize the scenarios and reasons for DNS pre-parsing:
- The WebKit document Resolver, which runs in the render process, provides a list of host names (hostname) for all links on the current page, and Chrome can choose whether to resolve them in advance.
- When the user wants to open the page, the rendering process first triggers a rollover (hover) or button down event.
- Omnibox may initiate a resolution request on a highly probable recommendation page.
- Chrome Predictor initiates host resolution requests based on past browsing history and resource request data. (This is explained in more detail below.) )
- The page itself explicitly requires Chrome to pre-parse some host names.
These are just a clue for Chrome. Chrome does not guarantee that the pre-parsing will be performed, and that all clues will be evaluated by predictor to determine subsequent operations. In the worst case, the host name may not be resolved in time, the user must wait for a DNS resolution time, then the TCP connection time, and finally the resource load time. Predictor will take note of this scenario and reference it accordingly in future decisions. In short, it must be more and more quickly.
As mentioned before, Chrome can remember the topology of each page (topology) and can be accelerated based on this information. Remember, on average, each page has 88 resources, from more than 30 separate hosts. Each open this page, Chrome will write down the resources of the more commonly used host name, in the subsequent browsing process, Chrome will be launched for some host or all host DNS resolution, or even TCP pre-connection !
Use
chrome://dns
You can see the above data (Google + page)
,There are 6 sub-resources corresponding to the host name, and the number of DNS pre-resolution occurrences, the number of TCP pre-connection occurrences, and the number of requests to each host are recorded. This data allows Chrome predictor to perform the appropriate optimization decisions.
In addition to the internal event notification, the page designer can embed the following statement in the page to request a browser to pre-parse:
<link rel="dns-prefetch" href="//host_name_to_prefetch.com">
A typical example of this requirement is redirection (redirects). Chrome itself has no way of judging this pattern, and in this way it allows the browser to parse it ahead of time.
The implementation is also different depending on the version, in general, there are two main implementations of DNS processing in Chrome: 1. Based on historical data (historically), by invoking the platform-independent getaddrinfo()系统函数实现。2.
DNS processing method of the agent operating system, This approach is being superseded by a set of asynchronous DNS resolution mechanisms (asynchronous DNS resolver) that are implemented by Chrome itself.
Depending on the implementation of the system, the code is small and simple, but getaddrinfo () is a blocking system call that cannot be efficiently parallel to multiple query operations. Experience data also shows that too many parallel requests can even exceed the negative amount of the router. Chrome has designed a complex mechanism for this. For pre-parsing with Worker-pool, Chrome simply sends the getaddrinfo()
call and blocks the worker thread until it receives the response data. Because the system has a DNS cache, the parse operation returns immediately for the resolved host. This process is simple and effective.
but it's not enough! getaddrinfo()隐藏了太多有用的信息,比如Time-to-live(TTL)时间戳, DNS缓存的状态等。于是Chrome决定自己实现一套跨平台的异步DNS解析器。
This new technology can support the following optimizations:
- Better control of the timing of re-rotating, the ability to perform multiple query operations in parallel.
- Record TTLs clearly.
- Better handling of IPV4 and IPv6 compatibility.
- Error handling for different servers based on RTT and other events (failover)
Chrome continues to be optimized . DNS metrics data can be observed through Chrome://histograms/dns:
The histogram above shows the distribution of DNS pre-resolution delay time: for example, nearly 50% (rightmost) queries are completed within 20MS. The data is based on the most recent browse operation (sampled 9,869 times), and the user can choose whether to report the usage data, which is then anonymously analyzed by the engineering team, so you can see the performance of the feature and how it will be further adjusted in the future. continue to optimize.
Reprint Please specify source: Http://blog.csdn.net/horkychen
Original address: http://www.igvita.com/posa/high-performance-networking-in-google-chrome/from:http://blog.csdn.net/ horkychen/article/details/10421523
High-performance networks in Google Chrome (ii)