Chrome source code analysis [sequence] & [1]

Source: Internet
Author: User
[Sequence]
Open source is a good thing, and it makes this a lot of industrial waste. Code And the industry of teaching materials, toys and code, has some more artistic atmosphere and beautiful potential. It gives everyone the opportunity to stand on the shoulders of giants, whether you are from New York or Tieling, China. If you cannot, you can hold at least one thigh... Now I just want to hug my thigh. This thick leg is affiliated with chrome (the name of the open-source project is actually chromium. It would have been obscure to use Chrome's name, I didn't expect its name to be upgraded to another level ...), google's ambitious browser. Everyone born with a golden spoon is inevitably admired and denied, and chrome is no exception. There are too many discussions about the advantages and disadvantages of chrome. Basically, it has been chewed into sugarcane dregs, and no one is willing to take one more bite. As the saying goes, a layman looks at the excitement. Most so-called layers evaluate their advantages and disadvantages by using their real feelings. This is undoubtedly the best way. However, there are still some self-proclaimed experts who like to talk about what they say to do. As soon as they see that Chrome uses multiple processes, they say that spam is definitely a low energy. Come on, everyone is engaged in technology. You know the disadvantages of multi-process, and Google knows that they are not politicians. They have nothing to do with anything except a gimmick, people also have faces and skins. Source code Is it fun to make people laugh? The advantages and disadvantages of the technology cannot be generalized. The effects of the same technology vary with different code implementations in different environments. Since chrome uses a lot of technologies that do not look very beautiful, do we need to know why it is used and how it is used before talking? (We will not invite you. Please use your own check box ...)... People say that the scorpion is pulled out by the horse, and Google has pulled the ass chrome to the world, so you can freely walk it. We have always said that we are engaged in science, that is, we are trying to start with the so-called artist. If we are engaged in super-female judges, we can freely put your asshole and fart, and say Li Tiantian is Li Tiantian, you can only say that he has a unique artistic taste. If you want to engage in science, you will not be able to do it. If you say no, it is ignorance, or academic fraud, the results will be gloomy. So now that all the code is available, you can only take a look at it when talking again... I have already begun to peat Chrome's ass. To be exact, it's a powerful fat donkey. The total project size is close to 2 GB. Such a huge object needs a large number of pores from the beginning to the foot, it is estimated that the breath will not vomit blood, we are not doing code review, do not need to be so desperate. Every good open-source project is like a beautiful woman. There is no perfect beauty in the world, and naturally there will be no outstanding open-source project. Every beauty has one or two things that make you feel the most exciting or mysterious. You will focus most of your attention on it to savor it and look at open source. For me, Chrome is attractive because (rankings ...):

1. How it uses multi-process (in fact, multiple threads will be used together) for concurrency, and how to solve some problems between multi-process, such as inter-process communication and process overhead;
2. As a latecomer, how is its scalability, how to weigh its compatibility with the original plug-ins, and what kind of plug-in model is provided;
3. What is its overall framework? Is there any Nb architecture idea;
4. How to implement a cross-platform UI control system;
5. Why is the legendary V8 so fast.

However, Chrome is a cross-platform browser, and its Linux and Mac versions are under development. So I put all my eyes on the Windows version, All code profiling is based on Windows . In other words, I am a newbie to the browser, an idiot to win the API, and a Martian of concurrent processing, and devote myself to this slide industry for my curiosity, it is inevitable that you will not be able to read your skills. If you have any mistakes, please correct them. If you cannot read them, go home and take it with you... It is really a kind of physical activity, so I will talk less about it later... For more information about Chrome source code download and environment configuration, see here (Windows Version). I just want to emphasize that you must configure the environment strictly according to the instructions, In particular, vs2005 patch and Windows SDK Installation Otherwise, it is definitely not compiled... Finally, write this part of content that is not nonsense. Remember the figure below. This is the epitome of chrome. If you are empty, you must read it here, the most important part is this article...
Figure 1 chrome thread and Process Model
[1] chrome multithreading Model
0. Chrome concurrency Model

If you have carefully read the preceding figure, you should have a basic understanding of the chrome thread and process framework. Chrome has a main process called the browser process, which is the boss and manages most of the daily affairs of chrome. Secondly, there will be many Renderer processes, which are governed by enclosure, display and communication of each management site (Chrome has always declared that a tab corresponds to a process in publicity, which is actually very inaccurate ...), they talk to the boss only, and the boss is responsible for balancing the interests of all parties. The channels for talking to the boss are called IPC (inter-process communication). This is a set of inter-process communication mechanisms built by Google. The basic implementation will be self-decomposed later...

Chrome Process Model
Google has always said that Chrome is the one tab one process mode during the promotion. In fact, this is just to facilitate the promotion. It is basically equivalent to advertising and the actual effect, but also from the code. In fact, the process models supported by chrome are far more extensive than those supported by publicity. You can refer Here In short, chrome supports the following process models:

  1. Process-per-site-instanceThat is, you open a website, and then chain a series of websites from this website belong to a process. This is the default chrome mode.
  2. Process-per-site: A website in the same domain name category is placed in a process. For example, www.google.com and www.google.com/bookmarksbelong to a domain name (Google has its own judgment mechanism). Whether or not there is a mutual open relationship, it is regarded as a process. Use the command line -- Process-per-site.
  3. Process-per-Tab: This is simple. A tab is a process, regardless of whether the sites of each tab are connected or not. Use -- Process-per-tab.
  4. Single Process: You are familiar with this. In the traditional browser mode, no multi-process is used only for multithreading. Use -- single-process to enable it.

There are official comments about the advantages and disadvantages of various models. In any case, it can at least be explained that Google does not adopt a multi-process strategy because of idiots, but the effects of experiments...
You can use Shift + ESC to observe the process status in each mode. At least I failed to observe (each type is the same as the default one...). The cause is to be tracked...

No matter the browser process or Renderer process, it is not just a light pole commander. They all have a series of threads to manage various services for themselves. For Renderer processes, there are usually two threads. One is the main thread, which is responsible for connecting with the boss and has some meanings behind the scenes. The other is the render thread, they are responsible for page rendering and interaction. At first glance, they know that they are the faces of the gang. In contrast, since the browser process is the boss, the younger brother naturally needs more, in addition to the brain-like main thread and the IO thread responsible for communications with the Renderer gangs, in fact, it also includes file thread responsible for file management, DB thread responsible for database management, and so on (for a more detailed list, see here). They are responsible for each other and work together to fight for the boss. They have different relationships with each Renderer process. The threads in the same process often require a lot of collaborative work. This batch of concurrent management between threads, is one of the most brilliant places in chrome...
concurrent chatting
single-process, single-thread programming is the most comfortable thing. As you can see, you can think about it in one dimension. However, the world of Program members is always less beautiful. In many occasions, we all need multi-thread, multi-process, and multi-machine hands-on to work together to complete a task, collectively referred to as concurrency (non-official definition ...). In my opinion, there are two main types of concurrency:

    1. for better user experience . Some tasks are too slow to process, such as database read/write, remote communication, and complex computing. If you do this in a thread and a process, it will often affect user experience, therefore, you need to open another thread or process to the background for processing. The reason why it takes effect depends on the time-sharing mechanism of a single CPU, or the collaborative work of multiple CPUs. Under a single CPU, the total time for two tasks to be divided into two sets is greater than the two tasks completed in turn. However, due to mutual disassociation, people feel more natural.
    2. to accelerate a job . The well-known map/reduce is doing this. It splits a large task into several small tasks and assigns several processes to complete them, together to get the final result faster. To achieve this goal, it is only possible in the case of multiple CPUs. In the case of a single CPU (single-host single-CPU...), it cannot be achieved.

In the second scenario, we will naturally focus on data separation to make good use of the capabilities of multiple CPUs. In the first scenario, we are used to the single-CPU mode, the relationship between data and behavior is usually not paid much attention to. As a result, performance is not increased or decreased in multi-CPU scenarios...

1. Chrome thread model
Let's take a closer look at how we use threads most of the time. In my poor multi-threaded experience, it is often used like this: Start a thread and pass in a specific entry function, check whether this function has side effects. If yes, it also involves multi-threaded Data Access. Check carefully and lock it in a suspicious location... Chrome's thread model follows another path, Attempts to circumvent the existence of locks . For a more accurate description, the chrome thread model limits the lock to a very small range (only when tasks are put into the Message Queue ...), in addition, the upper layer does not need to care about the lock at all (of course, the premise is to follow its programming model, encapsulate the function with a task and send it to the appropriate thread for execution ...), it greatly simplifies the development logic... However, from the implementation perspective, there is nothing mysterious about the chrome thread model (pretty girls, wearing clothes is more promising than not wearing clothes ...). Message Loop . Every chrome thread and entry function is similar. It starts a message loop (see messagepump class) and waits for and executes the task. . The only difference is that the message loop varies depending on the thread processing transaction type. For example, threads that process inter-process communication (note that in chrome, these threads are called I/O threads, and it is estimated that they were wrong at the time of design ...). Messagepumpforio Class, the thread used to process the UI is Messagepumpforui Class, the general thread is used Messagepumpdefault Class (only windows, windows, windows ...). There are two main differences between different message loops. One is what types of messages and tasks need to be processed in a message loop, the second is the loop process (for example, whether it is an endless loop or blocked on a semaphore ...). It is a full version of chrome message cycle diagram, including processing Windows messages and various tasks (what is a task, which will be announced later, so stay tuned ...), process the various semaphores observer (watcher), and then block them on a semaphores to wait for wake-up...
Figure 2 chrome message loop

Of course, not every message loop class needs to run in such a large circle. Some threads do not involve so many things and logic, and it is a waste of effort and time. It is really unforgivable. Therefore, different messagepump classes have different implementations. For details, see the following table:

Messagepumpdefault Messagepumpforio Messagepumpforui
Whether to process system messages No Yes Yes
Whether task needs to be processed Yes Yes Yes
Watcher processing required? No Yes No
Blocked on semaphores No Yes Yes
2. Tasks in chrome
From the above table, it is not difficult to see which type of message loop must be processed, that is, the task (temporarily forget the processing of system messages and watcher. Later, we will remember them ...). If we just leave the task to interfere with other things, we can think like this: There is no difference in the implementation of threads in chrome. The difference only exists in the responsibility layer. Threads with different responsibilities process different tasks. . Finally, before the arrival of tomato, let me talk about task... Simply put, a task is a class that contains the void run () abstract method (see task class ...). A real task can be derived from the task class and its run method can be implemented. Each messagepump class contains a messagepump: Delegate Class Object (an implementation of messagepump: Delegate, see messageloop class ...), in this object, the queues of several tasks are maintained. When you want to execute a logic in a thread, You can derive a task, encapsulate your logic in the run method, and then instance an object, call the posttask method in the expected thread to put the task object in its task queue and wait for execution. I know that many people have already copied bricks, because this method is too common. It is not a simple Dependency inversion. In the implementation of thread pools, undo \ Redo, and other modules, too much... However, what I want to say is that although anyone who eats dumplings during the Chinese New Year still has to watch the craft, it cannot be generalized. In Chrome, the thread model is uniform and unique, which is equivalent to a set of standards that must meet the needs of dozens of hundreds of tasks executed on each thread. Therefore, the performance must be flexible and easy to use. This is the difficulty of designing standards. . To meet these requirements, Chrome has made enough effort on the underlying Library:
    1. It provides a large set of template encapsulation (see task. h). The task can be removed from restrictions such as the inheritance structure, function name, and function parameters (that is, pseudo function implementation based on templates. For more information, we recommend that you directly check the originator modern c ++ and its Loki library ...);
    2. The cancelabletask, releasetask, deletetask, and other sub-classes are derived to provide better default implementations;
    3. In a message loop, tasks are logically divided into real-time tasks, delayed tasks, and tasks processed in idle to meet the needs of different scenarios;
    4. Task is derived from tracked_objects: tracked. Tracked is used to implement logging, statistics, and other functions in a multi-threaded environment, so that tasks are inherently debuggable and statistical;
This is a complete task model after all the things have been set up. We can see that this dumpling is still very laborious... 3. Chrome multithreading Model
To do well, you must first sharpen your tools. The reason why Chrome has spent a lot of effort to sharpen the underlying framework is to make it easier to kill the monster with multiple threads. In Chrome's multi-threaded model, locking occurs only when tasks are put into a thread's task queue. Other operations on any data do not need to be locked. Of course, there is no free lunch in the world. To properly transmit tasks, you need to understand the threads under the jurisdiction of each data object. However, this is a complicated lock compared, I don't know how many times it's really pediatric...
Figure 3 execution model of a task
If you are familiar with the design pattern, you will find that this is Command mode , Separates the environment created in the execution, creates a behavior in one thread, and executes the behavior in another thread. The advantage of the command mode is that it decouples the implemented operation from the constructed operation, which avoids the lock problem and makes the multi-thread and single-thread programming models unified. Second, the command mode also has an advantage, is conducive to the combination and expansion of commands. In Chrome, it Effectively unifies the logic of synchronous and asynchronous Processing ...
Command mode
The command mode is a seemingly cool mode. In traditional object-oriented programming, we often encapsulate data. In the command mode, we want to encapsulate behavior. This is quite normal in functional programming. It encapsulates a function as a parameter and transmits it over and over, but in Object-Oriented Programming, we need inheritance, templates, function pointers, and other techniques to implement them...
By using the command mode, we expect this behavior to be executed in a different environment than it was born. In short, this is a kind of behavior that we don't want to raise. When we do undo/Redo, the Command created in any environment will be put in a queue environment for unified scheduling. This is also true in chrome, we have created a task in a thread environment, but put it in another thread for execution. This kind of living style is useful in many cases...

In a general multi-threaded model, we need to know what synchronization means and what is asynchronous. In the synchronous mode, everything seems to be no different from a single thread, but at the same time, it also lost the advantage of multithreading (reduced to multi-threaded serial ...). If the asynchronous mode is used, it will be much more difficult to write. You need to register the callback and manage the object lifecycle carefully. It is disgusting to write the program. Under Chrome's multi-threaded model, the difference between synchronous and asynchronous programming models no longer exists. For such a scenario: thread a needs thread B to do something, then return to thread a and continue to do something. In Chrome, you can do this: generate a task and put it in the queue of line B. At the end of the run method of the task, another task is generated, which is put back to the thread queue of A and executed by. In this way, synchronous Asynchronization and global unification are all transmitted by tasks, and it is difficult to think about them...
Figure 4 Chrome's asynchronous execution Solution
4. the advantages and disadvantages of Chrome's multi-threaded model have been talking about the problem of Chrome's lock evasion. In the end, the lock is not good. What a terrible crime has it been, so people may hate to kill it first and then quickly. In Chapter 24th "Beautiful concurrency" of "Beautiful code", Simon Peyton Jones, one of the Haskell designers, summarized the difficulties in using locks. I copied them again, as shown below:
    1. when the lock is removed, the two threads modify a variable at the same time.
    2. when more locks are added, the concurrency is impaired. When the lock is heavy, the deadlock occurs.
    3. the lock is incorrect because of the connection between the lock and the data to be locked, it only exists in the programmer's brain. This kind of thing is too easy to happen.
    4. the locking sequence is incorrect. Maintaining the lock sequence is a difficult and error-prone problem;
    5. error recovery
    6. failed to wake up or retry with an error
    7. the most fundamental defect is locks and conditional variables do not support Modular programming . For example, in a transfer business, account a deducts 100 yuan, and account B adds 100 yuan, even if these two actions are independently protected by locks to maintain their correctness, nor can you simply concatenate two simple operations to complete a transfer operation. You must expose their locks and redesign them. There are two good functions, which cannot be used together. This is the greatest sorrow of the lock.
By describing these shortcomings, you can understand the advantages of the chrome multithreading model. It solves the most fundamental defect of the lock, that is, supporting Modular programming, you only need to maintain the functional relationship between the object and the thread, this mess, the mess of the lock, it is much simpler. For programmers, the burden quickly dropped from Taishan to hongmao... One of the main difficulties of the chrome multithreading model lies in the design of the relationship between threads and data. You need to divide the responsibilities of each thread well. If there is data under the jurisdiction of a thread, if the task occupies almost half of the total capacity, it will change from multithreading to a single thread, and the lock of the task queue will become a major bottleneck...
& Lt; TD width = "100%" & gt;
responsibilities of designers
is the design of an underlying structure successful? Is the designer competent, I always think there is a very simple measurement standard. You don't need to check the number of Nb technologies used by the designer. You only need to care about whether the design brings difficulties to other developers. One Nb design focuses all the difficulties on the underlying layer, and allows other developers to work with idiots. One SB design takes a long time, it's just to give other developers a 250-piece note. Then, Nb says, you can follow this manual to develop it, so there won't be any problems...

Basically, Chrome's thread model solves the problem of concurrent user experience rather than joint work (see "idle talk concurrency" I mentioned earlier "), it focuses not on splitting data and execution steps like MAP/reduce, but on the correspondence between threads and data, which matches the working environment of the browser. The design is always dependent on the environment, After all, on the client, there will not be a super-large number of concurrent processing tasks like the server, but it just needs to improve the user experience as much as possible From this perspective, Chrome's multi-threaded model looks pretty at least...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.