WebKit CSS engine Analysis

Source: Internet
Author: User
Reprinted please indicate the source: http://blog.csdn.net/cnnzp/article/details/6590087
WebKit CSS engine Analysis

The CSS module of the browser is responsible for parsing CSS scripts and calculating the style for each element. Although the CSS module is small and has a large computing capacity, poor design often becomes a bottleneck of browser performance. The CSS module has several features in implementation: a large number of CSS objects (small but many particles) and frequent computation (calculate the style for each element ). These features determine the design and algorithm adopted by WebKit in implementing the CSS engine.How to calculate the style efficiently is the key and difficulty of the browser kernel.

    Front-end engineers may be more concerned about:
  1. CSS scripts that can be efficiently executed by browsers
    Browser kernel engineers may be more concerned about:
  1. Organize CSS internal data
  2. Calculation Style
  3. Thoughts
  4. Summary
Efficient CSS Script Execution

Here I only use efficient CSS for the performance of WebKit execution, and it does not involve CSS design issues.

  1. If the computedstyle values of two or more elements are equal without calculation, the elements with the same computedstyle values will only calculate the style once, and the rest will only share the computedstyle.

    For example:

    <table><tr class='row'><td class='cell' width=300 nowrap>Cell One</td></tr><tr class='row'><td class='cell' width=300 nowrap>Cell Two</td></tr>

    The two tr instances share the computedstyle and the two TD instances share the computedstyle. In the kernel, only the computedstyle of the first Tr and the first TD will be calculated.

    How can we achieve shared computedstyle:

    1. The shared element cannot have the ID attribute and CSS also has the stylerule of the ID, even if the stylerule does not match the element. For example:

      div#id1{color:red}<p>paragraph1</p><p id="id1">paragraph2</p>

      We can see that the two P labels computedstyle are the same, but they are not shared.

    2. The tagname and class attributes must be the same.
    3. Mappedattribute must be equal.
    4. The style attribute is not allowed. Even if the style attributes are equal, they are not shared. For example:
      <p style="color:red">paragraph1</p><p style="color:red">paragraph2</p>

      They do not share computedstyle.

    5. You cannot use sibling selector, such as: First-child,: Last-selector, + selector.
  2. Using ID selector is very efficient. When using ID selector, note that because ID is unique, you do not need to specify both ID and tagname. For example:
    Badp#id1 {color:red;}Good#id1 {color:red;}
  3. The policy for using class selector is the same as that for ID selector. In kernel implementation, the matching between ID selector and class selector is not much different. If the same class needs to be assigned different CSS, you can do this.
    Badp.class1 {color:red;}div.class1 {color:black;}Goodp-class1{color:red;}div-class1{color:black;}

    Of course, this will increase the number of classname pages. Decide how to choose.

    If you want to select a node that is relatively deep, you can write the following code:

    Baddiv > div > div > p {color:red;}Goodp-class{color:red;}

    Childselector is slow in matching.

  4. Do not use Attribute selector. For example, P [att1 = "val1"]. Such a match is very slow. Do not write like this: P [ID = "id1"]. This degrades ID selector to attribute selector.
    Badp[id="id1"]{color:red;}p[class="class1"]{color:red;}Good#id1{color:red;}.class1{color:red;}
  5. Dependency inheritance. If some attributes can be inherited, it is not necessary to write them again.
  6. It is difficult for other selector to optimize the kernel implementation, so try not to use it if possible.
WebKit CSS module implementation

Here I want to share more of the problems I encountered in CSS development. Through these problems, we may be more experienced in WebKit design.

Some Glossary

Some WebKit kernel terms are explained here. If you do not understand these terms, there is a certain degree of resistance to the study code.

  • Mappedattristyle: Some HTML attributes that can affect CSS computedstyle.
    For example:<p align="middle">paragraph</p>The attribute align = "Middle" is called mappedattribute. Generally, we all know that each element has an inlinestyledeclaration. In fact, there is also an implicit declaration called mappedstyledeclaration. Its priority is higher than that of ordinary CSS and lower than that of inlinestyle.
  • Renderstyle: This is the representation of the familiar computedstyle in WebKit.
  • Bloom filter: an algorithm. You can search online if you have never touched it.
Organize CSS internal data

I don't want to draw any inheritance diagrams of CSS objects here. For the Object Inheritance diagram, refer to this article. In addition, I assume that the reader is familiar with CSS standards and concepts.

After the CSS script is parsed, A cssstylesheetlist is generated, which is saved on the document object. For faster calculation styles, these cssstylesheetlist must be reorganized. (Think, Can you calculate the style directly from the cssstylesheetlist ?)

The calculated style is to find all the property-value pairs matching the corresponding elements from the cssstylesheetlist. The matching is verified by cssselector and must meet the cascade rules.

An array model is a simple but inefficient organization.

Organizes all the properties in Declaration into a large array. Each item in the array records the selector, property value, and weight of the property (stacked rules ). For example:

<Style> P> A {color: red; Background-color: black;} A {color: yellow} Div {margin: 1px ;} </style> the array data after the reorganization is (weight I only represent the relative size between them, not the actual value .) : Selector property weight1, a color: yellow 12, P> A color: Red 23, P> A background-color: Black 24, div margin: 1px 3

You can see that each property is an array. The location of the same tagname in the array is adjacent. For example, selector A and Selector P> A are adjacent to each other in the array. All properties are stored in the tagname sequence of selector. With such an array organization, you can think about how to calculate the style?

An efficient organization method, which is now called the hash model.

WebKit uses the cssruleset object to organize the data. Cssruleset is an object that contains four hash tables: idrules, classrules, tagnamerules, and universalrules.

These hash tables are defined as follows: hashmap <string *, cssruledatalist *>

Cssruledatalist is a list, and each item is cssruledata.

Cssruledata stores a CSS stylerule and specificity of the selector of the stylerule (which can be understood as a weight ). In the cssruledata constructor, the bloom filter value of the selector is calculated.

For a rough illustration, I have not completely defined each class, but it can help us understand the relationship between these classes:

  • Store default stylesheet, userstylesheet, and authorstylesheet on different cssrulesets. The array model organizes all stylesheet into an array. Don't underestimate this step. This is a bright spot for me. It is related to the matching algorithm part.
  • Each cssruleset organizes all stylerule into idrules, classrules, tagnamerules, and universalrules. For example:
    # Id1 {color: red;} --> stored in idrules .. Class1 {color: red;} --> is stored in classrules .. Class1 {color: red;} --> same as above. P {color: red;} --> In tagnamerules. * {Color: red;} --> In universalrules.
  • The kernel organizes data after all stylesheet requests are completed and CSS parser ends.

Calculation Style

If the style calculation is improperly designed, the browser kernel performance is directly affected. Therefore, the algorithm here deserves careful analysis.

In the array model, we place the matching results in an array. The initial size of this array is the number of CSS properties. This array is now called a result array.

  1. Organize default stylesheet, user stylesheet, and author stylesheet into a large array. To match a tag, for example:

    <P> <a href = "#"> link </a> </P> calculate the style of tag:

    Because the tagnames of the array are sequential, you can use the Binary Search Method to locate the start and end positions of A, which is 1 --> 3.

  2. Check selector is performed for each item of the array. If check selector is successful, it is stored in the result array.
  3. When storing the matching results in the result array, You need to determine whether the property already exists in the result array. If the property already exists, you need to compare the weights of the two properties, if the new property weight is greater than the old one, replace this item in the array.
  4. Match all the tagnames in the array as universaltagnames. Repeat 2-3.

Conclusion: we can see that this algorithm needs to match all items with the same tagname and all universaltagnames. After the checkselector is successfully inserted into the result array, you also need to determine whether the property already exists.

Another serious problem is that the algorithm does not save the information of the matched selector during checkselector, which brings a lot of uncertainty to the local update in the future, as a result, local typographical analysis cannot determine whether to rearrange data. Compared with WebKit, there are a lot of actions to determine uncertain factors to optimize the actions required for typographical layout.

The hash model is in the array model, and the calculation result is stored in an array. In the hash model, the calculation results are also stored in an array:
Vector <const ruledata *, 32> m_matchedrules;

  1. First, determine whether the element has a renderstyle that can be shared. There are many conditions for sharing or not. We will not detail them here. For more information, see here.However, this policy is very good, and there are many tags shared on the webpage.So it can greatly improve the execution efficiency! If the algorithm can be shared, the matching algorithm does not need to be executed, and the execution efficiency is naturally very high.
    In my tests on www.sina.com.cn, I calculated styles 17864 times and shared them 4764 times. Nearly 27% of the computing style processes are not required, which means the performance is improved by about 27%! The website has 9686 nodes and 3412 elements.
  2. Match the default stylesheet, user stylesheet, and author stylesheet in sequence, and store the results in the result array.Record the starting position of various stylesheet matching results in the result array.
      Each stylesheet matching algorithm:
    1. If this element has the ID attribute, extract the corresponding cssruledatalist from the ID hash table of cssruleset.
    2. Test each cssruledata item in cssruledatalist in sequence. Here, we will first use the bloom filter algorithm to filter out the cssstylerule that does not meet the criteria.
    3. In the check selector process, if the match is successful, add it to the result array.
    4. Sorts the result Array Based on the weight.
    5. If the element has the class attribute, extract the corresponding cssruledatalist from the class hash table of the cssruleset. Repeat Step 2 --> Step 4
    6. Extract the cssruledatalist corresponding to the tagname from the cssruleset according to the tagname of the element. Repeat Step 2 --> step 4.
    7. Repeat Step 2 --> Step 4 for cssruledatalist of all universaltagnames.
  3. The algorithm flowchart of the result array generated by the preceding steps is as follows:

  4. After obtaining all the matched stylerule, You need to generate the renderstyle based on this result. The algorithm steps are shown in the following figure:
    1. How to display the weights of different style sheets in a stack rule.
    2. How to embody the same property and weight in the same style sheet, and overwrite the previous one.

Questions

In this document, it is difficult to clarify all the problems of a kernel module, so I will list some questions for discussion and study. These problems are also some of the specific problems I encountered in practical tools. It is difficult to understand these problems without actually developing the CSS module, and WebKit handles these problems well, it inspired me a lot.

  1. If the style label is written in the body area, what should WebKit do after parsing the style label? Is the style label written in the body area?
  2. : Hover pseudo classes are widely used in CSS. What should I do with the browser kernel? Do I need to re-calculate CSS when an element receives the hover status?
  3. CSS objects are small in granularity but large in number. CSS stylesheetlist and CSS objects must be used to represent CSS objects as small as a cssvalue. The CSS document is large, so will the memory be too fragmented?
  4. CSS objects have a small granularity but a large number of objects. It takes a lot of time to repeatedly distribute and release CSS objects. Is there any good way to do this? Consider managing the memory of CSS objects by yourself?
  5. CSS objects have a small granularity but a large number of objects. Can these small-granularity objects be shared? There is a flyweight pattern in the design pattern, which is well reflected in CSS.
  6. The class attribute or ID attribute of an element has changed. You need to recalculate renderstyle to see if you need to rearrange the element. We know that sibling Selectorp.class1 + aDoes it mean that all the siblings of the element that changed the attribute need to re-calculate the renderstyle or re-typeset?
  7. There are: the first-Child pseudo-class, does it mean to insert a header node into the Father's Day point, and all the children need to rearrange it? Because they may have: First-Child selector. How does WebKit work?
  8. The CSS value contains an inherit value, which indicates that the attribute value of its parent is used. If the property value of his parent changes, how can the child value be updated?
  9. This article does not talk much about the application of the bloom Filter Algorithm in CSS, but it does make a lot of Optimizations to the CSS check selector. Interested readers can refer to the following three points to view the WebKit source code:
    1. When a cssruledata object is generated, Bloom filter data is generated.
    2. During parse HTML, the bloom filter is updated in the beginparsingchildren event of each element.
    3. In matchrulesforlist, The fastrejectselector method is used to filter the locations where the filter occurs.

    Note: This optimization action is available only in the new WebKit version. It is not available in older WebKit versions.

  10. Computedstyle records all the property values of each element. The browser typographical engine frequently extracts the value of a property from computedstyle and designs computedstyle into an array? WebKit uses the renderstyle object to represent computedstyle. What are the advantages of this object in design? Focus on two points:
    1. This object saves a lot of memory than the array format.
    2. This object saves a lot of retrieval time than the array format.
Summary

CSS engines do very little (parsing and computing styles) and are often ignored, but it is really difficult to design a flexible and efficient CSS engine. By analyzing the implementation of WebKit CSS, I am often excited by some design highlights for a long time, so don't ignore the WebKit CSS module. It will surprise you!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.