Working mechanism of the browser

Source: Internet
Author: User

I. Overview

Web browsers are probably the most widely used software. In this article I'll explain how they work behind the scenes, and we'll see what happens when you enter "google.com" in the address bar until Google's page appears in the browser screen.

1. The browser we are going to discuss

There are now five main browsers that are used:IE, Firefox,Safari,Chrome and open gate. According to the Web browser statistics, the current (2009 September ),Firefox, Safari and chrome browsing share of the total use of nearly 60%.

So at the moment, open source browser is the backbone of the browser market.

2, the main function of the browser

The main function of the browser is to display the network resources of your choice, by requesting it from the server and displaying it in the browser window. The format of a resource is usually HTML, or PDF, image, and so on. The location of the resource is specified by the URI (Uniform Resource Identifier) used by the user. The way the browser interprets and displays the HTML file is specified according to the HTML and CSS specifications. These specifications are maintained by the organization of theInternet, the network standard organization.

The current HTML version number is 4 (http://www.w3.org/TR/html401/), and version 5 is still in development. The current CSS version number is 2(http://www.w3.org/TR/CSS2/)and version 3 is still in development. Over the years, browsers have only been part of the specification and have developed their own extensions. This poses a serious compatibility problem for users of the browser. Most browsers now basically follow the specifications.

The browser's user interface has a lot in common with each other, where the common user page elements are:

    • The address bar where the URI is written
    • Back and Forward Buttons
    • Bookmark options
    • Refresh and stop buttons to refresh and stop loading the current file

Strangely, the user interface of the browser is not specified in any formal specification, which is the result of many years of mutual imitation and continuous improvement among browser vendors. The HTML5 specification does not define UI elements that must be in the browser, including the address bar, the status bar, and the toolbar. There are also some browsers that have their own unique features, like the Firefox download manager.

3, the main structure of the browser

The main parts of the browser are:

    1. User interface: Includes address bar, back and Forward buttons, bookmarks menu, and more. That is, in addition to the other parts of the main window used to display your request page.
    2. Browser Engine: interface for querying and manipulating the rendering engine

    3. Rendering Engine: Responsible for displaying the requested content, such as the requested HTML content, which parses the HTML and CSS and displays the parsed content on the screen.

    4. Network: Used for network calls, such as HTTP requests, it has a platform-independent interface that can work on different platforms.

    5. Interface backend: Used to draw a similar combination of selection boxes and dialog boxes and other basic components, with an unlimited set of common interface to a platform, the bottom of the user interface using the operating system.

    6. JS Interpreter: Use it to parse and execute JS code.

    7. Data storage: belongs to the persistence layer, the browser needs to save various formats of data on the hard disk, such as cookies. The latest HTML specification (HTML5) defines a network database as a complete lightweight client database.

The following are the main components of the browser:

  

  PS: In particular, unlike most browsers, Chrome assigns each tag its own instance of the rendering engine, each of which is a separate process.

For each component, I'll explain it in a later chapter.

4. Communication between components

  Both Firefox and Chrome have developed a special communication structure, which will be discussed in a dedicated chapter later.

second, rendering engine

  The duty of the rendering engine is ... Rendering is to display the requested content on the browser screen. by default, the rendering engine can display HTML and XML files as well as pictures. Other types of files can be displayed through plugins (one browser extension). For example, display a PDF file by using the PDF view plugin . We will use a chapter to discuss plugins and extensions. This is just the main purpose of the rendering engine- displaying html and pictures after the CSS is applied.

1. Rendering Engine

< Span lang= "en-US" > The browser we are talking about: firefox, Chrome and Safari is built on two renderers. firefox uses gecko a mozilla self-developed rendering engine. afari and Chrome uses webkit.

  WebKit is an open-source rendering engine that was first used on Linux platforms and later improved and applied to mac and Windows by Apple , see http://webkit.org for more details .

2. Main process  

  The rendering engine first obtains the requested content from the network layer, usually in a 8K chunked manner.

  Here is the basic flow of the rendering engine:

  

The rendering engine first parses the HTML file and converts the label into a DOM node of the content tree . The style information in the external CSS file and the style tag is then parsed. These style information and the visibility directives in HTML will be used to build another tree -the render tree, which consists of some rectangles that contain attributes such as color and size, These rectangles appear on the screen in the correct order .

  After the render tree is built, the layout process is performed. This gives the exact coordinates that appear on the screen for each node. The next step is to draw-traverse the render tree and draw each node using the back-end UI layer.

  It is worth mentioning that this is a gradual process. For a better user experience, the rendering engine will be more likely to display content earlier on the screen, rather than wait until all of the HTML has been parsed and then build and layout the render tree, which is the side-resolved side of the display, while the rest of the content may be downloaded over the network.

3.Main process Examples

           Figure Webkit main flow

  

Figure Mozilla ' s Gecko rendering engine main flow

  

The above two images show that although WebKit and gecko use slightly different terms, the process is essentially the same. Gecko is called a tree of visible formatting elements that are composed of a frame tree.

< Span style= "FONT-SIZE:12PX;" > Each element is a frame. webkit uses the term webkit is called layout, and gecko is called reflow. webkit called the connection dom node and style information to build render tree process for " Attachment ", gecko a layer is attached between html and dom trees, which is known as a content receiver and is quite manufactured Span lang= The factory of the "en-us" >dom element. The stages in the process are discussed below.

   4. Parser

  Because parsing is a very important process in the rendering engine, we'll look at it in more depth. Let's first introduce the parsing.

  Parsing a file translates it into a meaningful structure-code that can be understood and used. The result of parsing is usually a tree of nodes representing the structure of a file, called a parse tree or a syntax tree.

Example-Parse expression "2+3-1", possibly returning such a tree:

  

4.1. Grammar  

  Resolves document-based grammar rules -the language or format of the document. Each resolvable format must have a specific grammar consisting of lexical and grammatical rules, called context-independent grammars. Human language does not have this feature, so it cannot be resolved by common analytic techniques.

4.  2, parser-lexical analyzer

  Parsing can be divided into two sub -processes-syntactic analysis and lexical analysis.

  Lexical analysis is the decomposition of input into symbols. A symbol is a glossary of Languages -a collection of valid building blocks. For humans, it is the equivalent of all the words that appear in our dictionaries.

  Parsing refers to applying grammatical rules to a language.

Parsers typically assign work to two components -the lexical analyzer is responsible for decomposing the input into legitimate symbols, and the parser is responsible for parsing the document structure according to the grammar rules of the language to build the parse tree. The lexical analyzer knows how to skip extraneous characters such as whitespace and line breaks.

  

The parsing process is iterative. Parsing usually gets a new symbol to the lexical parser and then attempts to match a grammar rule with this symbol. If a rule is matched, the node corresponding to the symbol is added to the parse tree, and then another symbol is requested. If no rules are matched, the parser saves the symbol internally and continues to get the symbol from the lexical parser until all the internally saved symbols match a rule. If no matching rule is found, the parser throws an exception, which means the document is invalid or contains a syntax error.

   4.3. Conversion

Many times the parse tree is not the end result. Parsing is typically used in transformations -converting an input document to another format. Compilation is an example. The compiler compiles the source code into a machine code, first parsing it into a parse tree and then converting the tree into a machine-code document.

  

   4.4. Analytic example

In Figure 5, we build a parse tree from a mathematical expression. Let's define a simple mathematical language and look at the parsing process.

Glossary: Our language can include integers, plus and minus signs.

Grammar:

1. The grammar basic unit of the language includes expressions, terms, and operators.

2. The language can include multiple expressions

3, an expression is defined as two term is connected by an operator

4. An operator is a plus sign or a minus sign

5. A term is an integer or an expression

  Now let's analyze the "2+3-1" input :

The substring of the first matching rule is "2", according to rule 5 which is a term . The second match is "2+3", a term followed by an operator and then another term . The next match is at the end of the input. "2+3-1" is an expression because we already know ? 2+3? is a term , so we have a term followed by an operator and another term . "2++" will not match any rules, so it is an invalid input

   4.5. Formal definition of vocabulary and grammar

Words are usually represented by regular expressions.

For example, our grammar will be defined as:

integer:0| [1-9] [0-9]*

Plus: +

Minus:-
As can be seen, integers are defined by a regular expression.

The syntax is usually defined in a format known as BNF, and our language will be defined as:

Expression: = term operation term

Operation: = PLUS | Minus

Term: = INTEGER | Expression

We say that if a language is a context-independent grammar, it can be parsed by a parser. An intuitive definition of context-independent grammars is that the grammar can be fully expressed in BNF. The formal definition can be consulted Http://en.wikipedia.org/wiki/Context-free_grammar

4.6. Type of resolution  

  There are two basic types of parsers-the top-down parser and the bottom-up parser. The more intuitive explanation is that the top-down parser is to look at the top-level structure of the syntax, and then try to match one of them, starting with the bottom-up parser from the input, and then gradually converting to syntactic rules, starting with the underlying rule until the top-level rule is matched.

  Let's take a look at how these two types of parsers parse our example:

  The top-up parser starts with the highest-level rule -it takes "2+3" as an expression and then recognizes "2+3-1" as an expression.

  The bottom-up parser scans the input until it matches a rule, and then replaces the matching input with that rule until all the inputs are resolved. A subset of the matched expressions are placed on the parsing stack.

  

  This type of bottom-up is called a conversion reduction parser because the input moves to the right and is gradually simplified to a grammar rule.

   4.7. Automatic parser

Some tools can produce a parser, which is called the parser Builder. You need to specify the grammar -vocabulary and syntax rules of the language, and it can generate a parser. Creating a parser requires a deep understanding of parsing, and it is not easy to manually create a well-performing parser, so the parser generator is very useful.

  Webkit uses two well-known parser generators--the Flex used to create the parser and the Bison to create the parser . the input to Flex is a regular expression that contains the definition of a symbol, and the input of Bison is a syntax rule in BNF format.

Third, remarks

This article is translated by a foreign article, the foreign address is linked to http://www.html5rocks.com/en/tutorials/internals/howbrowserswork/. Subsequent updates are pending.



Working mechanism of the browser

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.