How Rendering Work (in WebKit and Blink)

Source: Internet
Author: User

How Rendering Work (in WebKit and Blink)


Since the beginning of browser kernel development, many rendering-related articles have been written. However, I have always wanted to write an article like How Browsers Work that systematically and comprehensively describes How the rendering engine of a browser works and How it optimizes the rendering performance of webpages, however, the time and effort it takes for fear has not been enough. In any case, I finally have the courage to write it. I hope I can do it myself...


The main content of this article is as follows-

  • Rendering basics-DOM & RenderObject & RenderLayer
  • WebView, rendering and mixing, multi-thread Rendering
  • Hardware acceleration
  • Multipart Rendering
  • Layer hybrid Acceleration
  • Web Game rendering-Canvas & WebGL


First, let's clarify the definition of rendering in the text. The browser kernel engine is usually called the web page rendering engine. However, rendering here is actually a generic, generalized rendering, it includes all the main work of the browser kernel-loading, parsing, formatting, and drawing. Rendering in this article refers to the rendering-related part, that is, how the browser displays the result after the layout on the screen. If you want to have a rough understanding of the browser kernel engine, especially WebKit, How Browsers Work, How WebKit Work, and WebKit for Developers can provide good entry guide.


Secondly, this article mainly describes the implementation of the WebKit engine. However, because Blink actually does not take a long time to branch from WebKit, the two are basically consistent in the overall rendering architecture, therefore, the two are not clearly distinguished.


Finally, I hope this article will provide a quick start guide for developers engaged in browser kernel development, especially rendering engine development, and provide sufficient knowledge and help for front-end developers to optimize webpage rendering performance.


As the article may be revised and supplemented in the future, if you want to view the latest content, visit the personal blog's original article: Author (roger2yi@gmail.com ).


Rendering basics-DOM & RenderObject & RenderLayer

Image from GPU Accelerated Compositing in Chrome


When the browser loads an HTML file through the network or local file system and parses it, the kernel will generate its most important data structure-DOM tree. Each node in the DOM tree corresponds to every element in the webpage, And the webpage can also operate this DOM tree through JavaScript to dynamically change its structure. However, the DOM tree itself cannot be directly used for layout and rendering. The kernel will also generate another tree-Render tree, each node on the Render tree-RenderObject, almost one-to-one correspondence with the nodes on the DOM tree. When a visible DOM node is added to the DOM tree, the kernel will generate a corresponding RenderOject for it and add it to the Render tree.


Picture from How WebKit Work


The Render tree is the main job object of the browser Layout Engine. The layout engine determines the final structure of the Render tree according to the predefined layout rules based on the style definitions of the DOM tree and CSS style sheet, including the size and location of each RenderObject, while a typographical Render tree is the main input of the browser rendering engine. Readers can think that, the Render tree is a bridge between the browser's typographical engine and the rendering engine. It is the output of the typographical engine and the input of the rendering engine.


Picture from How WebKit Work


However, the browser rendering engine does not directly use the Render tree for rendering. To facilitate Positioning (Positioning), Clipping (cropping), Overflow-scroll (Page scrolling ), CSS Transform/Opacity/Animation/Filter, Mask or Reflection, Z-indexing (Z sorting), etc., the browser needs to generate another tree-Layer tree. The rendering engine will generate a corresponding RenderLayer for some specific renderobjects, and these specific renderobjects have a direct relationship with the corresponding RenderLayer. If their subnodes do not have a corresponding RenderLayer, from the RenderLayer of the parent node. In the end, each RenderObject directly or indirectly belongs to an RenderLayer.


RenderObject generates RenderLayer conditions, from GPU Accelerated Compositing in Chrome

  • It's the root object for the page
  • It has explicit CSS position properties (relative, absolute or a transform)
  • It is transparent
  • Has overflow, an alpha mask or reflection
  • Has a CSS filter
  • Corresponds to canvas element that has a 3D (WebGL) context or an accelerated 2D context
  • Corresponds to a video element


The browser rendering engine traverses the Layer tree, accesses each RenderLayer, and then traverses the RenderObject belonging to this RenderLayer to draw each RenderObject. Readers can think that the Layer tree determines the hierarchical order of Web Page painting, while the RenderObject belonging to the RenderLayer determines the content of this Layer, all the renderlayers and RenderObject determine the final content displayed on the page.


In software rendering mode, the browser draws the order of RenderLayer and RenderObject, from the GPU Accelerated Compositing in Chrome


In the software path, the page is rendered by sequentially painting all the RenderLayers, from back to front. the RenderLayer hierarchy is traversed recursively starting from the root and the bulk of the work is done in RenderLayer: paintLayer () which performs the following basic steps (the list of steps is simplified here for clarity ):

In this mode RenderObjects paint themselves into the destination bitmap by issuing draw callinto a single shared GraphicsContext (implemented in Chrome via Skia ).


WebView, rendering and mixing, multi-thread Rendering

The image is from [UC browser 9.7 for Android], with a WebView in the middle and a title bar and toolbar in the top.


The browser itself cannot directly change the pixel output of the screen. It needs to use the GUI Toolkit of the system itself. Therefore, generally, a browser packs a web page to be displayed into a UI component, which is usually called WebView, and then places the WebView on the UI of the application, in this way, the webpage is displayed on the screen.


Some GUI toolkits, such as Android, do not have their own bitmap caches by default. All the UI components that constitute the UI are directly drawn in the current window cache, therefore, every time a WebView is drawn, the RenderLayer/RenderObject in the visible area is drawn to the window cache one by one. The above rendering method has a very serious problem. When you drag a webpage or trigger an inertial scroll, the rendering performance of webpage sliding will be very bad. This is because, even if the webpage only moves one pixel, the entire WebView needs to be re-drawn, and the RenderLayer/RenderObject in the area with a WebView size needs to be drawn, which usually takes a long time, for some complex desktop web pages, it may take hundreds of milliseconds to draw one time on a mobile device, but it will take 60 frames/second of smoothness, the rendering time of each frame cannot exceed 16.7 milliseconds. Therefore, in this rendering mode, it is obviously impossible to achieve smooth webpage sliding effects, the smoothness of webpage sliding screens is the most intuitive and important experience of users on the rendering performance of browsers.


To improve the performance of webpage sliding screens, a simple method is to let the WebView itself hold an independent cache, And the WebView is drawn in two steps. 1) Update the internal cache as needed, draw the webpage content to the internal cache. 2) copy the internal cache to the window cache. The first step is usually called painting or Rasterization. It converts some drawing commands to true pixel color values, and the second step is called Composite ), it is responsible for cache copying, and may also include operations such as displacement, scaling, Rotation, and Alpha mixing. At first glance, rendering has become more complex than the original one-step operation, but in fact, the time consumed by mixing is usually far less than the time consumed by drawing web content, the latter is generally within several milliseconds even on mobile devices. Most of the time, in the first step, we only need to draw a small area without drawing a complete WebView area, which effectively reduces the overhead of painting this step. Taking Page scrolling as an example, you only need to draw the part that is new to the visible area of the WebView. If you scroll up to 10 pixels, the size of the area to be drawn is 10 x Width of WebView, which is much faster than the size of the web page in the original WebView area.


In addition, the browser can also use a multi-threaded rendering architecture to plot the webpage content to the cache and place it in another independent thread (drawing thread ), in the past, only cached copies (mixed threads) are available for WebView rendering by the thread. synchronization, partial synchronization, full asynchronous, and other job modes can be used between the drawing thread and the mixed thread, the browser can choose between performance and effect as needed. For example, in asynchronous mode, when the browser needs to copy the WebView cache to the window cache, but the content to be updated has not been drawn yet, the browser can draw a background color or blank in the part that has not been updated in time, so although the rendering effect has declined, it ensures that the update interval of each frame of window is within the ideal range. In addition, the browser can create a larger cache for WebView, which exceeds the WebView size. This allows us to cache more webpage content and draw invisible areas in advance, in this way, the white space in asynchronous mode can be effectively reduced, and a better balance between performance and effect can be achieved.


Hardware acceleration

The above rendering modes are completed by the CPU, regardless of the rendering or mixing mode, but are not used by the GPU. Rendering tasks are complex and difficult to use GPUs. GPU is sometimes less efficient for drawing various complex graphics/texts (and the system resource overhead is also high ), however, hybrid computing is different. GPU is best at processing multiple pixel computing in parallel. Therefore, compared with the CPU, the GPU is much faster to execute hybrid computing, especially scaling and rotating, alpha mixing is relatively simple, and it is not difficult to use a GPU.


In the multi-threaded rendering mode, because the painting and mixing are in different threads, the painting uses CPU and the GPU, in this way, the overall rendering performance of the browser can be effectively improved through concurrent operations between CPU and GPU. Furthermore, the update of Windows is handled by mixed threads. The higher the efficiency of the hybrid process, the shorter the interval between window updates, and the higher the smoothness of UI changes, as long as the window update interval can always be kept within 16.7 milliseconds, the UI interface will be able to maintain the ultimate smoothness of 60 frames/second (because in general, the display screen is refreshed at a frequency of 60 hz, therefore, 60 frames/second is already the maximum frame rate. exceeding this value is of little significance, and the OS graphics subsystem itself will force the UI interface updates to be synchronized with screen refreshes ).


Therefore, for modern browsers, hardware acceleration is to use GPU for hybrid rendering and still use CPU for rendering.


Multipart Rendering

The image is from [UC browser 9.7 for Android] and is segmented by X.

The cache of a Web page is usually not a large block, but a small block that is divided into a grid, usually 256x256 or 512x512 size. This Rendering method is called Tile Rendering ). The main reason for using multipart rendering is-

In short, a small cache with a fixed size is managed by a unified cache pool, which has many advantages over each WebView holding a large cache. This is especially suitable for multi-threaded CPU/GPU concurrent rendering models. Therefore, browsers that support hardware acceleration basically use the multipart rendering method.


Layer hybrid Acceleration

The picture is from [UC browser 9.7 for Android]. It can be seen that there are 4 layers in the region with their own cache-the Base Layer at the bottom, the Fixed title bar at the top, and the hot news bar in the middle, fixed jump button in the lower right


The rendering architecture of Accelerated Compositing is developed by Apple and implemented first on Safari, chrome/Android/Qt/GTK + and so on have successively completed their own implementations. If you are familiar with iOS or Mac OS GUI programming, you should be familiar with it. It is similar to the Layer Rendering architecture of iOS CoreAnimation, it is mainly used to solve the problem of frequent changes in the content of a Layer, or when a Layer triggers a 2D/3D Transform (2D/3D Transform) or fades in an animation, its displacement, scaling, and rotation, when attributes such as transparency change constantly, rendering performance is low in the original rendering architecture.


In a non-hybrid acceleration rendering architecture, all renderlayers do not have their own independent caches, and they are all drawn to the same cache (in their order ), so as long as the content of this Layer changes, or some of its CSS style attributes such as Transform/Opacity change, the cache of the changed area needs to be regenerated. In this case, not only the changed Layer needs to be drawn, other layers that intersect with the Damage Region must be drawn. As mentioned earlier, web page painting is very time-consuming. It doesn't matter if the Layer occasionally changes, but if a JavaScript or CSS animation is constantly driving the Layer to change, it is basically impossible for this animation to have a smooth Effect of 60 frames per second.


In the rendering architecture of hybrid acceleration, some renderlayers have their own independent caches, which are called Compositing layers. WebKit creates corresponding graphicslayers for these renderlayers, different browsers need to provide their own GrphicsLayer for managing cache allocation, release, update, and so on. RenderLayer with GrphicsLayer will be drawn into its own cache, while RenderLayer without GrphicsLayer will trace the parent/ancestor RenderLayer with GrphicsLayer until the Root RenderLayer ends, it is then drawn on the cache of the parent/ancestor RenderLayer of GrphicsLayer, while the Root RenderLayer always creates a GrphicsLayer and has its own cache. Finally, GraphicsLayer forms a parallel tree with the RenderLayer. The relationship between the RenderLayer and GraphicsLayer is similar to that between the RenderObject and RenderLayer.


In the hybrid acceleration rendering architecture, webpage mixing is also more complex than before. Instead of simply copying a cache to the window cache, instead, you need to copy multiple caches from different layers, add possible 2D/3D transformations, and the Alpha mixture between caches. Of course, hardware acceleration is supported, using GPU to complete Hybrid browsers is still fast.

RenderLayer generates GraphicsLayer conditions, from GPU Accelerated Compositing in Chrome


In the rendering architecture of hybrid acceleration, you only need to update the cache of the GraphicsLayer to which the Layer content changes, you only need to draw the RenderLayer that directly or indirectly belongs to this GraphicsLayer, rather than all renderlayers. In particular, the changes to some specific CSS style attributes do not actually cause changes to the content. You only need to change the mixed parameters of some GraphicsLayer and then re-mix the parameters. The mixed parameters are relatively fast compared to the rendering, these specific CSS style attributes are generally called accelerated. Different browsers support different situations, but basicallyCSS Transform & OpacityAll browsers that support hybrid acceleration are accelerated. The acceleration of CSS style attributes can easily achieve a smooth Effect of 60 frames per second.


The picture comes from Understanding Hardware Acceleration on Mobile Browsers, which shows a mix of layers in the classic CSS animation Demo-Falling Leaves, its Transform and Opacity animations are accelerated on all browsers that support hybrid acceleration.


However, the more renderlayers with independent cache, the better. Too many layers with independent cache will bring some serious side effects. First, it greatly increases the memory overhead, this has a greater impact on mobile devices, and even causes browsers to fail to support layer-based hybrid acceleration on mobile devices with less memory. Secondly, it increases the time overhead of hybrid acceleration, this results in a decrease in the hybrid performance, which is closely related to the smoothness of webpage scrolling/scaling operations. As a result, the smoothness of webpage scrolling/scaling is reduced, making the user feel that the operation is not smooth enough.


In chrome: // flags, enable the "border of the Compositing rendering Layer" to see which layers are a Compositing Layer, that is, they have their own independent cache. Front-end developers can use this to help themselves control the creation of Compositing layers.In general, the Compositing Layer can improve the rendering performance, but it will reduce the hybrid performance. Only by using the Compositing Layer reasonably on the web page can a good balance be achieved between painting and mixing, improve overall rendering performance.


The image is from Chrome and shows the border of the compositing rendering layer of the classic CSS animation Demo-Falling Leaves.


Web Game rendering-Canvas & WebGL

Images are from [UC browser 9.7 for Android], and 2D Canvas-based games are difficult to use. The smoothness of mainstream mobile phones can reach 60 frames per second.


In the past, web games were generally implemented using Flash. However, as Flash was eliminated from mobile devices, more and more web games were developed using Canvas and WebGL, for the Basic Drawing Process of the Canvas in the browser, refer to My previous article Introduce My Work. Although Web page elements are drawn by CPU, for accelerated 2D Canvas and WebGL, they are drawn directly using GPU, so they generally have a gl fbo (FrameBufferObject) as your own cache, the Canvas/WebGL content is drawn to this FBO, and the texture associated with this FBO is copied to the window cache in the hybrid operation. To put it simply, for accelerated 2D Canvas and WebGL, both painting and Mixing use GPU.


The image is from [UC browser 9.7 Android], a web page Demo that demonstrates WebGL


For details about how to optimize the Performance of Canvas games, refer to my previous article-High Performance Canvas Game for Android (High Performance Android Canvas Game development ).


Reference Index

How Browsers Work: Behind the scenes of modern web browsers
How WebKit Work
WebKit for Developers
GPU Accelerated Compositing in Chrome
Understanding Hardware Acceleration on Mobile Browsers
Web Page Rendering and Accelerated Compositing
My 2013-Year-end Summary + thoughts on browser rendering development
Introduce My Work
High Performance Canvas Game for Android (High Performance Android Canvas Game Development)
OpenGL Frame Buffer Object (FBO)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.