Chromium Graphics: Principle and Implementation of the synchronization mechanism between GPU clients-Part I, chromium-part

Source: Internet
Author: User

Chromium Graphics: Principle and Implementation of the synchronization mechanism between GPU clients-Part I, chromium-part

Abstract: The GPU process architecture in Chromium allows multiple GPU clients to access the GPU service at the same time, and there may be data dependencies between GPU clients, therefore, a synchronization mechanism must be provided to ensure the order of GPU operations. This article discusses the synchronization between GPU clients in a multi-process architecture and the basic principle of the SyncPoint mechanism.

GPU process architecture and other basic concepts

We know that Chromium is a multi-process architecture software system. For security and stability considerations, Chromium has a dedicated process (or thread) that interacts with the GPU device and performs GL operations, that is, any GL command must be processed by this process before being submitted to the GPU driver. This process is a GPU process. The Renderer process or Browser process must use IPC to interact with the GPU process and tell the GPU process what GL commands need to be executed. In this process, the Renderer process or Browser process isGPU ClientWhile the GPU process itself isGPU Server.

The GL command transmission between the GPU client and the server is completed through the command buffer (CommandBuffer. CommandBuffer is designed by Chromium to efficiently transmit GL commands on the client and server in a multi-process architecture. The buffer uses the shared memory to store GL commands, the command names and parameters of each GL command are serialized as strings and stored in the command line buffer. When the client needs to submit all GL commands in the command buffer to the GPU server, the client will send a GpuCommandBufferMsg_AsyncFlush message to the server, after receiving this message, the server reads the GL command from CommandBuffer Based on the specified offset, deserializes it, and submits it to the GPU driver.

In Chromium, GPU hardware acceleration is enabled for page rendering and merging. Any process that requires GPU hardware acceleration is the client of the GPU process. For example, the Renderer process is a GPU client, because it needs to request the GPU process to draw and synthesize the page content, the Browser process is also the GPU client, because it needs to final synthesize the page content and the address bar and other elements, and displayed on the screen.

This section describes the basic relationships between the GPU client, GPU server, and command buffer.


The GPU client can share texture through the mailbox mechanism. Simply put, the mailbox mechanism is to generate a unique identifier for each texture, the GPU process centrally manages the ing between the name Mark and texture, and these GPU clients belong to the same context sharing group, so all texture resources can also be shared.

Synchronization between GPU clients

From the above description, it is not difficult to see that the GPU process needs to process requests from multiple GPU clients, and these GPU clients may have Texture data dependencies. The following uses rendering WebGL page http://get.webgl.org on Android platform as an example to explain why different GPU clients have data dependencies.

First, the Browser process is a GPU client that creates a Browser synthesizer (Compositor) requesting the GPU process to final synthesis of page content, address bar, and other UI elements (if any, and render it to SurfaceView. The Renderer process is also a GPU client. It will also create a Renderer-side synthesizer for merging page content, and this client also contains two independent 3D contexts, one is used for merging the rendering layer in the page, and the other is used as the rendering context of WebGL.

Secondly, the DelegatedRenderer Renderer is enabled by default on the Android platform. Therefore, all GPU resources (including WebGL) managed by the Renderer Synthesizer) the devices are forwarded to the Browser synthesizer through IPC, and then the Browser synthesizer is responsible for unified synthesis of all texture. After the final merging is completed, the Browser synthesizer notifies the Renderer, at this time, the Renderer-side synthesizer can safely delete GPU resources that are no longer in use.

Finally, let's take a look at the rendering process of WebGL. In the page rendering mechanism of hardware acceleration, the page content is composed of multiple render layers. The storage backend of each render layer corresponds to a Texture, webGL is such a rendering layer that has an independent Texture storage backend. When rendering each frame of content, the Renderer process will request the GPU process to create a new Texture to store the rendering results of WebGL, and render WebGL to this Texture through the Framebuffer object. After the WebGL command is complete, the Renderer synthesizer sends the Texture to the Browser synthesizer as a resource. After the Browser synthesizer completes the synthesis of WebGL content, it notifies Renderer to delete the Texture.

So what problems can be inferred from the above description?

First,There is a data dependency between the Renderer synthesizer and the Browser synthesizer.Is the relationship between producers and consumers, that is, the Texture content is generated by the Renderer WebGL context, which is used by the Browser synthesizer;

Second,Synchronization problem in WebGL context and Browser Synthesizer. Since all GL commands are executed in the same thread, including Browser synthesizer and WebGL, 3D context is the basic unit of GPU thread scheduling, that is, only GL commands in the same 3D context can be executed one by one in sequence. On the one hand, different GPU clients may run in different processes or threads, on the other hand, different 3D contexts of the same GPU client may run in different threads, for example, the Renderer process synthesizer runs in a separate thread, while WebGL runs in the main thread of the Renderer process. Obviously, the execution order of gl commands in different 3D contexts is not sequential. Here, a problem arises. For WebGL page rendering, the following two requirements must be ensured:

  • After the GL command in the WebGL context is executed, the Browser synthesizer can be used;
  • After the Browser synthesizer uses the WebGL Texture, the WebGL context can be safely deleted;

Otherwise, the texture used by the Browser synthesizer will be incomplete. This is the issue of synchronization between GPU clients. Next we will focus on how Chromium solves this issue.

Basic Principle of synchronization point mechanism

Chromium uses the GL extension interface to design a synchronization mechanism to solve the synchronization problem between different GPU clients. It must meet two conditions:

First, context A can wait for context B to execute A group of GL commands and then execute subsequent GL commands in context;

Second, the waiting in context A must be non-blocking, that is, the execution of the GPU client code cannot be blocked;

According to the gpu/gles2/extensions/GL_CHROMIUM_sync_point.txt file, the synchronization point mechanism defines two GL extension interfaces specific to Chromium:

        uint InsertSyncPointCHROMIUM()        void WaitSyncPointCHROMIUM(uint sync_point)

InsertSyncPointCHROMIUM creates a synchronization point in the current context and inserts it into the command stream. This synchronization point acts as a protection wall. When the commands before this synchronization point have been submitted to the server, or when the context is destroyed, a signal is sent to the synchronization point. Returns the identifier of the synchronization point. After receiving the signal, the synchronization point will be deleted. On the same server, the synchronization point identifier is unique in all contexts, including the context of the same shared group.

WaitSyncPointCHROMIUM causes the current context to pause submitting the GL command until the specified synchronization point receives the signal and is implemented as a server wait. The sync_point parameter is the synchronization point identifier returned by InsertSyncPointCHROMIUM. If the sync_point parameter is invalid, this command is equivalent to the no-op operation and no error is reported.

The documented description above is somewhat obscure. The following uses a simplified example to describe how SyncPoint works in Chromium:

 

Suppose there are two contexts A and B, which may be in different GPU clients, but belong to the same ShareGroup ), in the GPU client code, call InsertSyncPointCHROMIUM to insert A synchronization point sp in the Command stream of context A. In context B, call WaitSyncPointCHROMIUM to wait for the synchronization point sp, the final order of running GL commands on the GPU server is: after the execution of GL commands A1, A2, and A3 before the same-step sp in context, to execute the GL command B3 in context B.

Furthermore, calling WaitSyncPointCHROMIUM (sp) in context B Actually instructs the GPU server to stop submitting subsequent GL commands of context B to the GPU device, instead:

  • If the GL command before the synchronization point sp of context A has been executed, that is, the synchronization point sp receives the signal and has been deleted, you can ignore WaitSyncPointCHROMIUM, continue to execute subsequent GL commands in context B;
  • If the GL command before the synchronization point sp of context A has not yet been executed, it will wait until the GL command before the sp point in context A is executed. This wait occurs on the GPU server and does not block subsequent code running on the GPU client (such as running by Renderer ).

In conclusion, the synchronization point mechanism allows the client to set the sequence of GL command execution between different contexts, context B waits for the synchronization point of context A to ensure that the commands before the same point in A are executed before the subsequent commands in B.

To be continued... the next section will explain how the synchronization point mechanism is implemented from the Chromium source code.





Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.