Abstract: The GPU process architecture in chromium allows multiple GPU clients to access the GPU service at the same time, while multiple GPU clients may have data dependencies, such as when rendering a webgl page, therefore, a synchronization mechanism is required to ensure the order of GPU operations. This article discusses the synchronization between GPU clients in a multi-process architecture and the basic principles of the syncpoint mechanism.
GPU process architecture and other basic concepts
We know that chromium is a multi-process architecture software system. For security and stability considerations, chromium has a dedicated process (or thread) that interacts with the GPU device and performs GL operations, that is, any GL command must be processed by this process before being submitted to the GPU driver. This process is a GPU process. The Renderer process or browser process must use IPC to interact with the GPU process and tell the GPU process what GL commands need to be executed. In this process, the Renderer process or browser process isGPU process ClientWhile the GPU process itself plays the role of the server.
The GL command transmission between the GPU client and the server is completed through the command buffer (commandbuffer. Commandbuffer is designed for efficient data transmission in a multi-process architecture. The buffer uses the shared memory storage client to initiate GL commands to the server, and each GL command initiated by the GPU client, both command names and parameters are serialized as strings and stored in the command line buffer. When the client needs to send all Gl commands stored in the buffer to the GPU server for execution, the client will send the gpucommandbuffermsg_asyncflush message to the server, after receiving the message, the server deserializes the GL command from commandbuffer Based on the specified offset.
In chromium, GPU hardware acceleration is enabled for both page rendering and merging processes. Any process that requires GPU acceleration is the client of the GPU process. In other words, the Renderer process is the GPU client, because it needs to request the GPU process to draw and synthesize the page content, the browser process is also a GPU client, because the browser needs to merge the page content and the address bar and other elements, and displayed on the screen.
Different GPU clients share texture through the mailbox mechanism. Simply put, the mailbox mechanism generates a unique identifier for each texture, the GPU process centrally manages the ing between the name Mark and texture.
Synchronization between GPU clients
From the above description, it is not difficult to see that the GPU process needs to process requests from multiple GPU clients, and these GPU clients may have texture data dependencies.
The following uses rendering webgl page http://get.webgl.org on Android platform as an example to explain why different GPU clients have data dependencies.
First, the browser process is a GPU client that creates a browser synthesizer (compositor) requesting the GPU process to final synthesis of page content, address bar, and other UI elements (if any, and render it to surfaceview. The Renderer process is also a GPU client. It will also create a Renderer-side synthesizer for page synthesis, and the client also contains two independent 3D contexts, one for page rendering, the other is used for webgl rendering.
Secondly, the delegatedrenderer Renderer is enabled on the Android platform by default. Therefore, all GPU resources (including webgl) managed by the Renderer synthesizer will be transferred to the browser synthesizer, then, the browser synthesizer performs unified synthesis. After the final merging is completed, the browser synthesizer will tell the Renderer end that the resources are used up and can be safely deleted.
Finally, let's take a look at the rendering process of webgl. In the page rendering mechanism of hardware acceleration, the page content is composed of multiple render layers. The storage backend of each render layer corresponds to a texture, webgl is such a rendering layer that has an independent texture storage backend. When rendering each frame of content, the Renderer process will request the GPU process to create a new texture to store the rendering results of webgl, and render webgl to this texture through the framebuffer object. After the webgl command is complete, the Renderer synthesizer transfers the texture to the browser synthesizer in the form of resources and renders it to the specified coordinates of surfaceview. After the browser synthesizer completes webgl synthesis, the Renderer process is notified to delete the texture.
So what problems can be inferred from the above description?
First,There is a data dependency between the Renderer synthesizer and the browser synthesizer.Is the relationship between producers and consumers, that is, the texture content is generated by the Renderer webgl context, which is used by the browser synthesizer;
Second,Synchronization problem in webgl context and browser Synthesizer. Since all Gl commands are executed in the same thread, 3D context is the basic unit of GPU thread scheduling, that is, only GL commands in the same 3D context can be executed one by one sequentially. On the one hand, different GPU clients may run in different processes or threads, on the other hand, different 3D contexts of the same GPU client may run in different threads, for example, the Renderer process synthesizer runs in a separate thread, while webgl runs in the main thread of the Renderer process. Obviously, the execution of gl commands in different 3D contexts is not executed in the specified order. Here, a problem arises. For webgl page rendering, how can we ensure that:
- The browser synthesizer can only be used after the GL command in the webgl context is executed;
- The webgl context can be safely deleted only after the browser synthesizer uses the webgl texture;
This is the issue of synchronization between GPU clients. This article will focus on how chromium solves this issue.
Basic Principle of synchronization point mechanism
Chromium uses the GL extension interface to design a synchronization mechanism to solve the synchronization problem between different GPU clients. This mechanism must meet two conditions:
First, context a can wait for context B to execute the subsequent GL command after executing the GL command;
Second, the waiting in context a must be non-blocking, that is, the execution of the GPU client code cannot be blocked;
According to the GPU/gles2/extensions/gl_chromium_sync_point.text file, the synchronization point mechanism defines two GL extension interfaces specific to chromium:
uint InsertSyncPointCHROMIUM() void WaitSyncPointCHROMIUM(uint sync_point)
Insertsyncpointchromium creates a synchronization point in the current context and inserts it into the command stream. This synchronization point acts as a protection wall. When the commands before this synchronization point have been submitted to the server, or when the context is destroyed, a signal is sent to the synchronization point. Returns the identifier of the synchronization point. After receiving the signal, the synchronization point will be deleted. On the same server, the synchronization point identifier is unique in all contexts, including the context of the same shared group.
Waitsyncpointchromium causes the current context to pause submitting the GL command until the specified synchronization point receives the signal and is implemented as a server wait. The sync_point parameter is the synchronization point identifier returned by insertsyncpointchromium. If the sync_point parameter is invalid, this command is equivalent to the no-op operation and no error is reported.
The documented description above is somewhat obscure. The following uses a simplified example to describe how syncpoint works in chromium:
Suppose there are two contexts A and B, which may be in different GPU clients, but belong to the same sharegroup ), in the GPU client code, call insertsyncpointchromium to insert a synchronization point sp in the Command stream of context a. In context B, call waitsyncpointchromium to wait for the synchronization point sp, the final order of running GL commands on the GPU server is: after the execution of GL commands A1, A2, and A3 before the same-step SP in context, to execute the GL command B3 in context B.
Furthermore, calling waitsyncpointchromium (SP) in context B actually tells the GPU server to stop submitting subsequent GL commands of context B to the GPU device, instead:
- If the GL command before the synchronization point sp of context a has been executed, that is, the synchronization point sp receives the signal and has been deleted, you can ignore waitsyncpointchromium, continue to execute subsequent GL commands in context B;
- If the GL command before the synchronization point sp of context A has not yet been executed, it will wait until the GL command before the SP point in context a is executed. This wait occurs on the GPU server and does not block subsequent code running on the GPU client (such as running by Renderer ).
In conclusion, the synchronization point mechanism allows the client to set the sequence of GL command execution between different contexts, context B waits for the synchronization point of context a to ensure that the commands before the same point in a are executed before the subsequent commands in B.
To be continued... the next section will explain how the synchronization point mechanism is implemented from the chromium source code.
Chromium graphics: Analysis of the Principle and Implementation of synchronization between GPU clients