Chromium graphics: Analysis of the Principle and Implementation of the synchronization mechanism between GPU clients-Part II

Last Update:2014-10-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Abstract: Part I analyzes the synchronization problems between GPU clients and the basic principle of the extended synchronization point MECHANISM OF CHROMIUM GL. This article analyzes the implementation of the synchronization point mechanism from the source code perspective. The implementation of the synchronization point mechanism mainly involves how to implement two GL extension interfaces, insertsyncpointchromium and waitsyncpointchromium, and how to implement the synchronization point wait on the GPU server.

GPU Client

The GPU client encapsulates all Gl commands in gles2implementation. gles2implementation serializes the client's gl commands and stores them in the command line buffer. Each webgraphicscontext3dimpl on the client creates a gles2implementation instance, the method for calling gles2implementation looks like directly calling the OpenGL method to directly operate the GPU device. The interaction between the GPU client and the GPU process is encapsulated in the gpucommandbufferproxy class. This proxy class sends an IPC message request to the gpucommandbufferstub on the GPU server to execute certain GPU operations, such as sending gpucommandbuffermsg_asyncflush message commit) command buffer.

The synchronization point mechanism must first implement the two GL extension interfaces in the gles2implementation class to allow client code to use the synchronization point mechanism. Gles2implementation: insertsyncpointchromium sends a synchronous IPC message to the GPU server through gpucommandbufferproxy to register a new synchronization point. Before calling commandbufferproxyimpl: insertsyncpoint, gles2implementation submits all Gl commands on the client to the server through the flush command line buffer, as shown in the following code:

GLuint GLES2Implementation::InsertSyncPointCHROMIUM() {  GPU_CLIENT_SINGLE_THREAD_CHECK();  GPU_CLIENT_LOG("[" << GetLogPrefix() << "] glInsertSyncPointCHROMIUM");  helper_->CommandBufferHelper::Flush();  return gpu_control_->InsertSyncPoint();}...uint32 CommandBufferProxyImpl::InsertSyncPoint() {  if (last_state_.error != gpu::error::kNoError)    return 0;  uint32 sync_point = 0;  Send(new GpuCommandBufferMsg_InsertSyncPoint(route_id_, true, &sync_point));  return sync_point;}

The sync_point parameter of the synchronous IPC message gpucommandbuffermsg_insertsyncpoint returns the unique identifier uniformly allocated by the GPU server.

Unlike insertsyncpointchromium, waitsyncpointchromium is not added to commandbuffer as a GL extension command instead of sending a dedicated IPC message to the GPU server, when processing this gl command, the GPU server determines how to pause the submission of subsequent commands and wait for the synchronization point to complete.

GPU Server

The GPU server can be either an independent process or a browser process thread. All GPU operations are executed on the same thread, or a GPU thread. In addition to submitting GL commands to the GPU device, the GPU thread also needs to manage multiple gpuchannels.

Message Processing Method of gpuchannel

Each gpuchannel corresponds to a GPU client (Renderer process or browser process). After the gpuchannel: onmessagereceived receives the IPC message from the client, it does not process the message immediately, instead, it is added to an unprocessed message queue, and then a task gpuchannel: handlemessage is added to the main message loop to process all unprocessed messages in all message queues. There are at least two advantages for this cache IPC message processing method:

First, it is convenient to optimize multi-context Message Processing Based on the specific mode of the IPC message sequence;

Second, it is easy to control the synchronization mode of GL command execution between multiple gpuchannels, that is, the synchronization point mechanism discussed in this article;

The handlemessage task traverses all unprocessed IPC messages one by one. Based on the route information of the IPC message, it is handed over to the corresponding gpucommandbufferstub for processing. See the following code:

Void gpuchannel: handlemessage () {handle_messages_scheduled _ = false; If (deferred_messages _. empty () return; bool should_fast_track_ack = false; // find the corresponding gpucommandbufferstub instance IPC: Message * m = deferred_messages _ Based on the route information of the IPC message _. front (); gpucommandbufferstub * stub = stubs _. lookup (m-> routing_id (); do {If (stub) {If (! Stub-> isscheduled () // when gpucommandbufferstub is in the "unschedulable" status, return directly; If (stub-> ispreempted () {onscheduled (); return ;}}// the header message is popped up from the message queue and sent to gpucommandbufferstub for processing scoped_ptr <IPC: Message> message (m); deferred_messages _. pop_front (); bool message_processed = true; currently_processing_message _ = message. get (); bool result; If (Message-> routing_id () = msg_routing_control) Result = oncontrolmessager Eceived (* message); else result = router _. routemessage (* message); currently_processing_message _ = NULL; If (! Result) {// respond to sync messages even if router failed to route. if (Message-> is_sync () {IPC: Message * reply = IPC: syncmessage: generatereply (& * message); reply-> set_reply_error (); send (reply) ;}} else {// If the command buffer becomes unscheduled as a result of handling the // message but still has more commands to process, synthesize an IPC // message to flush that command buffer. if (St UB) {If (stub-> hasunprocessedcommands () {deferred_messages _. push_front (New gpucommandbuffermsg_rescheduled (stub-> route_id (); message_processed = false ;}}if (message_processed) messageprocessed (); // some irrelevant code is omitted here} while (should_fast_track_ack); // if the message queue is not empty, call onscheduled to continue adding the handlemessage task to the main message loop if (! Deferred_messages _. Empty () {onscheduled ();}}

Gpucommandbufferstub is a very important class on the GPU process side. It encapsulates most of the functions used on the GPU server to execute GL commands, including context creation and initialization, and responds to client GPU operation requests, schedule execution analysis and deserialization of GL commands in the command line buffer. When the GPU client requests to create a new 3D context, it actually requests a new gpucommandbufferstub instance. Therefore, for Renderer processes that contain hardware-accelerated canvas, the corresponding gpuchannel manages multiple gpucommandbufferstub instances, and distributes the IPC messages to the corresponding gpucommandbufferstub instance based on the IPC message header information.

Synchronization point insertion and waiting

To pre-process the IPC messages received by the IO thread, each gpuchannel installs a message filter running on the IO thread, each IPC message from the GPU client must first be preprocessed by the message filter and then forwarded to gpuchannel for further processing.

When the GPU server receives the gpucommandbuffermsg_insertsyncpoint message, the message filter gpuchannelmessagefilter on the IO thread is processed in two steps:

1. Request syncpointmanager: generatesyncpoint on the IO thread to generate a unique identifier of the synchronization point and send it to the GPU client immediately;

2. Add a new task insertsyncpointonmainthread to the master message loop queue on the GPU server, that is, add a synchronization point to the gpuchannel on the master thread;

static void InsertSyncPointOnMainThread(      base::WeakPtr<GpuChannel> gpu_channel,      scoped_refptr<SyncPointManager> manager,      int32 routing_id,      bool retire,      uint32 sync_point) {    // This function must ensure that the sync point will be retired. Normally    // we'll find the stub based on the routing ID, and associate the sync point    // with it, but if that fails for any reason (channel or stub already    // deleted, invalid routing id), we need to retire the sync point    // immediately.    if (gpu_channel) {      GpuCommandBufferStub* stub = gpu_channel->LookupCommandBuffer(routing_id);      if (stub) {        stub->AddSyncPoint(sync_point);        if (retire) {          GpuCommandBufferMsg_RetireSyncPoint message(routing_id, sync_point);          gpu_channel->OnMessageReceived(message);        }        return;      } else {        gpu_channel->MessageProcessed();      }    }    manager->RetireSyncPoint(sync_point);  }

In addition to calling gpucommandbufferstub: addsyncpoint to add a synchronization point to the current context, insertsyncpointonmainthread also explicitly calls gpuchannel: onmessagereceived to send a publish message to the current gpuchannel, this message will be added to the end of the gpuchannel message queue. For the main thread that processes multiple gpuchannels at the same time, the following two situations may occur (assuming gpuchannela calls addsyncpoint ):

Case I:Gpuchannel A is fully scheduled in the main thread, and all unprocessed IPC messages are processed, including the gpucommandbuffermessage_asyncflush messages required when synchronization is inserted, and the gpucommandbuffermsg_retiresyncpoint message created by the GPU process after the synchronization point is added. This indicates that all Gl commands before the synchronization point have been submitted to the GPU driver, this synchronization point can be "retired". As mentioned in the above document, the signal received by the synchronization point will be automatically deleted. If other gpuchannels call the GL command of waitsyncpointchromium, it is equivalent to not executing any operation.

Case II:Gpuchannel A is not fully scheduled in the main thread. At the same time, gpuchannel B is scheduled in the main thread. In an context, waitsyncpointchromium is called to wait for the synchronization point created in. As mentioned above, waitsyncpointchromium is incorporated into the command buffer as an extended GL command. Therefore, when gpucommandbufferstub in gpuchannelb parses and executes the GL command of the command buffer one by one, gles2decoderimpl occurs when waitsyncpointchromium:: handlewaitsyncpointchromium performs the following operations:

Trigger the waitsyncpoint callback function gpucommandbufferstub: onwaitsyncpoint registered by gpucommandbufferstub. This function determines whether gpucommandbufferstub in gpuchannelb continues to parse the GL command or stops parsing the GL command waiting for the "retirement" of Synchronization ";
Waitsyncpoint first checks whether the synchronization point is "retired". If yes, true is returned, indicating that the current gpucommandbufferstub of gpuchannelb can continue parsing and executing the GL command. Otherwise, set the current gpucommandbufferstub to the "not scheduled" status, and request syncpointmanager to add a callback function onsyncpointretired for the waiting synchronization point, this callback function is used to restore gpucommandbufferstub to the "schedulable" status.

Careful readers may find that there are still two questions not clearly explained:

First, how is the scheduling status of gpucommandbufferstub used?

Second, when will syncpointmanager call the callback function gpucommandbufferstub: onsyncpointretired?

As mentioned above, gpuchannel: handlemessage will traverse all unprocessed IPC messages one by one and hand them to the corresponding gpucommandbufferstub for processing, the IPC message can be processed only when gpucommandbufferstub is in the "schedulable" status. Otherwise, the message is directly returned from the handlemessage method. Therefore, the message processing of gpuchannel will also stagnate. That is to say, at this time, gpuchannelb stops waiting for the "retirement" of the synchronization point until the gpucommandbufferstub: onsyncpointretired callback function is triggered.

Once the Message Processing of gpuchannel B is stopped on the main thread, gpuchannel A has more opportunities to get the scheduling of the main thread. The accumulated gpucommandbuffermsg_asyncflush and quit messages in gpuchannela will be processed together. The notify message is responsible for submitting the GL commands of all clients to the GPU driver through gles2decoderimpl, while the notify message triggers the gpucommandbufferstub: onretiresyncpoint message processing function, requesting syncpointmanager to delete the given synchronization point, when the synchronization point is deleted, syncpointmanager runs all the callback functions associated with the synchronization point. Therefore, gpucommandbufferstub: onsyncpointretired is called, and gpuchannelb's message processing can start again.

Note that, due to the scheduling status of gpucommandbufferstub, the message processing of gpuchannelb is stopped. It is only for the server to execute the GL command and does not stop receiving messages from the client, the client can still send IPC messages to gpuchannelb, but these messages will be cached in the message queue until the waiting synchronization point is "retired "."Server waiting".

Summary

To solve the data synchronization problem of multiple GPU clients, chromium introduces the GL extension of the syncpoint mechanism, which allows the client to customize the sequence of execution of GL commands in different contexts, the synchronization point mechanism ensures that all Gl commands before the current context executes subsequent GL commands have been submitted to the GPU device. The synchronization point waits for the server, and the client code is not blocked. In implementation, the synchronization point mechanism depends on the Message Processing Mechanism of gpuchannel. When the waiting synchronization point is not "retired", the current gpucommandbufferstub will be set to "unschedulable, all GL commands before the synchronization point have been submitted.

Chromium graphics: Analysis of the Principle and Implementation of the synchronization mechanism between GPU clients-Part II

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Chromium graphics: Analysis of the Principle and Implementation of the synchronization mechanism between GPU clients-Part II

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Chromium graphics: Analysis of the Principle and Implementation of the synchronization mechanism between GPU clients-Part II

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support