GraphicBuffer synchronization mechanism in Android-Fence and androidfence

Source: Internet
Author: User

GraphicBuffer synchronization mechanism in Android-Fence and androidfence

Fence is a synchronization mechanism. It is mainly used for GraphicBuffer synchronization in the graphics system in Android. What are the characteristics of the synchronization mechanism? It is mainly used to process cross-hardware scenarios, especially the synchronization between CPU, GPU, and HWC. It can also be used for synchronization between multiple time points. A big difference between GPU programming and pure CPU programming is that it is asynchronous. That is to say, when we call the GL command to return, this command is not necessarily complete, just put this command in the local command buffer. The CPU does not know when the GL command is actually executed. Unless the CPU uses glFinish () to wait until these commands are executed, another method is based on the Fence mechanism of the synchronization object. The following is an example of a producer giving GraphicBuffer to a consumer. For example, the producer is the renderer in the App and the consumer is SurfaceFlinger. Put the GraphicBuffer queue in the Buffer Queue BufferQueue. The interface of BufferQueue to the App is IGraphicBufferProducer, the implementation class is Surface, the interface of SurfaceFlinger is IGraphicBufferConsumer, and the Implementation class is SurfaceFlingerConsumer. Each GraphiBuffer in BufferQueue has a BufferState marked with its status:


This State indicates the ownership of GraphicBuffer to a certain extent, but only indicates the status in the CPU, while the real user of GraphicBuffer is GPU. That is to say, when the producer puts a GraphicBuffer into BufferQueue, it only completes the ownership transfer at the CPU level. However, GPUs may still be in use. If they are still in use, consumers cannot merge them. At this time, the relationship between GraphicBuffer and the production consumer is ambiguous. The consumer has the right to GraphicBuffer but has no right to use it. It needs to wait for a signal to tell it that the GPU is used up before the consumer has the right to use it. A Simplified Model is as follows:


This notification indicates that the signal of GraphicBuffer used up by the previous user is completed by Fence. The existence of Fence is very simple. From the very beginning, it was to send a signal at the right time. On the other hand, why don't the producer call glFinish () and other GPUs when giving GraphicBuffer to the consumer? In this way, ownership and right of use are transmitted together, without Fence. This can be done in terms of functionality, but the performance will be affected, because glFinish () is blocked, and the CPU itself cannot work in order to wait for the GPU. If Fence is used, the GraphicBuffer can be blocked when it is actually used by the consumer, and the CPU and GPU can work in parallel. This is equivalent to implementing lazy passing for critical resources.

After talking about the basic function of Fence, let's talk about its implementation. Fence, as its name implies, is to stop the first-come-first-served service. When the second-served service is finished, the two will go forward. Abstract: Fence contains multiple time points on the same or different timelines. Fence is triggered only when these points arrive at the same time. For more details, refer to this article (http://netaz.blogspot.com/2013/10/android-fences-introduction-in-any.html ).

Fence can be implemented by hardware (Graphic driver) or by software (sw_sync in Android kernel ). EGL provides an extension of the synchronization object KHR_fence_sync (http://www.khronos.org/registry/vg/extensions/KHR/EGL_KHR_fence_sync.txt ). EglCreateSyncKHR () and eglDestroySyncKHR () are provided to generate and destroy synchronization objects. This synchronization object is a special operation inserted into the GL command queue. When it is executed, it sends a signal indicating that all the commands in front of the queue have been executed. The eglClientWaitSyncKHR () function can block the caller from waiting for a signal.

On this basis, Android has extended its-ANDROID_native_fence_sync (http://www.khronos.org/registry/egl/extensions/ANDROID/EGL_ANDROID_native_fence_sync.txt), and added the eglDupNativeFenceFDANDROID () interface (). It can convert a synchronization object to a file descriptor (in turn, eglCreateSyncKHR () can convert the file descriptor to a synchronization object ). This extension allows the CPU to have a synchronization object handle in the GPU, and the file descriptor can be transferred between processes (through the IPC mechanism such as binder or domain socket ), this provides the basis for multi-process synchronization. We know that all Unix systems are files. Therefore, with this extension, Fence's versatility is greatly enhanced.

Android further enriches Fence's software stack. Mainly distributed in three parts: C ++ Fence class located in/frameworks/native/libs/ui/Fence. cpp; The libsync library of C is located in/system/core/libsync/sync. c; the part of the Kernel driver is located in/drivers/base/sync. c. In general, the kernel driver is the main implementation of synchronization, libsync is the encapsulation of the driver interface, and Fence is a further C ++ encapsulation of libsync. Fence will be attached to GraphicBuffer and transmitted between producers and consumers along with GraphicBuffer. In addition, the Fence software is implemented in/drivers/base/sw_sync.c. SyncFeatures is used to query the supported synchronization mechanisms:/frameworks/native/libs/gui/SyncFeatures. cpp.


The following describes the usage of Fence in Android. It is mainly used to synchronize GraphicBuffer between App, GPU, and HWC.

First, let's take a look at GraphicBuffer's journey from App to Display. GraphicBuffer is first drawn by the App end as the producer, and then put into BufferQueue, waiting for the consumer to take out for the next rendering synthesis. As a consumer, SurfaceFlinger obtains the GraphicBuffer corresponding to each layer to generate the EGLImageKHR object. The processing of GraphicBuffer during merging is divided into two situations. For the Overlay Layer, SurfaceFlinger will directly put its buffer handle into the Layer list of HWC. SurfaceFlinger will synthesize the previously generated EGLImageKHR using glEGLImageTargetTexture2DOES () as texture (http://snorp.net/2011/12/16/android-direct-texture.html) for layers that require GPU rendering (beyond the number of HWC processing layers or with complex transformations ). After the synthesis, SurfaceFlinger serves as the producer, set the handle of the GPU-synthesized framebuffer to FramebufferTarget in HWC (hwc_layer_0000t list in hwc_display_contents_0000t of HWC to the buffer where the rendering result of the GPU is stored ). HWC overlays the Overlay layer and then throws it to Display. HWC is a consumer. General Process


We can see that for non-Overlay layers, GraphicBuffer goes through two production consumer models successively. We know that the core of GraphicBuffer includes the buffer_handle_t structure. The native_handle_t it points to contains the file descriptor and other basic attributes of the graphic buffer applied in gralloc, this file descriptor is mapped to both the client and server as the shared memory.


Because both the service and client processes can access the same physical memory, errors may occur without synchronization. In order to coordinate the client and the server, the transmission of GraphicBuffer also carries Fence, indicating whether it is used by the previous user. Fence can be divided into acquireFence and releaseFence. The former is used to notify the consumer that production has been completed, and the latter is used to notify the consumer that consumption has been completed. The following describes the generation and use of these two Fence types. The first is the use process of acquireFence:


When the App inserts GraphicBuffer to BufferQueue through queueBuffer (), a Fence is introduced, which indicates whether the GraphicBuffer has been used by the producer. Then the GraphicBuffer is taken away by the consumer through acquireBuffer (), and The acquireFence is also taken out. When the consumer (SurfaceFlinger) wants to render it, it needs to wait until Fence is triggered. If the Layer is rendered by GPU, the Layer used is Layer: onDraw (), where the texture is bound through bindTextureImage:
486 status_t err = mSurfaceFlingerConsumer-> bindTextureImage ();
This function will finally call doGLFenceWaitLocked () to wait for acquireFence to trigger. The next step is to draw a picture. If you don't wait to go down, the rendered content is the wrong content.

If this layer is an Overlay layer of HWC rendering, you do not need to pass through the GPU, then you need to upload the acquireFence corresponding to these layers to HWC. In this way, HWC can confirm whether the buffer has been used by the producer before merging. Therefore, a normal HWC must wait for all the acquireFence to be triggered before drawing. This setting is completed in SurfaceFlinger: doComposeSurfaces (). This function calls the layer: setAcquireFence () function of each layer:
428 if (layer. getCompositionType () = HWC_OVERLAY ){
429 sp <Fence> fence = mSurfaceFlingerConsumer-> getCurrentFence ();
...
431 fenceFd = fence-> dup ();
...
437 layer. setAcquireFenceFd (fenceFd );
We can see that the non-Overlay layer is ignored, because HWC does not need to be synchronized directly with the non-Overlay layer, as long as it synchronizes with the results of the non-Overlay layer synthesis FramebufferTarget. After the GPU finishes rendering a non-Overlay layer, it uses queueBuffer () to put GraphicBuffer into the BufferQueue corresponding to FramebufferSurface, and then FramebufferSurface: onFrameAvailable () is called. It first obtains a GraphicBuffer from the BufferQueue through nextBuffer ()-> acquireBufferLocked (), with its acquireFence. Next, call HWComposer: fbPost ()-> setFramebufferTarget (), where the GraphicBuffer of acquire is set to FramebufferTarget slot in the Layer list of HWC together with acquireFence:
580 acquireFenceFd = acquireFence-> dup ();
...
586 disp. framebufferTarget-> acquireFenceFd = acquireFenceFd;
To sum up, the precondition for HWC to perform final processing is that the acquireFence of the Overlay layer and the acquireFence of FramebufferTarget are all triggered.

After reading acquireFence, let's look at the procedure of using releaseFence:


As mentioned above, the merging process is GPU-based. In the doComposition () function, a non-Overlay layer is formed, and the result is placed in framebuffer. SurfaceFlinger then calls postFramebuffer () to start HWC. In postFramebuffer (), the most important thing is to call the set () interface of HWC to notify HWC for merging and display, and then synchronize the releaseFence (if any) generated in HWC to SurfaceFlingerConsumer. Implemented in the onLayerDisplayed () function of the Layer:
151 mSurfaceFlingerConsumer-> setReleaseFence (layer-> getAndResetReleaseFence ());
The above is mainly for the Overlay layer. What about the GPU-drawn layer? When receiving the INVALIDATE message, SurfaceFlinger calls handleMessageInvalidate ()-> handlePageFlip ()-> Layer: latchBuffer ()-> SurfaceFlingerConsumer: updateTexImage (), the GLConsumer: updateAndReleaseLocked () function corresponding to the Consumer of the layer is called. This function will release the old GraphicBuffer. Before the function is released, it will use the syncForReleaseLocked () function to insert releaseFence, which means that if the GraphicBuffer consumer is used up at the time of triggering. Call releaseBufferLocked () and return it to BufferQueue. Of course, this releaseFence is also carried. In this way, when the GraphicBuffer is taken out by the producer again through dequeueBuffer (), the releaseFence can be used to determine whether the consumer is still in use.

On the other hand, after HWC synthesis, SurfaceFlinger will call DisplayDevice: onSwapBuffersCompleted ()-> FramebufferSurface: onFrameCommitted () in sequence (). The onFrameCommitted () Core code is as follows:
148 sp <Fence> fence = mHwc. getAndResetReleaseFence (mDisplayType );
...
151 status_t err = addReleaseFence (mCurrentBufferSlot,
152 mCurrentBuffer, fence );
Get the releaseFence of FramebufferTarget generated by HWC and set it to the corresponding GraphicBuffer Slot in FramebufferSurface. In this way, the GraphicBuffer corresponding to FramebufferSurface can be released back to BufferQueue. When EGL obtains the buffer from the buffer in the future, it also needs to wait for the releaseFence to be triggered before it can be used.


Why is it very inefficient to copy data from memcpy in GraphicBuffer?

Good post, very practical to view the original post>

Hope to adopt


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.