(reprint) (official) ue4--image programming----thread rendering

Source: Internet
Author: User
Tags thread class

Thread render render Thread

In Unreal Engine 4 (UE4), the entire renderer performs an operation on its own thread, which is after one or two frames of the game thread.

When performing rendering operations, you must carefully consider memory reads and writes, ensure thread safety, and the deterministic behavior. Functional behavior is determined by the poor execution speed between two threads, which is known as a race condition. Competition conditions need to be avoided as they are difficult to reproduce, and because of the poor speed they may depend on the machine, platform, debugger, or configuration. This type of bug is difficult to debug, and it takes about 10 times times more time to repair than a common, reproducible bug.

This is a simple example of a race condition/thread bug:

When the/** component is registered to the scene, the Fstaticmeshsceneproxy Actor is called on the game thread. */Fstaticmeshsceneproxy::Fstaticmeshsceneproxy(Ustaticmeshcomponent* Incomponent): Fprimitivesceneproxy(...), Owner(Incomponent-GetOwner()) <======== Note:Aactor Indicator is cached ... When the/** renderer executes a path on the scene, Drawdynamicelements is called on the render thread. */ void Fstaticmeshsceneproxy::Drawdynamicelements(...) { if (Owner-anyproperty) <========== Race  Condition!  The game thread has all  aactor / uobject states, //and may write to it at any time. Uobject may have performed garbage collection, causing the program to crash.  The value of Anyproperty is mirrored in this agent to perform the operation safely.  }< /c13> 
Development methods

There is no way to find competitive conditions through thorough testing. It is important to understand this: guessing tests or negative bug fixes cannot create reliable thread code. The best approach is to fully understand the interaction between the game thread and the render thread, and use mechanisms to ensure certainty. Should have the ability to interpret the sequence of events that make each interaction decisive, otherwise there will be a competitive condition.

Thread-specific data structures

It is therefore advisable to keep the data in a separate structure of "owned" by different threads, with explicit modifiers and modified objects. This method also applies to functions. The best way to do this is to fix calls to each function from the same thread or in extremely complex situations. This is true of most UE4 structures. For example,uprimitivecomponent is the underlying game thread class that can be rendered, which casts shadow resources, and has properties such as its own visual state. The render thread cannot directly touch uprimitivecomponent memory because the game thread may be written to its artifacts at any time. The render thread itself has a class- fprimitivesceneproxythat represents this functionality. The Fprimitivesceneproxy memory widget cannot be touched after the game thread is created and registered. uactorcomponent::registercomponent Adds a component to the scene and creates a fprimitivesceneproxy to make it visible to the renderer. After the component is registered, if it is visible, the fprimitivesceneproxy will be called for each path required ::D rawdynamicelements.

Performance considerations

The game thread will block at the end of each Tick () event until the render thread catches a gap of one to two frames. Because the rendering thread lags behind, blocking the game thread in the game process, waiting for the render thread to catch up completely is completely undesirable. Blocking is also undesirable when reading or single object garbage collection, because UE4 supports asynchronous streaming levels. Many operations have asynchronous mechanisms to prevent blocking.

Asynchronous inter-thread communication

The main way to communicate between two threads is through Enqueue_unique_render_command_xxxparameter macros. This macro creates a local class using the virtual execution function (the code that contains the input macro). The game thread inserts the command into the render command queue, and the render thread invokes the execution function at the beginning.

With frendercommandfence , you can easily track the progress of a render thread on a game thread. The game thread calls frendercommandfence::beginfence to start the fence. The game thread then calls frendercommandfence::wait to block until the render thread handles the fence, or checks the getnumpendingfencesto poll the process of the render thread. When Getnumpendingfences returns to zero, the render thread has handled the fence.

Blocking

Flushrenderingcommands is a standard way to block game threads until the render thread catches up. This is useful in offline (editor) operations, modifying the memory used by the render thread.

Rendering Resources

The Frenderresource provides the underlying rendering resource interface, as well as the initialization and release hooks. Resources derived from Frenderresource (fvertexbuffer,findexbuffer , etc.) need to be initialized before being used for rendering and need to be freed before being deleted. Frenderresource::initresource can only be called from a render thread, so a helper function (begininitresource) can be called on the game thread so that the render command into row to invoke Frenderresource::initresource. The RHI function can only be called from the render thread (except for creating devices, viewports, etc.).

Uobjects and garbage collection

garbage Collection(GC) occurs on the game thread, and operations are uobjects. Render line is impersonating the game thread may remove uobject when it processes a command that references Uobject. Therefore, the render thread should not de-reference the uobject indicator unless there is a mechanism to ensure that uobject is not deleted when it is referenced by the rendered thread. In Uprimitivecomponent, for example, it uses a frendercommandfence called detachfence to prevent the GC from deleting uobject before the render thread processes the detach command.

Game Threading Frenderresource Processing

There are two common scenarios in which you need to consider rendering thread interactions for a game thread: static resources (which can be edited only after loading or in the editor, similar to index buffering), and dynamic resources (you need to update the latest results for game threading simulations to each frame).

Static resources

This section describes how to handle static resource interactions in UE4, taking Uskeletalmesh as an example.

  • After loading, the Uskeletalmesh::P ostloadwill be called, and this resource will call initresources. As with the index buffer, it is raised with Begininitresourceon any static frenderresources it owns. Begininitresource makes a render command into row to invoke Frenderresource::initresource. From this point on, the game thread cannot modify the index buffer memory unless the ownership is taken back.

  • Component registration to start rendering with the Uskeletalmesh index buffer.

  • GC in some cases (the level is not loaded or no longer referenced) will stop referencing the component and detach it. Note: The game thread cannot drop the index buffer memory at this point because the render thread may not have finished detaching and still using index buffer rendering.

  • The GC calls Uskeletalmesh::begindestroy(the game thread object makes the command into row, releasing the opportunity to render the resource), so it executes the Beginreleaseresource (&indexbuffer The game thread still cannot delete the indexed buffered memory because the render thread does not necessarily have to finish disposing of the release. You can block a game thread, wait for the render thread to catch up, but cause a failure and slow it down, so we use an async mechanism instead. To track the progress of the release command for the tracing render thread, we will start a fence.

  • The GC calls Uskeletalmesh::isreadyforfinishdestroyand true does not destroy Uobject until this function returns. After the rendering thread passes through the fence, the function will only return true , meaning that the index buffer memory can be safely removed from the game thread.

  • The GC will eventually call Uobject::finishdestroy, which frees up memory at the central location. Index-buffered memory is emptied when the Uskeletalmesh destructor calls the Frawstaticindexbuffer destructor. The called destructor calls the destructor that holds the index buffer memory Tarray and empties the memory.

This mechanism is very efficient (no blocking of threads, initialization at a central location, not the need to initialize every frame check) and is very useful.

Dynamic Resources

One of the best examples of dynamic resource updates is the skeletal deformation of bone meshes generated by the game thread animation per frame. The goal is to get the warp from the game thread after each animation is updated into an array on the render thread, where you can set the warp to a shader constant. If the index or vertex buffers are updated at each frame, the results are the same. Here is the sequence of operations:

  • uskinnedmeshcomponent::createrenderstate_concurrent Assign uskinnedmeshcomponent::meshobject. From this point on, the game thread can only write to the meshobject indicator, but not to fskeletalmeshobject memory.

  • The Uskinnedmeshcomponent::updatetransform is called at least once per frame to update the component's movement. The fskeletalmeshobjectgpuskin::updatewill be called in the GPU skins. Now that the game threads have the latest variants, they need to be transferred to the render thread. How to do this: first allocate memory on the heap (fdynamicskelmeshobjectdata), then copy the skeletal deformation and then use Enqueue_unique_render_command_twoparameter Upload this copy to the render thread. The render thread now owns this copy and is responsible for the deletion. The Enqueue_unique_render_command_twoparameter macro contains code that replicates the morph to the final destination, so the morph can be set as a shader constant. such as updating vertex positions, this is where the vertex buffers are locked and updated.

  • In some cases, the components will be detached. The game thread into row the render command to release all dynamic frenderresources, and now the Meshobject indicator can be set to NULL, but the actual memory is still referenced by the render thread and cannot be deleted. The deferred removal mechanism can be useful at this time. Classes derived from fdeferredcleanupinterface can be deleted in an asynchronous manner that is harmless to threads. Fskeletalmeshobject apply this interface. The game thread needs to start Fskeletalmeshobject for deferred deletion, so it calls Begincleanup (meshobject). After the cleanup is complete, the memory is progressively removed.

Update status VS traverse render scene

When developing a system with unique updates and rendering operations, merging the two into drawdynamicelements looks beautiful, but is actually not a good idea. A better approach is to separate updates from the rendering traversal, such as updating into row from the game thread Tick event.

Call Drawdynamicelements with high-order rendering code to draw the elements of the original component. The high-order code assumes that the RHI is not changed, and the drawdynamicelements can be called any time (depending on the shading path, number of images, and scene capture in the scene) in each frame. Drawdynamicelements may even be called, but the underlying drawing rule discards the result for a number of reasons (for example: the semi-transparent fmeshelement submitted in the depth path will be discarded). If the original component is actually invisible, the occlusion system may/might not actually call drawdynamicelements (depending on the heuristics it uses). All of these factors may conflict with status updates that occur once per frame.

A better solution would be to isolate the update and render traversal independently. The game thread Tick event causes the render command to into row and perform an update operation. Render commands can be slightly updated based on visibility. You can use the lastrendertime of the original scene information to perform the operation, as permitted by the usage situation. Any RHI function can be used (including setting different render targets) if the update operation is into row separately in this way.

The state cache (as opposed to the update) is an exception to this rule. The state cache saves the intermediate result of the rendering traversal as an optimization. It is closely related to traversal and does not change the RHI state, so it is not affected by the negative effects mentioned above (set the right time to cache).

Original: https://docs.unrealengine.com/latest/CHN/Programming/Rendering/ThreadedRendering/index.html

(reprint) (official) ue4--image programming----thread rendering

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.