With the popularization of multi-core CPUs, multi-threaded design has become increasingly important to the 3D engine. It is hard to imagine that the 3D engine launched one or two years later is still using the single-threaded approach. However, the introduction of multithreading makes the engine more complex, and the performance improvement caused by poor design is very limited. Even in a single-core environment, the performance may decline significantly. Therefore, it is critical to find a clear, concise, and efficient design method. Currently, the design of multi-threaded 3D engines is rare. Most of them only discuss some principled concepts and lack the analysis of specific instances.
This article proposes a specific 3D engine multi-thread implementation scheme, which is only a preliminary implementation and lacks a lot of tests and long-term operation tests, however, the ideas may provide some reference for you. Currently, the popular thread function division usually separates resource loading and Image Rendering into one thread separately. This is also a very intuitive division method. This article is no exception, this thread function is also used.
A traditional thread module design method is to allocate a group of strongly cohesive jobs to independent threads, such as resource loading or image display, and then design a group of command messages for it, the main thread submits a command message to the working thread. The working thread completes the command request in the background and then passes the result to the main thread through the message. There are two problems with this design. One is that the command messages to be implemented are many complex and difficult to synchronize; the other is that asynchronous command execution and result return break the natural running logic of the main thread, the main thread has to retain the context related to command submission, get the command execution result through polling or callback function, and finally complete the Command Processing Based on the reserved context. The design diagram introduced in this article avoids the two problems mentioned above and simplifies the design of command messages. It is best to submit command requests in only one way, and you do not want to leave the request blank after it is submitted, no polling or callback is required. It is closed in the thread module even if it is used, and transparent to the main thread. There is no free lunch in the world. This design is bound to its limitations. First, it reduces the working thread function, only completes a single and clear task, and leaves complicated tasks to the main thread, the workload of the main thread is large. Second, the resource loading thread must work with a resource loading forecast.AlgorithmIn the pre-Read mode, backend loading is transparent to the main thread. If you want to load Resources in the background without pre-reading, the main thread needs to perform round-robin or response callback.
The following describes the design and features of the resource loading and drawing threads.
to avoid complexity, the resource loading thread only creates game resource objects to avoid background creation of Game objects. For example, instead of creating model objects that represent the main character or NPC, you can only create mesh objects and texture objects used by model objects. In addition, the creation of resource objects is divided into two steps: loading the resource file to the memory data block, parsing the memory data block, and creating the final resource object (including allocating the memory resources ). The loading thread only completes the first step of resource creation, that is, loading the resource file to the memory. The resource parsing and creation are retained in the main thread, but they are also the internal implementation of the loading module, it is transparent to other logic of the main thread. The purpose of this operation is to maximize the workload of loading threads. Although data parsing and object creation are left to the main thread, the loading thread has completed the file I/O work with the largest overhead, therefore, the loading thread still shares a considerable workload, avoiding the congestion of the main thread due to file IO. The loading thread does not create resource objects. Another purpose is to allocate memory resources for resource object creation and deal with 3D devices. This requires communication and synchronization with the image display thread, in the above design, the loading thread does not need to know the existence of the display thread, and the synchronization problem with the display thread is handed over to the main thread for control. Let's take a look at the working conditions of the loading thread. The main thread can initiate a loading request at any time (note: this refers to a pre-Read Request). The main parameter is the file name of the file to be loaded. The load module puts the request into a queue for completion of the request. The loading thread constantly fetches command messages from the request queue, opens and reads the file, and saves the loaded memory pointer to the Command Message structure, then, add the completed command message to a queue with completed commands. The main thread calls a post-processing function of the loading module at the end of each frame. This function extracts each command message from the queue of completed commands, complete the final resource parsing and resource object creation, and add the created object to a hash table cited by the resource source file name for later search. Note that this step is completed in the main thread, so you do not need to perform any synchronization with other parts of the main thread, so you can easily complete complicated operations. The synchronization between Resource Creation and display threads is shown in the chart below. When the main thread really needs to create a game object, the resource object on which the game object depends has been loaded, so the game object can be quickly created.
Let's take a look at the draw thread. Similarly, to simplify the complexity of the drawing thread, the drawing thread does not create explicit memory resources, but leaves this work to the main thread. This works with manyArticleWe recommend that you hand over all operations on the 3D device to a thread for inconsistency. The communication between the draw thread and the main thread is completed by a set of draw contexts. After each logical frame of the main thread is completed, the main thread submits a drawing context to the drawing thread for each object to be drawn. This drawing context includes all the information required to complete the drawing task, such as VB, IB, vertex declaration, shader, and all shader parameters. With this context rendering thread, you can complete the work independently without having to communicate with other objects. At the beginning of each frame, the main thread performs logical operations on the current frame. The drawing thread simultaneously processes the rendering context submitted in the previous frame to draw a frame. When the main thread completes the logic operation of the current frame, and the draw thread has not completed the painting of the previous frame, or when the drawing thread finishes the painting and the main thread does not complete the logical computing, they have to wait for each other until both threads have finished the work. That is to say, at the end of each frame, the main thread synchronizes with the drawing thread. After this synchronization point, the drawing thread is in the waiting state (blocking ), the main thread starts to call the post-processing function of the loading module to create resource objects (including the creation of memory resources. Note that the rendering thread is blocked here, no longer need to synchronize operations on the 3D device), and then the main thread submits a set of painting contexts for the current frame. Here, the main thread notifies the drawing thread to start working, draw the submitted context, and start new logic processing.
A thousand words is not as clear as a pair of pictures. Finally, the sequence diagram of the three threads is pasted as the end.