Design the framework of a parallel Game Engine

Source: Internet
Author: User

GameRes Game Development Resources Network http://www.gameres.com

Design the framework of a parallel Game Engine

Author: Jeff Andrews

Translation: Vincent

Contact: QQ: 14173579 MSN: square@sina.com

Design a function-breaking, data-decomposing system that can provide large-scale parallel execution while ensuring the performance of multi-core processors.

With the advent of multi-core processors, the demand for parallel computing game engines has become increasingly important. Although it is still feasible to rely solely on GPUs and single-thread game engines, the advantages of using multi-core processors on a single system will bring users a more profound experience. For example, a game with multi-core CPU can add more physical rigid-body objects to improve the effect, or develop more intelligent and human-like AI.

A parallel game engine framework, or a multi-thread engine, is designed to improve performance using all the processors on the development platform. (ENGINE) through parallel processing, each function module can use all available processors. Of course, it is easier to say than to do it. After all, many things in the game engine are cross each other, which usually causes thread errors. Therefore, we need to design a system to handle data synchronization issues appropriately and avoid synchronization lock restrictions. In addition, we also need a set of methods to ensure that serial processing consumes as little as possible when processing data synchronization in parallel mode. This article requires readers to have a good understanding and experience in the development of modern computer games and the programming of Game Engine threads.

2. Parallel Processing

The concept of parallel processing state is very important for an efficient engine with multi-threaded runtime tense. If the engine needs to implement real parallel processing-that is, to minimize the synchronization loss, it needs to sit down to as few interactions as possible during the running of each system in the engine. Although data needs to be shared, each system should now have its own copy of data instead of accessing data in a public way. In this way, there will be no data dependency between systems. Any changes to shared data will be sent to a status manager and added to a change queue, which may be called a message prompt queue. Once each system completes the processing task, they will be prompted to change their status and update their internal data structure (as part of the Message Queue ). Using this mechanism will greatly reduce the synchronization loss, so that each system can work more independently.

2.1 Execution Mode

When each system is running synchronously (that is, the operations of each system are restricted to the same clock), the execution status management will be optimal. The clock frequency can be equal to the frame rate. Of course, this is not absolute. The frequency of the clock may not even be a fixed value. However, if the span is equal to the time required to process a frame, no matter how long the frame is, we can ignore the frequency at all. Your Implementation of execution state management will determine the clock span. Figure 1 depicts the status of different systems when using a free clock, in which these systems are not executed within the same clock. In addition, Figure 2 depicts how all systems are executed under the same locked clock.

Figure 1. Execution State in free walking mode

2.1.1 free walking mode

In this mode, the system running time depends on the time required by the task. The freedom here does not mean that the system is not free before the task is completed, but that the system is free to choose the number of clocks needed.

In this way, a common status change prompt is not enough for the status manager, and relevant data needs to be included in the prompt. This is because when a system modifies shared data, it may still be executed, and other systems also need to update the data. This requires more and more memory for backup. This method is obviously not the best.

2.1.2 locked step mode

This mode requires all systems to complete their respective processing within the same span. This is easy to implement without attaching data to the prompt, because when the system status changes, you can simply access other systems to obtain data at the end of the running cycle.

The lock step mode can achieve a false free step mode by performing cross-execution in multiple steps. For example, when AI calculates its initial "macro-View" goal in the first clock, it can focus on more specific goals under the macro-view goal in the next clock, instead of simply repeating the previous macro goal.

Figure 2. Execution state under lock step

2.2 Data Synchronization

When multiple systems can change the same shared data, you need to determine which value is correct and usable in these changes. There are two mechanisms to solve this problem:

L time, the value of the last system that makes the change is correct.

L permission. The value of the system with higher permissions is correct. When multiple systems have the same permission, they can be used in combination with the time mechanism.

Under these two mechanisms, data that is considered to be old will be overwritten or discarded from the prompt queue.

Because the data is shared, it may be difficult to grasp the relative values of the data because the data is unordered. To eliminate this obstacle, assign values with absolute values when the system updates data to achieve the alternation of old and new. The combination of absolute and relative values is ideal, but it depends on the situation. For example, the position-oriented public data should be identified by absolute values, because the order of receiving data needs to be considered when creating a transformation matrix. However, a system that creates particles can only update relative values when it has full particle information.

3. Engine

When designing an engine, you should pay attention to the elasticity of the structure to make it easier to Expand functions. Based on this, the engine can be well adjusted on a variety of restricted platforms (such as memory.

An engine consists of two parts: a framework and a manager. The Framework (Chapter 3.1) contains parts of the game that repeat multiple instances, and also those that appear in the main loop. The Manager (Chapter 3.2) exists as a single piece and is independent of the game logic.

The following figure describes the components of the engine:

Figure 3: advanced engine framework

It is worth noting that the game processing function, that is, a system, is treated differently from the engine. For the purpose of modularization, the engine is used as a "glue" to link various functions. Modularization allows the system to be loaded or detached as needed.

An interface is a channel for communication between the engine and the system. After the system implements the interface, the engine can use the system functions. On the contrary, after the engine implements the interface, the system can also access the manager in the engine.

Appendix A provides A clearer explanation of this concept, "engine sample diagram ". As mentioned in Chapter 2, the concept of "Parallel Execution state" makes the system discrete in essence. In this way, the system will not interfere with each other during parallel operation. However, such parallelism cannot ensure data stability when communication is required between systems. There are two reasons for inter-system communication:

L notify another system that the shared data has changed. (Such as location and orientation)

L requests some functions that are not included in the request. (For example, the AI system requires the terrain/physical system to perform a ray collision detection)

The first communication problem is solved by implementing the status manager described in the previous chapter. The status manager will be discussed in more detail in section 3.2.3 "status manager.

To solve the second problem, you need to add a mechanism in the system to provide services to different systems. Chapter 3.2.3 "Service Manager" will be explained in depth.

3.1 Framework

The role of the framework is to link different parts of the engine. The engine Initialization is completed within the framework, but the manager Initialization is global and is not affected by the framework. The scenario information is also stored in the framework. Based on Elastic considerations, a scenario, or a general scenario, is equivalent to serving as a container to form a general object for the entire scenario. Chapter 3.1.2 provides more detailed information.

The game loop is also executed within the framework. The following is the game loop process:

Figure 4: Main game loop

Because the engine runs in a window environment, the first step of the game loop is to process window messages from the operating system. If these messages are not processed, the engine does not need to do any additional work. The next step is to release system tasks to the task manager by the scheduler. This section will be discussed in more detail in Chapter 3.1.1. Next, the message tracked by the Status Manager (Chapter 3.2.2) is distributed to the part for response. Finally, the framework determines the execution status and determines whether the engine exits, or continues to execute other tasks, such as entering the next scenario. The execution state of the engine is the responsibility of the Environment Manager, which will be discussed in chapter 3.2.4.

3.1.1 Scheduler

When the scheduler manages the master clock for execution, the master clock frequency should be set in advance. The clock frequency can also be unlimited. For example, in the benchmark test mode, the clock can be stopped before the end of the operation. The scheduler registers the system within a clock length through the task manager. In the free step mode (chapter 2.1.1), The scheduler communicates with the system to determine the number of clocks required for system execution, and which systems are ready for execution or completed after a clock. The start and end of all the systems in locking step mode (Chapter 2.1.2) are in the same clock respectively. Therefore, the Scheduler only needs to wait for the system to complete the execution.

3.1.2 common scenarios and objects

Common scenarios and objects exist in the system as containers with certain functions. Common scenarios and objects do not have any function, except for the function of interacting with the engine. However, they can be extended into containers that contain system functions. As a result, these containers can take over the attributes of available systems in a loosely coupled relationship without the need to bond with a specific system. Due to loose coupling, systems can be independent from each other, making parallel execution possible. The following icons describe the extension of common scenarios and objects in the system:

Figure 5: extensions of common scenarios and objects

A common scenario is expanded to a container that can contain graphics, physics, and other attributes. Graphic scenario extensions are used to initialize screens and other rendering objects. Physical scenario extensions are used to set the rigid body world, such as gravity. A scenario contains objects. Therefore, a common scenario has several common objects. A common scenario can also be expanded to a container that contains graphics, physics, and other attributes. Graphic object extension is used to render an object on the screen, and physical object extension is used to perform collision and interaction between rigid bodies. The further relationship between the engine and the system can be viewed in the figure "engine and system relationship diagram" in appendix B.

Another point to note is that common scenarios and common objects need to register their respective extensions through the status manager to respond to changes caused by other extensions (such as systems. For example, after a graphic extension is registered, it can capture the prompts generated by changes in location and orientation caused by physical expansion.

For more information about system components, see Chapter 5.2 "system components.

3.2 Manager

The manager provides global functions as a single piece in the engine, which means that each manager has only one instance. This is because the resources they manage should not be copied; otherwise, redundancy and potential impact on performance will occur. The manager also provides some common cross-system functions.

3.2.1 Task Manager

The task manager uses its own thread pool to Schedule System tasks. The thread pool allocates a thread to each processor to achieve optimal n-path processing. This avoids excessive use of thread resources and unnecessary Task Switching in the operating system.

The task manager receives the list of tasks to be processed and to wait from the scheduler. The scheduler obtains the list of tasks to be processed from each system. Each system has only one major task. The main task can be divided into several subtasks based on the data to be processed. The above two features can be called functional decomposition and data decomposition.

The figure below illustrates how the Task Manager assigns tasks to threads on a quad-core system:

Figure 6: Task Manager and thread pool instance

Apart from the scheduler and the Director, the task manager has an initialization mode that calls the system serially by the thread where each system is located, so that the system can initialize the local threads stored by it. Appendix D "Tips for implementing tasks" can help you implement the task manager step by step.

3.2.2 status Manager

Status management is part of the message mechanism. It is used to track the prompts generated by changes in a system and distribute these prompts to other systems that need to respond. To reduce unnecessary broadcast prompts, the system must register the prompts that you are interested in. This mechanism is based on the observer model, which can be explained in more detail in Appendix C "Observer Design Patterns. Simply put, the observer pattern is: The Observer observes any changes of interest, and the Controller acts as the transmitter to pass this change to the observer.

The mechanism works as follows:

1. The observer registers an object of interest to the Controller (status manager.

2. When a property of an object changes, it passes the change to the Controller.

3. When the Controller receives a prompt from the framework, it forwards the prompt to the observer.

4. The observer accesses this object to obtain the specific changed data.

The free walking mode (chapter 2.1.1) brings additional complexity to this mechanism. First, when the prompt is generated, the related data needs to be included. This is because the system that generates this prompt may be still running, this makes it impossible to obtain shared data by accessing the system. Next, if a system that needs to receive a prompt cannot prepare for receiving the prompt at the end of the clock, the status manager needs to keep the prompt until the system is ready.

This framework implements two State managers to handle changes at the scene and object layers. In most cases, messages in different scenarios and objects are different, so separating them can reduce unnecessary message processing. However, any changes to the scenario-related objects should be registered to the scenario, so that the scenario can receive these prompts. To reduce synchronization consumption, the status manager Prepares a change queue for each thread created by the task manager. In this way, synchronization will not occur when you access these queues. After the queues are executed, use the methods mentioned in section 2.2 to merge them.

Figure 7: internal common object Change Prompt

When you think that the prompts of these changes should be serially distributed, it is actually feasible to process them in parallel. When the system processes their respective tasks, they operate on all objects. For example, if an interaction occurs between objects, the physical system moves objects, detects collisions, and sets new forces. During the Change Prompt, objects in a system no longer interact with objects in the system, but interact with other extension objects associated with the system. This means that the general objects in the system are independent of each other at this time, so that they can be updated in parallel. Note: Although synchronous processing is required in a few cases, previously something that seemed to have to be processed serially can now be parallelized.

3.2.3 Service Manager

The service manager provides the system with access to functions that it does not possess. It should be noted that the service manager does not directly provide services for the system, but is implemented through predefined interfaces. Any system that implements these exposed interfaces can register itself with the Service Manager to obtain the service.

Because the engine is designed to keep systems as discrete as possible, there are actually few services available. At the same time, the system itself cannot provide any required services, but can only be selected through the Service Manager.

Figure 8: Service Manager instance

Another role of the service manager is to provide a way for each system to access each other's attributes. Attribute is a value that is not transmitted by the message system. The window resolution or the gravity value of the physical system. The access channel provided by the Service Manager does not allow direct access by the system. In this way, attributes can be added to the queue and distributed in sequence. Note that access to system attributes rarely occurs, so this does not need to be considered a common application. These accesses only occur when the control window opens/closes the wiremap mode, or when the player changes the screen resolution through the interface system. Therefore, these accesses do not appear at all frames.

3.2.4 Environment Manager

The Environment Manager provides functions for the running environment of the engine. The following is a list of function groups provided by the Environment Manager:

L variable. Shared variable names and data in the engine. It is usually set after loading scenarios or user settings, or the results of query and execution by various systems.

L execution. Information about execution, such as the end of a scenario or the end of a program. It is usually set or queried by the engine or system.

3.2.5 platform manager

The platform manager processes the calls to the operating system and provides some additional functions on these calls. The advantage of doing so is to Package Multiple common functions to respond to one call, so that you do not have to implement all the calls step by step and do not have to pay attention to the nuances between them.

An example is the call from the dynamic link library of the platform manager loading system. In addition to loading the system, the manager also obtains the function entry point and then calls the Library's initialization function. The manager also keeps a handle for the database and uninstalls the database after the engine exits.

The platform manager also provides processor information, such as the SIMD command supported by the processor and the process initialization response. These functions are only available for system queries.

4 Interface

Interfaces provide a channel for communication between the manager and the system. You can directly access the manager in the internal framework of the engine. However, the system does not reside in the engine, and functions are different between systems. Therefore, a common method is required to access them. In addition, the system cannot directly access the manager, so it is necessary to provide methods for the system to access them, but this is not necessary and comprehensive, because what is inside the system should only allow access to the framework.

Interface provides a set of common access methods. In this way, the framework can communicate with the system through these display methods, so that there is no need to know the details of each system.

4.1 object and observer Interfaces

The object and the observer interface are used to register the object to the observer, so that changes to the object can be passed to the observer. Registering or deregistering an object with an observer is common to every object.

4.2 Manager interface

Although management exists as a single piece, they only allow the framework to access themselves, which is inaccessible to various systems. To provide access to the system, each manager must provide an interface to expose some function subsets. After system initialization, you can access these subsets.

The interface definition is related to the manager. Therefore, these interfaces are not generic, but defined according to each manager.

4.3 System Interface

To enable the Framework to access itself, the system also needs to implement some interfaces for the framework to use. Without these interface frameworks, You have to implement each newly added system on your own.

The system consists of four components. Therefore, the system needs to implement four interfaces. They are systems, scenarios, objects, and tasks. These components are described in Chapter 5 "system. The interface is used to access these components. Use system interfaces to create and destroy scenarios. The scenario interface is used to create and destroy objects. In addition, it is used to obtain directors. The task interface is used by the task manager to publish tasks in the thread pool.

The scenario interface and the object interface inherit from the object and the observer interface, so that the system can communicate with each other between common scenarios and common objects.

4.4 changed Interfaces

There are also some special interfaces used to transmit data between systems. Any system that makes the adjustment must implement these interfaces. Taking terrain as an example, the terrain interface can obtain the position, orientation, and size of an object. Any system that adjusts the terrain must implement the terrain interface, so that a system does not have to worry about other systems when acquiring the terrain changes.

5 System

The system provides the engine with the required game functions. Without these system engines, there will be an infinite loop in a state without tasks. To make the engine and system independent from each other, the system must implement the interfaces described in section 4.3 "system interfaces. In this way, when a new system is added to the engine, you do not need to care about the details, making the process more convenient.

5.1 type

The engine should contain predefined system types as standard game components. For example, terrain, graphics, Physics (rigid body collision), sound, input, AI, and animation.

In addition to these common functions, custom systems can also be considered. Note that a custom system must provide interfaces to other systems because the engine does not provide such information.

5.2 System Components

A system must implement several components. For example, systems, scenarios, objects, and tasks. These components are used to communicate with other parts of the engine.

The following illustration describes the relationship between components:

Figure 9: System Components

Appendix A "engine and system relationship diagram" provides more detailed information about the relationship between the engine and the system.

5.2.1 System

System components are used to initialize system resources. These system resources remain unchanged throughout the engine operation. For example, a graphical system analyzes the locations of all resources to determine how to perform faster loading, but does not care about the usefulness of these resources. Similarly, the screen resolution set by the graphic system is also such a type of resource.

The system provides the framework with the main entry point and its own information. For example, you can use your own types to create and destroy scenarios.

5.2.2 scenario

Scenario components, or system scenarios, are used to manage scenario-related resources. General scenarios use this scenario as an extension function to make the Attributes provided by the system available. For example, a physical scenario creates a world after scenario initialization and sets its gravity attribute.

The scenario provides methods for creating and destroying objects. It also has a task component. In addition to operating the scenario, this task component also provides methods to obtain the scenario.

5.2.3 object

Object components can also be called system objects. They are associated with objects that can be viewed by players in a scenario. A common object uses an object component as a function extension so that the external interface exposed by the common object can access the attributes of the object component.

For example, a common object extends the terrain, graphics, and physics and creates a wooden girder on the screen. The terrain system retains the location, orientation, and size of the object. The Graphic System displays the grid on the screen. The physical system uses the rigid body collision detection and gravity effect on the object.

In a particular scenario, a system object may be interested in other common objects or their scaling changes. In this case, we can establish a link through a common object so that this system object can observe other objects.

5.2.4 task

A Task component is a system task used to operate on a scenario. When a task receives an update instruction from the task manager, it uses system functions to operate the objects in the scenario.

With the help of the task manager, tasks can be divided into several subtasks for extra multi-threaded operations. This requires that the engine can quickly configure multi-core processors. This technology is the data decomposition mentioned above.

During the scenario update process, any object changes in the scenario are passed to the status manager. See section 3.2.2 for details about status manager.

6. Conclusion

It is difficult to absorb the information at one time because of the intersection of chapters. The engine workflow can be divided into the following parts.

6.1 initialization phase

The engine starts from the Initialization Manager and framework:

L The Framework calls the scenario loader to load the scenario.

L The loader determines which systems will be used in this scenario, and then notifies the platform manager to load these modules.

L The platform manager loads the module and uses the Manager interface command manager to create a new system.

The module returns the pointer of the system instance.

L The system module registers the services that can be provided to the Service Manager.

Figure 10: Engine manager and system initialization

6.2 scenario loading stage

In this phase, the control is handed over to the scenario Loader:

L The loader creates a general scenario, then instantiates the System Scenario through all system interfaces, and then makes these system scenarios an extension of this general scenario.

L General scenarios Check System scenarios to confirm how they change shared data and what changes they may receive about shared data.

L General scenarios register a system scenario that matches the changes with the status manager so that in the future, these scenarios will receive changes.

L The loader creates a common object for the objects in each scenario, and then determines which systems the general object should be extended. The registration method for common objects is similar to that for common scenarios.

L The loader instantiates system objects through the interface in the system scenario, and then makes these System Objects an extension of common objects.

L The scheduler determines their main tasks through the System Scenario interface, and then releases these tasks to the task manager.

Figure 11: initialization of common scenarios and objects

6.3 game cycle stage

L call the platform manager to process all window messages and other platform-related operations.

L The operation is passed to the scheduler, And the scheduler waits for the clock to end.

L The scheduler checks which system tasks have been completed in the previous clock in the free step mode. Then, release all the prepared tasks to the task manager.

L The scheduler determines which tasks will end in the current clock and prepares for the end.

L in the locked step mode, the scheduler releases all tasks and then checks whether there are completed tasks in every clock step.

6.3.1 execution of a task

The operation is passed to the task manager.

L The task manager arranges the tasks and begins to allocate the tasks to available threads.

L The task modifies the internal data structure of the entire scenario or a specific object during execution.

L any shared data, such as location and orientation, should have copies in other systems. System Tasks use commands to generate system scenarios or system objects to notify their respective observers. In this case, the observer is actually the Controller that controls changes in the status manager.

L The change controller sorts the change information for subsequent processing. Changes that the observer is not interested in are usually negligible.

L the task can be run by the service manager to provide the required services. The Service Manager can also be used to change certain system attributes that are not exposed to the message mechanism. (For example, the player changes the screen resolution in the graphics system through the input system)

L The task can also call the Environment Manager to obtain environment variables and change the runtime status. (For example, pause and enter the next scenario)

Figure 12: Task Manager and task

6.3.2 Distribution

Once all the tasks in the current clock cycle are completed, the main cycle will command the status manager to distribute the changes:

L status Manager Command changes the Controller distributes changes in the queue. This process is done by checking every changed observer.

L The change controller informs the observer of the change (the pointer to the object that generates the change is also transmitted to the observer ). In the free walking mode, the observer can directly access the object to obtain the data through the Controller or the changed data.

L The Observer interested in the change of a system object may be a system object that is stuck with this object in the same common object. In this way, changes can be distributed to parallel running tasks. To reduce synchronization, You can package tasks that generate the same common object for processing.

6.3.3 check and exit during running

The last step of the main loop is to check the running status. Such as running, pausing, and entering the next scenario can all be considered as running status. If the running status is running, the entire game loop will be repeated. If it is set to quit during running, the game will exit, release resources, and end the program.

Considerations

The key to the entire article is Chapter 2 "Parallel Execution state ". The design of a data decomposition system can provide large-scale parallel operations, while ensuring the performance of more core processors in the future. It should be noted that the status manager should be used in the message mechanism to minimize data synchronization consumption.

The observer mode is a mode that uses the message mechanism. To meet the engine's needs in this aspect, it takes some time to learn and implement it. After all, it is a communication mechanism between systems to synchronize shared data.

Task mechanisms play an important role in load balancing. Appendix D helps your engine create an efficient task manager.

As you can see, it is feasible to design a highly parallel engine using clearly defined messages and architectures. Moderate parallelism can improve the performance of your game engine while using the current and future processors.

Appendix A engine Diagram

The main game loop starts running (see figure 4, "main game loop ")

Appendix B engine and system Relationship Diagram

Appendix C observer Mode

The observer mode can be found in design patterns: the basis for reusable object-oriented software.

The basic concept of this model is that it is not necessary to query the changes of data or status changes at all times. This pattern defines an object and an observer to handle change prompts. The working principle is: The Observer observes whether the object has changed. The change Controller acts as a transmitter between the two. Describes the relationship:

Figure 13: Observer Mode

The process of the entire event is as follows:

1. The observer registers himself and the object to be observed with the change controller.

2. The change controller is actually an observer. Unlike other observers, they do not need to register themselves and objects. On the contrary, they own a list to record which observer and which object are registered.

3. An object (in fact a change Controller) inserts an Observer into its own observer list. You can also classify changes for the observer to improve the distribution speed of change prompts.

4. When the data or status of an object changes, it notifies the observer of the change type through the callback mechanism.

5. The change controller will change the queue and wait for the signal to be distributed.

6. The controller calls the corresponding observer during the distribution process.

7. The observer queries the object to obtain the changed data and status (or directly obtains the data in the message ).

8. When the observer is no longer interested in the object, or the object has been destroyed, the observer notifies the change controller to deregister the relationship between the observer and the object.

Appendix D. Suggestions for implementing the task mechanism

There are many ways to implement task distribution, but the best one is to make the number of working threads equal to the number of available processors on the platform. If the same task is assigned to a thread, the load inside the thread is unbalanced, because each system does not complete the task at the same time, which will greatly weaken the concurrency. We recommend that you study the job library, such as Intel's ThreadingBuilding Blocks, which greatly simplifies this process.

To ensure that the CPU can work in a friendly way, You can optimize the task manager:

LReverse release,If the sequence of the main tasks to be released is relatively static, You can selectively reverse publish these tasks at each frame. The data of the task executed in the previous frame is likely to reside in the cache. Therefore, the next reverse release task ensures that the data in the CPU cache is correct and does not need to be updated.

LCache sharing,Some multi-core processors divide the shared cache into several parts so that the two processors can share the same cache. If multiple subtasks from the same system are assigned to a processor with a shared cache, the data of the task becomes more likely in this shared cache.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.