On the ECS architecture in the "Overwatch"

Source: Internet
Author: User

Https://blog.codingnow.com/2017/06/overwatch_ecs.html

Today read a "Overwatch" Architecture design and network synchronization. This is based on the GDC 2017 speech Overwatch Gameplay Architecture and Netcode video translation, so there is no original text. Because it was an hour's talk, it was impossible to say everything, so it was difficult to understand, I read it three times over and over again, and then I took the English video (subscribe to GDC Vault to see it, have the copyright) and looked at it, roughly understanding the ECS framework. Writing this blog to record my understanding of ECS, combined with my own years of experience in game development, may not be equivalent to the original speech ideas.

The Entity Component System (ECS) is a gameplay-level framework that is built on top of the rendering engine, the physics engine, and the main problem is how to create a model to handle the update operation of the game object.

Many of the traditional game engines are based on object-oriented design, and everything in the game is an object, each object has a method called update, and the framework iterates through all the objects and calls its Update method in turn. Some engines even define a variety of Update methods that are called at different times in the same frame.

This is actually a huge flaw, I believe that many have done game development program will have this experience. Because the game object is actually by many parts aggregation, the engine function module is many, the different module attention part is often unrelated. For example, the rendering module does not care about the network connection, the game business processing does not care about the player's name, what model to use. In a natural sense, it is natural to aggregate the properties of a game object together as an object, and it is the most logical way to manage the object's life cycle. But for different business modules, dealing with objects that are aggregated together is less natural than binding the processing method on the object. This results in poor cohesion of the module and unnecessary coupling between the modules.

I think the Overwatch is designing a new framework to solve this problem because the complexity of the problems they face can be a higher level: how to use predictive technology for more accurate network synchronization. Network synchronization only cares about very few object properties, and there is no need to involve too many unnecessary things when designing synchronization modules. In order to be accurate, the client and server need to run the same set of code, and the server does not need to do the display, so it is easier to remove the display system, the client and the server is not exactly the same logic, need to share a part of the system, and on the other part of the implementation according to the separate ...

In general, you need to think of a way to break down complex problems, focus the problem into a smaller set, and improve the cohesion of each sub-task.

ECS's E, or Entity, can be said to be the Game Object in the traditional engine. But under this system, it's just a combination of c/component. Its meaning is life-time management, which is represented by a 32bit ID instead of a pointer, plus the resource ID used to render the rendering. The integer ID is more robust because it is only responsible for life-cycle management and does not design the method on which it is called. Integer IDs make it easier to refer to an invalid object, and pointers are difficult to do.

C and S are at the heart of this framework. The system, which is the module I mentioned above. For games, each module should focus on doing a good thing, and everything is either a single individual for a group of objects of the same kind in the game world, or a particular interaction of such objects. For example, the collision system, only concerned about the object's volume and location, do not care about the name of the object, connection status, sound effects, hostile relations and so on. It does not necessarily care about all the objects in the game world, such as those that do not participate in collisions. So for each subsystem, it is the framework's responsibility to filter out the subset of objects that the system cares about and to show it only the data that it cares about.

In the ECS framework, each individual object attribute may be used as a single Component, such as an object whose name is a Component, and the position state of the object is another Component. Each Entity is composed of multiple Component, sharing a lifetime, and Component can be grouped together as criteria for System filtering. When we develop, we can define a System that cares about a fixed Component, and the framework will filter out the entity that satisfies this combination in the game world for this system traversal, if an entity has only the Component in this set of Part, it will not go into this filter collection, and it will not be cared for by this System.

In the speech, the author talked about an input state to decide whether to take the long-term non-input object kicked off the line, that is, the object has a connection component, input components, and so on, and then this AFK processing system to traverse all eligible objects, according to the recent input events generated by the time, Inform the offline object of the long-term non-input event; In particular, AI-controlled robots, which have no connectivity components, have no access to the complete set of components required by the AFK system, even though they have state components, and do not have to waste computing resources on them. I think this is an advantage of ECS relative to the traditional object update model, it is possible to write an empty update function with the traditional method.

The business cycle of a game is to invoke many different systems, each of which traverses its own objects of interest, and only the predefined component parts can be perceived by the subsystem, so that each system can have a strong cohesive capability. Note that this is very different from the traditional object-oriented or Actor model. OO or Actor emphasizes that the object itself handles its own business, and then the framework manages the collection of objects and is responsible for using messages to drive them. In ECS, each system is concerned with different sets of objects, which have common slices in the objects it handles. This is in line with Overwatch this kind of MOBA game. This kind of game is concerned with the relationship between objects, such as a attack B on B caused damage, the matter is between A and B, in the traditional model, you will struggle with the damage calculation in the method of a object or in the method of B completed. There is no need to tangle in ECS, because it can be done in the system of damage calculation, which is concerned with the collection of the small amount of data related to the production of the damage in all objects.

ECS is designed to manage complexity, and it provides a guideline that Component is a pure data combination, without any way to manipulate the data, and the System is a pure method combination that has no internal state of its own. It either makes a pure function without side effects, calculates a result based on the Component combination of objects it can see, or is used to update the state of a particular Component. There is no need to call each other (reduce coupling) between the system, which is driven by the game world (external framework) to drive several system. If these prerequisites are met, each System can be developed independently, and it only needs to traverse the collection of components that the framework provides to it, and make the correct processing, updating the component state is sufficient. People who write Gameplay are more likely to glue the system together, as long as they know exactly what each system does, what Component is affected by the operation itself, and correctly write the update order for the system. A System for most Component is read-only, only a small number of Component will be rewritten, this can be pre-defined clearly, with this knowledge, one is easy to manage the complexity, and second, to the parallel processing left the optimization space.

In the speech, we talked about the development team's knowledge of ECS design is gradually evolving.

In the beginning, for example, they thought that Component was a filter for a collection that had some kind of Entity attribute. The ECS framework assists in this filtering process, and each System module iterates over the components of the objects in the related Entity in the same way as each. Then they found that, in fact, for each game object collection, a class of Component can and should have only one. For example, the player's keyboard input Component, there are not many. Many system needs to read this unique Component state (which buttons are pressed), you can schedule a system to update the Component. The original version of this Component become Singleton Component, I think this thing and the beginning of ECS to solve the problem there are some differences: different types of Entity has a homogeneous group of attributes, the framework is responsible for managing the same kind of collection. We can still create an entity called the player keyboard to add to the game world, which is composed of keyboard components. However, we do not have to iterate over the player keyboard as an Entity, as it must have only one, and put the object directly in the game world. But putting it in the System is not a good design. Because it destroys system stateless design principles and does not support multiple game worlds: In the original cited an example, the actual game and game playback is two different game worlds, different game worlds mean different business process combinations, need to use different ways to glue the developed System. It is not appropriate to have the state of the game keyboard built into a particular System. From this point of view, the essence of ECS is data C and Operation S separation. Operation S is not limited to managing the collection of homogeneous components, but also for individual components. The authors themselves say that eventually 40% of the components are a single piece.

The single piece itself is almost the same as the traditional object-oriented model. But the separation of data and methods is very meaningful. When we develop with object-oriented patterns, we encounter an object that has several different approaches, some of which focus on the state, others on the other, and some that focus on the set of previous sets of States. The method here is the ECS system, the state is the component. Separating data and methods allows different methods to be decoupled. If you use the traditional C + + object-oriented model, it is possible to use multiple inheritance, combinatorial forwarding and so on complex grammatical means.

It also mentions some common techniques for dealing with complex problems in ECS mode.

Component there is no method, and the System has no state, only the process of defining the Component state. In many System, the same type of problem is likely to be dealt with, and the Component types involved are the same. If this common problem involves only one Entity, then the intuitive approach is to design a System, iterate, and compute the results one at a time, and save it as a Component state, and the other System can read the result as a state later.

However, if the behavior involves multiple entity, such as in different System, it is necessary to query the hostile relationship of two entity. It is not possible to use a System to calculate the hostile relationship between all the Entity, which inevitably generates a lot of unnecessary computation, or this behavior does not want to modify the state of the Component, hoping to maintain no side effects, for example, I want to continue to simulate the position of an object over time changes, It cannot be calculated with one system and then read from another system.

In this way, the concept of the Utility function is introduced to do this type of operation, and then the Utility function is shared to different System calls. In order to reduce the complexity of the system, it is required that either this function has no side effects, and any call is not a problem, such as the above query the example of hostile relations, or to restrict the invocation of this function, only in a few places, the caller is careful to ensure the impact of side effects, such as the above continuous position change process.

If the behavior of the side effect of the state change must exist, and it will be triggered in many System, then in order to reduce the place of the call, we need to concentrate the point that really produces the side effect. The trick is to postpone the timing of the action. is to save the state that is needed when the behavior occurs, to put it in a queue, and to process them centrally in a separate System.

For example, different shooting behaviors can create new objects, destroy scenes, and affect the state of existing objects. Leave different bullet holes on the same wall, do not need to stack together, but only need to keep the last one, delete the previous. We can let the different System trigger these objects to create, delete the behavior, but do not really do. The focus is postponed to the end of the current frame or to the beginning of the next frame. This ensures that most of the System's work is done with no side effects for most components, while the behavior of serious side effects is concentrated on a single point of careful handling.

ECS to solve the most complex, the core of the problem, perhaps the network synchronization. I think this is also the main motive for designing a framework of strict separation of state and behavior. Since a good network synchronization system must be predictive, predictable, and fail to resolve conflicts, the rollback state must be supported. The state rollback also includes rolling back only part of the state, not simply rolling back the world.

I actually talked about this issue in this blog last year. My point is that the individual preservation of state is very important. In the ECS model, C is pure data, so it is very convenient to take snapshots and rollback. The component separation of Entity is also suitable for the record of critical state. Last year and a colleague to do a shooting class MOBA demo, the final implementation is to the game object position (moving) state, and shooting state specifically extracted to achieve predictive synchronization, the effect is very good.

Instead of talking about the specific techniques of prediction and synchronization, this speech talks about how ECS can help reduce the complexity of using these technologies. It also mentions some interesting details.

For example, ECS provides a updatefixed function for each System that needs to perform according to the input. Overwatch synchronization logic is based on 60fps, so this updatefixed function is called every 16ms, specifically to calculate the state of this logical frame. Depending on the player's delay, the server will defer a little bit longer, calling Updatefixed later than the client. In my blog last year to talk about synchronization also said that the player does not really care about the client and the server is not always the absolute consistency (absolutely consistent is impossible to do), and the concern is that different clients and servers are not showing the same process. Just like live movies, different places play early and later, you see the content is consistent enough, is not at the same time in the watch is not important.

However, the difference between the game and the movie is that the player's own actions affect the plot of the movie. We need the player's input in the server quorum to influence the world. The player needs to tell the server that the operation was made in a few seconds of the start of the movie, and the server inserted the operation into the world process at this point. If the client waits for the server to return the operation result then it is too much card, so the client should simulate the consequences after the operation is released. If the operation is not interrupted, in fact, the results of the client simulation and the server after the arbitration results are the same, so that the server after the return to the client in the past a certain point in time the state of the object, in fact, and the original client simulation is consistent, in this case, the client is happy to continue to run forward.

Only when predicting the operation, such as the player has been running forward, but the server perceives that another player has released a freeze on him, put him in place. In this way, the server passes back to the player's location data: He stays somewhere at some point and is different from the one he predicted at the time. When this prediction fails, the client needs to adjust itself. With the help of ECS, the state rolls back to the divergent version, taking into account the result of the server return and the new understanding of the world changes, re-action for a period of time after the operation of the state of the moment, it is relatively simple to do.

For a server, its default client will continue to push new operations to it in a fixed cycle. As mentioned earlier, the server's moment is intentionally deferred than the client, so that it does not immediately handle the client's input, but instead the input is placed in a buffer, and then fixed by the client and the period (60fps) from the buffer. Due to the existence of this small buffer, slight fluctuations in the network (the travel time of each packet delivery is not exactly the same) is completely unaffected. But if the network is not stable, it will appear to the time the client operation has not been delivered. At this point, the server will also try to predict what happens to the client. When the real operation package arrives, it is different than the comparison to its own forecast, and the next state is calculated based on the state of the previous divergence forecast and the actual uploaded operation.

At the same time, the server will be aware of the network status is not good, it actively notify the client that the network is not quite right, this time the agreement is more interesting to follow. That is, the client gets this message to start the time compression, with higher frequency to run the game, from 60fps to 65fps, the player will feel a slight acceleration, the result is the client with a higher frequency to produce a new input: from three to 15.2 Ms once. In other words, in a short time, the client's time is more advanced Server, and the more leading the more. This allows the server's read-ahead queue to receive more of what will happen in the future, and less likely to encounter the point without knowing the client's input. However, the total traffic does not increase, because it is assumed that a game consists of 10,000 ticks, no matter how the client compresses the time, the advance time, the total data or the operation of the 10,000 tick, does not change.

Once the network is over the unstable period, the server notifies the client that it is OK, this time the client knows the lead time caused by its compression time, the corresponding expansion slowed down time (reduce the frequency of sending operations to the server) so that the state back to the original point.

BTW, Overwatch is based on UDP communication, from the presentation to see, for the UDP may drop packets of this problem, they deal with the simple rude: the client each time the server is not confirmed by the package is packaged together to send. Because the operation of each logical frame is minimal, packaging together does not exceed the MTU limit.

The real power of ECS in this process is the stage of correcting the error after predicting the error. Once you need to correct past errors, you need to roll back and re-execute the instructions. Moving, shooting these are normal settings, it is easier to do rollback re-execution, the skill itself is based on Blizzard developed statescript, through it to achieve the same effect. The power of ECS is that these elements are separated by Component and can be handled separately.

For example, a shooting hit is a separate system, based on a component called Modifyhealthqueue, which is determined by the object. This component records all the damage and healing effects received by Entity. This component can be used for the filter of the Entity, the object without this component will not be harmed, and there is no need to participate in the hit determination. The real impact of hit determination is the Movementstate component, which also participates in the screening of the hit-judging system and really participates in the operation. The hit decision gets the position of the object that should be compared against the movementstate after querying the hostile relationship to predict if it is hit (it may need to play the corresponding animation). However, the damage calculation, that is, the data in the Modifyhealthqueue can only be filled in the server and pushed to the client.

Movementstate will be rolled back due to the need to correct the error forecast, while some non-movementstate status will be rolled back, such as the state of the door, the state of the platform, and so on. This fallback is the behavior of the Utility function, which may affect the performance of the attack, while the injury is the consequence of another fixed behavior (server-determined push). They occur on different component slices of the Entity and can be separated by orthogonal.

Shooting prediction and correction can use the active area of the object to reduce the amount of judgment calculation. If you can always calculate the maximum moving range of the current object over a period of time (that is, the bounding box for a period of time), then when you need to make a previous shot hit, you only need to compare the shooting trajectory to the detection area of all current objects. Only intersect to do further testing: fallback related objects to the moment of shooting, do a strict life lieutenant. If the original predicted hit results and the current verification of the same agreement, do not need to fix the results (if hit, where the specific hit is not important, if not hit, and no matter where the bullet shot).

If the ping value is high, it is often meaningless for the client to make a hit prediction, increasing the amount of computation. Therefore, after the Ping exceeds 220ms, the client will no longer predict the hit event in advance and wait for the server to return directly.

ECS framework in this case can only go back to rollback and the Component related, a System knows which Entity is what it really cares about, how to back what it cares about. The complexity of this development is reduced. The game itself is complex, but the System associated with network synchronization has little impact on the game business, and the participation of Component is almost always read-only. So we can decouple this complex problem from the rest of the engine as much as possible.

ECS is a good framework, but the need to follow certain norms to play his due effect: to reduce the number of system coupling between the degree. But not all of the problems are suitable to follow the specifications of ECS development, especially some of the old modules, it is difficult to do the data structure according to Component specifications exposed, and the state changes in the method of integration into the independent System. You should do some encapsulation work at this time. For example, some systems have used the multithreaded model for parallel optimization, so we need to isolate the already done work outside the ECS framework, exposing only some interfaces and ECS framework docking.

Cloud winds submitted in June 10:56 PM | permalink

On the ECS architecture in the "Overwatch"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.