Original article: Optimlzation for C ++ Games-Game Programming Gems II
Translated by: carvenson
--------------------------------------------------------------------------------
In general, C ++ games are reusable and maintainable compared with C Programs. But is this really valuable? Can complicated C ++ be compared with traditional C Programs in terms of speed?
If you have a good compiler and a good understanding of the language, it is really possible to use C ++ to write some efficient game programs. This article describes several typical techniques you can use to accelerate your game. It assumes that you are already very certain about the benefits of using C ++, and you are quite familiar with the basic concepts of optimization.
The first basic concept that often benefits people is obviously the importance of profiling. If there is a lack of profiling, the programmer will make two kinds of errors. One is to optimize the wrong code: if the main indicator of a program is not efficiency, it is a waste of time to make it more efficient. Intuition is used to determine which code's main indicator is that efficiency is not credible and can only be measured directly. The second concept is that programmers often "optimize" to reduce the speed of code. This is a typical problem in C ++. A simple command line may generate a large number of machine code. You should always check the output of your compiler and analyze it.
1. Object Construction and Analysis
Object Construction and analysis is one of the core concepts of C ++ and a major part of the code generated by the compiler. Uncarefully designed programs often spend a lot of time calling constructors, copying objects and initializing temporary objects. Fortunately, the general feeling and a few simple rules can make the heavy object code run as little as C.
It is not constructed unless necessary.
The fastest code is the code that does not run at all. Why do you want to create an object that you don't even use? In the code below:
Voide Function (int arg)
{
Object boj;
If (arg = 0)
Return;
...
}
Even if arg is 0, we pay the cost of calling the Object constructor. Especially if arg is often 0 and the Object itself is still allocated memory, this waste will be more serious. Obviously, the solution is to move the definition of obj after judgment.
Be careful when defining complex variables in a loop. If a complex object is constructed in a loop based on the principle of not constructing unless necessary, in this case, you have to pay a constructor price for every loop. It is best to construct it only once outside the loop. If a function is called in an internal loop and the function constructs an object in the stack, You can construct it externally and pass an application to it.
1.1 Use the initialization list
Consider the following classes:
Class Vehicle
{
Public
Vehicle (const std: string & name)
{
MName = name
}
Private:
Std: string mName;
}
Because the member variables are constructed before the constructor is executed, this code calls the constructor of string mName and then calls the = Operator to copy the value. A typical disadvantage in this example is that the default constructor of string will allocate memory, but in fact it will allocate much more space than actually needed. The following code will be better and block the call to the = Operator. Further, the non-default constructor will be more effective because more information is provided, in addition, the compiler can optimize the constructor if its body is empty.
Class Vehicle
{
Public
Vehicle (const std: string & name): mName (name)
{}
Private:
Std: string mName;
}
1.2 Pre-auto-increment or post-auto-increment (I .e., ++ I or I ++)
When writing x = y ++, the problem is that the auto-increment function will create a copy of the original value to keep y, then auto-increment y, and return the original value. Post-auto-increment includes the construction of a temporary object, but not the pre-auto-increment. There is no extra burden on integers, but for user-defined types, this is a waste. You should use pre-auto-increment when possible, in the loop variable, you will often encounter this situation.
The addition of vertices is often seen in C ++ without the return value operator:
Vector operator + (const Vector & v1, const Vector & v2)
This operation will cause a new Vector object to be returned, and it must also be returned as a value. Although the expression v = v1 + v2 can be written in this way, the burden of constructing temporary objects and copying objects is as follows, it is too big for things that are often called like vertex addition. Sometimes the code can be well planned so that the compiler can optimize the temporary object (this is called the return value optimization ). But in more general cases, you 'd better put down your shelf and write code that is ugly but faster:
Void Vector: Add (const Vector & v1, const Vector & v2)
Note that the + = operator does not have the same problem. It only modifies the first parameter and does not need to return a temporary object. Therefore, you can replace ++ with ++ = if possible.
1.3 use lightweight Constructors
In the previous example, does the Vector constructor need to initialize its element 0? This problem may occur several times in your code. If yes, it makes all calls, whether necessary or not, have to pay the initialization cost. Typically, temporary vertices and member variables have to bear these extra overhead.
A good compiler can remove unnecessary code, but why is it so risky? As a general rule, you want the constructor to initialize all member variables, because uninitialized data will produce errors. However, in small classes that are frequently instantiated, especially some temporary objects, you should be prepared to compromise the efficiency rules. The first choice is the vector and Matrix classes in many games. These classes should obviously provide some methods to set 0 and identify, but their default constructor should be empty.
The inference of this concept is that you should provide another constructor for this type. If the Vebicle class in our second example is written as follows:
Class Vehicle
{
Public:
Vehicle ()
{
}
Void SetName (const std: string & name)
{
MName = name;
}
Private:
Std: string mName
};
We saved the overhead of constructing mName, and later set its value using the SetName method. Similarly, using the copy constructor is better than constructing an object and then using the = Operator. We would rather construct it like this: Vebicle V1 (V2) should not be constructed like this:
Vehicle v1; v1 = v2;
If you need to prevent the compiler from helping you copy objects, declare the copy constructor and operator = as private, but do not implement any of them. In this way, any attempt to copy this object will generate a compile-time error. It is best to develop the habit of defining Single-parameter constructor unless you want to perform type conversion. This prevents hidden temporary objects generated by the compiler during type conversion.
1.4 pre-allocated and Cache objects
A game generally has some categories that are frequently allocated and released, such as weapons. In the C program, you will allocate a large array and use it as needed. In C ++, after a small plan, you can do the same. Instead of constructing and destructing objects all the time, this method requests a new one and returns the old one to the Cache. Cache can be implemented as a template, which can work for all classes with a default constructor. The Cache template Sample can be found in the attached CD.
You can also allocate some objects to fill the Cache as needed, or pre-allocate them. If you want to maintain a stack for these objects (before you delete object X, you need to delete all objects allocated after object X ), you can allocate the Cache to a continuous memory block.
2. Memory Management
C ++ applications generally have to go deeper into the memory management details than C Programs. In C, all allocation is simply done through malloc and free, while C ++ can also implicitly allocate memory by constructing temporary objects and member variables. Many C ++ game programs require their own memory management programs. The C ++ game program needs to execute a lot of allocation, so be careful with heap fragments. One method is to select a complex path: either no memory is allocated after the game starts, or a large continuous memory block is maintained and released on schedule (such as between checkpoints ). On Modern Machines, strict rules are unnecessary if you want to be cautious with your memory usage.
The first step is to reload the new and Delete operators and use the self-implemented operators to direct the most frequently-used memory allocation in the game from malloc to pre-allocated memory blocks. For example, you find that you can allocate a maximum of 10000 4-byte memory at any time. You can allocate 40000 bytes first and then reference them as needed. To track which blocks are empty, you can maintain a free list that points each empty block to the next empty block. During the allocation, remove the previous block. When releasing the block, move the empty block to the front. Figure 1 describes how the free list works with a series of allocation and release actions in a continuous memory block.
Figure 1 A linked free list
You can easily find that a game has a lot of short-lived memory allocations, and you may want to reserve space for many small blocks. It will waste a lot of memory to keep large memory blocks for those that are not currently in use. In a certain size, you should allocate the memory to a different large memory allocation function or directly to malloc ().
3. Virtual Functions
Critics of C ++ games always point their finger at virtual functions and think of them as a mysterious feature to reduce efficiency. Conceptually, the mechanism of virtual functions is very simple. To call a virtual function of an object, the compiler accesses the virtual function table of the object, obtains a pointer to the member function, sets the call environment, and jumps to the address of the member function. Compared with the function call of C program, C Program sets the call environment and jumps to an established address. The extra burden of calling a virtual function is the indirect direction of the virtual function table. because you do not know the address to be redirected in advance, it is also possible that the processor cannot hit the Cache.
All real C ++ programs use a large number of virtual functions, so the main method is to prevent virtual function calls in areas that place great importance on efficiency. Here is a typical example:
Class BaseClass
{
Public:
Virtual char * GetPointer () = 0;
};
Class Class1: public BaseClass
{
Virtual char * GetPointer ();
};
Class Class2: public BaseClass
{
Virtual char * GetPointer ();
};
Void Function (BaseClass * pObj)
{
Char * ptr = pObj-> G