First, let's talk about dynamic memory allocation. Malloc and free are the most commonly used in C, and new [] Delete and delete [] are used in C ++. these functions are the basis for dynamic memory allocation, and are one of the most common but most CPU-consuming system calls. in addition, memory fragments are easily generated after a large amount of use. If there are too many fragments in the system memory, the system will fail to allocate large blocks of memory or the system can only allocate memory on the virtual memory, this is why some programs are easy to slow and crash after running for 2 or 3 hours. Another important factor is that when programmers write programs, they often allocate memory and forget to release it. Especially when writing over lines of code, I often forget where the memory is allocated. therefore, memory management is a very important issue for the stability of the game. After all, everyone can play for 10 hours without any rest.
Currently, the most popular solution is to write your own memory management functions on the memory allocation functions provided by the system. Rewrite malloc and free in the C language to track the distribution and usage of each memory. In C ++, the overload operators new and delete. by providing your own library, you can easily detect memory leakage. by allocating a large enough memory from the operating system at the beginning of the program, memory management can effectively prevent memory leakage and support object reuse technology, improve the speed and stability of the game. Of course, you can also use some memory leakage detection tools to check memory usage (such as Firefox Memory leakage detection tool or visual leak detector ).
In fact, in game programming, dynamic memory allocation is rarely used, and most of the memory is allocated in advance. Even the linked list or tree data structures are effectively simulated using arrays.
====================================
The following describes some precautions in the code. Among memory-related considerations, memory alignment is the top priority. That is to say, the first address of a piece of memory must be divisible by 2, 4, 8, 16, 32, or 64. Different CPUs have different requirements for this number.
For intel's latest Pentium dual core series xenon series and the Pentium 4 series in earlier days. We recommend that you use 64 bytes or 128 bytes memory alignment. Because it is used in pentium4 series, when the program requires memory access, a CPU preprocessing module (prefetch) will read the data in the memory to level1 cache in advance, the data volume read each time is 64 bytes (the Pentium xenon series is 128 bytes ). If memory alignment is not performed, for example, an int occupies 4 bytes, the first byte is in the first 64 bytes, and the last three bytes are in the last 64 bytes, therefore, when the CPU reads this int, it needs to take the data one more time from the memory, which will greatly increase the code running time. Let's take a look at the example:
_ Declspec (align (64) int test [128]; // 64-byte align
Int * pint = (int *) (char *) test + 1); // No alignment pointer
Int * pint2 = test; // alignment pointer
Int F1 (void)
{
Int I, K = 0;
For (I = 0; I <16; I ++) K + = pint [I];
Return K;
}
Int F2 (void)
{
Int I, K = 0;
For (I = 0; I <16; I ++) K + = pint2 [I];
Return K;
}
Based on the test results of vtune in the attachment (see Appendix 1), we can see that the running time (clockticks value) of non-64 bytes alignment is almost three times the running time of alignment memory. Therefore, when using dynamic or static memory, it is best to pay attention to the word alignment of the memory. In Visual Studio. NET, you can use _ declspec (align (64) to align static variables, arrays, or structures with memory. For dynamic memory allocation, you can use _ aligned_malloc () and _ aligned_free ().
These memory alignment problems are generally optimized by the current compiler. However, if you want to write your own memory management functions, you need to pay attention to them separately.
====================================
The following describes the structure array problem. Structure arrays are often used in the following format:
Struct mystructure {
Int firstnumber;
Int secondnumber;
Int thirednumber;
Int fourthnumber;
} Structurearray [100];
Another way to organize this type of data structure is the array structure. The format is as follows:
Struct mystructure {
Int firstnumber [2, 100];
Int secondnumber [2, 100];
Int thridnumber [2, 100];
Int fourthnumber [100];
} Arraystructure;
Which of the two forms is better? You need to determine based on the actual situation. In general, if you want to perform continuous access to the same member in all the structures, for example, to require the sum of all firstnumber in the 100 structure, it will be much faster to use the 2nd form. If you want to calculate the sum of all the members of each structure separately, the first form is much faster.
=================Calculate the sum of the first member of all structures ============
// Incorrect selection
For (I = 0; I <100; I ++) sum + = structurearray [I]. firstnumber;
// Select the correct one
For (I = 0; I <100; I ++) sum + = arraystructure. firstnumber [I];
===================Calculate the sum of all members of each structure ==============
// Incorrect selection
For (I = 0; I <100; I ++)
Sum = arraystructure. firstnumber [I]
+ Arraystructure. secondnumber [I]
+ Arraystructure. thirdnumber [I]
+ Arraystructure. fourthnumber [I];
// Select the correct one
For (I = 0; I <100; I ++)
Sum = structurearray [I]. firstnumber
+ Structurearray [I]. secondnumber
+ Structurearray [I]. thirdnumber
+ Structurearray [I]. fourthnumber;
I don't need to say that everyone understands the truth. In programming, the data organization method should be determined based on the operation.
There are many precautions for memory access, such as the aliasing and store forward issues. We recommend that you refer to Intel's Pentium documentation.